Subtitle Design for 2026: 4 Settings That Decide If Viewers Stay

2026-04-22
KKevin Wong

A friend who makes knowledge-focused YouTube videos sent me a 10-minute test cut last week. The subtitles were auto-generated and the text itself was fine. Three minutes in, he frowned at his own video: one line had 22 Chinese characters crammed into it — the viewer couldn't finish reading before the next cue fired. A few segments later, cues lingered past the cut, stalling the pacing. The content was solid. Retention was going to suffer anyway.

The part of subtitling most people underestimate isn't accuracy. It's the readability settings — how long a cue sits on screen, how long the line is, what font it uses, how big it is. Get those wrong and even a perfectly transcribed video loses viewers. These are the 4 settings I've landed on after a lot of back-and-forth on my own videos and the test footage we run through Subanana's AI subtitle tool. If you make content for a HK bilingual audience — Cantonese-English code-switching, mixed CJK and Latin text — the numbers matter even more, because each language has different width rules.

1. Characters per line: 15 Chinese characters max, 37 English characters max

The most common first mistake is stuffing too much into one line.

City University of Hong Kong's Department of Translation and Linguistics publishes subtitle guidelines that work cleanly for HK bilingual video: Chinese subtitles should show one line at a time, at most 13–15 Chinese characters per line. English subtitles can go up to two lines, max 37 characters per line (letters, numbers, punctuation, spaces all counted). Break at natural phrase boundaries — never slice a sentence mid-clause.

The logic underneath is simple. The viewer is splitting attention three ways: watching the picture, listening to the audio, reading the subtitle. A line that's too long pushes past what they can scan before the next cue appears, and the comprehension thread breaks.

How to check in practice:

  • Chinese cues: count characters. Over 15 → break the line.
  • English cues: over 37 characters (spaces included) → wrap to a second line. Past two lines → split into two cues.
  • Bilingual cues (Chinese + English stacked): each language follows its own rule on its own line. Don't mix counts.
  • In Subanana's editor, the CPS (characters-per-second) flag marks cues that either cram too many characters into too little time or linger too long with too little text. You don't have to count manually; the editor lights up the problem rows for you.

2. Font size: scaled to resolution, not to what looks fine on your monitor

The second setting people miss is font size. Your viewers are watching on phones, tablets, laptops, and the occasional TV. If you size to what reads comfortably on a 27-inch monitor, the same subtitle is a squint on an iPhone and a screen-hog on a TV.

The BBC's subtitle accessibility guide recommends a minimum subtitle height of 44 px for HD video. The general rule is to set font size to between 1/20 and 1/10 of the frame height:

  • 1080p (1080 px tall): font size roughly 54–108 px.
  • 4K (2160 px tall): roughly 108–216 px.
  • Vertical short-form (Reels, Shorts, TikTok): the frame is narrow, so cues need to be shorter — aim for 8–10 Chinese characters or ~25 English characters per line, and push font size toward the upper end of the 1/10 range.

One habit that saves a lot of reshoots: preview on a phone before exporting. A subtitle that reads fine on your laptop can easily cover half your subject's face in a vertical frame.

3. Font and colour: readability first, decoration later

The subtitle font isn't where you show off design taste. It's text the viewer needs to read in under a second, often on top of a moving image. Skip handwritten, decorative, or overly thin fonts. Pick a font with clean, structurally consistent strokes — sans-serif (黑體) for most use cases, serif (明體/宋體) if the content calls for it.

A few free-for-commercial-use fonts that cover Traditional Chinese cleanly:

On colour, the only thing that matters is contrast:

  • Light backgrounds → dark text (dark grey or black). Avoid pure #000000 — it can fringe under aggressive compression. #1a1a1a holds up better.
  • Dark backgrounds → light text (white or pale yellow), with a semi-transparent black box or outline so the subtitle doesn't wash out when the shot goes bright.
  • Bilingual subtitles — give the two languages different colours so the eye separates them at a glance. A common pattern is white for the primary language and pale yellow for the secondary, stacked two lines.

If you're exporting bilingual subtitles, Subanana writes both languages into a single SRT — Chinese + English (or Chinese + Japanese, Chinese + Korean, etc.) stacked on the same cue. You don't have to hand-merge two files or re-align timecodes afterwards.

4. Timing: cues sit longer than you'd guess

The last setting, and the one most people get wrong, is how long each subtitle stays on screen.

The rules I use:

  • Minimum 1 second per cue, even if the line is only two characters long. Under a second, the viewer's eye doesn't have time to land.
  • Maximum 6 seconds. Past that, viewers start wondering if the subtitle has frozen.
  • Hold ~0.5 seconds after the audio ends. Gives the viewer tail-time to finish reading.
  • Never appear before the audio. Subtitles that pre-announce the next line are the fastest way to pull a viewer out of the video.
  • If two cues are separated by less than ~0.2 seconds, merge them. Flashing one cue off and the next on inside that window is visual noise.

Tuning this by hand on a 10-minute video is a 40-minute job. Subanana splits the work into two layers:

  • Timecodes come directly from the STT model — they're aligned to the audio waveform, not guessed. CPS flags then mark any cue where the density is off (too much text per second, or too long a hang).
  • Text goes through a separate LLM pass that flags likely misheard words and same-sounding wrong characters (e.g. 再見 misheard as 在見, or English their vs there). Each suggestion is propose-and-confirm — you approve or reject; nothing auto-applies. Scope: the LLM only edits text, never touches timecodes, and only catches substitution errors. It does not detect dropped characters (漏字) and it does not second-guess timing. Think of it as a spell-checker-style layer that surfaces likely issues for you to review.

For Cantonese creators: if your source is spoken Cantonese but you want written-Chinese subtitles (closer to Mandarin grammar), the 口語→書面語 toggle rewrites the register in one click. Spoken 佢哋而家唔使咁做 becomes written 他們現在不需要這樣做. Same meaning, different register — the one HK viewers read vs the one they speak.

Doing all 4 in Subanana

If you'd rather not hand-adjust each cue, the full Subanana flow:

  1. Upload a file or paste a link. Direct uploads for .mp4 / .mov / .webm / .ogg (up to 15 GB / 3 hours on paid plans), or paste a public YouTube, Instagram, or Facebook URL — Subanana fetches and transcribes from the link, no local download needed.
  2. Pick the source language. Cantonese (廣東話 / Hong Kong), Mandarin (華語), English, Japanese, Korean — 80+ total. Subanana routes the audio to whichever STT model currently benches best for that language.
  3. Handle 3 things in the editor: review CPS flags for cues that are too dense or too sparse, walk through the LLM's proposed text corrections one by one, and nudge any timecodes that need it. The bulk of the 4-setting work is already done by this point.
  4. For Cantonese source, toggle 口語→書面語 if you want the written-Chinese register instead of spoken Cantonese.
  5. Export. Six standalone formats — SRT, VTT, TXT, DOCX, XLSX, MD — plus a ZIP bundle. Bilingual dual-language SRT is one checkbox. One-click burned-in video export is there too, if you want a finished MP4 rather than a sidecar file.

Cantonese accuracy on clean audio sits around 95% in my testing; Mandarin is a bit higher. Paid plans start at US$9/mo (approx HK$68/mo, annual billing) with a 60-minute monthly quota — enough to subtitle a handful of short videos and feel out the editor. Full plan comparison is on the pricing page.

Further reading

Subtitles that stay on screen the right length, at the right size, in a readable font, and under the character limit — that's what keeps viewers through the last third of your video. Dial in the 4 settings, let AI take the slow parts, and try Subanana on a real cut before committing.

Boost Your Efficiency with Subanana

No payment method required
Free Trial
Cancel Anytime