Audio description
Also: AD, video description, described video, audio described
A narrated description of visual content in a video — actions, scene changes, text on screen, expressions — for blind viewers. WCAG 1.2.3 (AA) and 1.2.5 (AA pre-recorded extended) require it.
Audio description is a narrated track that describes the visual content of a video for users who can’t see it: scene changes, character actions, expressions, on-screen text. It plays during natural pauses in the dialogue, narrated by either a professional voice talent or synthesised speech.
Audio description is what captions are for the opposite direction: captions transcribe audio for viewers who can’t hear; audio description narrates video for viewers who can’t see.
What WCAG requires
Two criteria specifically cover audio description:
- 1.2.3 Audio Description or Media Alternative (Pre-recorded) — Level A — pre-recorded video must have either audio description OR a full alternative text presentation.
- 1.2.5 Audio Description (Pre-recorded) — Level AA — audio description is required for all pre-recorded video.
- 1.2.7 Extended Audio Description (Pre-recorded) — Level AAA — where pauses in the original audio are insufficient for adequate description, the video can be paused programmatically to allow longer description segments.
The AA criterion is the practical floor; most legal regimes that reference WCAG include 1.2.5.
Standard vs extended
- Standard audio description uses the natural pauses in the dialogue. Works well for dialogue-driven content with natural silences. Fails for fast-paced, dialogue-dense content where there’s no room.
- Extended audio description pauses the video to insert longer description segments. Adds significantly to total runtime; less common in commercial content.
Most professional audio description is standard.
How it’s produced
- Script writing. A trained describer (often the same role as subtitle / caption writer in firms that do both) watches the programme and writes narration for the visual content. Skill is in choosing what to describe, given limited time, that conveys the essential information without editorialising.
- Recording. A voice talent narrates, or — increasingly — a text-to-speech engine synthesises the narration from the script.
- Mixing. Description is mixed into the audio track at a level that doesn’t compete with the original dialogue.
- Delivery. As a separate audio track on the video player (selectable via a control labelled “AD” or similar), or as “open-described” content with description always on.
Common failures in production
- No AD at all — the most common failure. Many sites that diligently ship captions skip audio description because it’s perceived as more complex. WCAG AA requires both for pre-recorded video.
- Description only over silence. A scene with continuous music or ambient noise has no real “pause” — the describer has to either talk over the music (less ideal) or pause the video (extended description).
- Decorative content over-described. Filler scenes (long establishing shots, cutaways) don’t need narration that just lists what’s on screen. Useful description conveys narratively-relevant information.
When the AAA threshold matters
For content with very dense dialogue (legal proceedings, fast-talking documentaries, interview shows), standard AD often isn’t enough. AAA extended AD with programmatic pausing is the only way to convey enough visual context. Some streaming platforms now support this natively; many do not, requiring the video to be pre-edited with extended AD already burned in.