Standards · WCAG 2.2

SC 1.2.3 Level A WCAG 2.0

Audio Description or Media Alternative (Prerecorded)

Prerecorded video needs either an audio description track or a full text alternative for any visual information that isn't conveyed by the soundtrack — so blind users get the same content as sighted viewers.

What it asks

If a video shows information the soundtrack doesn’t already describe — on-screen text, a chart, a silent action, a facial expression that changes the meaning — that visual information must be available to people who can’t see the screen. You get two options at Level A: an audio description track that narrates the visuals during pauses in the dialogue, or a complete text alternative that captures both the audio and the visual track in writing.

How to meet it

  • Write a clean, time-coded transcript that includes visual descriptions in brackets or as separate paragraphs.
  • For training videos with slides, include the slide text inline in the transcript at the timecode it appears.
  • Record an audio description track when there are gaps in the dialogue long enough to fit narration without overlap.
  • Use the <track kind="descriptions"> element to attach a description track in HTML5 video.
  • When dialogue is dense (a talking-head interview), the audio is often self-describing — a transcript may already be enough.
  • For tutorials that show keyboard shortcuts or UI clicks, narrate the action: “now press Cmd+K to open the search.”

Common failures

  • Product demo videos where the presenter says “click here” and “this button” with no verbal context — blind users hear pronouns with no antecedents.
  • Animated explainer videos with on-screen text and a music-only soundtrack, no description track and no transcript.
  • A “transcript” that is the captions file copy-pasted — captions are dialogue-only and don’t describe visuals.
  • Slide-based webinars where the slides contain unique information not spoken aloud.
  • Conference talks where the speaker gestures at a chart without describing the data.

Why it matters

This is the lowest bar in the audio-description family — Level A asks for either a description track or a transcript. Most teams produce neither, then discover the gap only when a blind user complains. Pairing 1.2.2 captions with a slightly expanded transcript covers both 1.2.2 and 1.2.3 in one document.