Voice control
Also: Dragon NaturallySpeaking, Voice Control, Windows Speech Recognition, VoiceAccess
The class of assistive technology that lets users operate a computer through spoken commands. Dragon NaturallySpeaking (Windows), macOS / iOS Voice Control, Windows Speech Recognition — all rely on accessible names matching the spoken command.
Voice control is the assistive-technology class that lets users operate a computer through spoken commands. The major systems are Dragon NaturallySpeaking (Windows, the long-running market leader), macOS Voice Control and iOS Voice Control (Apple’s built-in, free with the OS), Windows Speech Recognition (Microsoft’s built-in), and Google Voice Access (Android).
Voice control serves users with motor disabilities that make keyboard / mouse / touchscreen use difficult or impossible — RSI, arthritis, tremor, paralysis at various levels. It’s also widely used in workflow-efficiency contexts (lawyers and clinicians dictating notes) where the user has no documented disability but benefits from hands-free operation.
How voice control resolves commands
A voice-control system listens for two kinds of commands:
- Dictation — the user is speaking text to be entered into a focused field. The system transcribes the speech.
- Commands — the user is naming an action or a control. The system matches the spoken phrase against a registry of available commands and accessible names of on-screen controls.
The second kind is where web accessibility intersects: when the user
says “Click Submit,” the voice-control software looks for an element
with the accessible name “Submit” (or a close match) and dispatches a
click event to it. If your Submit button has no accessible name —
because it’s an unlabelled <button>, or a custom <div>, or a
button with an icon-only label — the voice-control user cannot
operate it.
What this means for web developers
The single most important voice-control accessibility rule:
Every interactive element must have a text-based accessible name that matches its visible label.
In practice:
- Buttons need visible text.
<button>Submit</button>works. Icon-only buttons needaria-labelmatching what the user is likely to say —aria-label="Search"rather thanaria-label="Magnifying glass". - Visible label and accessible name must match. If a button’s
visible text is “Submit” but its
aria-labelis “Send form,” voice control may not find it when the user says “Click Submit.” WCAG 2.5.3 Label in Name (Level A) requires that the accessible name contain the visible label text. - Custom controls expose their role + name. A
<div role="button" aria-label="Submit">works. A<div onclick>without role or name does not. - No phantom controls. Voice-control overlays (numbered overlays that show every focusable element with a number to call out) work better when the number of focusable elements is manageable. Hidden but-still-focusable elements clutter the overlay.
Where voice control overlaps with screen-reader accessibility
The same accessible-name and semantic-HTML discipline that makes a site screen-reader-accessible also makes it voice-control-accessible. Both technologies route through the accessibility tree and depend on accessible names being present and correct.
The major exception: voice control doesn’t need ARIA live regions (it’s not a screen reader), but it does need numbered overlays to expose controls without text labels, which is why text labels matter disproportionately for voice users.
What goes wrong specifically for voice control
- Visible icon-only labels. A heart button (favourite) with no visible text. The user says “Click favourite,” but the accessible name is “Save” or missing entirely. Mismatch.
- Two controls with the same accessible name. Two “Read more” links on the page. The user says “Click Read more”; the voice- control system shows a numbered disambiguation overlay. This isn’t broken, just slow.
- Mismatched localised labels. Spoken in English, but the page is
in French. The user’s voice control needs to recognise the right
language for the element’s name. Setting
langcorrectly on the HTML helps.
The fastest manual audit: turn on macOS Voice Control or Windows Speech Recognition, and try to use your site by voice. The unactivatable controls become obvious within five minutes.