When AI Gets Accessibility Wrong: Why Developers Still Need Manual Testing

AI tools are quickly becoming a staple in modern development workflows. They generate snippets, scaffold entire components, and even offer “accessibility tips” on demand. But there’s a catch: much of what AI produces in this space is incomplete, outdated, or wrong, because accessibility simply doesn’t have the same depth of training data as mainstream technologies like JavaScript or React. That imbalance leads to hallucinations and half-correct patterns that look plausible but fail in real use.

AI is incredibly useful for developers, but it cannot be trusted as the final word on accessibility. Every suggestion should be tested, verified, and judged through the lens of real-world user experience.

It’s worth noting that most everyday code gets plenty of “free” validation just by being executed. If Copilot generates a SPA component or inserts code into an existing project, a developer usually has multiple quality checks in place as part of the normal workflow:

  • Compiling and linting: catches typos in variable or function names, missing imports, and syntax errors.
  • Dogfooding: simply running the app quickly exposes obvious defects or broken flows as you click through features.
  • Unit testing: well-written unit tests validate core logic and edge cases before they ship.
  • Integration testing: ensures that new code plays well with existing modules and APIs.
  • Type systems: in languages like TypeScript or with strict type hints in Python, misused values get flagged immediately.

Accessibility, however, doesn’t benefit from these same safety nets. A screen reader won’t throw a compiler error if headings are out of order. A CI pipeline won’t fail just because an icon is exposed to assistive technologies. Those problems require deliberate validation – both manual testing and specialized tooling – which is why relying on AI’s output without verification is risky.

Table of Contents

How This Code Sneaks into Projects

Most developers don’t intentionally write inaccessible code. Instead, it arrives quietly through AI-driven workflows. You might accept a Copilot suggestion, or copy a polished-looking snippet from ChatGPT.

Because these snippets look authoritative, they’re easy to trust. The problem is that they often carry subtle accessibility flaws. Without proper checks, those flaws ship to production and impact users who rely on assistive technologies. What feels like a small shortcut in development can have a negative impact on usability.

Why AI Misses Accessibility Details

It’s worth asking: why does AI struggle so much with accessibility compared to other coding tasks? Several reasons stand out:

  • Training imbalance: The web is flooded with layout and JavaScript examples, but good accessibility examples are fewer and often inconsistent.
  • Conflicting sources: AI absorbs outdated ARIA patterns, redundant attributes, and “hacks” that were never best practice.
  • Context dependence: Accessibility choices depend heavily on intent and user needs, something AI can’t infer from a bare code snippet.
  • No runtime evaluation: Unlike a human tester, AI doesn’t tab through interfaces or listen with a screen reader. It can only guess what will work.

This combination means AI can sound confident while giving advice that, when put into practice, actively harms the user experience.

Examples and Fixes

Here are a few common issues that crop up in AI-generated code. These aren’t hypothetical; they’ve shown up repeatedly in suggestions I’ve tested myself.

1. Heading levels Out of Sequence

Jumping from an h1 straight to an h5 is confusing for screen reader users who rely on headings to navigate content. A proper hierarchy matters more than how a heading looks visually.

2. Redundant or incorrect region names

AI loves to sprinkle aria-label="navigation region" onto a <nav> element. This adds noise without clarity. Labels are only useful when there are multiple navigation sections with different purposes. This is also the same for other regions or landmarks.

3. Decorative icons not hidden

Decorative icons often end up exposed to assistive technologies because AI omits aria-hidden="true". This clutters the page and often distracts from meaningful content.

4. Skip links that are only visible to screen readers

Skip links should not be permanently hidden. Keyboard users who don’t use a screen reader still need them.

5. Event-driven flows that misuse semantics

A common anti-pattern is attaching onclick to a <div> styled like a button. This breaks keyboard accessibility and removes built-in semantics. Native elements like <button> already handle roles, states, and keyboard events correctly.

Other recurring issues include unlabeled form fields, missing error messages, improper aria-live usage, modals without focus trapping, and insufficient color contrast. The list goes on, and it will evolve as models change over time.

Testing and Tooling That Catch These Issues

So what do we do about it? The answer is straightforward: test your code like real users would, and back it up with automated checks.

Manual Testing

Manual testing remains the best way to discover many accessibility issues. Ideally, it should involve screen reader users who are fluent with their technology. But even without that, you can get started. I wrote A Keyboard Guide for Accessibility Testing to help developers run basic validation checks with nothing more than a browser and a keyboard.

Automated Tools

Automation can’t cover everything, but it’s an excellent safety net. Combined with manual checks, these tools can catch common issues early in the development cycle.

  • eslint-plugin-jsx-a11y for linting React code.
  • axe DevTools or Lighthouse for browser-based scanning.
  • jest-axe and Playwright for automated test coverage.
  • CI/CD gates to block merges on critical accessibility violations.

Closing Thoughts

AI is here to stay, and it can be a powerful ally in development. But when it comes to accessibility, blind trust in its output is risky. Accessibility requires nuance, context, and validation-things AI simply cannot provide on its own.

If you take away one thing, let it be this: use AI as an assistant, not an authority. Test thoroughly, validate with both manual and automated tools, and always put real users at the center of your process. That’s how you turn AI-generated code into something genuinely usable.

Disclaimer


This article is not meant to suggest that developers should avoid AI tools, or that they cannot provide real value in our workflows. AI has its place in scaffolding, exploration, speeding up routine tasks, and even accessibility. My focus here is on the developer experience—how we, as builders, can ensure the applications and services we create remain accessible and usable for everyone. What makes this issue concerning is the sheer prevalence of ChatGPT and similar tools as “authoritative” sources. Incorrect or incomplete accessibility guidance is often not obvious to most developers, and that poses a real risk when accessibility barriers can silently exclude people. I write this as both a developer and a blind accessibility advocate: the stakes are higher than code style or syntax. These details directly impact whether someone can use what we build.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>