AI-generated accessibility, an update — frontier models still fail, but skills change the game
The latest report on AI-generated accessibility highlights that frontier models still struggle with accessibility checks by default. However, the introduction of skills has shown to significantly improve pass rates for accessibility. Custom instructions continue to be a cost-effective method for enhancing accessibility in generated UI code.
- ▪Frontier models like GPT-5.5 and Claude Opus 4.7 still fail accessibility checks by default.
- ▪The Building Accessible UI skill achieved an 86% pass rate, significantly higher than the average of 12% for default models.
- ▪Custom instructions can increase pass rates by over 48 percentage points, demonstrating their effectiveness.
Opening excerpt (first ~120 words) tap to expand
try { if(localStorage) { let currentUser = localStorage.getItem('current_user'); if (currentUser) { currentUser = JSON.parse(currentUser); if (currentUser.id === 152506) { document.getElementById('article-show-container').classList.add('current-user-is-article-author'); } } } } catch (e) { console.error(e); } Michael Fairchild Posted on May 21 AI-generated accessibility, an update — frontier models still fail, but skills change the game #a11y #llm #ai #benchmark A few months ago I shared early results from the A11y LLM Eval project, a benchmark that measures how accessibly LLMs generate UI code. The previous post showed that LLMs default to inaccessible code, explicit accessibility instructions can dramatically change that, and manual testing is still essential.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at DEV.to (Top).