The last six months in LLMs in five minutes

Simon Willison· May 19, 2026 · 1:09 AM UTC ·1 min read · 0 reactions · 0 comments · 27 views

#technology #artificial intelligence #coding #OpenAI #Anthropic #Codex #Claude

The last six months in LLMs in five minutes

⚡ TL;DR · AI summary

The advancements in coding agents have become significant over the past six months. OpenAI and Anthropic focused on improving code quality through Reinforcement Learning from Verifiable Rewards. By November, these agents had improved enough to be used reliably for real work without constant corrections.

Key facts

▪OpenAI and Anthropic enhanced coding agents using Reinforcement Learning from Verifiable Rewards.
▪The quality of code produced by these models improved dramatically in late 2025.
▪By November, coding agents became reliable enough for daily use without frequent errors.

Original article

Simon Willison's Weblog · Simon Willison

Read full at Simon Willison's Weblog →

Opening excerpt (first ~120 words) tap to expand

# It took a little while for this to become clear, but the real news from November was that the coding agents got good. OpenAI and Anthropic had spent most of 2025 running Reinforcement Learning from Verifiable Rewards to increase the quality of code written by their models, especially when paired up with their Codex and Claude Code agent harnesses. In November the results of this work became apparent. Coding agents went from often-work to mostly-work, crossing a quality barrier where you could use them as a daily-driver to get real work done, without needing to spend most of your time fixing their stupid mistakes.

Excerpt limited to ~120 words for fair-use compliance. The full article is at Simon Willison's Weblog.

Anonymous · no account needed

Discussion

0 comments

The last six months in LLMs in five minutes

Discussion

More from Simon Willison's Weblog