WeSearch

The last six months in LLMs in five minutes

Simon Willison· ·1 min read · 0 reactions · 0 comments · 27 views
#technology#artificial intelligence#coding#OpenAI#Anthropic#Codex#Claude
The last six months in LLMs in five minutes
⚡ TL;DR · AI summary

The advancements in coding agents have become significant over the past six months. OpenAI and Anthropic focused on improving code quality through Reinforcement Learning from Verifiable Rewards. By November, these agents had improved enough to be used reliably for real work without constant corrections.

Key facts
Original article
Simon Willison's Weblog · Simon Willison
Read full at Simon Willison's Weblog →
Opening excerpt (first ~120 words) tap to expand

# It took a little while for this to become clear, but the real news from November was that the coding agents got good. OpenAI and Anthropic had spent most of 2025 running Reinforcement Learning from Verifiable Rewards to increase the quality of code written by their models, especially when paired up with their Codex and Claude Code agent harnesses. In November the results of this work became apparent. Coding agents went from often-work to mostly-work, crossing a quality barrier where you could use them as a daily-driver to get real work done, without needing to spend most of your time fixing their stupid mistakes.

Excerpt limited to ~120 words for fair-use compliance. The full article is at Simon Willison's Weblog.

Anonymous · no account needed
Share 𝕏 Facebook Reddit LinkedIn Threads WhatsApp Bluesky Mastodon Email

Discussion

0 comments