WeSearch

Where's the raccoon with the ham radio? (ChatGPT Images 2.0)

Simon Willison· ·3 min read · 0 reactions · 0 comments · 3 views
#chatgpt images 2.0#ai image generation#raccoon ham radio#openai#where's waldo ai test
Where's the raccoon with the ham radio? (ChatGPT Images 2.0)
⚡ TL;DR · AI summary

OpenAI has released ChatGPT Images 2.0, its latest image generation model, which shows significant improvement over its predecessor in generating complex, detailed scenes. The model was tested with a 'Where's Waldo'-style prompt featuring a raccoon holding a ham radio, a task that challenged both older models and competing systems. While earlier versions and some rival models failed or produced unclear results, the high-resolution output from ChatGPT Images 2.0 successfully included the raccoon, making it visible and identifiable. However, attempts to use AI models to solve their own generated puzzles revealed limitations in reliability.

Original article
Simon Willison's Weblog · Simon Willison
Read full at Simon Willison's Weblog →
Full article excerpt tap to expand

Where’s the raccoon with the ham radio? (ChatGPT Images 2.0) 21st April 2026 OpenAI released ChatGPT Images 2.0 today, their latest image generation model. On the livestream Sam Altman said that the leap from gpt-image-1 to gpt-image-2 was equivalent to jumping from GPT-3 to GPT-5. Here’s how I put it to the test. My prompt: Do a where's Waldo style image but it's where is the raccoon holding a ham radio gpt-image-1 First as a baseline here’s what I got from the older gpt-image-1 using ChatGPT directly: I wasn’t able to spot the raccoon—I quickly realized that testing image generation models on Where’s Waldo style images (Where’s Wally in the UK) can be pretty frustrating! I tried getting Claude Opus 4.7 with its new higher resolution inputs to solve it but it was convinced there was a raccoon it couldn’t find thanks to the instruction card at the top left of the image: Yes — there’s at least one raccoon in the picture, but it’s very well hidden. In my careful sweep through zoomed-in sections, honestly, I couldn’t definitively spot a raccoon holding a ham radio. [...] Nano Banana 2 and Pro Next I tried Google’s Nano Banana 2, via Gemini: That one was pretty obvious, the raccoon is in the “Amateur Radio Club” booth in the center of the image! Claude said: Honestly, this one wasn’t really hiding — he’s the star of the booth. Feels like the illustrator took pity on us after that last impossible scene. The little “W6HAM” callsign pun on the booth sign is a nice touch too. I also tried Nano Banana Pro in AI Studio and got this, by far the worst result from any model. Not sure what went wrong here! gpt-image-2 With the baseline established, let’s try out the new model. I used an updated version of my openai_image.py script, which is a thin wrapper around the OpenAI Python client library. Their client library hasn’t yet been updated to include gpt-image-2 but thankfully it doesn’t validate the model ID so you can use it anyway. Here’s how I ran that: OPENAI_API_KEY="$(llm keys get openai)" \ uv run https://tools.simonwillison.net/python/openai_image.py \ -m gpt-image-2 \ "Do a where's Waldo style image but it's where is the raccoon holding a ham radio" Here’s what I got back. I don’t think there’s a raccoon in there—I couldn’t spot one, and neither could Claude. The OpenAI image generation cookbook has been updated with notes on gpt-image-2, including the outputQuality setting and available sizes. I tried setting outputQuality to high and the dimensions to 3840x2160—I believe that’s the maximum—and got this—a 17MB PNG which I converted to a 5MB WEBP: OPENAI_API_KEY="$(llm keys get openai)" \ uv run 'https://raw.githubusercontent.com/simonw/tools/refs/heads/main/python/openai_image.py' \ -m gpt-image-2 "Do a where's Waldo style image but it's where is the raccoon holding a ham radio" \ --quality high --size 3840x2160 That’s pretty great! There’s a raccoon with a ham radio in there (bottom left, quite easy to spot). The image used 13,342 output tokens, which are charged at $30/million so a total cost of around 40 cents. Takeaways I think this new ChatGPT image generation model takes the crown from Gemini, at least for the moment. Where’s Waldo style images are an infuriating and somewhat foolish way to test these models, but they do help illustrate how good they are getting at complex illustrations combining both text and details. Update: asking models to solve this is risky rizaco on Hacker News asked ChatGPT to draw a red circle around the…

This excerpt is published under fair use for community discussion. Read the full article at Simon Willison's Weblog.

Anonymous · no account needed
Share 𝕏 Facebook Reddit LinkedIn Email

Discussion

0 comments

More from Simon Willison's Weblog