I just got a much better version using this command instead, which uses the maximum image size according to https://github.com/openai/openai-cookbook/blob/main/examples...
OPENAI_API_KEY="$(llm keys get openai)" \
uv run 'https://raw.githubusercontent.com/simonw/tools/refs/heads/main/python/openai_image.py' \
-m gpt-image-2 \
"Do a where's Waldo style image but it's where is the raccoon holding a ham radio" \
--quality high --size 3840x2160
https://gist.github.com/simonw/88eecc65698a725d8a9c1c918478a... - I found the raccoon!I think that image cost 40 cents.
Fed into a clear Claude Code max effort session with : "Inspect waldo2.png, and give me the pixel location of a raccoon holding a ham radio.". It sliced the image into small sections and gave:
"Found the raccoon holding a ham radio in waldo2.png (3840×2160).
- Raccoon center: roughly (460, 1680)
- Ham radio (walkie-talkie) center: roughly (505, 1650) — antenna tip around (510, 1585)
- Bounding box (raccoon + radio): approx x: 370–540, y: 1550–1780
It's in the lower-left area of the image, just right of the red-and-white striped souvenir umbrella, wearing a green vest. "
Which is correct!The faces...that's nice that it turned a kid's book into an abomination
I tried it on the ChatGPT web UI and it also worked, although the ham radio looks like a handbag to me.
The people in this image remind me of early this person does not exist, in the best way
I found it on the 2nd image! On the 1st one not yet...
A startling number of people either have no arms, one arm, a half of an arm, or a shrunken arm; how odd!