logoalt Hacker News

bgirardyesterday at 7:49 PM1 replyview on HN

> Using the develop web game skill and preselected, generic follow-up prompts like "fix the bug" or "improve the game", GPT‑5.3-Codex iterated on the games autonomously over millions of tokens.

I wish they would share the full conversation, token counts and more. I'd like to have a better sense of how they normalize these comparisons across version. Is this a 3-prompt 10m token game? a 30-prompt 100m token game? Are both models using similar prompts/token counts?

I vibe coded a small factorio web clone [1] that got pretty far using the models from last summer. I'd love to compare against this.

[1] https://factory-gpt.vercel.app/


Replies

vebyesterday at 7:54 PM

I just wanted to say that's a pretty cool demo! I hadn't realised people were using it for things like this.

show 1 reply