logoalt Hacker News

stopachkatoday at 1:00 AM0 repliesview on HN

For those curious about the question: "how well does GPT 5.2 build Counter Strike?"

We tried the same prompts we asked previous models today, and found out [1].

The TL:DR: Claude is still better on the frontend, but 5.2 is comparable to Gemini 3 Pro on the backend. At the very least 5.2 did better on just about every prompt compared to 5.1 Codex Max.

The two surprises with the GPT models when it comes to coding: 1. They often use REPLs rather than read docs 2. In this instance 5.2 was more sheepish about running CLI commands. It would instead ask me to run the commands.

Since this isn't a codex fine-tuned model, I'm definitely excited to see what that looks like.

[1] The full video and some details in the tweet here: https://x.com/instant_db/status/1999278134504620363