logoalt Hacker News

theturtletalkslast Thursday at 5:50 AM3 repliesview on HN

All of this seems like manufactured hype for Gemini. I use GPT-5.2, Opus 4.5, and Gemini 3 flash and pro with Droid CLI and Gemini is consistently the worst. It gets stuck in loops, wants to wipe projects when it can’t figure out the problem, and still fails to call tools consistently (sometimes the whole thread is corrupted and you can’t rewind and use another model).

Terminal Bench supports my findings, GPT-5.2 and Opus 4.5 are consistently ahead. Only Junie CLI (Jetbrains exclusive) with Gemini 3 Flash scores somewhat close to the others.

It’s also why Ampcode made Gemini the default model and quickly back tracked when all of these issues came to light.


Replies

petesergeantlast Thursday at 6:07 AM

Claude for writing the code, Codex for checking the code, Gemini for when you want to look at a pretty terminal UI.

show 1 reply
alex1138last Thursday at 5:56 AM

I'm pretty high on Claude, though not an expert on coding or LLMs at all

I'm naturally inclined to dislike Google from what they censor, what they consider misinformation, and just, I don't know, some of the projects they run (many good things, but also many dead projects and lying to people)