logoalt Hacker News

jchwlast Thursday at 10:48 AM1 replyview on HN

I dunno about Gemini CLI, but I have tried Google Antigravity with Gemini 3 Pro and found it extremely superior at debugging versus the other frontier models. If I threw it at a really, really hard problem, I always expected it to eventually give up, get stuck in loops, delete a bunch of code, fake the results, etc. like every other model and every other version of Gemini always did. Except it did not. It actually would eventually break out of loops and make genuine progress. (And I let it run for long periods of time. Like, hours, on some tricky debugging problems. It used gdb in batch mode to debug crashes, and did some really neat things to try to debug hangs.)

As for wit, well, not sure how to measure it. I've mainly been messing around with Gemini 3 Pro to see how it can work on Rust codebases, so far. I messed around with some quick'n'dirty web codebases, and I do still think Anthropic has the edge on that. I have no idea where GPT 5.2 excels.

If you could really compare Opus 4.5 and GPT 5.2 directly on your professional work, are you really sure it would work much better than Gemini 3 Pro? i.e. is your professional work comparable to your private usage? I ask this because I've really found LLMs to be extremely variable and spotty, in ways that I think we struggle to really quantify.


Replies

Xmd5alast Thursday at 12:35 PM

Is Gemini 3 Pro better in Antigravity than in gemini-cli ?

show 2 replies