I don't get the Gemini 3 hype... yes it's their first usable model, but its not even close...

vedmakk • last Wednesday at 8:31 PM • 39 replies • view on HN

I don't get the Gemini 3 hype... yes it's their first usable model, but its not even close to what Opus 4.5 and GPT 5.2 can do.

Maybe on Benchmarks... but I'm forced to use Gemini at work everyday, while I use Opus 4.5 / GPT 5.2 privately every day... and Gemini is just lacking so much wit, creativity and multi-step problem solving skills compared to Opus.

Not to mention that Gemini CLI is a pain to use - after getting used to the smoothness of Claude Code.

Am I alone with this?

Replies

svara • last Thursday at 6:47 AM

I cancelled my ChatGPT subscription because of Gemini 3, so obviously I'm having a different experience.

That said, I use Opus4.5 for coding through Cursor.

Gemini is for planning / rubber ducking / analysis / search.

I seriously find it a LOT better for these things.

ChatGPT has this issue where when it's doesn't know the explanation for something, it often won't hallucinate outright, but create some long-winded confusing word salad that sounds like it could be right but you can't quite tell.

Gemini mostly doesn't do that and just gives solid scientifically/ technically grounded explanations with sources much of the time.

That said it's a bit of a double edged sword, since it also tends to make confident statements extrapolating from the sources in ways that aren't entirely supported but tend to be plausible.

➕ show 10 replies

ulfw • yesterday at 9:52 AM

It's prove of what investors have been fearing. That LLMs are a dime a dozen, that there is no real moat and that the products are hence becoming commoditised. If you can replace one model with another without noticing a huge difference, there can only be a pricing race to the bottom for market share with hence much lower potential profits than the AI bubble has priced in.

mythz • last Thursday at 7:41 AM

Full time Antigravity user here, IMO best value coding assistant by far, not even including all the other AI Pro sub perks.

Still using Claude Pro / GitHub Copilot subs for general terminal/VS Code access to Claude. I consider them all top-tier models, but I prefer the full IDE UX of Antigravity over the VS Code CC sidebar or CC terminal.

Opus 4.5 is obviously great at all things code, tho a lot of times I prefer Gemini 3 Pro (High) UI's. In the last month I've primarily used it on a Python / Vue project which it excels at, I thought I would've need to switch to Opus at some point if I wasn't happy with a particular implementation, but I haven't yet. Few times it didn't generate the right result was due to prompt misunderstanding which I was able to fix by reprompting.

I'm still using Claude/GPT 5.2 for docs as IMO they have a more sophisticated command over the English language. But for pure coding assistance, I'm a happy Antigravity user.

➕ show 3 replies

cvhc • last Thursday at 7:58 AM

For general researching/chatbot, I don't feel one of them is much better than the other. But since I'm already on Google One plan, upgrading the plan costs less than paying $20/mo to OpenAI, so I ended up cancelling ChatGPT Plus. Plus my Google One is shared with my family so they can also use advanced Gemini models.

➕ show 1 reply

falloutx • last Wednesday at 8:33 PM

Dont use it on gemini.google.com, but instead try it on aistudio.google.com.

Model may be the same but the agent on aistudio makes it much better when it comes to generating code.

Still jules.google.com is far behind in terms of actual coding agents which you can run in command line.

Google as always has over engineered their stuff to make it confusing for end users.

➕ show 3 replies

intalentive • last Thursday at 4:42 AM

No, not alone, I find GPT far preferable when it comes to fleshing out ideas. It is much deeper conceptually, it understands intent and can cross pollinate disparate ideas well. Gemini is a little more autistic and gets bogged down in details. The API is useful for high volume extraction jobs, though — Gemini API reliability has improved a lot and has lower failure rate than OpenAI IME.

SilverSlash • last Thursday at 8:34 AM

While that may be your personal experience, but for me Gemini always answers my questions better than Claude Opus 4.5 and often better than GPT 5.2. I'm not talking about coding agents, but rather the web based AI systems.

This has happened enough times now (I run every query on all 3) that I'm fairly confident that Gemini suits me better now. Whereas it used to be consistently dead last and just plain bad not so long ago. Hence the hype.

➕ show 1 reply

jchw • last Thursday at 10:48 AM

I dunno about Gemini CLI, but I have tried Google Antigravity with Gemini 3 Pro and found it extremely superior at debugging versus the other frontier models. If I threw it at a really, really hard problem, I always expected it to eventually give up, get stuck in loops, delete a bunch of code, fake the results, etc. like every other model and every other version of Gemini always did. Except it did not. It actually would eventually break out of loops and make genuine progress. (And I let it run for long periods of time. Like, hours, on some tricky debugging problems. It used gdb in batch mode to debug crashes, and did some really neat things to try to debug hangs.)

As for wit, well, not sure how to measure it. I've mainly been messing around with Gemini 3 Pro to see how it can work on Rust codebases, so far. I messed around with some quick'n'dirty web codebases, and I do still think Anthropic has the edge on that. I have no idea where GPT 5.2 excels.

If you could really compare Opus 4.5 and GPT 5.2 directly on your professional work, are you really sure it would work much better than Gemini 3 Pro? i.e. is your professional work comparable to your private usage? I ask this because I've really found LLMs to be extremely variable and spotty, in ways that I think we struggle to really quantify.

➕ show 1 reply

afro88 • last Thursday at 5:50 PM

This may sound backwards, but gemini 3 flash is quite good when given very specific tasks. It's very fast (much faster than Opus and GPT-5.2), follows instructions very well and spits out working code (in contrast to other flash, haiku etc fast models).

It does need a solid test suite to keep it in check. But you can move very fast if you have well defined small tasks to give it. I have a PRD then breakdown epics, stories and finally the tasks with Pro first. Works very well.

barrkel • last Thursday at 6:48 AM

When I had a problem with video handoff between one Linux kernel and the next with a zfsbootmenu system, only Gemini was helpful. ChatGPT led me on a merry chase of random kernel flags that didn't have the right effect.

What worked was rebuilding the Ubuntu kernel with a disabled flag enabled, but it took too long to get that far.

avazhi • last Thursday at 4:10 AM

I mean, I'm the exact opposite. Ask ChatGPT to write a simple (but novel) script for AutoHotKey, for example, and it can't do it. Gemini can do it perfectly on the first try.

ChatGPT has been atrocious for me over the past year, as in its actual performance has deteriorated. Gemini has improved with time. As for the comment about lacking wit, I mean, sure I guess, but I use AI to either help me write code to save me time or to give me information - I expect wit out of actual humans. That shit just annoys me with AI, and neither ChatGPT nor Gemini bots are good at not being obnoxious with metaphors and floral speech.

➕ show 1 reply

websiteapi • last Wednesday at 8:44 PM

I find them all comparable, but Gemini is cheaper

➕ show 1 reply

tempestn • last Thursday at 9:08 AM

I've been using both GPT 5.2 and Gemini 3 Pro a lot. I was very impressed with 3 Pro when it came out, and thought I'd cancel my OAI Plus, but I've since found that for important tasks it's been beneficial to compare the results from both, or even bounce between them. They're different enough that it's like collaborating with a team.

➕ show 1 reply

james2doyle • last Thursday at 5:23 PM

Maybe try out some of the alternative CLI options? Like https://opencode.ai? I also like https://github.com/charmbracelet/crush and https://github.com/mistralai/mistral-vibe

Mistletoe • last Thursday at 5:11 AM

I love Gemini. Why would I want my AI agent to be witty? That's the exact opposite of what I am looking for. I just want the correct answer with as little fluff and nonsense as possible.

➕ show 1 reply

tezza • last Thursday at 1:32 PM

You’re not alone. I do a small blog reviewing LLMs and have detailed comparisons that go beyond personal anecdotes. Gemini struggles in many usecases.

Everyone has to find what works for them and the switching cost and evaluation cost are very low.

I see a lot of comments generally with the same pattern “i cancelled my LEADER subscription and switched to COMPETITOR”… reminiscent of astroturf. However I scanned all the posters in this particular thread and the cancellers do seem like legit HN profiles.

Workaccount2 • last Thursday at 2:39 PM

People get used to a model and then work best with that model.

If you hand an iPhone user an Android phone, they will complain that Android is awful and useless. The same is true vice versa.

This is in large part why we get so many conflicting reports of model behavior. As you become more and more familiar with a model, especially if it is in fact a good model, other good models will feel janky and broken.

dahcryn • last Thursday at 8:28 AM

Claude Code > Gemini CLI, fair enough

But I actually find Gemini Pro (not the free one) extremely capable, especially since you can throw any conversation into notebooklm and deep thinking mode to go in depth

Opus is great, especially for coding and writing, but for actual productivity outside of that (e.g. working with PDF, images, screenshots, design stuff like marketing, tshirts, ...,...) I prefer Gemini. It's also the fastest.

Nowhere do I feel like GPT 5.2 is as capable as these two, although admittedly I just stopped using it frequently around november.

➕ show 1 reply

dave771 • last Thursday at 9:10 AM

Yeah, you are. You're limiting your view to personal use and just the text modality. If you're a builder or running a startup, the price-performance on Gemini 3 Pro and Flash is unmatched, especially when you factor in the quotas needed for scaled use cases. It’s also the only stack that handles text, live voice, and gen-media together. The Workspace/Gmail integration really doesn't represent the raw model's actual power.

➕ show 1 reply

HarHarVeryFunny • last Thursday at 4:03 PM

> Not to mention that Gemini CLI is a pain to use - after getting used to the smoothness of Claude Code.

Are you talking strictly about the respective command line tools as opposed to differences in the models they talk to?

If so, could you list the major pain points of Gemini CLI were Claude Code does better ?

azuanrb • last Thursday at 10:39 AM

Opus > GPT 5.2 | Gemini 3 Pro to me. But they are pretty close to lately. The gap is smaller now. I'm using it via CLI. For Gemini, their CLI is pretty bad imo. I'm using it via Opencode and pretty happy with it so far. Unfortunately Gemini often throw me rate limit error, and occasionally hang. Their infra is not really reliable, ironically. But other than that, it's been great so far.

verelo • last Thursday at 1:58 PM

Claude opus is absurdly amazing. I now spent around $100-200 a day using it. Gemini and all the OpenAI models can’t me up right now.

Having said that, Google are killing it at the image editing right now. Makes me wonder if that’s because of some library of content and once Anthropocene acquires the same they’ll blow us away there too.

➕ show 3 replies

pwagland • last Thursday at 9:51 AM

In my experience, Gemini is great for "one-shot" work, and is my goto for "web" AI usage. Claude Code beats gemini-cli though. Gemini-cli isn't bad, but it's also not good.

I would love to try antigravity out some more, but last I don't think it is out of playground stage yet, and can't be used for anything remotely serious AFAIK.

mindcrime • last Thursday at 12:54 PM

I haven't straight up cancelled my ChatGPT subscription, but I find that I use Gemini about 95% of the time these days. I never bother with any of Anthropic's stuff, but as far as OpenAI models vs Gemini, they strike me as more or less equivalent.

qaq • last Thursday at 2:52 PM

Nope be it in coding context but Claude and Codex are a combo that really shine and Gemini is pretty useless. The only thing I actually use it for is to triple check the specifications sometimes and thats pretty much it.

Galaxeblaffer • last Thursday at 4:00 AM

Gemini really only shines when using it in planning life in th vscode fork antigravity. It also supports opus so it's easy to compare.

broochcoach • last Thursday at 6:14 PM

The Gemini voice app on iOS is unimpressive. They force the answers to be so terse to save cost that it’s almost useless. It quickly goes in circles and needs context pruning. I haven’t tried a paid subscription for Gemini CLI or whatever their new shiny is but codex and Claude code have become so good in the last few months that I’m more focused on using them than exploring options.

OsrsNeedsf2P • last Thursday at 3:19 AM

You're not alone, I feel like sometimes I'm on crazy pills. I have benchmarks at work where the top models are plugged into agents, and Gemini 3 is behind Sonnet 4. This aligns closely with my personal usage as well, where Gemini fails to effectively call MCP tools.

But hey, it's cheapish, and competition is competition

joshvm • last Thursday at 9:13 PM

I've found that for any sort of reasonable task, the free models are garbage and the low-tier paid models aren't much better. I'm not talking about coding, just general "help me" usage. It makes me very wary of using these models for anything that I don't fully understand, because I continually get easily falsifiable hallucinations.

Today, I asked Gemini 3 to find me a power supply with some spec; AC/DC +/- 15V/3A. It did a good job of spec extraction from the PDF datasheets I provided, including looking up how the device performance would degrade using a linear vs switch-mode PSU. But then it comes back with two models from Traco that don't exist, including broken URLs to Mouser. It did suggest running two Meanwell power supplies in series (valid), but 2/3 suggestions were BS. This sort of failure is particularly frustrating because it should be easy and the outputs are also very easy to test against.

Perhaps this is where you need a second agent to verify and report back, so a human doesn't waste the time?

RandallBrown • last Thursday at 5:33 AM

I've only used AI pretty sparingly, and I just use it from their websites, but last time I tried all 3 only the code Google generated actually compiled.

No idea which version of their models I was using.

zapnuk • last Thursday at 2:43 PM

gemini 2.0 flash is and was a godsend for many small tasks and ocr.

There needs to be a greater distinction between models used for human chat, programming agents, and software-integration - where at least we benefitted from gemini flash models.

murdy • last Thursday at 2:11 PM

I also get weirdly agitated by this. In my mind Geminy 3 is case of clear benchmaxing and over all massive flop.

I am currently testing different IDEs including Antigravity, and I avoid that model at all cost. I will rather pay to use different model, than use Geminy 3.

It sucks at coding compared to OpenAI and Anthropic models and it is not clearly better as chat-bot (I like the context window). The images are best part of it as it is very steerable and fast.

But WTF? This was supposed to be the OpenAI killer model? Please.

netdur • last Thursday at 11:10 AM

Ai studio with my custom prompting is much better than Gemini app and opus

pjjpo • last Thursday at 11:20 AM

I'm with you - the most disappointing was when asking Gemini, technically nano banana, for a PNG with transparent background it just approximated what a transparent PNG would look like in a image viewer, as an opaque background. ChatGPT has no problem. I also appreciate when it can use content like Disney characters. And as far as actual LLMs go, the text is just formatted more readably in GPT to me, with fairly useful application of emojis. I also had an experience asking for tax reporting type of advice, same prompt to both. GPT was the correct response, Gemini suggested cutting corners in a grey way and eventually agreed that GPT's response is safer and better to go with.

It just feels like OpenAI puts a lot of effort into creating an actually useful product while Gemini just targets benchmarks. Targeting benchmarks to me is meaningless since every model, gpt, Gemini, Claude, constantly hallucinate in real workloads anyways.

outside1234 • last Thursday at 3:32 PM

Have you used it as a consumer would? Aka in google search results or as a replacement for ChatGPT? Because in my hands it is better than ChatGPT.

LightBug1 • last Thursday at 2:50 PM

I've started using Gem 3 while things are still in flux in the AI world. Pleasantly surprised by how good it is.

Most of my projects are on GPT at the moment, but we're nowhere too far gone that I can't move to others.

And considering just the general nonsense of Altman vs Musk, I might go to Gemini as a safe harbour (yes, I know how ridiculous that sounds).

So far, I've also noticed less ass-kissing by the Gemini robot ... a good thing.

PunchTornado • last Thursday at 12:40 PM

I am the opposite. Find GPT 5.2 much worse. Sticking only with gemini and claude.

littlestymaar • last Thursday at 10:59 AM

> Not to mention that Gemini CLI is a pain to use - after getting used to the smoothness of Claude Code.

Claude Code isn't actually tied to Claude, I've seen people use Claude Code with gpt-oss-120b or Qwen3-30b, why couldn't you use Gemini with Claude Code?

retinaros • last Thursday at 11:24 AM

No you are not. I tried all gemini models. They are slop.

alt Hacker News

Replies