logoalt Hacker News

827ayesterday at 7:07 PM17 repliesview on HN

I've played around with Gemini 3 Pro in Cursor, and honestly: I find it to be significantly worse than Sonnet 4.5. I've also had some problems that only Claude Code has been able to really solve; Sonnet 4.5 in there consistently performs better than Sonnet 4.5 anywhere else.

I think Anthropic is making the right decisions with their models. Given that software engineering is probably one of the very few domains of AI usage that is driving real, serious revenue: I have far better feelings about Anthropic going into 2026 than any other foundation model. Excited to put Opus 4.5 through its paces.


Replies

mritchie712yesterday at 7:28 PM

> only Claude Code has been able to really solve; Sonnet 4.5 in there consistently performs better than Sonnet 4.5 anywhere else.

I think part of it is this[0] and I expect it will become more of a problem.

Claude models have built-in tools (e.g. `str_replace_editor`) which they've been trained to use. These tools don't exist in Cursor, but claude really wants to use them.

0 - https://x.com/thisritchie/status/1944038132665454841?s=20

show 2 replies
vunderbayesterday at 7:20 PM

My workflow was usually to use Gemini 2.5 Pro (now 3.0) for high-level architecture and design. Then I would take the finished "spec" and have Sonnet 4.5 perform the actual implementation.

show 6 replies
lxgryesterday at 10:41 PM

> I've played around with Gemini 3 Pro in Cursor, and honestly: I find it to be significantly worse than Sonnet 4.5.

That's my experience too. It's weirdly bad at keeping track of its various output channels (internal scratchpad, user-visible "chain of thought", and code output), not only in Cursor but also on gemini.google.com.

show 1 reply
lvl155yesterday at 7:25 PM

I really don’t understand the hype around Gemini. Opus/Sonnet/GPT are much better for agentic workflows. Seems people get hyped for the first few days. It also has a lot to do with Claude code and Codex.

show 4 replies
chinathrowyesterday at 7:22 PM

I gave Sonnet 4.5 a base64 encoded PHP serialize() json of an object dump and told him to extraxt the URL within.

It gave me the Youtube-URL to Rick Astley.

show 7 replies
emodendroketyesterday at 7:50 PM

Yeah I think Sonnet is still the best in my experience but the limits are so stingy I find it hard to recommend for personal use.

visioninmybloodyesterday at 7:09 PM

The model is great it is able to code up some interesting visual tasks(I guess they have pretty strong tool calling capapbilities). Like orchestrate prompt -> image generate -> Segmentation -> 3D reconstruction. Checkout the results here https://chat.vlm.run/c/3fcd6b33-266f-4796-9d10-cfc152e945b7. Note the model was only used to orchestrate the pipeline, the tasks are done by other models in an agentic framework. They much have improved tool calling framework with all the MCP usage. Gemini 3 was able to orchestrate the same but Claude 4.5 is much faster

verdvermyesterday at 8:26 PM

> played around with

You'll never get an accurate comparison if you only play

We know by now that it takes time to "get to know a model and it's quirks"

So if you don't use a model and cannot get equivalent outputs to your daily driver, that's expected and uninteresting

show 1 reply
rishabhaioveryesterday at 7:10 PM

I suspect Cursor is not the right platform to write code on. IMO, humans are lazy and would never code on Cursor. They default to code generation via prompt which is sub-optimal.

show 1 reply
Squarexyesterday at 7:10 PM

I have heard that gemini 3 is not that great in cursor, but excellent in Antigravity. I don't have a time to personally verify all that though.

show 5 replies
jjcmyesterday at 7:33 PM

Tangental observation - I've noticed Gemini 3 Pro's train of thought feels very unique. It has kind of an emotive personality to it, where it's surprised or excited by what it finds. It feels like a senior developer looking through legacy code and being like, "wtf is this??".

I'm curious if this was a deliberate effort on their part, and if they found in testing it provided better output. It's still behind other models clearly, but nonetheless it's fascinating.

show 1 reply
rustystumpyesterday at 7:22 PM

Gemini 3 was awful when i gave it a spin. It was worse than cursor’s composer model.

Claude is still a go to but i have found that composer was “good enough” in practice.

screyeyesterday at 7:47 PM

Gemini being terrible in Cursor is a well known problem.

Unfortunately, for all its engineers, Google seems the most incompetent at product work.

behnamohyesterday at 7:14 PM

i’ve tried Gemini in Google AI studio as well and was very disappointed by the superficial responses it provided. It seems like at the level of GPT-5-low or even lower.

On the other hand, it’s a truly multi modal model whereas Claude remains to be specifically targeted at coding tasks, and therefore is only a text model.

UltraSaneyesterday at 7:33 PM

I've had Gemini 3 Pro solve issues that Claude Code failed to solve after 10 tries. It even insulted some code that Sonnet 4.5 generated

show 1 reply
poszlemyesterday at 7:16 PM

I’ve trashed Gemini non-stop (seriously, check my history on this site), but 3 Pro is the one that finally made me switch from OpenAI. It’s still hot garbage at coding next to Claude, but for general stuff, it’s legit fantastic.

enraged_camelyesterday at 7:19 PM

My testing of Gemini 3 Pro in Cursor yielded mixed results. Sometimes it's phenomenal. At other times I either get the "provider overloaded" message (after like 5 mins or whatever the timeout is), or the model's internal monologue starts spilling out to the chat window, which becomes really messy and unreadable. It'll do things like:

>> I'll execute.

>> I'll execute.

>> Wait, what if...?

>> I'll execute.

Suffice it to say I've switched back to Sonnet as my daily driver. Excited to give Opus a try.

show 1 reply