logoalt Hacker News

maccardtoday at 8:59 AM6 repliesview on HN

I’m not OP but every time I post a comment with this sentiment I get told “the latest models are what you need”. If every 3 months you are saying “it’s ready as long as you use the latest model”, then it wasn’t ready 3 months ago and it’s not likely to be ready now.

To answer your question, I’ve tried both Claude code and Antigravity in the last 2 weeks and I’m still finding them struggling. AG with Gemini regularly gets stuck on simple issues and loops until I run out of requests, and Claude still just regularly goes on wild tangents not actually solving the problem.


Replies

anon7000today at 10:12 AM

I don’t think that’s true. Claude Opus 4.5/4.6 in Cursor have marked the big shift for me. Before that, agentic development mostly made me want to just do it myself, because it was getting stuck or going on tangents.

I think it can (and is) shifting very rapidly. Everyone is different, and I’m sure models are better at different types of work (or styles of working), but it doesn’t take much to make it too frustrating to use. Which also means it doesn’t take much to make it super useful.

show 1 reply
fendy3002today at 10:40 AM

It depends on what you're handling. Frontend (not css), swagger, mundane CRUD is where it shines. Something more complex that need a bit harder calculation usually make the agents struggling.

Especially good to navigate the code if you're unfamiliar with it (the code). If you have known the code for good, you'll find it's usually faster to debug and code by yourself.

Opus 4.6 with claude code vscode extension

edgyquanttoday at 12:08 PM

I thought this too and then I discovered plan mode. If you just prompt agent mode it will be terrible, but coming up with a plan first has really made a big difference and I rarely write code at all now

sergiosgctoday at 9:57 AM

Have you tried it with something like OpenSpec? Strangely, taking the time to lay out the steps in a large task helps immensely. It's the difference between the behavior you describe and just letting it run productively for segments of ten or fifteen minutes.

show 1 reply
techpressiontoday at 10:14 AM

Agree, it’s strange, I will just assume that the people who say this are building react apps. I still have so much ”certainly, I should not do this in a completely insane way, let me fix that” … -400+2. It’s not always, and it is better than it was, but that’s it.

show 1 reply
fragmedetoday at 11:36 AM

At this point though, after Claude C Compiler, you've got to give us more details to better understand the dichotomy. What do you consider simple issues?

show 2 replies