logoalt Hacker News

Claude Opus 4.5

683 pointsby adocompleteyesterday at 6:53 PM311 commentsview on HN

https://platform.claude.com/docs/en/about-claude/models/what...


Comments

xkbarkaryesterday at 9:05 PM

This is great. Sonnet 4.5 has degraded terribly.

I can get some useful stuff from a clean context in the web ui but the cli is just useless.

Opus is far superiour.

Today sonnet 4.5 suggested to verify remote state file presence by creating an empty one locally and copy it to the remote backend. Da fuq? University level programmer my a$$.

And it seems like it has degraded this last month.

I keep getting braindead suggestions and code that looks like it came from a random word generator.

I swear it was not that awful a couple of months ago.

Opus cap has been an issue, happy to change and I really hope the nerf rumours are just that. Undounded rumours and the defradation has a valid root cause

But honestly sonnet 4.5 has started to act like a smoking pile of sh**t

show 1 reply
tschellenbachyesterday at 7:56 PM

Ok, but can it play Factorio?

cyrusradfaryesterday at 7:40 PM

I'm curious if others are finding that there's a comfort in staying within the Claude ecosystem because when it makes a mistake, we get used to spotting the pattern. I'm finding that when I try new models, their "stupid" moments are more surprising and infuriating.

Given this tech is new, the experience of how we relate to their mistakes is something I think a bit about.

Am I alone here, are others finding themselves more forgiving of "their preferred" model provider?

show 1 reply
GodelNumberingyesterday at 7:08 PM

The fact that the post singled out SWE-bench at the top makes the opposite impression that they probably intended.

show 1 reply
CuriouslyCyesterday at 8:07 PM

I hate on Anthropic a fair bit, but the cost reduction, quota increases and solid "focused" model approach are real wins. If they can get their infrastructure game solid, improve claude code performance consistency and maintain high levels of transparency I will officially have to start saying nice things about them.

lerp-ioyesterday at 11:07 PM

80% and 77% is not that much lol

gsibbleyesterday at 9:18 PM

They lowered the price because this is a massive land grab and is basically winner take all.

I love that Antrhopic is focused on coding. I've found their models to be significantly better at producing code similar to what I would write, meaning it's easy to debug and grok.

Gemini does weird stuff and while Codex is good, I prefer Sonnet 4.5 and Claude code.

AJRFyesterday at 8:29 PM

that chart at the start is egregious

show 1 reply
0x79deyesterday at 7:30 PM

this is quite a good

zb3yesterday at 7:22 PM

The first chart is straight from "how to lie in charts"..

fragmedeyesterday at 8:19 PM

Got the river crossing one:

https://claude.ai/chat/0c583303-6d3e-47ae-97c9-085cefe14c21

Still fucked up one about the boy and the surgeon though:

https://claude.ai/chat/d2c63190-059f-43ef-af3d-67e7ca1707a4