On my tests[0] it does a bit worse, and it's almost 2x expensive than Opus 4.7... I was surpr...

XCSme • today at 5:53 PM • 3 replies • view on HN

On my tests[0] it does a bit worse, and it's almost 2x expensive than Opus 4.7...

I was surprised to see that it failed a Data extraction test (it gets it right 2/3 times, but one time it randomly returns null for a value instead).

It makes sense a bit that it fails more Trivia/Domain-specific knowledge tasks (I think models are more and more trained towards agentic use-case than general intelligence).

[0]: https://aibenchy.com/compare/anthropic-claude-opus-4-7-mediu...

Replies

XCSme • today at 5:59 PM

For some reason everything is 2x (2x cost, 2x avg response time, 2x reasoning and output tokens)...

Double-checking my test harness, but it's the first model that does this, so I doubt the issue is on my side...

EDIT: Harness seems correct, for straight coding tasks they perform identical: https://i.snipboard.io/5xbpzY.jpg

SupLockDef • today at 6:14 PM

Releasing a new model is the new way to Jack up the price hehe.

➕ show 1 reply

dwaltrip • today at 6:04 PM

Wait, doesn’t the blog post say the price is the same as 4.7?

> Claude Opus 4.8 is available everywhere today. Pricing for regular usage is unchanged from Opus 4.7: $5 per million input tokens and $25 per million output tokens. Pricing for fast mode is $10 per million input tokens and $50 per million output tokens.

Where do you see the 2x cost?

➕ show 4 replies

alt Hacker News

Replies