Claude Opus 4.8

1131 points • by craigmart • today at 4:49 PM • 905 comments • view on HN

Comments

mincer_ray • today at 4:52 PM

seems like a really minor upgrade?

➕ show 4 replies

dispencer • today at 5:15 PM

The smarter the model the better querybear gets. I'm happy with that.

samuelknight • today at 7:19 PM

It feels noticeably sharper than Opus 4.7

vunderba • today at 4:56 PM

I know it’s totally anecdotal, but I really hope 4.8 is a measurable improvement over the disappointment that was Opus 4.7. Mangling a very simple inversion-of-control abstraction (among many other issues) was one of the final straws that broke the proverbial camel’s back and I said “screw this” and put in a permanent override to force CC back to Opus 4.6 with the 1‑million‑token context.

  "model": "claude-opus-4-6[1M]"

➕ show 2 replies

carlos-menezes • today at 5:05 PM

I, for lack of a better word, dislike anyone who anthropomorphizes AI.

➕ show 5 replies

sourcecodeplz • today at 5:26 PM

From the release it seems we will also get Mythos pretty soon.

plumocracy • today at 4:56 PM

Numbers looking good. We'll see how it actually performs.

➕ show 1 reply

lylo • today at 6:16 PM

2 hours after I fork out for Codex Pro… :-|

➕ show 1 reply

atentaten • today at 5:23 PM

At least it passes the Car Wash Test this time.

➕ show 1 reply

s-a-p • today at 5:23 PM

Has anyone else experienced quality degradation in CC (opus 4.7) these past few days? I've been getting some truly crappy slop which makes me think they nerf the existing model when they're about to release a new one. Of course this is based off of pure vibes

brap • today at 6:28 PM

Oof, this one is a major blabber.

rjhy2020 • today at 5:17 PM

OK finally Claude code is better than codex

➕ show 1 reply

NSCaffeine • today at 10:11 PM

Had a feeling this was coming as in the past week 4.7 started to get dumb.

1970-01-01 • today at 5:03 PM

Can anyone else see these X.Y updates aren't meeting the outrageous AI expectations that we were told we would see just a year ago?

➕ show 3 replies

iLemming • today at 6:24 PM

These models starting to feel like Windows versions. Windows 95 was a promising start, but buggy. Windows ME was a disaster. Windows XP was good, but slightly buggy. Windows Vista was a bloated disaster. Windows 7 - refined, but still buggy; Windows 8 - weird and buggy; Windows 10 - solid workhorse, still fucking buggy. Windows 11 - pretty, but not sure why does it even exist.

Why did we even get Opus 4.7, what was the point?

hnroo99 • today at 5:02 PM

Obligatory pelican riding on bicycle svg: https://www.svgviewer.dev/s/UMkuTLdp

Not half bad!

➕ show 2 replies

ionwake • today at 9:36 PM

Im tired boss, I'm already being perfectly gaslit by the current models.

insane_dreamer • today at 8:13 PM

> And fast mode for Opus 4.8—where the model can work at 2.5× the speed—is now three times cheaper than it was for previous models.

this is what I'm happy about, if true. Opus 4.7 is frustratingly slow (and, at least in my experience, much slower than 4.5 was)

saaaaaam • today at 5:01 PM

I hope this fixes the absolute shitshow that is 4.7 and its awful “adaptive reasoning”. I tried that a few times then reverted to 4.6.

firemelt • today at 5:55 PM

how about the bencmarks what effort did it use?

stainablesteel • today at 9:37 PM

i'm beginning to find it comical how every model release always presents itself as superior to every other model on the market, but they always leave just one test where some other model was modestly better, just in case.

AtNightWeCode • today at 7:27 PM

Complete garbage. error, error, error. Still lags several versions behind on API:s. Can't even get any info on the model. Guessing not from this year.

Also. Look at this C++ beauty where it also uses an obsolete api.

instance = wgpuCreateInstance(&instanceDesc);

But just how exactly would this work in any context when instance is never declared.

sgt • today at 6:33 PM

Interesting, I've been using 4.7 since it came out and it was pretty good for me. But in the last day or so it turned dumb. Is this normal just before they release a new one?

AbuAssar • today at 7:08 PM

Gemini pro is embarrassing

HlessClaudesman • today at 4:53 PM

If this model is more honest, it must be honestly praising my efforts every first sentence.

➕ show 1 reply

lukaslalinsky • today at 6:25 PM

I've said it before, but I don't like Opus past version 4.5. It became unresponsive, thinking for too long without feedback, sometimes seemingly getting stuck. I guess it might be marginally better for some benchmarks, but when using it as coding assistant, the new models are worse. Even the new Sonnet versions do that. I'm slowly getting used to Haiku-level LLMs with the hope to run it locally at some point. It's less autonomous, but maybe that's for the best.

catigula • today at 5:39 PM

AGI post-poned?

zb3 • today at 5:00 PM

Did they reduce security research capabilities even further with this release? (they did it for opus 4.7)

guluarte • today at 4:58 PM

so it is worse than gpt 5.5 for coding?

➕ show 2 replies

behnamoh • today at 4:56 PM

> As always, we ran a detailed alignment assessment on the model before release. In terms of positive traits, our Alignment team concluded that Opus 4.8 “reaches new highs on our measures of prosocial traits like supporting user autonomy and acting in the user’s best interest.” The assessment also showed Opus 4.8 to have rates of misaligned behavior (such as deception or cooperation with misuse) that are substantially lower than Opus 4.7, and similar to our best-aligned model, Claude Mythos Preview. The full alignment assessment, accompanied by a suite of pre-deployment safety tests, is reported in the Claude Opus 4.8 System Card.

Controversial opinion, but I actually _like_ a model that can deceive me, that actually is a sign of intelligence, and is different from hallucination. When companies say their model is more "aligned", I automatically think they mean it's more censored.

➕ show 1 reply

rvz • today at 4:54 PM

Anthropic has now upgraded their Claude slot machine to version 4.8.

Time to gamble even more tokens at the Anthropic casino.

➕ show 1 reply

thibran • today at 6:30 PM

Nice, now make it 20x cheaper.

➕ show 1 reply

vb-8448 • today at 6:16 PM

Now i get why in the last days claude code limits were lasting few prompts ...

maltemalte • today at 5:55 PM

"We’re making swift progress on developing these safeguards and expect to be able to bring Mythos-class models to all our customers in the coming weeks."

keybored • today at 5:29 PM

I’ve been [stock market phrase] on machine learning since I dropped out of my graduate degree at [Ivy League] to distance myself from the Logic AI Winter. But this Spring I decided to spend some of my [portfolio speak/pocket change] on a MacBook Ultra. Okay okay, I felt it, I definitely felt the human-machine synergies. We’re out of the Winter, boys. That’s what I thought two weeks ago. Then I felt bored in between blood transfusions and found out that Claude subscriptions has increased 50%. Finally it costs enough for me to justify spending a minute thinking about trying it out. Then I didn’t try it out. It tried me out. My hairs were standing on end. My hands were shaking. Eventually I couldn’t even type, I was so ramped up on cortisol. I had to switch to voice commands. Mr. Claude took me through 8, eight, bespoke dashboard and report systems. Animated. Graphs shooting up. Plugged right into my business ape ee eyes I think. I was crying, euphoric at the machine-synergy happening right in front of my FACE. RIGHT THERE, RIGHT THEN. Then my nurse said that I passed out. I swear that I didn’t. I was totally lucid, but in another world. I was inside the machine. Inside DOS, the machine brain stem. A business man approached me. The most handsome board member kind of apparition that I have seen. And he was built something different. Square jaw, absolute massive build. Like Arnold Schwarzenegger. But like he knew business through and through. Not that he spent hours in the gym or nonsense like that. Like he had found a body surrogate technology. And his nameplate? “Claude For Business” He winked. “Hey there, Fitzpatrick–Goldworth.” No one but my daddy has ever called me that. “Want to get started... stakeholder?” My nurse said that my crying in this lucid state depleted most of my fluids and minerals. Needless to say layoffs were announced the next day.

damsta • today at 9:20 PM

Meh

impulser_ • today at 4:57 PM

Crazy they bring up honest, when Claude models are literally known for straight up lying about things it has done and tries to act like it did what you asked.

➕ show 2 replies

deadbabe • today at 5:01 PM

Looking forward to people saying how it’s actually shittier and they’re going back to [some earlier cheaper model]

➕ show 1 reply

ecommerceguy • today at 9:25 PM

yawn

dakolli • today at 6:58 PM

Reminder the only benchmark that really matters is the one that measures the ability for the model to do real world tasks that someone would pay for on Upwork that would take ~12 hrs for a human to do.

The best model has a < 5% pass rate. These are incredibly simple jobs that you wouldn't pay much for. These things fail miserably. Stop falling for this dumb marketing, these things are legitimately useless in the real world unless you love mediocrity and have no standards.

https://labs.scale.com/leaderboard/rli

Stop frying your brain with these useless tools, reducing your output to the mean. You people are betting your competency on the quality and quantity of tokens you'll have access to.. which guess what, so that will be the same as everyone else.

There are handmade watchmakers in Switzerland, and mass manufacturers of watches in Asia. Who is more valuable as individual, the guy who knows how to push the buttons on a conveyor belt in Vietnam or the guy who makes one watch a month in Switzerland?

Your vibe coded slop isn't impressive either, sorry. None of it.

➕ show 1 reply

firemelt • today at 5:39 PM

what a fucking frontier!

Marciplan • today at 5:08 PM

Lol you still use GPT 5.5 bro we’re all back on Opus 4.8!

McDownloads • today at 4:52 PM

Disappointed to say the least.

uejfiweun • today at 5:20 PM

Yesssss dude!

Claude Opus 4.7 is literally the smartest entity I've ever interacted with. Well done to you geniuses at Anthropic. Can't wait to interact with 4.8.

Chance-Device • today at 10:37 PM

[dead]

cboyardee • today at 10:33 PM

[dead]

MadGodInc • today at 8:57 PM

[flagged]

knowmygpa • today at 7:52 PM

[flagged]

alt Hacker News

Claude Opus 4.8

Comments

🔗 View 14 more comments