> We’ve therefore launched the model with safeguards that mean queries on some topics will instea...

bob1029 • yesterday at 5:35 PM • 1 reply • view on HN

> We’ve therefore launched the model with safeguards that mean queries on some topics will instead receive a response from our next-most-capable model, Claude Opus 4.8. To release the model both safely and quickly, we’ve tuned these safeguards conservatively—they’ll sometimes catch harmless requests, though they trigger, on average, in less than 5% of sessions. With more capable models arriving in the coming months...

This sounds suspiciously like a capacity story masquerading as a safety story.

Replies

azan_ • yesterday at 7:26 PM

Approx. 5% sessions? That's insanely high.

alt Hacker News

Replies