logoalt Hacker News

lebovictoday at 2:19 AM0 repliesview on HN

The post is light on details, and I agree with the sentiment that it reads like marketing. That said, Opus 4.6 is actually a legitimate step up in capability for security research, and the red team at Anthropic – who wrote this post – are sincere in their efforts to demonstrate frontier risks.

Opus 4.6 is a very eager model that doesn't give up easily. Yesterday, Opus 4.6 took the initiative to aggressively fuzz a public API of a frontier lab I was investigating, and it found a real vulnerability after 100+ uninterrupted tool calls. That would have required lots of of prodding with previous models.

If you want to experience this directly, I'd recommend recording network traffic while using a web app, and then pointing Claude Code at the results (in Chrome, this is Dev Tools > Network > Export HAR). It makes for hours of fun, but it's also a bit scary.