isn't this insane? why aren't people freaking out? the jump in capability is outrageous. anyone?
Anthropic needs to show that its models continually get better. If the model showed minimal to no improvement, it would cause significant damage to their valuation. We have no way of validating any of this, there are no independent researchers that can back any of the assertions made by Anthropic.
I don’t doubt they have found interesting security holes, the question is how they actually found them.
This System Card is just a sales whitepaper and just confirms what that “leak” from a week or so ago implied.
It's going to be expensive to serve (also not generally available), considering they said it's the largest model they've ever trained.
I suspect it's going to be used to train/distill lighter models. The exciting part for me is the improvement in those lighter models.
I am freaking out. The world is going to get very messy extremely quickly in one or two further jumps in capability like this.
"some model I don't get to use is much better at benchmarks"
pick one or more: comically huge model, test time scaling at 10e12W, benchmark overfit
I think there's no SOA advance on this one worthy of "freaking out".
Looks like they just built a way larger model, with the same quirks than Claude 4. Seems like a super expensive "Claude 4.7" model.
I have no doubts that Google and OpenAI already done that for internal (or even government) usage.
Freak out about what? I read the announcement and thought "that's a dumb name, they sure are full of themselves" – then I went back to using Claude as a glorified commit message writer. For all its supposed leaps, AI hasn't affected my life much in the real except to make HN stories more predictable.
Well for one, it’s a PDF
the time to freak out was 2 years ago.
Wait until you see real usage. Benchmark numbers do not necessarily translate to real world performance (at least not by the same amount).
I've been increasingly "freaking out" since about 3 - 4 years ago and it seems that the pessimistic scenario is materializing. It looks like it will be over for software engineers in a not so distant future. In January 2025 I said that I expect software engineers to be replaced in 2 years (pessimistic) to 5 years (optimistic). Right now I'm guessing 1 to 3 years.