isn't this insane? why aren't people freaking out? the jump in capability is outrageous. a...

oliver236 • today at 6:36 PM • 10 replies • view on HN

isn't this insane? why aren't people freaking out? the jump in capability is outrageous. anyone?

Replies

I've been increasingly "freaking out" since about 3 - 4 years ago and it seems that the pessimistic scenario is materializing. It looks like it will be over for software engineers in a not so distant future. In January 2025 I said that I expect software engineers to be replaced in 2 years (pessimistic) to 5 years (optimistic). Right now I'm guessing 1 to 3 years.

➕ show 1 reply

Eufrat • today at 7:32 PM

Anthropic needs to show that its models continually get better. If the model showed minimal to no improvement, it would cause significant damage to their valuation. We have no way of validating any of this, there are no independent researchers that can back any of the assertions made by Anthropic.

I don’t doubt they have found interesting security holes, the question is how they actually found them.

This System Card is just a sales whitepaper and just confirms what that “leak” from a week or so ago implied.

nsingh2 • today at 6:41 PM

It's going to be expensive to serve (also not generally available), considering they said it's the largest model they've ever trained.

I suspect it's going to be used to train/distill lighter models. The exciting part for me is the improvement in those lighter models.

➕ show 2 replies

mofeien • today at 7:13 PM

I am freaking out. The world is going to get very messy extremely quickly in one or two further jumps in capability like this.

➕ show 1 reply

anuramat • today at 6:49 PM

"some model I don't get to use is much better at benchmarks"

pick one or more: comically huge model, test time scaling at 10e12W, benchmark overfit

➕ show 1 reply

yrds96 • today at 7:51 PM

I think there's no SOA advance on this one worthy of "freaking out".

Looks like they just built a way larger model, with the same quirks than Claude 4. Seems like a super expensive "Claude 4.7" model.

I have no doubts that Google and OpenAI already done that for internal (or even government) usage.

➕ show 1 reply

nozzlegear • today at 7:35 PM

Freak out about what? I read the announcement and thought "that's a dumb name, they sure are full of themselves" – then I went back to using Claude as a glorified commit message writer. For all its supposed leaps, AI hasn't affected my life much in the real except to make HN stories more predictable.

➕ show 1 reply

RobertDeNiro • today at 8:03 PM

Well for one, it’s a PDF

risyachka • today at 8:46 PM

the time to freak out was 2 years ago.

dysoco • today at 6:56 PM

Wait until you see real usage. Benchmark numbers do not necessarily translate to real world performance (at least not by the same amount).

alt Hacker News

Replies