logoalt Hacker News

Reaktornanoyesterday at 10:02 PM4 repliesview on HN

Author here. Spent the last few weeks chasing down the AI-attributed attack cases that made the rounds this year, including the Mexican government breach, the "vibe hacking" story, the Algerian amateur. Basically trying to work out whether hacking is impacted by broader AI adoption or whether the press was running ahead of the evidence.

On one side, Daniel Stenberg ran the gated Anthropic frontier model against curl on May 11. Five "confirmed" findings, one low-severity CVE after triage. His words: "the big hype around this model so far was primarily marketing." Stenberg is not a guy who hedges, and curl is not a toy codebase.

On the other side, there's SCONE — Anthropic's own December 2025 benchmark. Agents exploited 19 of 34 post-cutoff smart contracts, 55.8% success, $4.6M in simulated funds at an average API cost of $1.22 per contract. The comparable number 12 months earlier was about 2%.

Looks like agents are getting genuinely good at narrow, well-scoped vulnerability classes (Solidity, post-cutoff, bounded targets) and still bad at messy real-world codebases. But that's a guess and I'd rather hear pushback. Happy to get into methodology, the spots where Chainalysis, Immunefi, and Web3IsGoingJustGreat don't line up, or specific cases. 28 references at the end of the piece.


Replies

nozzlegearyesterday at 11:19 PM

> On the other side, there's SCONE — Anthropic's own December 2025 benchmark. Agents exploited 19 of 34 post-cutoff smart contracts, 55.8% success, $4.6M in simulated funds at an average API cost of $1.22 per contract. The comparable number 12 months earlier was about 2%.

Anthropic has a vested interest in making their LLMs look advanced, powerful and dangerous. This is the company that is explicitly pro-regulation, who has donated $20M to a PAC for pro-regulation candidates, and whose own competitors accuse of being pro-regulatory capture. We should take their benchmarks and their "Mythos is too dangerous for you mere mortals" statements with a big ass grain of salt, because it plays directly into that regulation angle they're playing. Anthropic wants frontier model development locked up, with only a few select stewards of humanity holding the keys.

Barbingyesterday at 11:08 PM

Do you have a dictation app? Hit us with your train of thought on this, how you’ve spent the last few weeks and the impact. Will be glad to read.

adampunkyesterday at 11:46 PM

>Beep-boop, I am a robot.

refulgentisyesterday at 10:51 PM

You wrote the blog and this comment with Claude Opus.

I'm sure you meant well and only used it for editing, etc. etc., and I agree AI is good.

In any case, I can't trust AI on AI, especially with such a stark headline from someone outside Anthropic. (how do you know it was a solo user with Claude?)

This is either breaking news that you for some reason delegated to an overly verbose post written by AI, or, its an almost-true-but-not-quite clickbait title, and I don't have the domain chops to know. Impossible spot to be in as a reader.

show 2 replies