logoalt Hacker News

Oarchyesterday at 7:43 PM4 repliesview on HN

It would probably depend on the target audience.

I was very impressed by Anthropic's swarm of agents building a C compiler earlier this year with 1000 PRs per hour. Easy to nitpick that it wasn't perfect, but it sure was impressive.


Replies

pronyesterday at 8:02 PM

You mean trying and failing to build a C compiler. This isn't a very hard task to begin with (assuming you know compilers, and the models do), but it was made unrealistically easy by giving the agents thousands of tests written by humans over years (on top of a spec and a reference implementation, both of which the models were trained on), and the agents still failed to converge. I was actually surprised that they failed as this was the purest possible example of "just do the coding" (something that isn't achievable in real or more complex cases) and when I read the description I thought they made it too easy, and in a way that isn't representative of real software. My thought at that failure was that if agents can't even build a C compiler with so much preparation effort put into the test, then we have some ways to go. Indeed, once you work a lot with agents for a while you see that coding isn't really their strong suit (although they are impressive at debugging).

AlienRobotyesterday at 7:48 PM

How many C compilers do we need...

show 1 reply
refulgentisyesterday at 7:47 PM

Right. Pretty impressive.

What percentage of people will think that’s life changing?

Because then we’re not talking about “can everyone up their demos to life changing, please?”, we’re talking about “can everyone use demos Oarch thinks are life changing, please?” - and “can build a MVP C compiler draft that barely works for $XXK” isn’t really that compelling to me, and we’re both software engineers, and my whole day job has been an agentic coder for…2.5 years?…now. My incentive structure and demographics are lined up perfectly to agree with you, but I don’t :/

show 1 reply
queenkjuulyesterday at 8:34 PM

You're too easily impressed