logoalt Hacker News

mikeceyesterday at 4:51 PM3 repliesview on HN

In my experience Claude is like a "good junior developer" -- can do some things really well, FUBARS other things, but on the whole something to which tasks can be delegated if things are well explained. If/when it gets to the ability level of a mid-level engineer it will be revolutionary. Typically a mid-level engineer can be relied upon to do the right thing with no/minimal oversight, can figure out incomplete instructions, and deliver quality results (and even train up the juniors on some things). At that point the only reason to have human junior engineers is so they can learn their way up the ladder to being an architect and responsible coordinating swarms of Claude Agents to develop whole applications and complete complex tasks and initiatives.

Beyond that what can Claude do... analyze the business and market as a whole and decide on product features, industry inefficiencies, gap analysis, and then define projects to address those and coordinate fleets of agents to change or even radically pivot an entire business?

I don't think we'll get to the point where all you have is a CEO and a massive Claude account but it's not completely science fiction the more I think about it.


Replies

0x457today at 9:52 PM

My experience with Claude (and other agents, but mostly Claude) is such a mixed bag. Sometimes it takes a minimal prompt and 20 minutes later produce a neat PR and all is good, sometimes it doesn't. Sometimes it takes in a large prompt (be it your own prompt, created by another LLM or by plan mode) and also either succeed and fail.

For me, most of the failure cases are where Claude couldn't figure something out due to conflicting information in context and instead of just stopping and telling me that it tries to solve in entirely wrong way. Doesn't help that it often makes the same assumptions as I would, so when I read the plan it looks fine.

Level of effort also hard to gauge because it can finish things that would take me a week in an hour or take an hour to do something I can in 20 minutes.

It's almost like you have to enforce two level of compliance: does the code do what business demands and is the code align with codebase. First one is relatively easy, but just doing that will produce odd results where claude generated +1KLOC because it didn't look at some_file.{your favorite language extension} during exploration.

Or it creates 5 versions of legacy code on the same feature branch. My brother in Christ, what are you trying to stay compatible with? A commit that about to be squashed and forgotten? Then it's going to do a compaction, forget which one of these 5 versions is "live" and update the wrong one.

It might do a good junior dev work, but it must be reviewed as if it's from junior dev that got hired today and this is his first PR.

alfalfasprouttoday at 6:20 PM

> I don't think we'll get to the point where all you have is a CEO and a massive Claude account but it's not completely science fiction the more I think about it.

At that point, why do you even need the CEO?

show 7 replies
imirictoday at 9:35 PM

> In my experience Claude is like a "good junior developer"

We've been saying this for years at this point. I don't disagree with you[1], but when will these tools graduate to "great senior developer", at the very least?

Where are the "superhuman coders by end of 2025" that Sam Altman has promised us? Why is there such a large disconnect between the benchmarks these companies keep promoting, and the actual real world performance of these tools? I mean, I know why, but the grift and gaslighting are exhausting.

[1]: Actually, I wouldn't describe them as "good" junior either. I've worked with good junior developers, and they're far more capable than any "AI" system.

show 1 reply