LLMs are just really good search. Ask it to create something and it's searching within the pretrained weights. Ask it to find something and it's semantically searching within your codebase. Ask it to modify something and it will do both. Once you understand its just search, you can get really good results.
In my experience Claude is like a "good junior developer" -- can do some things really well, FUBARS other things, but on the whole something to which tasks can be delegated if things are well explained. If/when it gets to the ability level of a mid-level engineer it will be revolutionary. Typically a mid-level engineer can be relied upon to do the right thing with no/minimal oversight, can figure out incomplete instructions, and deliver quality results (and even train up the juniors on some things). At that point the only reason to have human junior engineers is so they can learn their way up the ladder to being an architect and responsible coordinating swarms of Claude Agents to develop whole applications and complete complex tasks and initiatives.
Beyond that what can Claude do... analyze the business and market as a whole and decide on product features, industry inefficiencies, gap analysis, and then define projects to address those and coordinate fleets of agents to change or even radically pivot an entire business?
I don't think we'll get to the point where all you have is a CEO and a massive Claude account but it's not completely science fiction the more I think about it.
This article resonates exactly how I think about it as well. For example, at minfx.ai (a Neptune/wandb alternative), we cache time series that can contain millions of floats for fast access. Any engineer worth their title would never make a copy of these and would pass around pointers for access. Opus, when stuck in a place where passing the pointer was a bit more difficult (due to async and Rust lifetimes), would just make the copy, rather than rearchitect or at least stop and notify user. Many such examples of ‘lazy’ and thus bad design.
I'm not entirely convinced by the anecdote here where Claude wrote "bad" React code:
> But in context, this was obviously insane. I knew that key and id came from the same upstream source. So the correct solution was to have the upstream source also pass id to the code that had key, to let it do a fast lookup.
I've seen Claude make mistakes like that too, but then the moment you say "you can modify the calling code as well" or even ask "any way we could do this better?" it suggests the optimal solution.
My guess is that Claude is trained to bias towards making minimal edits to solve problems. This is a desirable property, because six months ago a common complaint about LLMs is that you'd ask for a small change and they would rewrite dozens of additional lines of code.
I expect that adding a CLAUDE.md rule saying "always look for more efficient implementations that might involve larger changes and propose those to the user for their confirmation if appropriate" might solve the author's complaint here.
IDK I've been using opus 4.5 to create a UI library and it's been doing pretty well: https://simsies.xyz/ (still early days)
Granted it was building ontop of tailwind (shifting over to radix after the layoff news). Begs the question? What is a lego?
Eventually you can show Claude how you solve problems, and explain the thought process behind it. It can apply these learnings but it will encounter new challenges in doing so. It would be nice if Claude could instigate a conversation to go over the issues in depth. Now it wants quick confirmation to plough ahead.
Did the author ask it to make new abstractions? In my experience when I produces output that I don't like I ask it to refactor it. These models have and understanding of all modern design patterns. Just ask it to adopt one.
Wow, what an excellent blog. Highly suggest trying out creator's tool (stardrift.ai) too!
By and large, I agree with the article. Claude is great and fast at doing low level dev work. Getting the syntax right in some complicated mechanism, executing an edit-execute-readlog loop, making multi file edits.
This is exactly why I love it. It's smart enough to do my donkey work.
I've revisited the idea that typing speed doesn't matter for programmers. I think it's still an odd thing to judge a candidate on, but appreciate it in another way now. Being able to type quickly and accurately reduces frustration, and people who foresee less frustration are more likely to try the thing they are thinking about.
With LLMs, I have been able to try so many things that I never tried before. I feel that I'm learning faster because I'm not tripping over silly little things.