I'm either in a minority or a silent majority. Claude Code surpasses all my expectations. When it makes a mistake like over-editing, I explain the mistake, it fixes it, and I ask it to record what it learned in the relevant project-specific skills. It rarely makes that mistake again. When the skill file gets big, I ask Claude to clean and compact it. It does a great job.
It doesn't really make sense economically for me to write software for work anymore. I'm a teacher, architect, and infrastructure maintainer now. I hand over most development to my experienced team of Claude sessions. I review everything, but so does Claude (because Claude writes thorough tests also.) It has no problem handling a large project these days.
I don't mean for this post to be an ad for Claude. (Who knows what Anthropic will do to Claude tomorrow?) I intend for this post to be a question: what am I doing that makes Claude profoundly effective?
Also, I'm never running out of tokens anymore. I really only use the Opus model and I find it very efficient with tokens. Just last week I landed over 150 non-trivial commits, all with Claude's help, and used only 1/3 of the tokens allotted for the week. The most commits I could do before Claude was 25-30 per week.
(Gosh, it's hard to write that without coming across as an ad for Anthropic. Sorry.)
When I see people talking about Claude Code becoming "unusable" for them recently, I believe them, but I don't understand. It's a deeply flawed and buggy piece of software but it's very effective. One of the strangest things about AI to me is that everyone seems to have a radically different experience.
The article has a benchmark and Opus has best score in two categories and the second-best in another (there are only three categories). Opus is probably the best choice when it comes to producing readable code right now. GPT (for example) lags way behind.
> I intend for this post to be a question: what am I doing that makes Claude profoundly effective?
I'm fascinated by this question.
I think the first two sections of this article point towards an answer: https://aphyr.com/posts/412-the-future-of-everything-is-lies...
I've personally had radically different experiences working on different projects, different features within the same project, etc.
Are you writing code that gets reviewed by other people? Were code reviews hard in the past? Do your coworkers care about "code quality" (I mean this in scare quotes because that means different things to different people).
Are you working more on operational stuff or on "long-running product" stuff?
My personal headcanon: this tooling works well when built on simple patterns, and can handle complex work. This tooling has also been not great at coming up with new patterns, and if left unsupervised will totally make up new patterns that are going to go south very quickly. With that lens, I find myself just rewriting what Claude gives me in a good number of cases.
I sometimes race the robot and beat the robot at doing a change. I am "cheating" I guess cuz I know what I want already in many cases and it has to find things first but... I think the futzing fraction[0] is underestimated for some people.
And like in the "perils of laziness lost"[1] essay... I think that sometimes the machine trying too hard just offends my sensibilities. Why are you doing 3 things instead of just doing the one thing!
One might say "but it fixes it after it's corrected"... but I already go through this annoying "no don't do A,B, C just do A, yes just that it's fine" flow when working with coworkers, and it's annoying there too!
"Claude writes thorough tests" is also its own micro-mess here, because while guided test creation works very well for me, giving it any leeway in creativity leads to so many "test that foo + bar == bar + foo" tests. Applying skepticism to utility of tests is important, because it's part of the feedback loop. And I'm finding lots of the test to be mainly useful as a way to get all the imports I need in.
If we have all these machines doing this work for us, in theory average code quality should be able to go up. After all we're more capable! I think a lot of people have been using it in a "well most of the time it hits near the average" way, but depending on how you work there you might drag down your average.
[0]: https://blog.glyph.im/2025/08/futzing-fraction.html [1]: https://bcantrill.dtrace.org/2026/04/12/the-peril-of-lazines...
I feel the same way. Doesn't make sense economically or even in good faith for me to use company paid time writing code for line of business apps at anymore and I'm 28 years into this kind of work.
I think a lot of use have implemented our own ad hoc self-improvement checks into our agentic workflows. My observations are the same as yours.
Is your claude.md, skills or other settings that you have honed public?
> I'm either in a minority or a silent majority. Claude Code surpasses all my expectations.
I looked at some stats yesterday and was surprised to learn Cursor AI now writes 97% of my code at work. Mostly through cloud agents (watching it work is too distracting for me)
My approach is very simple: Just Talk To It
People way overthink this stuff. It works pretty good. Sharing .md files and hyperfocusing on various orchestrations and prompt hacks of the week feels as interesting as going deep on vim shortcuts and IDE skins.
Just ask for what you want, be clear, give good feedback. That’s it