I've been increasingly removing myself from the typing part since August. For the last few months, I haven't written a single line of code, despite producing a lot more.
I'm using Claude Code. I've been building software as a solo freelancer for the last 20+ years.
My latest workflow
- I work on "regular" web apps, C#/.NET on backend, React on web.
- I'm using 3-8 sessions in parallel, depending on the tasks and the mental bandwidth I have, all visible on external display.
- I've markdown rule files & documentation, 30k lines in total. Some of them describes how I want the agent to work (rule files), some of them describes the features/systems of the app.
- Depending on what I'm working on, I load relevant rule files selectively into the context via commands. I have a /fullstack command that loads @backend.md, @frontend.md and a few more. I have similar /frontend, /backend, /test commands with a few variants. These are the load bearing columns of my workflow. Agents takes a lot more time and produces more slop without these. Each one is written by agents also, with my guidance. They evolve based on what we encounter.
- Every feature in the app, and every system, has a markdown document that's created by the implementing agent, describing how it works, what it does, where it's used, why it's created, main entry points, main logic, gotchas specific to this feature/system etc. After every session, I have /write-system, /write-feature commands that I use to make the agent create/update those, with specific guidance on verbosity, complexity, length.
- Each session I select a specific task for a single system. I reference the relevant rule files and feature/system doc, and describe what I want it to achieve and start plan mode. If there are existing similar features, I ask the agent to explore and build something similar.
- Each task is specifically tuned to be planned/worked in a single session. This is the most crucial role of mine.
- For work that would span multiple sessions, I use a single session to create the initial plan, then plan each phase in depth in separate sessions.
- After it creates the plan, I examine, do a bit of back and forth, then approve.
- I watch it while it builds. Usually I have 1-2 main tasks and a few subtasks going in parallel. I pay close attention to main tasks and intervene when required. Subtasks rarely requires intervention due to their scope.
- After the building part is done, I go through the code via editor, test manually via UI, while the agent creates tests for the thing we built, again with specific guidance on what needs to be tested and how. Since the plan is pre-approved by me, this step usually goes without a hitch.
- Then I make the agent create/update the relevant documents.
- Last week I built another system to enhance that flow. I created a /devlog command. With the assist of some CLI tools and cladude log parsing, it creates a devlog file with some metadata (tokens, length, files updated, docs updated etc) and agent fills it with a title, summary of work, key decisions, lessons learned. First prompt is also copied there. These also get added to the relevant feature/system document automatically as changelog entries. So, for every session, I've a clear document about what got done, how long it took, what was the gotchas, what went right, what went wrong etc. This proved to be invaluable even with a week worth of develops, and allows me to further refine my workflows.
This looks convoluted at a first glance, but it's evolved over the months and works great. The code quality is almost the same with what I would have written by myself. All because of existing code to use as examples, and the rule files guiding the agents. I was already a fast builder before, but with agents it's a whole new level.
And this flow really unlocked with Opus 4.5. Sonnet 3.5/4/4.5 was also working OK, but required a lot more handholding and steering and correction. Parallel sessions wasn't really possible without producing slop. Opus 4.5 is significantly better.
More technical/close-to-hardware work will most likely require a different set of guidance & flow to create non-slop code. I don't have any experience there.
You need to invest in improving the workflow. The capacity is there in the models. The results all depends on how you use them.