5 years ago, I would agree with you. But when you go ALL IN on LLM development, and use annealing wi...

kordlessagain • today at 7:20 PM • 0 replies • view on HN

5 years ago, I would agree with you. But when you go ALL IN on LLM development, and use annealing with multi-agent harnesses, these issues disappear. One caveat: I build everything off other things that originated with my own hand written code. Auth for my site, for example. Also, most of my current projects are packed with advice I've rendered to the LLM on how git commits go down and cadence of those commits into deployments. Claude Code rarely fucks this up, and has memories and plan files that it updates if we find a hole. So, I'm comfortable with an occasional hiccup in the process. It'll get caught, eventually. Maybe. ;)

A recent analysis on my Claude Code prompts showed 1.5B input tokens over the last few months. I use 4-5 provider agents (all CLI) DAILY, so this is a small subset. I spend a lot of time using transcription services to drone on about how some agent fucked things up and how I want it fixed and how to do it.

To assist with that process, I'm currently building out a search engine that is exposed via MCP to allow auditing of the dev runs. I already have the foundation of file changes (ala Splunk style) that let me keep an eye on the agents, and an agentic terminal that allows one agent to keep an eye on what the other agent is whacking on. Combined with my constant badgering for proper systems development, these things are improving the process at an acclerated rate.

Look, I get being an "engineer" on these types of things, and I think there is an absolute purity in pushing LLM generated code out of a codebase you control. That said, that's not the ONLY way to do things, and your milage will vary based on your systems thinking hat. I prefer to push hard on getting the outcomes and sacrifice the exhaustive process of reviewing every single line of code.

Consider frameworks. They make things easier to do, if they are complete and stable. There's an argument here that LLM harnesses should probably not ALSO be maintained by LLMs (something I'm completely ignoring so probably ironic I'm mentioning it). But the point being is the harnesses SHOULD have eyes on most lines of code. Eyes on every package though? Hard to say. I've settled on doing most stuff in Rust nowadays, just because it keeps the LLM more honest. And, we can build most "packages" by hand so we can change them to match our outcomes without code bloat. By bitching at it about code refactoring constantly, annealing the codebase by high level overview, not exhaustive review, I've found things get easier to work on as I go and still stay sane.

I do catch the LLMs occasionally hard coding things that belong in their own file or configs, and am a hardass about that and file length. I do read some code and hate it being overly long (and it sucks for burning tokens).

FWIW, I typed all this out on my keyboard myself. However, if I ran it through an LLM for cleanup or whatever, the very wall of text itself helps FORCE the LLM to stick to the substantive argument and steers it away from slop prompts. The same applies to code, if you are careful.

alt Hacker News