There’s too much in this article to comment on it all, but if we zoom into the first claim:
> An increase in the complexity of the surrounding systems to mitigate the increased ambiguity of AI's non-determinism.
My question is why isn’t there an effort from the author to mitigate the insane things that LLMs do? For example, I set up a hexagonal design pattern for our backend. Claude Code printed out directionally ok but actually nonsensical code when I asked it to riff off the canonical example.
Then, I built linters specific to the conventions I want. For example, all hexagonal features share the same directory structure, and the port.py file has a Protocol class suffixed with “Port”.
That was better but there was a bunch of wheel spinning so then I built a scaffolder as part of the linter to print out templated code depending on what I want to do.
Then I was worried it was hallucinating the data, so I wrote a fixture generator that reads from our db and creates accurate fixtures for our adapters.
Since good code has never been “explained for itself 100%, without comments”, I employ BDD so the LLM can print out in a human readable way what the expected logical flow is. And for example, any disable of a custom rule I wrote requires and explanation of why as a comment.
Meanwhile, I’m collecting feedback from the agents along the way where they get tripped up, and what can improve in the architecture so we can promote more trust to the output. Like, I only have a fixture printer because it called out that real data (redacted yes) would be a better truth than any mocks I made.
Finally, code review is now less focused on the boilerplate and much more control flow in the use_case.
The stakes to have shitty code in these in-house tools is almost zero since new rules and rule version bumps are enforced w a ratchet pattern. Let the world fail on first pass.
Anyway, it seems to me like with investment you can slap rails on your code and stay sharp along the way. I have a strong vision for what works, am able to prove it deterministically with my homespun linters, and am being challenged by the LLMs daily with new ideas to bolt on.
So I don’t know, seems like the issue comes down to choosing to mistrust instead of slap on rails.
Edit: I wanted to ask if anyone is taking this approach or something similar, or have thought about things like writing linters for popular packages that would encourage a canonical implementation (I have seen some crazy crazy modeling with ORMs just from folks not reading the docs). HMU would love to chat youngii.jc@gmail