The failure mode split nobody's naming: Claude gets regexes right about 95% of the time, which is annoying but catchable. Gets auth logic or state management right 95% of the time and you've got silent data corruption showing up 3 months later on an edge case nobody tested.
Vibe coders treating those as the same category is what actually worries me. Even in regular software there's a feedback mechanism - unit tests go red, CI breaks. Vibe coding skips that too. You get working code that passes the happy path and nothing that tells you which 5% failure rate is the dangerous one. That judgment about problem category severity is the thing that's hard to develop without breaking things first.
A fault in a regex could be really bad news depending on where it’s used.
> you've got silent data corruption showing up 3 months later on an edge case nobody tested
I mean this happens in normal development?
This is an intresting take and the ”tooling” around pure llm-based code generation is what really matters.
AFAIK Replit and Claude code has way to reduce the rate of these kind of errors, but I havn’t deep dived into how.