The compounding booboos bit is the key insight here. Humans are a bottleneck and that bottleneck is actually load-bearing. You feel the pain of bad decisions slowly enough to course correct.
I've been building the same AI product for months - a coaching loop that persists across sessions. Every few weeks someone ships a "competitor" in a weekend. Feature list looks similar. The difference is everything that breaks when a real user comes back for session 3 or 4. Context drifts, scores stop calibrating, plans don't adapt. None of that shows up in a demo. You only find it after sitting in the same codebase for weeks, running real sessions, getting confused by your own data. That's the friction the post is talking about and I don't think you can skip it.
I like the framing of „context drift“. It describes the problem in LLM/agent terms.
Similar how „tech debt“ describes the same mechanism in business terms.