This mirrors my experience trying to integrate LLMs into production pipelines.
The issue seems to be that LLMs treat code as a literary exercise rather than a graph problem. Claude is fantastic at the syntax and local logic ('assembling blocks'), but it lacks the persistent global state required to understand how a change in module A implicitly breaks a constraint in module Z.
Until we stop treating coding agents as 'text predictors' and start grounding them in an actual AST (Abstract Syntax Tree) or dependency graph, they will remain helpful juniors rather than architects.