If you saw the Claude Code leak, you’d know the harness is anything but simple. It’s a sprawling, labyrinthine mess, but it’s required to make LLMs somewhat deterministic and useful as tools.
That’s also because of how Claude Code was written. It doesn’t have to be that way per se.
Hypothesis: it's a sprawling, labyrinthine mess because it was grown at high speed using Claude Code.
It's pretty easy to get determinism with a simple harness for a well-defined set of tasks with the recent models that are post-trained for tool use. CC probably gets some bloat because it tries to do a LOT more; and some bloat because it's grown organically.