Very cool tool. As the "moar tokens" era is starting to wind down I think people are going to realize just how crappy these harnesses really are, especially Claude Code.
I have gone back and forth between Claude and Cursor and it is clear Claude just throws the kitchen sink at problems to get an edge. I write MCP tools and I see these exact problems when the inputs and outputs aren't clearly defined, the LLM just guesses and retries.