This is easily solved with good error messages.
Claude always gets the syntax wrong on my tool calls.
So I did a revolutionary thing and made the error output print helpful guidance on how to correctly call the tool.
The agent tries again and always gets it right. Total time “wasted”: 1-2 seconds. It happens every session, but it only happens once per context window. After that the agent holds on to the lesson.
To do this for your own tool calls, imagine what you’d do in the agent’s place - what info you’d need so you can correct your mistake. Assume the agent wants to achieve the goal so it’ll try again. These are probabilistic systems, so we need to give them an extra loop to get the deterministic bits right.
I've been trying to push for this perspective about the error messages of jj vcs. There's some push back from people that don't perceive that making tools work well with LLMs is also making tools work well with humans. (Obviously there's more nuance to the arguments than this one sided perspective).
This will cause an extra round trip to the LLM. Which means more $ spent.
So, are you saying that skills are not such a good tool for agents to learn, they still need tool-trial-and-error dance after injecting them? (I'm assuming each tool comes with its own skill.)
LSPs and linters serve the same purpose. I use the latter in git hooks.
I've built a library that makes creating rich feedback systems easier, check this out:
https://tool2agent.org/