Nono, you of course CAN teach tool use at inference, but it's different than doing so at traini...

TZubiri • today at 2:39 AM • 1 reply • view on HN

Nono, you of course CAN teach tool use at inference, but it's different than doing so at training time, and the models are trained to call specific tools right now.

Also MCP is not an Agent protocol, it's used in a different category. MCP is used when the user has a chatbot, sends a message, gets a response. Here we are talking about the category of products we would describe as Code Agents, including Claude Code, ChatGPT Codex, and the specific models that are trained for use in such contexts.

The idea is that of course you can tell it about certain tools in inference, but in code production tasks the LLM is trained to use string based tools such as grep, and not language specific tools like Go To Definition.

My source on this is Dax who is developing an Open Source clone of Claude Code called OpenCode

Replies

oofbey • today at 6:03 AM

Claude code and cursor agent and all the coding agents can and do run MCP just fine. MCP is effectively just a prompt that says “if you want to convert a binary to hex call the ‘hexdump’ tool passing in the filename” and then a promise to treat specially formatted responses differently. Any modern LLM that can reason and solve math problems will understand and use the tools you give it. Heck I’ve even seen LLMs that were never trained to reason make tool calls.

You say they’re better with the tools they’re trained on. Maybe? But if so not much. And maybe not. Because custom tools are passed as part of the prompt and prompts go a long way to override training.

LLMs reason in text. (Except for the ones that reason in latent space.) But they can work with data in any file format as long as they’re given tools to do so.

alt Hacker News

Replies