+1 thanks for mentioning MCP!
re: different tools (apis vs mcps). in my mind, there should be no real difference at what kind of tools is called at this moment since I model this as a softmax over a label set of tools.
that said, an idea I want to investigate is whether tools can live in a learned embedding space, where selection isn’t a softmax over discrete labels but a nearest-neighbor or attention mechanism over continuous vectors.
this is the intuition I'm developing as we speak and in some of my other comments on this thread (see differentiable state machine comment).