"Do you think you need to do high/medium/low amount of thinking to answer X?" seems well within an LLMs wheelhouse if the goal is to build an optimized routing engine.
How do you think that an LLM could come by that information? Do you think that LLM vendors are logging performance and feeding that back into the model or some other mechanism?
How do you think that an LLM could come by that information? Do you think that LLM vendors are logging performance and feeding that back into the model or some other mechanism?