> do you mean they have branching early on to shortcut certain prompts?
Putting a classifier in front of a fleet of different models is a great way to provide higher quality results and spend less energy. Classification is significantly cheaper than generation and it is the very first thing you would do here.
A default, catch-all model is very expensive, but handles most queries reasonably well. The game from that point is to aggressively intercept prompts that would hit the catch-all model with cheaper, more targeted models. I have a suspicion that OAI employs different black boxes depending on things like the programming language you are asking it to use.
Aren't you describing why they use mixture of experts? Where a sub-set of weights are activated depending on the query?