Seems plausible but the overall architecture is still the same, your request has to be "routed" by some NN & if that gets stuck picking a node/"expert" (regardless of "tools" & "veracity" scoring) that keeps refusing the request incorrectly then getting unstuck is highly non-trivial b/c users are not given a choice in what weights are assigned to the "experts", it's magic that OpenAI is performing behind the scenes that no one has any visibility into.
I think maybe you mean something else when you say MoE. I interpret that as “Mixture of Experts” which is a model type where there is a routing matrix applied per layer to sort of generate the matmul executed on that layer. The experts are the weight columns that are selected, but calling them experts kinda muddies the waters IMO, it’s really just a sparsification strategy. Using that MoE you almost certainly would get various different routing behaviors as you added to the context.
I might misunderstand you but it seems like you think there are multiple models with one dispatching to others? I’m not sure what that sort of multi-agent architecture is called, but I think those would be modeled as tool calls (and I do believe that the image related stuff is certainly specialized models).
In any case, I am saying that GPT5 (or whichever) is the one actually refusing the request. It is making that decision, and only updating its behavior after getting higher trust data confirming the user’s words in its context.