In a month or three we’ll have the sensible approach, which is smaller cheaper fast models optimized for looking at a query and identifying which skills / context to provide in full to the main model.
It’s really silly to waste big model tokens on throat clearing steps
I thought most of the major AI programming tools were already doing this. Isn't this what subagents are in Claude code?