I'm beginning to develop the opinion that the next step in this process will (or at least should) be local and/or self-hosted inference.
The latest qwen models are already very useful, and the smaller ones can be run locally on my laptop. These are obviously not as good as the latest frontier models, and that's extremely noticeable for the development workflow, but maybe in a year or two, they will be competitive with the proprietary models we have today, which are incredibly capable. I also expect compute for inference to continue getting cheaper.
The current lock in for me is the UX of Claude Code / codex cli, but this is a very small moat that will definitely be commoditized soon.