Nice experiment!
I'm using a similar approach in an app I'm building. Seeing how well it works, I now really believe that in the coming years we'll see a lot of "just-in-time generation" for software.
If you haven't already, you should try using qwen-coder on Cerebras (or kimi-k2 on Groq). They are _really_ fast, and they might make the whole thing actually viable in terms of speed.