If you added a few more tools that let the LLM modify code files that would directly serve requests, that would significantly speed up future responses and also ensure consistency. Code would act like memory. A direct HTTP request to the LLM is like a cache miss. You could still have the feedback mechanism allowing a bypass that causes an update to the code. Perhaps code just becomes a store of consistency for LLMs over time.
Creating instructions and adding boundaries on how to grow, and you end up with a seed.
This was an unserious experiment meant to illustrate the gap and bottlenecks that are still there. I agree that there's a lot that could be done to optimize this kind of approach. But even if you did, I'm not sure the results would be viable and I'm pretty sure classic coding (with LLM assistance and all) would still outperform such a product.