Just look at the code quality produced by these loops. That's all you need to know about it.
It's complete garbage, and since it runs in a loop, the amount of garbage multiplies over time.
You probably wouldn't use it for anything serious, but I've Ralphed a couple of personal tools: Mac menu bar apps mostly. It works reasonably well so long as you do the prep upfront and prepare a decent spec and plan. No idea of the code quality because I wouldn't know good swift code from a hole in the head, but the apps work and scratch the itch that motivated them.
I do not understand where this Ralph hype is coming from. Back when Claude 4.0 came out and it began to become actually useful, I already tried something like this. Every time it was a complete and utter failure.
And this dream of "having Claude implement an entire project from start to finish without intervention" came crashing down with this realization: Coding assistants 100% need human guidance.
I don’t think anyone serious would recommend it for serious production systems. I respect the Ralph technique as a fascinating learning exercise in understanding llm context windows and how to squeeze more performance (read: quality) from today’s models
Even if in the absolute the ceiling remains low, it’s interesting the degree to which good context engineering raises it