This feels less like "agents" and more like a controlled generate → execute → fix loop.
Works great when you have a clear verification signal (tests passing), but what drives convergence when that signal isn’t well-defined?