I dunno. I just let Claude build a python script that calls Claude code though subprocess.run().
I recently made a sort of Autoresearch with that approach. The script calls Claude Code to create a hyphotesis, then code based on that, evaluate- rinse and repeat. I am still trying to figure out if I am actually on to something or just burning tokens. Jury is still out.
I think you're onto something, but I would add that it's sort of like a live REPL that has an integrated agent but with extra steps.
I haven't used python much but I wouldn't be surprised if you can set up a sufficiently powerful REPL with it. I know Julia can do it very well and it's a very similar language. Obviously there are powerful Lisps that do this very well as well.
That's totally a valid approach! Especially for a very specific workflow you are looking for. For the cases I cover in cook, I had done those patterns enough times that I figured it was time to build a tool/skill for Claude so that I didn't have to explain it as much and also not have to wait for claude to code it up, and possibly interpret me wrong. Now ask claude to "/cook race 3 of foo plan with review, pick the best" and it knows what to do.
[dead]
I went quite far down this approach last year; you're welcome to take what you want from my repo -
https://github.com/riazarbi/way