I have converged on a workflow that is just what I was doing before, but use LLMs just for boring or tedios parts. General guidelines are: only single agent at a time, small targeted queries, understand what you are building, if something would seem like a fun task, do it yourself. I use LLMs for bug hunting, to trace the flow, to build quick visualizatios (paste csv, ask to generate visualization), to search in the code of the dependencies using github mcp, to write 100 line scripts (deno + ts + zx was a game changer for me). Even "dumber" opensource models are good for this kind workflow, more tokens per second is generally more benefitial than plain intelligence. I would use LLMs more or less, sometimes even full vibecoding if the task is something like quick tooling web app and the flow is just firing of the next LLM query every 30 seconds. But, depending on the type of task or domain that you work in, YMMV