I have been experimenting with multi-agent llms for last month, as I put in the writeup for my repo and in the video the biggest value I have found is when you run a bunch of different agentic strategies in parallel then have a judge review the variance of them. So far that has uncovered interesting insights. The rest of it is so-so. Been fun but also expensive!
Repo with video: https://github.com/monkeydust/rightmind