logoalt Hacker News

monkeydustlast Thursday at 11:09 AM1 replyview on HN

I have been thinking about this a bit - so rather than rely on one have an agentic setup that could take question run against the top 3 and then another one to judge the response to give back.

Is anyone doing this for high stake questions / research?

The argument against is that the models are fairly 'similar' as outlined in one of the awarded papers from Neurips '25 - https://neurips.cc/virtual/2025/loc/san-diego/poster/121421


Replies

Workaccount2last Thursday at 2:46 PM

I often put the models in direct conversation with each other to work out a framework or solution. It works pretty well, but they do tend to glaze each other a bit.