logoalt Hacker News

stavrosyesterday at 4:27 PM1 replyview on HN

Yeah, maybe, but then it would make much more sense to run a big model than hope one of the small ones randomly stumbles upon the solution, just because the possibility space is so much larger than the number of dumb LLMs you can run.


Replies

abeppuyesterday at 4:59 PM

I don't work this way, so this is all a hypothetical to me, but the possibility space is larger than _any_ model can handle; models are effectively applying a really complex prior over a giant combinatorial space. I think the idea behind a swarm of small models (probably with higher temperature?) on a well-defined problem is akin to e.g. multi-chain MCMC.