logoalt Hacker News

gwerntoday at 4:47 AM1 replyview on HN

Ensembling is not compute or parameter-efficient, so compression per se is a terrible application. (This is related to why people train ever larger LLMs like 1 10t-parameter LLM, rather than 100 GPT-3-scale LLMs.)


Replies

SubiculumCodetoday at 6:43 AM

Yeah.