Ensembling is not compute or parameter-efficient, so compression per se is a terrible application. (...

gwern • today at 4:47 AM • 1 reply • view on HN

Ensembling is not compute or parameter-efficient, so compression per se is a terrible application. (This is related to why people train ever larger LLMs like 1 10t-parameter LLM, rather than 100 GPT-3-scale LLMs.)

Replies

SubiculumCode • today at 6:43 AM

Yeah.

alt Hacker News

Replies