logoalt Hacker News

fishphamyesterday at 4:16 PM2 repliesview on HN

Those won’t be sufficient to run SOTA/trillion parameter models


Replies

Zambyteyesterday at 4:24 PM

And most tasks don't demand that.

general1465yesterday at 4:28 PM

Distilled models are good enough.