i’ve often thought that less than one second is all you need.One of my fun super powers when someone...

gorpy7 • today at 3:07 AM • 2 replies • view on HN

i’ve often thought that less than one second is all you need.One of my fun super powers when someone asks what i’d like to have is 1 second ahead of everyone else- that’s all i need. i honest don’t know where the distillation conversation is at. is it real, is it ongoing? i think that aspect would big one. Your point is valid if it’s valid. i’m not a great global citizen, you know, lots going on out and about.

Replies

olliepro • today at 3:13 AM

A lot of distillation happens. E.g. OLMo models have a completely open dataset and they are heavily distilled. It only makes sense to try to absorb behaviors from the best models out there. That said, I think the open weight juggernaughts are doing really genuinely great work with RL, training environments, architectural innovations etc.

➕ show 1 reply

alt Hacker News

Replies