distilling does not require the models to be released. They simply use the apis. They have been a ...

jdmoreira • last Thursday at 6:04 PM • 2 replies • view on HN

distilling does not require the models to be released. They simply use the apis.

They have been a source of innovation but probably not in training them.

Replies

They just lack of performant hardware. They have enough knowledge. And so they choose a more effective strategy without wasting resources on training from scratch.

alt Hacker News

Replies