logoalt Hacker News

epolanskilast Thursday at 6:00 PM2 repliesview on HN

Frontier models aren't released, they are closed source.

And the Chinese have been a huge source of innovation in the field.


Replies

jdmoreiralast Thursday at 6:04 PM

distilling does not require the models to be released. They simply use the apis.

They have been a source of innovation but probably not in training them.

show 2 replies
Tostinolast Thursday at 6:06 PM

You can do a rough distill through the APIs. You don't need the weights.

It was much easier when companies had models on the /completion style APIs, because you could actually get the logits for each generation step, and use that as a dataset to fit your model to.

That isn't to diminish the efforts of the Chinese developers though, they are great.