logoalt Hacker News

Tostinolast Thursday at 6:06 PM0 repliesview on HN

You can do a rough distill through the APIs. You don't need the weights.

It was much easier when companies had models on the /completion style APIs, because you could actually get the logits for each generation step, and use that as a dataset to fit your model to.

That isn't to diminish the efforts of the Chinese developers though, they are great.