Model distillation is very useful! Put it like this: Reinforcement Learning from Human Feed...

nl • today at 4:26 AM • 0 replies • view on HN

Model distillation is very useful!

Put it like this: Reinforcement Learning from Human Feedback (RLHF) is useful with hundreds of examples, and LLM distillation is basically the same thing.

alt Hacker News