Does anyone have hints on what kinds of prompts are most used for a distillation like this—SWE-Bench sorts of things?
Is reconstructing the compressed knowledge in the model like reconstructing a lossy JPG or MP3 a reasonable analogy?
There are some Claude datasets (of indeterminate provenance) floating around on huggingface you can look at (or at least used to be, they might've been taken down).
RLAIF is a good place to start reading.
Claude will also help you with (mostly good advice) if you ask something like “Research and help me make the most effective plan to train a smaller student model to be better from a teacher model”.
I actually was doing an experiment with a GLM->Gemma E4B for fun, and Claude kept on suggesting I should also add Claude Opus as a teacher lol, suggesting techniques I haven’t heard of like thinking inversion (train a small model to deconstruct summarised thinking into detailed native thinking format of the student).