Why do you need logits to distill? Those are at least tokenizer-dependent, and different models use different tokenizers.