alt
Hacker News
programjames
•
yesterday at 8:24 PM
•
0 replies
•
view on HN
Don't they add a KL loss term to the frozen model's outputs?