logoalt Hacker News

Tiberiumtoday at 6:57 AM0 repliesview on HN

The fake claim here is compression. The results in the repo are likely real, but they're done by running the full transformer teacher model every time. This doesn't achieve anything novel.