logoalt Hacker News

rahimnathwanitoday at 5:10 AM0 repliesview on HN

Looking forward to next time, hoping you mention speculative decoding and MTP :)

It would support your point about the performance of 20GB local models.