logoalt Hacker News

syntaxingtoday at 5:07 PM0 repliesview on HN

Is it worth running speculative decoding on small active models like this? Or does MTP make speculative decoding unnecessary?