logoalt Hacker News

xnxyesterday at 2:33 PM2 repliesview on HN

Is this newer/better than the speculative decoding from 2022? https://arxiv.org/abs/2211.17192


Replies

alok-gyesterday at 6:23 PM

That paper is cited in the 'introduction' and 'background' sections. This paper is improving by removing some bottlenecks.

tiahurayesterday at 5:44 PM

Seems like they focus on improving the drafter and the verification policy so speculation keeps producing net speedups rather than wasted verification work at deepseek scale.