When you predict with the small model, the big model can verify as more of a batch and be more simil...

cma • yesterday at 12:42 PM • 0 replies • view on HN

When you predict with the small model, the big model can verify as more of a batch and be more similar in speed to processing input tokens, if the predictions are good and it doesn't have to be redone.

alt Hacker News