logoalt Hacker News

tateftoday at 6:25 PM0 repliesview on HN

Yes, definitely agree. It's more of a POC than a functional use case. However, for many smaller MoE models this method can actually be useful and capable of achieving multiple tokens/sec.