logoalt Hacker News

petuyesterday at 5:14 PM0 repliesview on HN

> Qwen3.5-27b 8-bit quant 20 to 25 tok/sec

It that with some kind of speculative decoding? Or total throughput for parallel requests?