logoalt Hacker News

toughyesterday at 4:25 PM1 replyview on HN

did you see yesterday nano-vllm [1] from a deepseek employee 1200LOC and faster than vanilla vllm?

1. https://github.com/GeeeekExplorer/nano-vllm


Replies

Gracanayesterday at 4:49 PM

Is it faster for large models, or are the optimizations more noticeable with small models? Seeing that the benchmark uses a 0.6B model made me wonder about that.

show 1 reply