logoalt Hacker News

skhamenehtoday at 7:10 AM0 repliesview on HN

ik_llama is almost always faster when tuned. However, when untuned I've found them to be very similar in performance with varied results as to which will perform better.

But vLLM and Sglang tend to be faster than both of those.