logoalt Hacker News

ranger_dangertoday at 5:03 AM0 repliesview on HN

with regular llama.cpp on a 3070ti I get 60tok/s TG with the 9B model, it's quite impressive.