logoalt Hacker News

pixelmeltyesterday at 5:25 PM0 repliesview on HN

I would look into running a 4 bit quant using llama cpp (or any of its wrappers)