logoalt Hacker News

joefourieryesterday at 10:13 PM0 repliesview on HN

What quant? You should have no problem running it at Q4 with 256K context, Q5 or Q6 even although maybe not at full context. I can run Q4 on a 4090 with just 24GB VRAM.