alt
Hacker News
pixelmelt
•
yesterday at 5:25 PM
•
0 replies
•
view on HN
I would look into running a 4 bit quant using llama cpp (or any of its wrappers)