can you elaborate? you can use quantized version, would context still be an issue with it?

pdyc • yesterday at 2:12 PM • 2 replies • view on HN

A usable quant, Q5_KM imo, takes up ~26GB[0], which leaves around ~6-7GB for context and running other programs which is not much.

nickthegreek • yesterday at 2:15 PM

context is always an issue with local models and consumer hardware.

➕ show 1 reply

alt Hacker News