Tested gemma4 26 MoE 4bit quantisized gguf on llama.cpp following these guides with mmap'd I&#x...

tannhaeuser • yesterday at 7:34 PM • 0 replies • view on HN

Tested gemma4 26 MoE 4bit quantisized gguf on llama.cpp following these guides with mmap'd I/O on a 16GB MBP and it was unbearably slow (0.0 t/s).

alt Hacker News