Has extra KV cache on SSD, and lots more options to tweak. There's experimental TurboQuant and multi token prediction support.