Yep, thats it what it does. Only works with nvidia.
The difference it does use safetensors, and not gguf's. But it does dynamically requant to int4 8 or bf16.
Wow that's actually sick as hell, somehow hadn't heard of this. maybe I will go and blow $700 on a new ram kit... thanks for sharing!
Wow that's actually sick as hell, somehow hadn't heard of this. maybe I will go and blow $700 on a new ram kit... thanks for sharing!