I haven’t tried this myself yet but you would still need enough non-vram ram available to the cpu to offload to cpu, right? This is a fully novice question, I have not ever tried it.
You're correct. If you don't have enough RAM for the model, it can still run but most of it will run on the CPU and be continuously reloaded from the SSD (through mmap).
A medium MoE like 35B can still achieve usable speeds in that setup, mind you, depending on what you're doing.
You're correct. If you don't have enough RAM for the model, it can still run but most of it will run on the CPU and be continuously reloaded from the SSD (through mmap).
A medium MoE like 35B can still achieve usable speeds in that setup, mind you, depending on what you're doing.