How are you using that RAM with the GPU?

DrBenCarson • today at 12:08 AM • 1 reply • view on HN

canpan • today at 12:12 AM

Llama.cpp with automatic offload to main memory. You can also use Ollama, it is easier, but slower.

➕ show 1 reply

alt Hacker News