As of today, what is the best local model that can be run on a system with 32gb of ram and 24gb of v...

xvv • last Sunday at 4:42 AM • 3 replies • view on HN

As of today, what is the best local model that can be run on a system with 32gb of ram and 24gb of vram?

Replies

Qwen3-Coder-30B-A3B-Instruct-FP8 is a good choice ('qwen3-coder:30b' when you use ollama). I have also had good experiences with https://mistral.ai/news/devstral (built under a collaboration between Mistral AI and All Hands AI)

ethan_smith • last Sunday at 10:33 AM

DeepSeek Coder 33B or Llama 3 70B with GGUF quantization (Q4_K_M) would be optimal for your specs, with Mistral Large 2 providing the best balance of performance and resource usage.

v5v3 • last Sunday at 7:03 AM

Start with Qwen of a size that fits in the vram.

alt Hacker News

Replies