No need. You can run the Gemma 4 and Qwen3.5 MoE models with as little as 12 GB of VRAM at 30-40 tps (Q4/Q5), and they both blow GPT-4o and DeepSeek R1 out of the water.