I had an annoying issue in a setup with two Nvidia L4 cards where trying to run the MoE versions to ...

KronisLV • today at 9:03 AM • 0 replies • view on HN

I had an annoying issue in a setup with two Nvidia L4 cards where trying to run the MoE versions to get decent performance just didn't work with Ollama, seems the same as these:

https://github.com/ollama/ollama/issues/14419

https://github.com/ollama/ollama/issues/14503

So for now I'm back to Qwen 3 30B A3B, kind of a bummer, because the previous model is pretty fast but kinda dumb, even for simple tasks like on-prem code review!

alt Hacker News