Nemotron-3-Nano-30B-A3B[0][1] is a very impressive local model. It is good with tool calling and wor...

breput • today at 1:57 AM • 3 replies • view on HN

Nemotron-3-Nano-30B-A3B[0][1] is a very impressive local model. It is good with tool calling and works great with llama.cpp/Visual Studio Code/Roo Code for local development.

It doesn't get a ton of attention on /r/LocalLLaMA but it is worth trying out, even if you have a relatively modest machine.

[0] https://huggingface.co/nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B...

[1] https://huggingface.co/unsloth/Nemotron-3-Nano-30B-A3B-GGUF

Replies

bhadass • today at 3:04 AM

Some of NVIDIA's models also tend to have interesting architectures. For example, usage of the MAMBA architecture instead of purely transformers: https://developer.nvidia.com/blog/inside-nvidia-nemotron-3-t...

➕ show 1 reply

jychang • today at 2:40 AM

It was good for like, one month. Qwen3 30b dominated for half a year before that, and GLM-4.7 Flash 30b took over the crown soon after Nemotron 3 Nano came out. There was basically no time period for it to shine.

➕ show 3 replies

binary132 • today at 3:47 AM

I find the Q8 runs a bit more than twice as fast as gpt-120b since I don’t have to offload as many MoE layers, but is just about as capable if not better.

alt Hacker News

Replies