logoalt Hacker News

kcbyesterday at 6:49 PM2 repliesview on HN

Nemotron 3 Super was released recently. That's a direct competitor to gpt-oss-120b. https://developer.nvidia.com/blog/introducing-nemotron-3-sup...


Replies

evilduckyesterday at 9:16 PM

In terms of ability, maybe, in terms of speed, it's not even close. Check out the Prompt Processing speeds between them: https://kyuz0.github.io/amd-strix-halo-toolboxes/

gpt-oss-120b is over 600 tokens/s PP for all but one backend.

nemotron-3-super is at best 260 tokens/s PP.

Comparing token generation, it's again like 50 tokens/sec vs 15 tokens/sec

That really bogs down agentic tooling. Something needs to be categorically better to justify halving output speed, not just playing in the margins.

show 1 reply
coder68yesterday at 7:36 PM

I gave it a whirl but was unenthused. I'll try it again, but so far have not really enjoyed any of the nvidia models, though they are best in class for execution speed.

show 1 reply