logoalt Hacker News

MattRixyesterday at 8:08 PM1 replyview on HN

It’s not a 3B model, it has 3B active parameters. The full model is much larger.


Replies

neosatyesterday at 8:38 PM

That's true, I should have mentioned active. Actual params are closer to 12B-14B likely, given the 40GB VRAM usage.