logoalt Hacker News

rhdunnyesterday at 8:36 PM1 replyview on HN

Mistral have small variants (3B, 8B, 14B, etc.), as do others like IBM Granite and Qwen. Then there are finetunes based on these models, depending on your workflow/requirements.


Replies

karmasimidayesterday at 11:23 PM

True, but anything remotely useful is 300B and above