Granite 4.1: IBM's 8B Model Matching 32B MoE

124 points • by steveharing1 • today at 10:31 AM • 64 comments • view on HN

Comments

Nah, I ain't reading that. If they can't be bothered to get a human to write it, it can't be that important. I'm glad for them though. Or sorry that happened.

2ndorderthought • today at 10:54 AM

I test drove it yesterday. It's pretty impressive at 8b. Runs on commodity hardware quickly.

Qwen3.6 35b a3b is still my local champion but I may use this for auto complete and small tasks. Granite has recent training data which is nice. If the other small models got fine tuned on recent data I don't know if I would use this at all, but that alone makes it pretty decent.

The 4b they released was not good for my needs but could probably handle tool calls or something

➕ show 3 replies

cbg0 • today at 11:21 AM

The real "sleeper" might be https://huggingface.co/ibm-granite/granite-vision-4.1-4b if the benchmarks hold up for such a small model against frontier models for table & semantic k:v extraction.

➕ show 1 reply

theblazehen • today at 12:42 PM

> models are judged by GPT-4

An interesting choice

Havoc • today at 10:55 AM

Interesting to see a pivot away from MoE by both IBM and mistral while the larger classes of SOTA of models all seem to be sticking to it.

Quick vibe check of it- 8B @ Q6 - seems promising. Bit of a clinical tone, but can see that being useful for data processing and similar. You don't really want a LLM that spams you with emojis sometimes...

➕ show 2 replies

tosh • today at 11:34 AM

IBM announcement: https://research.ibm.com/blog/granite-4-1-ai-foundation-mode...

agunapal • today at 11:34 AM

If you really think about why MoE came into existence, its to save significant cost during training, I don't think there was any concrete evidence of performance gains for comparable MoE vs dense models. Over the years, I believe all the new techniques being employed in post training have made the models better.

➕ show 2 replies

100ms • today at 11:13 AM

> Full stop.

Why people don't edit out obvious sloppification and expect to still have readers left

➕ show 3 replies

dissahc • today at 12:26 PM

qwen3.5 9b outperforms granite 4.1 30b by a huge amount (32 vs 15 on artificialanalysis benchmark)... i have no idea what made the writer of this article say so many demonstrably incorrect things

robotmaxtron • today at 12:35 PM

"open source"

show me.

RugnirViking • today at 10:51 AM

sounds interesting. Here's hoping they release a 32B model, thats a pretty good sweet spot for feasibility of home setups.

edit: I just realised they do actually have a 30b release alongside this. Haven't tried it yet.

➕ show 1 reply

mdp2021 • today at 10:50 AM

Wish they also released an embedding model, in the line of their previous: compact (while good)...

➕ show 2 replies

tokenhub_dev • today at 12:29 PM

[flagged]

whalesalad • today at 11:53 AM

[flagged]

alt Hacker News

Granite 4.1: IBM's 8B Model Matching 32B MoE

Comments