logoalt Hacker News

Granite 4.1: IBM's 8B Model Matching 32B MoE

124 pointsby steveharing1today at 10:31 AM64 commentsview on HN

Comments

dash2today at 12:38 PM

Nah, I ain't reading that. If they can't be bothered to get a human to write it, it can't be that important. I'm glad for them though. Or sorry that happened.

2ndorderthoughttoday at 10:54 AM

I test drove it yesterday. It's pretty impressive at 8b. Runs on commodity hardware quickly.

Qwen3.6 35b a3b is still my local champion but I may use this for auto complete and small tasks. Granite has recent training data which is nice. If the other small models got fine tuned on recent data I don't know if I would use this at all, but that alone makes it pretty decent.

The 4b they released was not good for my needs but could probably handle tool calls or something

show 3 replies
cbg0today at 11:21 AM

The real "sleeper" might be https://huggingface.co/ibm-granite/granite-vision-4.1-4b if the benchmarks hold up for such a small model against frontier models for table & semantic k:v extraction.

show 1 reply
theblazehentoday at 12:42 PM

> models are judged by GPT-4

An interesting choice

Havoctoday at 10:55 AM

Interesting to see a pivot away from MoE by both IBM and mistral while the larger classes of SOTA of models all seem to be sticking to it.

Quick vibe check of it- 8B @ Q6 - seems promising. Bit of a clinical tone, but can see that being useful for data processing and similar. You don't really want a LLM that spams you with emojis sometimes...

show 2 replies
agunapaltoday at 11:34 AM

If you really think about why MoE came into existence, its to save significant cost during training, I don't think there was any concrete evidence of performance gains for comparable MoE vs dense models. Over the years, I believe all the new techniques being employed in post training have made the models better.

show 2 replies
100mstoday at 11:13 AM

> Full stop.

Why people don't edit out obvious sloppification and expect to still have readers left

show 3 replies
dissahctoday at 12:26 PM

qwen3.5 9b outperforms granite 4.1 30b by a huge amount (32 vs 15 on artificialanalysis benchmark)... i have no idea what made the writer of this article say so many demonstrably incorrect things

robotmaxtrontoday at 12:35 PM

"open source"

show me.

RugnirVikingtoday at 10:51 AM

sounds interesting. Here's hoping they release a 32B model, thats a pretty good sweet spot for feasibility of home setups.

edit: I just realised they do actually have a 30b release alongside this. Haven't tried it yet.

show 1 reply
mdp2021today at 10:50 AM

Wish they also released an embedding model, in the line of their previous: compact (while good)...

show 2 replies
tokenhub_devtoday at 12:29 PM

[flagged]

whalesaladtoday at 11:53 AM

[flagged]