logoalt Hacker News

zozbot234today at 2:40 PM1 replyview on HN

It's not just LLM sourced though, folks have literally tried this after the release with the 26A4B model and it wasn't very good. Maybe the dense ~31B model is worthwhile though.


Replies

Aurornistoday at 2:46 PM

Many Gemma implementations are or were broken on launch day. The first attempts to fix llama.cpp’s tokenizer were merged hours ago.

Everyone hated Qwen3.5 at launch too because so many implementations were broken and couldn’t do tool calling.

You need to ignore social media “I tried this and it sucks” echo chambers for new model releases.

show 1 reply