logoalt Hacker News

zozbot234today at 5:24 PM2 repliesview on HN

The llama4 series was one of the earliest large MoE's to be made publically available. People just ignored it because they were focused on running smaller and denser models at the time, we should know better these days.


Replies

dilaptoday at 5:55 PM

Deepseek R1 was a publically-available, MoE model that was getting a ton of attention before llama4. Llama4 didn't get much attention because it wasn't good.

prodigycorptoday at 5:31 PM

the models were objectively horrible

show 1 reply