logoalt Hacker News

babelfishtoday at 4:25 PM1 replyview on HN

Wow, 30B parameters as capable as a 1T parameter model?


Replies

mhitzatoday at 5:53 PM

On the above compared benchmarks is closer to other larger open weights models, and on par with GPT-OSS 120B, for which I also have a frame of reference.