logoalt Hacker News

maxlohyesterday at 5:15 PM3 repliesview on HN

Is the training cost really that high, though?

The Allen Institute (a non-profit) just released the Molmo 2 and Olmo 3 models. They trained these from scratch using public datasets, and they are performance-competitive with Gemini in several benchmarks [0] [1].

AMD was also able to successfully train an older version of OLMo on their hardware using the published code, data, and recipe [2].

If a non-profit and a chip vendor (training for marketing purposes) can do this, it clearly doesn't require "burning 10 years of cash flow" or a Google-scale TPU farm.

[0]: https://allenai.org/blog/molmo2

[1]: https://allenai.org/blog/olmo3

[2]: https://huggingface.co/amd/AMD-OLMo


Replies

turtlesdown11yesterday at 5:36 PM

No, of course the training costs aren't that high. Apple's ten years of future free cash flow is greater than a trillion dollars (they are above $100b per year). Obviously, the training costs are a trivial amount compared to that figure.

show 3 replies
lostmsuyesterday at 7:58 PM

No, I doesn't beat Gemini in any benchmarks. It beats Gemma, which isn't a SoTA even among open models of that size. That would be Nemotron 3 or GPT-OSS 20B.

PunchyHamsteryesterday at 10:25 PM

my prediction is that they might switch once AI craze will simmer down to some more reasonable level