I am still trying to figure out the business model of open weights. Like... it's wonderful that there are open LLMs, super happy about it, good for everyone, but why are there these? What is the advantage to their companies to release them?
Downward pressure on proprietary model pricing until a lab can catch up. Also good for hiring talent (who love OSS).
Cultural influence is another benefit. China is securing its sphere of influence as well as keeping us ai in check.
It's analogous to open-source software, which never had an obvious economic incentive either, although training an LLM necessary costs money whereas developing an OSS project might only cost time, which people are probably more likely to give up.
Balaji's "AI OVERPRODUCTION" post is the most compelling thesis that I've come across
They are making the hardware and commoditizing the complement.
Big AI labs are losing money. Open Models is making the pricing equation a lot trickier for them.
There are some short term ones but I doubt this will continue, especially for the more powerful models.
I mean, this is straight out of chinas playbook, it should not be surprising that China is making an inferior derivative product at an artificially lower price point: state subsidies to massively drive up internal scale and supply chains leading to artificially lower priced goods which then suffocate the competition has lead to *gestures vaguely at everything* being made in china.
Right now it’s so the Chinese can undermine the frontier models in the US. In areas they’re doing well like video generation (ie seedance) they won’t open source anything.
> What is the advantage to their companies to release them?
It's a distribution strategy. It costs something to serve the models - let's say $5/1M tokens.
If Qwen required $5 from anyone who was curious so you could even begin to test it out, a lot of people just wouldn't.
Now Qwen could offer a "free" tier, but it's infinitely cheaper to provide the weights and let people run it themselves including opening up the ability for anyone else on the planet to test it against other (open weight) models.
The costs to build the open weight models are sunk, but the costs to serve them, get them tested are not.
It's also precisely why the .NET SDK is free or the ESP32 SDK is free - they sell more Microsoft or ESP32 products.
People use their model otherwise they would not.
The majority are released by socialists, and by socialist I mean the People's Republic of China. Which everyone seems to forget is a socialist country working towards world communism.
They are a prestige propaganda tool on par with the space race. On top of that they insert a subtle pro-socialist bias in everything they touch.
Ask deepseek about the US economic system for a blatant example.
Now think what something as innocent seeming as the qwen retrieval models are doing in the background of every request.
IMHO this is only temporary, china buying themselves some time and want to make sure none of US models get entrenched in their position in the next few years (also putting pressure on US AI companies bleeding them)
The same way like Windows got entrenched everywhere even though linux desktop is pretty good even for non-tech savvy people and free.