IMHO, the biggest problem with the future of open weights models is that currently, open weights mod...

profsummergig • yesterday at 9:45 PM • 13 replies • view on HN

IMHO, the biggest problem with the future of open weights models is that currently, open weights models are the result of philanthropy by some private org. (e.g. DeepSeek).

The spigot can be turned off at any time.

Until there's some sort of "community owned hardware", open weights models are always at risk of being discontinued.

Replies

NitpickLawyer • yesterday at 9:53 PM

Yeah, but the biggest plus for open models is that they can never be taken away. In other words, whatever capabilities they reach (even if there will never be another model), those stay forever. That can't be said for API-based models where a provider can sunset models whenever they feel like (i.e. gpt5-mini will soon be gone, and replaced by a more expensive 5.4-mini, same for goog, etc).

And there will always be incentivised parties that release models. Nvda for one has every incentive to keep the nemotron line going, as they're directly profiting from people running this. And the models aren't really far from open SotA anyway.

Goog will probably continue to release the small models, since they'll use them for browser stuff anyway, and know that they'll leak. So for them it's a win-win to release the small models and gain some dev market share.

And the chinese labs also have incentives to keep releasing models, and will likely continue to get gov support to do so (yay commercial wars between nations).

➕ show 4 replies

40four • today at 4:16 AM

We should address the elephant in the room. The problem with the future of open weight models is not they are created as a result of philanthropy by some private org. All of the top contenders are created by the Chinese government.

I don’t think we should describe these companies as simply releasing these highly capable open weight models out of the goodness of their hearts

➕ show 2 replies

fridder • yesterday at 9:59 PM

We need a SETI@Home but for model training

➕ show 6 replies

throwawayffffas • yesterday at 11:45 PM

I don't think that's the case, it's not philanthropy, they are getting something out of it. The labs are learning from one another from the shared models.

Plus I am certain it makes financial sense. I am guessing here but fully utilizing a subscriptions limits probably costs the operator more money than the subscription revenue, that is why anthropic is making such a big stink about the chinese data harvesting. By releasing the weights, you are relieving yourself from that burden because the competition does not need to hammer your subscription service they can just download your model and analyze it and run it all day.

Also for the largest models it makes no sense to run it yourself unless you are a major player. Renting the hardware is ludicrously more expensive than their subscription tens of thousands of dollars. And buying the hardware to run them is in the hundreds of thousands of dollars.

➕ show 1 reply

gwerbin • today at 3:44 AM

Isn't this also true of a lot of FOSS software and libraries? tensorflow and pytorch for example, among many others.

Shitty-kitty • yesterday at 9:53 PM

It's just a smart business decision that allows their models to compete and gain market-share against much pricier private models. No philanthropy there.

➕ show 1 reply

notnullorvoid • yesterday at 10:17 PM

> Until there's some sort of "community owned hardware"

Or until some bright people figure out drastically more efficient means of training.

UncleOxidant • yesterday at 10:38 PM

> The spigot can be turned off at any time.

True. And it's possible that this has already happened at Alibaba Qwen - at least for the smaller models that people had a chance of running at home (122B and smaller).

➕ show 2 replies

recursive • yesterday at 10:04 PM

This seems backwards. Access to Fable can be removed. I don't see how an open weight model can ever be put back into the bag though.

➕ show 1 reply

slashdave • yesterday at 11:13 PM

Training these models is not a "hardware" problem.

➕ show 2 replies

ForHackernews • yesterday at 10:15 PM

It's not pure philanthropy: https://gwern.net/complement

jmyeet • yesterday at 11:17 PM

How is this a complaint? Once you have the model, you have the model. Download DeepSeek-R1 671B and you have it. You might not get improvements in the future, just like you may not ever get a future release of an open source project. Is that an indictment of open source?

But consider the alternative. OpenAI and Anthropic can shut off your account or API key at any time for any reason. How is this better? You have way more security when you're running your own model.

alt Hacker News

Replies