logoalt Hacker News

profsummergigyesterday at 9:45 PM13 repliesview on HN

IMHO, the biggest problem with the future of open weights models is that currently, open weights models are the result of philanthropy by some private org. (e.g. DeepSeek).

The spigot can be turned off at any time.

Until there's some sort of "community owned hardware", open weights models are always at risk of being discontinued.


Replies

NitpickLawyeryesterday at 9:53 PM

Yeah, but the biggest plus for open models is that they can never be taken away. In other words, whatever capabilities they reach (even if there will never be another model), those stay forever. That can't be said for API-based models where a provider can sunset models whenever they feel like (i.e. gpt5-mini will soon be gone, and replaced by a more expensive 5.4-mini, same for goog, etc).

And there will always be incentivised parties that release models. Nvda for one has every incentive to keep the nemotron line going, as they're directly profiting from people running this. And the models aren't really far from open SotA anyway.

Goog will probably continue to release the small models, since they'll use them for browser stuff anyway, and know that they'll leak. So for them it's a win-win to release the small models and gain some dev market share.

And the chinese labs also have incentives to keep releasing models, and will likely continue to get gov support to do so (yay commercial wars between nations).

show 4 replies
40fourtoday at 4:16 AM

We should address the elephant in the room. The problem with the future of open weight models is not they are created as a result of philanthropy by some private org. All of the top contenders are created by the Chinese government.

I don’t think we should describe these companies as simply releasing these highly capable open weight models out of the goodness of their hearts

show 2 replies
fridderyesterday at 9:59 PM

We need a SETI@Home but for model training

show 6 replies
throwawayffffasyesterday at 11:45 PM

I don't think that's the case, it's not philanthropy, they are getting something out of it. The labs are learning from one another from the shared models.

Plus I am certain it makes financial sense. I am guessing here but fully utilizing a subscriptions limits probably costs the operator more money than the subscription revenue, that is why anthropic is making such a big stink about the chinese data harvesting. By releasing the weights, you are relieving yourself from that burden because the competition does not need to hammer your subscription service they can just download your model and analyze it and run it all day.

Also for the largest models it makes no sense to run it yourself unless you are a major player. Renting the hardware is ludicrously more expensive than their subscription tens of thousands of dollars. And buying the hardware to run them is in the hundreds of thousands of dollars.

show 1 reply
gwerbintoday at 3:44 AM

Isn't this also true of a lot of FOSS software and libraries? tensorflow and pytorch for example, among many others.

Shitty-kittyyesterday at 9:53 PM

It's just a smart business decision that allows their models to compete and gain market-share against much pricier private models. No philanthropy there.

show 1 reply
notnullorvoidyesterday at 10:17 PM

> Until there's some sort of "community owned hardware"

Or until some bright people figure out drastically more efficient means of training.

UncleOxidantyesterday at 10:38 PM

> The spigot can be turned off at any time.

True. And it's possible that this has already happened at Alibaba Qwen - at least for the smaller models that people had a chance of running at home (122B and smaller).

show 2 replies
recursiveyesterday at 10:04 PM

This seems backwards. Access to Fable can be removed. I don't see how an open weight model can ever be put back into the bag though.

show 1 reply
slashdaveyesterday at 11:13 PM

Training these models is not a "hardware" problem.

show 2 replies
ForHackernewsyesterday at 10:15 PM

It's not pure philanthropy: https://gwern.net/complement

jmyeetyesterday at 11:17 PM

How is this a complaint? Once you have the model, you have the model. Download DeepSeek-R1 671B and you have it. You might not get improvements in the future, just like you may not ever get a future release of an open source project. Is that an indictment of open source?

But consider the alternative. OpenAI and Anthropic can shut off your account or API key at any time for any reason. How is this better? You have way more security when you're running your own model.