Who is going to fund it? Training is unfathomably expensive.
You have either VC funded models looking for a return on investment, or CCP funded models looking to solidify authoritarian "model Chinese society".
Maybe there are some university 4B models, but I doubt those will carry far.
I share your concerns, although we still see pretty similarly large and complex things that remain open source today.
I am astonished on a daily basis that my Linux computer is so close to the same experience as two operating systems put out by trillion dollar companies. It even does things that those commercial alternatives don’t do.
Also, if DeepSeek is truly putting out models with 1/10th the cost of Western competitors, and a fraction of the employee headcount, I think it implies that there will be a market for someone else to be in the space offering an alternative.
I think about how companies like IBM are so willing to contribute to Linux and give away those contributions for free because they are part of group of corporate sponsors that need an alternative to more dominant commercial players in the market.
Meta “gives away” React for similar reasons: it’s more beneficial for them to have it be a standard and be able to hire people who already know it.
It’s definitely harder to imagine the same ecosystem benefits of an AI model, but maybe it’s out there somewhere.
I could imagine some data center/VPS providers trying to sponsor something like that so that the big AI companies have less leverage over them.
Or maybe all this optimism is a pipe dream?
Ever calculate the cost of a computer in the 1960s, adjusted for inflation? Training is unfathomably expensive right now. What if a bunch of universities pooled their money? Or a bunch of nations pooled their money? Breakthroughs will eventually happen, optimization will occur, etc.
People questioned whether there could ever be a viable open source operating system, yet Linux has been a viable option for a desktop environment for decades now, and that's not to mention its ubiquitous use as a server or phone OS.
It’s expensive, but not unfathomably, esp in an open source setting where capable people might contribute high quality data for post training (worked problems, code reviews, feedback, …) gratis instead of at immense cost.
Anyone who isn't currently own a piece of who is winning by the current model. Basic disruption theory, if the game isn't going your way, change the game.
You have an unhealthy and unreasonable obsession with the idea of CCP models, you should get that checked.
Tbh, there really needs to be some legal precedent set that makes model distillation a legal activity. If the model makers can rip everyone else's work and launder information as if it's their own without giving credit back to the original creators, I don't see why it should be illegal to distill the models. It's the same thing the frontier model makers are doing to IP everywhere else.