I'm confused and a bit disturbed; honestly having a very difficult time internalizing and processing this information. This announcement is making me wonder if I'm poorly calibrated on the current progress of AI development and the potential path forward. Is the key idea here that current AI development has figured out enough to brute force a path towards AGI? Or I guess the alternative is that they expect to figure it out in the next 4 years...
I don't know how to make sense of this level of investment. I feel that I lack the proper conceptual framework to make sense of the purchasing power of half a trillion USD in this context.
Let me avoid the use of the word AGI here because the term is a little too loaded for me these days.
1) reasoning capabilities in latest models are rapidly approaching superhuman levels and continue to scale with compute.
2) intelligence at a certain level is easier to achieve algorithmically when the hardware improves. There's also a larger path to intelligence and often simpler mechanisms
3) most current generation reasoning AI models leverage test time compute and RL in training--both of which can make use of more compute readily. For example RL on coding against compilers proofs against verifiers.
All of this points to compute now being basically the only bottleneck to massively superhuman AIs in domains like math and coding--rest no comment (idk what superhuman is in a domain with no objective evals)
I see it somewhat differently. It is not that technology has reached a level where we are close to AGI, we just need to throw in a few more coins to close the final gap. It is probably the other way around. We can see and feel that human intelligence is being eroded by the widespread use of LLMs for tasks that used to be solved by brain work. Thus, General Human Intelligence is declining and is approaching the level of current Artificial Intelligence. If this process can be accelerated by a bit of funding, the point where Big Tech can overtake public opinion making will be reached earlier, which in turn will make many companies and individuals richer faster, also the return on investment will be closer.
> Is the key idea here that current AI development has figured out enough to brute force a path towards AGI?
My sense anecdotally from within the space is yes people are feeling like we most likely have a "straight shot" to AGI now. Progress has been insane over the last few years but there's been this lurking worry around signs that the pre-training scaling paradigm has diminishing returns.
What recent outputs like o1, o3, DeepSeek-R1 are showing is that that's fine, we now have a new paradigm around test-time compute. For various reasons people think this is going to be more scalable and not run into the kind of data issues you'd get with a pre-training paradigm.
You can definitely debate on whether that's true or not but this is the first time I've been really seeing people think we've cracked "it", and the rest is scaling, better training etc.
Largest GPU cluster at the moment is X.ai's 100K H100's which is ~$2.5B worth of GPUs. So, something 10x bigger (1M GPUs) is $25B, and add $10B for 1GW nuclear reactor.
This sort of $100-500B budget doesn't sound like training cluster money, more like anticipating massive industry uptake and multiple datacenters running inference (with all of corporate America's data sitting in the cloud).
>AI development has figured out enough to brute force a path towards AGI?
I think what's been going on is compute/$ has been exponentially rising for decades in a steady way and has recently passed the point that you can get human brain level compute for modest money. The tendency has been once the compute is there lots of bright PhDs get hired to figure algorithms to use it so that bit gets sorted in a few years. (as written about by Kurzweil, Wait But Why and similar).
So it's not so much brute forcing AGI so much that exponential growth makes it inevitable at some point and that point is probably quite soon. At least that seems to be what they are betting.
The annual global spend on human labour is ~$100tn so if you either replace that with AGI or just add $100tn AGI and double GDP output, it's quite a lot of money.
This has nothing to do with technology it is a purely financial and political exercise...
> I don't know how to make sense of this level of investment.
The thing about investments, specifically in the world of tech startups and VC money, is that speculation is not something you merely capitalize on as an investor, it's also something you capitalize on as a business. Investors desperately want to speculate (gamble) on AI to scratch that itch, to the tune of $500 billion, apparently.
So this says less about, 'Are we close to AGI?' or, 'Is it worth it?' and more about, 'Are people really willing to gamble this much?'. Collectively, yes, they are.
> Is the key idea here that current AI development has figured out enough to brute force a path towards AGI? Or I guess the alternative is that they expect to figure it out in the next 4 years...
Can't answer that question, but, if the only thing to change in the next four years was that generation got cheaper and cheaper, we haven't even begun to understand the transformative power of what we have available today. I think we've felt like 5-10% of the effects that integrating today's technology can bring, especially if generation costs come down to maybe 1% of what they currently are, and latency of the big models becomes close to instantaneous.
It's a typical Trump-style announcement -- IT'S GONNA BE HUUUGE!! -- without any real substance or solid commitments
Remember Trump's BIG WIN of Foxconn investing $10B to build a factory in Wisconsin, creating 13000 jobs?
That was in 2017. 7 years later, it's employing about 1000 people if that. Not really clear what, if anything, is being made at the partially-built factory. [0]
And everyone's forgotten about it by now.
I expect this to be something along those lines.
[0] https://www.jsonline.com/story/money/business/2023/03/23/wha...
I think the only way you get to that kind of budget is by assuming that the models are like 5 or 10 times larger than most LLMs, and that you want to be able to do a lot of training runs simultaneously and quickly, AND build the power stations into the facilities at the same time. Maybe they are video or multimodal models that have text and image generation grounded in a ton of video data which eats a lot of VRAM.
> current AI development has figured out enough to brute force a path towards AGI? Or I guess the alternative is that they expect to figure it out in the next 4 years...
Or they think the odds are high enough that the gamble makes sense. Even if they think it's a 20% chance, their competitors are investing at this scale, their only real options are keep up or drop out.
This announcement is from the same office as the guy that xeeted:
“My NEW Official Trump Meme is HERE! It's time to celebrate everything we stand for: WINNING! Join my very special Trump Community. GET YOUR $TRUMP NOW.”
Your calibration is probably fine, stargate is not a means to achieve AGI, it’s a means to start construction on a few million square feet of datacenters thereby “reindustrializing America”
To me it looks like a strategic investment in data center capacity, which should drive domestic hardware production, improvements in electrical grid, etc. Putting it all under AI label just makes it look more exciting.
> Is the key idea here that current AI development has figured out enough to brute force a path towards AGI?
It rather means that they see their only chance for substantial progress in Moar Power!
Yes that is exactly what the big Aha! moment was. It has now been shown that doing these $100MM+ model builds is what it takes to have a top-tier model. The big moat is not just the software, the math, or even the training data, it's the budget to do the giant runs. Of course having a team that is iterating on these 4 regularly is where the magic is.
"There are maybe a few hundred people in the world who viscerally understand what's coming. Most are at DeepMind / OpenAI / Anthropic / X but some are on the outside. You have to be able to forecast the aggregate effect of rapid algorithmic improvement, aggressive investment in building RL environments for iterative self-improvement, and many tens of billions already committed to building data centers. Either we're all wrong, or everything is about to change." - Vedant Misra, Deepmind Researcher.
Maybe your calibration isn't poor. Maybe they really are all wrong but there's a tendency here to these these people behind the scenes are all charlatans, fueling hype without equal substance hoping to make a quick buck before it all comes crashing down, but i don't think that's true at all. I think these people really genuinely believe they're going to get there. And if you genuinely think that, them this kind of investment isn't so crazy.