logoalt Hacker News

shubhamjaintoday at 1:47 PM24 repliesview on HN

If you think you need to spend $100B, does using a third-party cloud provider still make sense? It doesn’t matter what sweet deal Amazon is pitching—in that scenario, you’d want to own your stack. Especially in a hyper-competitive field like this, where margins are going to matter a lot soon.

It feels like these hyperscalers are just raising as much as they can giving extremely rosy projections becauses these sooner or later peak is going to be reached (if that hasn’t happened already)


Replies

IMTDbtoday at 3:35 PM

The problem is that at that scale, the alternative is building your own data centers. You'd probably want at least 2 in the US, 2 in Europe, 2 in Asia, maybe 1 in Africa and 1 in LATAM. So 8-10, and you need at least half of them ready "on time."

What does "on time" mean? You'll need to negotiate with local authorities, some friendly, some not. Data centers aren't exactly popular neighbors these days. Then negotiate with the local power utility. Fingers crossed the political landscape doesn't shift and your CEO doesn't sign a contract with an army using your product to pick bombing targets, because you'll watch those permits evaporate fast.

Then there's sourcing: CPUs, GPUs, memory, networking. You need all of it. Did you know the lead time for an industrial power transformer is 5+ years? Don't get me started on the water treatment pumps and filters you can't even get permitted without. What will you do in the meantime ? You surely aren't gonna get preferential treatment from AWS / Google / ... if they know you are moving away anyway. Your competition will.

The risk and complexity are just too big. AI/LLM is already an incredibly complex and brittle environment with huge competition. Getting distracted building data centers isn't enticing for these companies, it's a death sentence.

show 5 replies
MeetingsBrowsertoday at 2:53 PM

Going from a company with no experience building and operating datacenters to a company with 100B worth of compute is a multi-decade high risk goal.

show 1 reply
dktptoday at 2:12 PM

I think these pledges offload some of the risk onto Amazon/Oracle/etc

If Anthropic/OpenAI miss projections, infra providers can somewhat likely still turn around and sell it to the next guy or use it themselves. If they have more demand than expected (as Anthropic currently does), vcs will throw money at them and they can outbid the competition

If they built it themselves and missed projections it's a much more expensive mistake

It's just risk sharing. Infra providers take some of the risk and some of the upside

show 1 reply
neyatoday at 3:41 PM

I remember seeing this extremely shocking graph of top AI companies on Facebook on how the money just keeps changing hands between a handful of companies. Almost seemed like a scam.

show 2 replies
etempletontoday at 3:12 PM

In a rationale business yes, but when everything is basically some form of growth signal to investors to extract even more money from them before the music stops it doesn’t matter.

JumpCrisscrosstoday at 3:07 PM

> It doesn’t matter what sweet deal Amazon is pitching

Isn't that almost all that matters when comparing doing something yourself versus paying someone else, in this case Amazon, to do it for you?

nashashmitoday at 3:24 PM

No. I am guessing that this is only a commitment and they will waver on committing.

However there are certain advantages like supply chain that only established companies would have access to. This is also a commitment to spend upto 100B on internal approach and research. I would expect them to come up with their own cpu chip and device design. This will shift the focus to an internal approach. And might make amazon give better prices later down the line

LogicFailsMetoday at 2:05 PM

Classic time value of money situation. They get access to the HW now so they can continue to grow the business. Of course, if you think AI is just pets.com redux, I can see how you'd think it's already peaked. All those years of very important people insisting Bezos couldn't just pull a switch on reinvesting all the revenue into growing Amazon and then he did exactly that comes to mind.

samdixontoday at 2:45 PM

From my understanding, if you want to use native Claude in AWS Bedrock, it runs from an AWS datacenter. I'm guessing that's why regardless of running your own stack... they still need a footprint in all the major clouds.

bombcartoday at 2:27 PM

If you’re sure it’s going to go gangbusters you want to get it all in-house asap.

If you’re not sure it’s going to blow the socks off, foisting capital investment on partners is a great deal.

See the difference in companies/franchises that always own the land/building and those that always lease.

lubujacksontoday at 2:42 PM

Look at GPU and RAM prices and data center rollout. We have quickly reached Earth's capacity for compute - it is a lot like the housing market. Once there is global saturation, the price to buy becomes increasingly high EVERYWHERE. Let's also not forget that Anthropic moves the market with their purchases and usage. They might literally be unable to buy capacity they need (or project to) and are doing this deal to pave a roadmap for the near-term and to keep global prices (somewhat) down.

show 1 reply
0xbadcafebeetoday at 3:33 PM

There is no money or time left to build a $100B stack. All private capital is tapped and banks know it's too risky. They have no choice but to rent.

nickorlowtoday at 3:29 PM

AWS exists and has compute right now, spinning up their own HW would take months (at least). This gets them moving quicker.

bilekastoday at 2:28 PM

I imagine it comes down to if they want to buy hardware every generation, that gets very expensive and depreciates quickly. You've then got a whole load of assets on your books that are technically obsolete for the bleeding edge. This way, AWS buys and maintains the hardware and OpenAI doesn't need to claim it as depreciation ?

Just a guess.

dgellowtoday at 3:27 PM

Anthropic also has their own servers

credit_guytoday at 2:05 PM

Here’s the answer to your queation (from the article)

> The Anthropic deal specifically covers Trainium2 through Trainium4 chips, even though Trainium4 chips are not currently available. The latest chip, Trainium3, was released in December. On top of that, Anthropic has secured the option to buy capacity on future Amazon chips as they become available.

show 2 replies
Tepixtoday at 1:53 PM

Sure: If you can't get enough compute by ordering it yourself, make deals with anyone who promises to get you more compute.

Culonavirustoday at 2:06 PM

Only Google and xAI build their own, no? I don't think it's that easy to vertically integrate massive datacenters into a software company. Both Google and xAI (Tesla, SpaceX) have a massive wealth of experience when it comes to building factories.

show 2 replies
avereveardtoday at 3:16 PM

Cannot get Tranium anywhere else and NVIDIA commands a super high premium.

DANmodetoday at 3:27 PM

> you’d want to own your stack.

Everybody does right now, right?

But: is it your core competency?

Can your firm afford the distraction?

vascotoday at 2:39 PM

That is a project you can work on at any point in the future and the more you delay it the more certain your investment will be about what you really need. But those additions to the PnL are capped to the costs.

In the meantime if you work on revenue generating work, that side of PnL is uncapped. So you can either put some engineers on reducing your costs at most by 100% or, if they worked on product ideas they could be working on things that generate over 9000% more revenue.

Zababatoday at 1:54 PM

I think it could make sense to not want to own the stack if you think it's going to cost you velocity/focus? Which is probably the play here. But I'm not certain at all.

loveparadetoday at 1:57 PM

Good lucking getting GPUs.

mitchell_htoday at 2:09 PM

I watched some explain how deepseak got good and the Chinese approach to LLM training. Really wish I could remember it. The premise was China thinks of LLMs not as a thing separate from hardware, but gains efficiencies at each layer of the stack. From Chips to software, it's all integrated and purpose built for training.

Wonder if Anthropic is making a mistake by focusing on "consumer" hardware, and not going super specialized.

show 5 replies