logoalt Hacker News

electrolytoday at 3:59 PM2 repliesview on HN

For AI inference you don't need to geographically distribute your data centers. Latency, throughput, and routes don't matter here. When it's 10 seconds for the first token and then a 1KB/sec streamed response, whatever is fine. You can serve Australia from the US and it'll barely matter. You can find a spot far outside populated areas with cheap power, available water, and friendly leadership, then put all of your data centers there. If you're worried about major disasters, you can pick a second city. You definitely don't need a data center in every continent.

You're not wrong about the rest but no AI company would ever build a data center in every continent for this, even if they were prepared to build data centers. AI inference isn't like general purpose hosting.


Replies

pohltoday at 4:57 PM

Sounds like you're betting that the performance users experience today will be the same as the performance they'll expect tomorrow. I wouldn't take that bet.

show 2 replies
TSiegetoday at 4:02 PM

latency absolutely matters? this is such a weird thing to say. for training sure, but customers absolutely want low latency

show 3 replies