logoalt Hacker News

lukeschlatherlast Thursday at 4:01 PM1 replyview on HN

I feel like you're overstating the resources required by a couple orders of magnitude. You do need a GPU farm to do training, but probably only $100M, maybe $1B of GPUs. And yes, that's a lot of GPUs, but they will fit in a single datacenter, and even in dollar terms, there are many individual buildings in NYC that are cheaper.


Replies

fnordpigletyesterday at 5:54 AM

I refer you to the data centers under construction roughly the size of Manhattan to do next generation model training. Granted they’re also to house inference, but my statement wasn’t hyperbole, it’s based on actual reality. To accommodate the next generation of frontier training it’s infeasible for any but the most wealthy organizations on earth to participate. OSS weights are toys. (Mind you i like toys)