logoalt Hacker News

kelnostoday at 4:10 AM1 replyview on HN

Is the issue that training with less compute takes more time? Or is it just not possible? I think a collective using distributed training could tolerate the idea that it takes 10x as long as Anthropic to train a model, or whatever.


Replies

mike_hearntoday at 10:31 AM

It's possible but it's not linear. A modern AI training cluster is a supercomputer that uses very different architectures and hardware to a bunch of small PCs connected via normal networking. The networking advantage alone kills any chance of decentralized training.