logoalt Hacker News

miki123211yesterday at 9:26 PM0 repliesview on HN

> how impossible is a world where open source base models are collectively trained similar to a proof of work style pool

Current multi-GPU training setups assume much higher bandwidth (and lower latency) between the GPUs than you can get with an internet connection. Even cross-datacenter training isn't really practical.

LLM training isn't embarrassingly parallel, not like crypto mining is for example. It's not like you can just add more nodes to the mix and magically get speedups. You can get a lot out of parallelism, certainly, but it's not as straightforward and requires work to fully utilize.