logoalt Hacker News

PoignardAzur10/12/20241 replyview on HN

A lot of comment are sneering at various aspects of this press release, and yeah, there's some cringeworthy stuff.

But the technical aspects are pretty cool:

- Fault-tolerant training where nodes and be added and removed mid-run without interrupting the other nodes.

- Sending quantized gradients during the synchronization phase.

- (In the OpenDiLoCo article) Async synchronization.

They're also mentioning potential trustless systems where everyone can contribute compute, which would make this a truly decentralized open platform. Overall it'll be pretty interesting to see where this goes!


Replies

londons_explore10/12/2024

> Sending quantized gradients during the synchronization phase.

I did this 9 years ago, works pretty well. I don't understand why all ML isn't async and quantized like that now. This project quantizes to 1 bit per weight and it works so well I didn't even make it configurable.

https://github.com/Hello1024/shared-tensor

show 1 reply