This is for keeping the weight vectors in sync between two machines. The weight vectors themselves...

londons_explore • 10/12/2024 • 0 replies • view on HN

This is for keeping the weight vectors in sync between two machines.

The weight vectors themselves are regular floats. But the data exchanged between the machines is 1 bit. Basically, you keep track of changes to the weight vector which hasn't yet been propagated to the other machine. You quantize this to 1 bit per weight (ie. a sign bit) and send it, together with a single scale factor X, accumulating the quantization error for the next sync iteration.

You choose X to be the RMS or some similar metric of the accumulated error.

alt Hacker News