logoalt Hacker News

londons_explore10/12/20240 repliesview on HN

This is for keeping the weight vectors in sync between two machines.

The weight vectors themselves are regular floats. But the data exchanged between the machines is 1 bit. Basically, you keep track of changes to the weight vector which hasn't yet been propagated to the other machine. You quantize this to 1 bit per weight (ie. a sign bit) and send it, together with a single scale factor X, accumulating the quantization error for the next sync iteration.

You choose X to be the RMS or some similar metric of the accumulated error.