Yeah that's odd. It seems like you'd want an n-1 dimensional grid on the surface of the unit sphere rather than an n dimensional grid within which the sphere resides.
Looking at the paper (https://arxiv.org/abs/2504.19874) they cite earlier work that does exactly that. They object that grid projection and binary search perform exceptionally poorly on the GPU.
I don't think they're using a regular grid as depicted on the linked page. Equation 4 from the paper is how they compute centroids for the MSE optimal quantizer.
Why specify MSE optimal you ask? Yeah so it turns out there's actually two quantization steps, a detail also omitted from the linked page. They apply QJL quantization to the residual of the grid quantized data.
My description is almost certainly missing key details; I'm not great at math and this is sufficiently dense to be a slog.
Yeah that's odd. It seems like you'd want an n-1 dimensional grid on the surface of the unit sphere rather than an n dimensional grid within which the sphere resides.
Looking at the paper (https://arxiv.org/abs/2504.19874) they cite earlier work that does exactly that. They object that grid projection and binary search perform exceptionally poorly on the GPU.
I don't think they're using a regular grid as depicted on the linked page. Equation 4 from the paper is how they compute centroids for the MSE optimal quantizer.
Why specify MSE optimal you ask? Yeah so it turns out there's actually two quantization steps, a detail also omitted from the linked page. They apply QJL quantization to the residual of the grid quantized data.
My description is almost certainly missing key details; I'm not great at math and this is sufficiently dense to be a slog.