logoalt Hacker News

blackbear_yesterday at 3:52 PM1 replyview on HN

While the theoretical bottleneck is there, it is far less restrictive than what you are describing, because the number of almost orthogonal vectors grows exponentially with ambient dimensionality. And orthogonality is what matters to differentiate between different vectors: since any distribution can be expressed as a mixture of Gaussians, the number of separate concepts that you can encode with such a mixture also grows exponentially


Replies

Scene_Cast2yesterday at 6:01 PM

I agree that you can encode any single concept and that the encoding space of a single top pick grows exponentially.

However, I'm talking about the probability distribution of tokens.

show 1 reply