I agree that you can encode any single concept and that the encoding space of a single top pick grows exponentially.
However, I'm talking about the probability distribution of tokens.
I think within the framework of "almost-orthogonal axes" you can still create a vector that has the desired mix of projections onto any combination of these axes?
I think within the framework of "almost-orthogonal axes" you can still create a vector that has the desired mix of projections onto any combination of these axes?