logoalt Hacker News

snomanlast Saturday at 4:27 PM1 replyview on HN

Take whatever you're indexing and make it 16-20x and that’s a good approximation of what the vector db’s total size is going to be.


Replies

jononorlast Sunday at 1:55 PM

Why is it like that, currently? There is no information added by a vector index compared to the original text. And the text is highly redundant and compressible with even lossless functions. Furthermore a vector index is already lossy and approximate. So conceptually it is at least possible to have an index that would be a fraction of the size of what is indexed?

show 1 reply