Asymmetric Quantization: Near-Lossless Retrieval with 97% Storage Reduction

66 points • by breadislove • last Monday at 7:42 PM • 15 comments • view on HN

Comments

I would love to see real examples of what reduced quality means in practice. Are you able to recover a document from the vector in a human readable format? If so, what sort of changes come up?

I could imagine a scenario where differences tend to be more substantive than you'd expect because of how less frequent words with fine distinctions in meaning - the very words that make the document special - may be embedded in the vector space.

➕ show 1 reply

alfiedotwtf • today at 1:41 PM

If you squint hard enough, it sounds like their storage layer is a bloom filter

purple-leafy • today at 11:00 AM

Hey breadislove; amazing article, I’ll be sending mixedbread an email in the morning that may interest you (email will be <5-characters>@pm.me)

I have also been working in compression and performance engineering, and managed to get a 99+% compression unlock versus conventional approaches (100+KB down to 1KB) in the scenario of 30 minute massive multiplayer game replays for a “game+engine” I’m developing

I think there’s a synergy between these 2 concepts I’d love to chat some more

➕ show 1 reply

rq1 • today at 10:28 AM

The Pi compression algorithm is better.

➕ show 1 reply

functionmouse • today at 11:29 AM

there is no such thing as "near lossless"

➕ show 1 reply

nathan_compton • today at 12:55 PM

" A single document produces more then one embedding, depending on the complexity of the document it can produce hundreds or thousands of vectors."

That typo up there is kind of endearing in the AI slop era.

m_m_carvalho • today at 12:41 PM

[dead]

mv_d5339e31 • today at 9:00 AM

[dead]

johnathan101 • today at 7:59 AM

[flagged]

➕ show 1 reply

TradingReality • last Tuesday at 11:58 PM

[flagged]

Ameo • today at 8:48 AM

I can't wait until we get to 100% storage/cost/compute reduction for LLMs. Every thought you could have thought pre-conceived in high-fidelity super-resolution. Every action you could have taken predicted and simulated in advance courtesy of Openthropic and the USA Sovereign Wealth Fund.

➕ show 3 replies

alt Hacker News

Asymmetric Quantization: Near-Lossless Retrieval with 97% Storage Reduction

Comments