logoalt Hacker News

pi_22by711/04/20251 replyview on HN

The key insight about bloom filters lacking synergy is excellent. The ~7K document crossover point makes sense because inverted indexes amortize dictionary storage across all documents while bloom filters must encode it linearly per document


Replies

hinkley11/04/2025

But doesn’t that depend on the cardinality of the indexes versus the document count? I’ve seen systems with a stupid number of tag values.