> Nobody’s actually run this in production We do at Discourse, in thousands of databases, and i...

xfalcox • yesterday at 2:16 PM • 5 replies • view on HN

> Nobody’s actually run this in production

We do at Discourse, in thousands of databases, and it's leveraged in most of the billions of page views we serve.

> Pre- vs. Post-Filtering (or: why you need to become a query planner expert)

This was fixed in version 0.8.0 via Iterative Scans (https://github.com/pgvector/pgvector?tab=readme-ov-file#iter...)

> Just use a real vector database

If you are running a single service that may be an easier sell, but it's not a silver bullet.

Replies

xfalcox • yesterday at 3:25 PM

Also worth mentioning that we use quantization extensively:

- halfvec (16bit float) for storage - bit (binary vectors) for indexes

Which makes the storage cost and on-going performance good enough that we could enable this in all our hosting.

➕ show 3 replies

tacoooooooo • yesterday at 4:15 PM

for sure people are running pgvector in prd! i was more pointing at every tutorial

iterative scans are more of a bandaid for filtering than a solution. you will still run into issues with highly restrictive filters. you still need to understand ef_search and max_search_tuples. strict vs relaxed ordering, etc. it's an improvement for sure, but the planner still doesn't deeply understand the cost model of filtered vector search

there isn't a general solution to the pre- vs post-filter problem—it comes down to having a smart planner that understands your data distribution. question is whether you have the resources to build and tune that yourself or want to offload it to a service that's able to focus on it directly

➕ show 1 reply

whakim • yesterday at 10:14 PM

Interested to hear more about your experience here. At Halcyon, we have trillions of embeddings and found Postgres to be unsuitable at several orders of magnitude less than we currently have.

On the iterative scan side, how do you prevent this from becoming too computationally intensive with a restrictive pre-filter, or simply not working at all? We use Vespa, which means effectively doing a map-reduce across all of our nodes; the effective number of graph traversals to do is smaller, and the computational burden mostly involves scanning posting lists on a per-node basis. I imagine to do something similar in postgres, you'd need sharded tables, and complicated application logic to control what you're actually searching.

How do you deal with re-indexing and/or denormalizing metadata for filtering? Do you simply accept that it'll take hours or days?

I agree with you, however, that vector databases are not a panacea (although they do remove a huge amount of devops work, which is worth a lot!). Vespa supports filtering across parent-child relationships (like a relational database) which means we don't have to reindex a trillion things every time we want to add a new type of filter, which with a previous vector database vendor we used took us almost a week.

➕ show 1 reply

jascha_eng • yesterday at 8:37 PM

There are also approaches do doing the filtering while traversing a vector index (not just pre/post) e.g. this paper by microsoft explains an approach https://dl.acm.org/doi/10.1145/3543507.3583552 which pgvectorscale implements here: https://github.com/timescale/pgvectorscale?tab=readme-ov-fil...

In theory these can be more efficient than plain pre/post filtering.

➕ show 1 reply

dpflan • yesterday at 4:18 PM

What are you using it for? Is it part of a hybrid search system (keyword + vector)?

➕ show 1 reply

alt Hacker News

Replies