logoalt Hacker News

Fripplebubbylast Monday at 2:47 PM3 repliesview on HN

The post is a clear example of when YAGNI backfires, because you think YAGNI but then, you actually do need it. I had this experience, the author had this experience, you might as well - the things you think you AGN are actually pretty basic expectations and not luxuries: being able to write vectors real-time without having to run other processes out of band to keep the recall from degrading over time, being able to write a query that uses normal SQL filter predicates and similarity in one go for retrieval. These things matter and you won't notice that they actually don't work at scale until later on!


Replies

simonwlast Monday at 4:29 PM

That's not YAGNI backfiring.

The point of YAGNI is that you shouldn't over-engineer up front until you've proven that you need the added complexity.

If you need vector search against 100,000 vectors and you already have PostgreSQL then pgvector is a great YAGNI solution.

10 million vectors that are changing constantly? Do a bit more research into alternative solutions.

But don't go integrating a separate vector database for 100,000 vectors on the assumption that you'll need it later.

show 1 reply
throwway120385last Monday at 5:30 PM

Many of the concerns in the article could be addressed by standing up a separate PG database that's used exclusively for vector ops and then not using it for your relational data. Then your vector use cases get served from your vector DB and your relational use cases get served from your relational DB. Separating concerns like that doesn't solve the underlying concern but it limits the blast radius so you can operate in a degraded state instead of falling over completely.

show 3 replies