logoalt Hacker News

hunterpayneyesterday at 9:17 PM1 replyview on HN

So its a longish article and doing a point by point explanation is probably too much for a single post. But several of the points are solved but just standing up a specific Postgres instance for the vector use cases instead of doing this inside an existing instance.

Most of the rest of his complaints comes down to this is complex stuff. True, but its not a solution, its a tool used in making a solution. So when using pg_vector directly, you probably need to understand databases to a more significant degree than a custom solution that won't work for you the moment your requirements change. You surely need to understand databases more than the author does. He doesn't point to a single thing that pg_vector doesn't do or doesn't do well. He just complains it hard to do.

In summary, pg_vector is a toolkit for building vector based functionality, not a custom solution for a specific use case. What is best for you comes down to your team's skills and expertise with databases and if your specific requirements will change. Choose poorly and it could go very badly.


Replies

samustoday at 8:33 AM

> He doesn't point to a single thing that pg_vector doesn't do or doesn't do well. He just complains it hard to do.

He very clearly complains that IVFFlat indexes have to be periodically rebuilt, that HNSW has high overhead (both during inserts and rebuilds) and that the query planner is not particularly good at optimizing queries involving this kind of indexes. None of this is a problem if the dataset is puny enough, but deadly if you want to scale up without investing significant engineering.