logoalt Hacker News

cpursleyyesterday at 11:48 PM1 replyview on HN

It's common to do a hybrid of BM25 with other fuzzy search or pgvector.


Replies

storusyesterday at 11:56 PM

BM25 is quite bad and needs to be retrained for each corpus anew. SPLADEv2 is much better and there are even better sparse embeddings these days.