logoalt Hacker News

dmezzettiyesterday at 1:52 PM0 repliesview on HN

Excellent article on BM25!

Author of txtai [1] here. txtai implements a performant BM25 index in Python [2] via the arrays package and storing the term frequency vectors in SQLite.

With txtai, the hybrid index approach [3] supports both convex combination when BM25 scores are normalized and reciprocal rank fusion (RRF) when they aren't [4].

[1] https://github.com/neuml/txtai

[2] https://neuml.hashnode.dev/building-an-efficient-sparse-keyw...

[3] https://neuml.hashnode.dev/benefits-of-hybrid-search

[4] https://github.com/neuml/txtai/blob/master/src/python/txtai/...