Don't use a vector database for code, embeddings are slow and bad for code. Code likes bm25+tri...

CuriouslyC • 01/15/2026 • 7 replies • view on HN

Don't use a vector database for code, embeddings are slow and bad for code. Code likes bm25+trigram, that gets better results while keeping search responses snappy.

Replies

jankovicsandras • 01/15/2026

You can do hybrid search in Postgres.

Shameless plug: https://github.com/jankovicsandras/plpgsql_bm25 BM25 search implemented in PL/pgSQL ( Unlicense / Public domain )

The repo includes also plpgsql_bm25rrf.sql : PL/pgSQL function for hybrid search ( plpgsql_bm25 + pgvector ) with Reciprocal Rank Fusion; and Jupyter notebook examples.

➕ show 1 reply

postalcoder • 01/15/2026

I agree. Someone here posted a drop-in for grep that added the ability to do hybrid text/vector search but the constant need to re-index files was annoying and a drag. Moreover, vector search can add a ton of noise if the model isn't meant for code search and if you're not using a re-ranker.

For all intents and purposes, running gpt-oss 20B in a while loop with access to ripgrep works pretty dang well. gpt-oss is a tool calling god compared to everything else i've tried, and fast.

➕ show 1 reply

rao-v • 01/15/2026

Anybody know of a good service / docker that will do BM25 + vector lookup without spinning up half a dozen microservices?

➕ show 4 replies

Der_Einzige • 01/15/2026

This is true in general with LLMs, not just for code. LLMs can be told that their RAG tool is using BM25+N-grams, and will search accordingly. keyword search is superior to embeddings based search. The moment google switched to bert based embeddings for search everyone agreed it was going down hill. Most forms of early enshittification were simply switching off BM25 to embeddings based search.

BM25/tf-idf and N grams have always been extremely difficult to beat baselines in information retrieval. This is why embeddings still have not led to a "ChatGPT" moment in information retrieval.

lee1012 • 01/15/2026

static embedding models im finding quite fast lee101/gobed https://github.com/lee101/gobed is 1ms on gpu :) would need to be trained for code though the bigger code llm embeddings can be high quality too so its just yea about where is ideal on the pareto fronteir really , often yea though your right it tends to be bm25 or rg even for code but yea more complex solutions are kind of possible too if its really important the search is high quality

ehsanu1 • 01/15/2026

I've gotten great results applying it to file paths + signatures. Even better if you also fuse those results with BM25.

➕ show 1 reply

itake • 01/15/2026

With AI needing more access to documentation, WDYT about using RAG for documentation retrieval?

➕ show 1 reply

alt Hacker News

Replies