My advice for building something like this: don't get hung up on a need for vector databases an...

simonw • today at 5:50 PM • 8 replies • view on HN

My advice for building something like this: don't get hung up on a need for vector databases and embedding.

Full text search or even grep/rg are a lot faster and cheaper to work with - no need to maintain a vector database index - and turn out to work really well if you put them in some kind of agentic tool loop.

The big benefit of semantic search was that it could handle fuzzy searching - returning results that mention dogs if someone searches for canines, for example.

Give a good LLM a search tool and it can come up with searches like "dog OR canine" on its own - and refine those queries over multiple rounds of searches.

Plus it means you don't have to solve the chunking problem!

Replies

cwmoore • today at 9:27 PM

I recently came across a “prefer the most common synonym” problem, in Google Maps, while searching for a poolhall—even literally ‘billiards’ returned results for swimming pools and chlorine. I wonder if some more NOTs aren’t necessary…interested in learning about RAGs though I’m a little behind the curve.

mips_avatar • today at 7:29 PM

In my app the best lexical search approaches completely broke my agent. For my rag system the llm would on average take 2.1 lexical searches to get the results it needed. Which wasn’t terrible but it meant sometimes it needed up to 5 searches to find it which blew up user latency. Now that I have a hybrid semantic search + lexical search it only requires 1.1 searches per result.

froobius • today at 6:52 PM

Hmm it can capture more than just single words though, e.g. meaningful phrases or paragraphs that could be written in many ways.

leetrout • today at 6:15 PM

Simon have you ever given a talk or written about this sort of pragmatism? A spin on how to achieve this with Datasette is an easy thing to imagine IMO.

➕ show 1 reply

tra3 • today at 6:26 PM

I built a simple emacs package based on this idea [0]. It works surprisingly well, but I dont know how far it scales. It's likely not as frugal from a token usage perspective.

0: https://github.com/dmitrym0/dm-gptel-simple-org-memory

pstuart • today at 9:09 PM

Perhaps SQLite with FTS5? Or even better, getting DuckDB into the party as it's ecosystem seems ripe for this type of work.

enraged_camel • today at 6:34 PM

Yes, exactly. We have our AI feature configured to use our pre-existing TypeSense integration and it's stunningly competent at figuring out exactly what search queries to use across which collections in order to find relevant results.

➕ show 1 reply

alt Hacker News

Replies