Someone simply assumed at some point that RAG must be based on vector search, and everyone followed.
I don't think this was a simple assumption. LLMs used to be much dumber! GPT-3 era LLMS were not good at grep, they were not that good at recovering from errors, and they were not good at making followup queries over multiple turns of search. Multiple breakthroughs in code generation, tool use, and reasoning had to happen on the model side to make vector-based RAG look like unnecessary complexity
Certainly a lot of blog posts followed. Not sure that “everyone” was so blinkered.
It was the terminology that did that more than anything. The term 'RAG' just has a lot of consequential baggage. Unfortunately.
Doesn't have to be tho, I've had great success letting an agent loose on an Apache Lucene instance. Turns out LLMs are great at building queries.
It’s something of a historical accident
We started with LLMs when everyone in search was building question answering systems. Those architectures look like the vector DB + chunking we associate with RAG.
Agents ability to call tools, using any retrieval backend, call that into question.
We really shouldn’t start RAG with the assumption we need that. I’ll be speaking about the subject in a few weeks
https://maven.com/p/7105dc/rag-is-the-what-agentic-search-is...