The real thing I think people are rediscovering with file system based search is that there’s a type of semantic search that’s not embedding based retrieval. One that looks more like how a librarian organizes files into shelves based on the domain.
We’re rediscovering forms of in search we’ve known about for decades. And it turns out they’re more interpretable to agents.
https://softwaredoug.com/blog/2026/01/08/semantic-search-wit...
Similar effort with PageIndex [1], which basically creates a table of contents like tree. Then an LLM traverses the tree to figure out which chunks are relevant for the context in the prompt.
This kind of circles back to ontological NLP, that was using knowledge representation as a primitive for language processing. There is _a ton_ of work in that direction.
Lovely blog post, first thing I've read in a while that feels like it was written by a human.
Why do you think about knowledge graphs for RAG?
I think it's cool that LLMs can effectively do this kind of categorization on the fly at relatively large scale. When you give the LLM tools beyond just "search", it really is effectively cheating.
Inverted indexes have the major advantages of supporting Boolean operators.
Turns out the millions of people in knowledge work arent librarians and they wing shit everywhere
Someone simply assumed at some point that RAG must be based on vector search, and everyone followed.