RAG seems odd when you can just have a coding agent manage memory by managing folders. Multi agent also feels weird when you have subagents.
I've been leaning towards multi agent because sub agent relies on the main agent having all the power and using it responsibly.
Totally useless indeed.
Interesting.
I guess RAG is faster? But I'm realizing I'm outdated now.
Yeah, vector embeddings based RAG has fallen out of fashion somewhat.
It was great when LLMs had 4,000 or 8,000 token context windows and the biggest challenge was efficiently figuring out the most likely chunks of text to feed into that window to answer a question.
These days LLMS all have 100,000+ context windows, which means you don't have to be nearly as selective. They're also exceptionally good at running search tools - give them grep or rg or even `select * from t where body like ...` and they'll almost certainly be able to find the information they need after a few loops.
Vector embeddings give you fuzzy search, so "dog" also matches "puppy" - but a good LLM with a search tool will search for "dog" and then try a second search for "puppy" if the first one doesn't return the results it needs.