eh i'd push back on "just RAG". like yes the retrieval-generation loop is RAG shaped, no ones arguing that. but the interesting bit here is the write loop - the LLM is authoring and maintaining the wiki itself, building backlinks, filing its own outputs back in. thats not retrieval thats knowledge synthesis. in vanilla RAG your corpus is static, here it isnt
also the linting pass is doing something genuinely different - auditing inconsistencies, imputing missing data, suggesting connections. thats closer to assistant maintaining a zettelkasten than a search engine returning top-k chunks
cool project btw will check it out
I'm curious how this linting step scales with larger wikis. Looking for an inconstency across N files requires N*N comparisons, and that's assuming each file contains a single idea.
This is just persistent memory RAG. I have had a setup like this since about a day after I started using copilot, except it's an MCP server that uses sqlite-vec and has recall endpoints to contextually load the proper data instead of a bunch of extra files polluting context.
OP's example isn't something new or incredibly thoughtful at all - in fact this pattern gets "discovered" every other day here, reddit or social media in general by people that don't have the foresight to just look around and see what other people are doing.
I agree with you, the linting pass seems valuable and it's something I'm thinking about adding - it's a great idea.
What I'm pushing back on specifically is the insistence that the core loop - retrieving the most relevant pieces of knowledge for wiki synthesis - is not RAG. In order for the LLM to do a good job at this, it needs some way to retrieve the most relevant info. Whether that's via vector DB queries or a structured index/filesystem approach, that fundamental problem - retrieving the best data for the LLM's context - is RAG. It's a problem that has been studied and evaluated for years now.
thanks for checking it out