> Let's follow one example: Nigeria is the most populous country in Africa. In Abstract Wikipedia, this might be stored as: Z27243(Q1033, Q138758272, Q6256, Q15, Z27243K5)
Haha that's like John Wilkins' "Real Character, and a Philosophical Language"
https://en.wikipedia.org/wiki/La_Ricerca_della_Lingua_Perfet... is a great intro to the weird and wonderful world of abstract/universal/ideal/a priori languages.
It's not that different from how LLM tokens work, only in a tree structure as opposed to a plain sequence. Having a tree structure makes it easier to formally define rewrite rules (which is key for interpretability), as opposed to learning them from data as LLM do.