And they are full of wiki markup, templates, and inconsistent formatting. A human brain can easily understand it, but automated parsing is impossible (pre LLM).
https://kaikki.org/index.html
https://kaikki.org/index.html