logoalt Hacker News

esafak06/16/20251 replyview on HN

First I heard of it. https://en.wikipedia.org/wiki/Text_Encoding_Initiative


Replies

mgr8606/16/2025

Understandable. I work in academic publishing, and while the XML is everywhere crowd is graying, retiring, or even dying :( it still remains an excellent option for document markup. Additionally, a lot of government data produced in the US and EU make heavy use of XML technologies. I imagine they could be an interested consumer of Nanonets-OCR. TEI could be a good choice as well tested and developed conversions exist to other popular, less structured, formats.

show 2 replies