logoalt Hacker News

quinnduponttoday at 1:31 PM0 repliesview on HN

Very helpful analysis that confirms everything I’ve encountered. OCR remains a thorny issue. The author talks about professional workflows struggling with tables and such, but I’ve found it challenging to get clean copies of long documents (books). The hybrid workflow (layout then OCR) sounds promising.