Cool but it's relying on every extractor honoring that replacement-text property which you said...

Xotic007 • today at 5:46 PM • 1 reply • view on HN

Cool but it's relying on every extractor honoring that replacement-text property which you said yourself is hit or miss. So it's clean markdown until someone runs it through a tool that ignores it and quietly gets the messy version and has no idea that happened.

Replies

SarthakGaud • today at 6:16 PM

From my trials, it fails with OCR but works with popular libs like pypdf2 etc

➕ show 1 reply

alt Hacker News

Replies