logoalt Hacker News

RodgerTheGreatlast Friday at 10:59 PM2 repliesview on HN

It's definitely far easier to emit a controlled, useful subset of PDF than it is to parse PDF documents. I wrote a small PDF library for the Decker ecosystem that just focuses on bitmaps and page layout; roughly 4kb and 135 LoC.

docs/demos: https://beyondloom.com/decker/pdf.html

browsable source: https://github.com/JohnEarnest/Decker/blob/main/examples/dec...


Replies

kuschkufanyesterday at 10:20 AM

This decker stuff is pretty nifty too

user3939382yesterday at 4:48 PM

I’m working on one rn. It takes arbitrary PDFs and builds composable dynamic pandoc pipelines to match the source byte for byte output. It’s very very complex. But if I can get it finished it will fuck over Adobe so worth it.