It's definitely far easier to emit a controlled, useful subset of PDF than it is to pars...

RodgerTheGreat • last Friday at 10:59 PM • 2 replies • view on HN

It's definitely far easier to emit a controlled, useful subset of PDF than it is to parse PDF documents. I wrote a small PDF library for the Decker ecosystem that just focuses on bitmaps and page layout; roughly 4kb and 135 LoC.

docs/demos: https://beyondloom.com/decker/pdf.html

browsable source: https://github.com/JohnEarnest/Decker/blob/main/examples/dec...

Replies

kuschkufan • yesterday at 10:20 AM

This decker stuff is pretty nifty too

user3939382 • yesterday at 4:48 PM

I’m working on one rn. It takes arbitrary PDFs and builds composable dynamic pandoc pipelines to match the source byte for byte output. It’s very very complex. But if I can get it finished it will fuck over Adobe so worth it.

alt Hacker News

Replies