logoalt Hacker News

PaulKeeblelast Saturday at 9:31 PM4 repliesview on HN

About the worst job on any enterprise software project is the PDF output, they always end up doing it for emails or something else and its a never ending list of bugs. Text formatting is a never ending list of problems since its so got a lot of vague inputs and a relatively strict output. Far too many little details go wrong.


Replies

vbezhenarlast Saturday at 11:19 PM

With PDF, my best approach was to go very low level. I've used PDFKit and PDFBox libraries and both provide a way to output vector operations. It allows to implement extremely performant code. The resulting PDF is tiny and looks gorgeous (because it's vector). And you can implement anything. Code will be verbose, but it's worth it.

I even think that it's viable to output PDF without any libraries. I've investigated that format a bit and it doesn't seem too complicated, at least for relatively dumb output.

mikedayyesterday at 1:34 AM

We've spent twenty years working on HTML to PDF conversion and I expect we could easily spend another twenty years, so feel free to give Prince a try if you would rather avoid the headache :)

show 2 replies
kbbgl87yesterday at 12:18 AM

Thinking about phantomjs and rasterize brings back nightmares

huflungdunglast Saturday at 11:53 PM

[dead]