PDFs should be only for printing or maybe for keeping scanned versions of things. For anything else ...

Worf • today at 10:46 AM • 6 replies • view on HN

PDFs should be only for printing or maybe for keeping scanned versions of things. For anything else they're just not the right tool for the job. Not for things meant to be accessed on a computer like books, scientific papers or, for some weird reason, catalogs and price lists from websites.

We have responsive and open standards like HTML and EPUB (zipped XTML) and they work great. arXiv has HTML papers, and libgen and anna's archive often have EPUB versions of books. The issue for me with EPUB is the lack of good readers now.

Replies

adrian_b • today at 12:25 PM

HTML and EPUB work not great, but very badly for scientific or technical papers or books.

No two readers render them alike, and they typically are much uglier and more difficult to use than books (sometimes even the same book) in PDF, DJVU or ODT formats.

I read a very large quantity of technical documentation and I always avoid EPUB and HTML like the plague. I use such formats only when there is no alternative.

On Linux, mupdf is a decent EPUB reader, which is very fast and it usually does a better job at formatting pages than most other EPUB readers that I have tried on Linux.

For fast navigation and searching, especially in technical documentation with hundreds or thousands of pages, it is very useful for the document to be well partitioned into pages and the page layouts to be well designed, like for a printed book, even if this may seem unnecessary for a document stored in a computer.

HTML and EPUB documents are seldom divided in uniform pages and the position of various elements, like tables or figures can vary between readers or even with the same reader in different circumstances, so when you search various things you are slowed down in recognizing them, because they may not be in the position where you have seen them previously. Moreover, in HTML and EPUB documents, depending on the reader, the size ratios between various elements may be inappropriate, making the pages ugly and/or hard to understand.

All the defects of HTML and EPUB documents are caused by the fact that the writer of the document normally does not take full responsibility for the appearance of the pages, delegating this to the browsers/readers, which seldom do a good job for scientific/technical documents full of formulae, tables and figures.

This may be fine for normal Web pages, but it is not acceptable for technical and scientific documents.

In theory, one could design carefully HTML pages and the associated CSS files, to be rendered deterministically, but I have encountered very rarely such documents.

➕ show 1 reply

mcdonje • today at 11:19 AM

EPUB is the ebook standard, outside of Amazon-land, so it has staying power in its space. I think it would be good for the ecosystem if it broke containment and got tooling in enough places to challenge PDF.

jkscm • today at 11:30 AM

slighlty disagree with this. A fixed page layout has it's own advantages. The reason we have more high quality pdf readers than epub readers is probrably connected to the format itself. PDF readers usally are more more feature complete when it comes to stuff like annotations too.

➕ show 1 reply

gf000 • today at 11:10 AM

I don't know, I really love a well-typeset books/papers. Especially when they feature figures that are deliberately placed close to the relevant section in the text, it's just not something we can replicate with HTML, that can barely do proper justified text.

Sure, I would like that beautifully designed page to magically become a single column beautiful document on my phone, but I will take the former over a badly designed text extract where the relevant figure is 10 pages away.

Epub (=html) is good for novels, but there is nothing replacing PDF for science papers. If anything, the latex (or ideally typst) source would come the closest, if properly written (not absolute offsets). That could be used to produce different page sized versions.

➕ show 1 reply

FailMore • today at 10:52 AM

Interesting point. What do you feel about the "business world"'s heavy use of PDFs? There is something to be said about the file format being trusted/so dominant now... probably some random sequence of events led to this happening... but perhaps hard to shift

➕ show 1 reply

danhor • today at 11:15 AM

A PDF of a long document such as a standard or reference manual is almost always preferable to an HTML version. HTML versions have issues with formatting, searching (as browsers struggle with multi-thousand page documents and non-native search document search implementations almost always suck), indexing, correct behavior on windows size change (especially a side-by-side pdf view is almost unheard of for webpages), ... . Some vendors have switched to online-only for some documents and it always annoys me.

➕ show 1 reply

alt Hacker News

Replies