logoalt Hacker News

aidenn0today at 7:53 PM2 repliesview on HN

Note that PDF :

1. Supports JPEG2000 compression, which is very similar to what DjVu uses for images

2. Supports JPEGs compressed with jpegli which is competitive with DjVu at higher quality settings

3. Supports JBIG2 for bi-level images, which is very similar to what DjVu uses for bi-level layers.


Replies

jbabertoday at 9:23 PM

Any combination of ghostscript flags or something to turn a random pdf into one that uses these things to make a pdf as fast and small as a djvu?

rahimnathwanitoday at 8:37 PM

Right, if you look at PDF files from Internet Archive, they're usually compressed with MRC (Mixed Raster Content).

IIRC each page has three layers:

- background (jpeg, color)

- foreground (jbig2, monochrome maybe?)

- mask (indicating whether foreground or background should be shown at this point)

https://github.com/internetarchive/archive-pdf-tools