Okay now this is weird.
I can reproduce it just fine ... but only when compressing all PDFs simultaneously.
To utilize all cores, I ran:
$ for x in *.pdf; do zstd <"$x" >"$x.zst" --ultra -22 & done; wait
(and similar for the other formats).I ran this again and it produced the same 2M file from the source 1.1M file. However when I run without paralellization:
$ for x in *.pdf; do zstd <"$x" >"$x.zst" --ultra -22; done
That one file becomes 1.1M, and the total size of *.zst is 37M (competitive with Brotli, which is impressive given how much faster it is to decompress).What's going on here? Surely '-22' disables any adaptive compression stuff based on system resource availability and just uses compression level 22?
Yeah, `--adaptive` will enable adaptive compression, but it isn't enabled by default, so shouldn't apply here. But even with `--adaptive`, after compressing each block of 128KB of data, zstd checks that the output size is < 128KB. If it isn't, it emits an uncompressed block that is 128KB + 3B.
So it is very central to zstd that it will never emit a block that is larger than 128KB+3B.
I will try to reproduce, but I suspect that there is something unrelated to zstd going on.
What version of zstd are you using?