I wonder how much smaller it could get with some compression. You could probably encode "This website hijacks the scrollbar and I don't like it" comments into just a few bits.
That's at least 45%, then you can leave out all of my comments and you're left with only 5!
It might be a neat experiment to use ai to produce canonicalized paraphrasings of HN arguments so they could be compared directly and compress well.
Guilty.
The hard-coded dictionary wouldn't be much stranger than Brotli's:
https://news.ycombinator.com/item?id=27160590