I do not know inner details of Zstandard, but I would expect that it to least do suffix/prefix ...

srean • today at 7:30 AM • 2 replies • view on HN

I do not know inner details of Zstandard, but I would expect that it to least do suffix/prefix stats or word fragment stats, not just words and phrases.

Replies

Jaxan • today at 10:59 AM

The thing is that two English texts on completely different topics will compress better than say and English and Spanish text on exactly the same topic. So compression really only looks at the form/shape of text and not meaning.

➕ show 1 reply

duskwuff • today at 9:01 AM

It's not specifically aware of the syntax - it'll match any repeated substrings. That just happens to usually end up meaning words and phrases in English text.

alt Hacker News

Replies