logoalt Hacker News

plorkyeranlast Thursday at 1:44 AM0 repliesview on HN

One byte equals one character was already incorrect in the pre-unicode days for east asian languages. UTF-8 is much easier to parse than something like Shift JIS, where splitting a string in between bytes of a codepoint results in a valid but incorrect string.