logoalt Hacker News

pansa2today at 1:07 AM1 replyview on HN

> all the encoding/decoding functions default to utf-8

Languages that use UTF-8 natively don't need those functions at all. And the ones in Python aren't trivial - see, for example, `surrogateescape`.

As the sibling comment says, the only benefit of all this encoding/decoding is that it allows strings to support constant-time indexing of code points, which isn't something that's commonly needed.


Replies

laurencerowetoday at 1:34 AM

They absolutely do because random byte strings are not valid utf8. Safe Rust requires validating bytes when converting to strings because this.