> Unicode code points are 32 bit
21-bit, actually. It was supposed to be 32-bit, but UTF-16 caps out at 21-bit, so they lopped eleven bits of potential from Unicode (and UTF-8, so no more six-byte encoding).
> at some point before Unicode
No, in the early days of Unicode.
> run length encodes
Um… what? RLE is a data compression thing, UTF-16 has nothing to do with it.
>> Unicode code points are 32 bit
> 21-bit, actually
Less than that. https://en.wikipedia.org/wiki/Code_point#In_character_encodi...:
“The Unicode code space is divided into seventeen planes (the basic multilingual plane, and 16 supplementary planes), each with 65,536 (= 2¹⁶) code points. Thus the total size of the Unicode code space is 17 × 65,536 = 1,114,112”
That makes it log(1,114,112)/log(2) bit. That’s about 20,09.
(https://www.unicode.org/versions/Unicode17.0.0/ assigns 159,801 of them to characters)