Because that's not how Unicode works. It's not simply a table mapping numbers to all possible symbols. Unicode is full of special codepoints that have no meaning on their own, they serve as modifiers to other symbols and a single visible symbol can be formed by an arbitrary (in theory) long combimation of such codepoints. It doesn't matter how you encode it, it simply doesn't work as "codepoint -> symbol" and indexing in a unicode string is never O(1) and cannot be made O(1). Could we use a simple table approach? Maybe. But it wouldn't be Unicode