logoalt Hacker News

jjmarrtoday at 12:50 AM4 repliesview on HN

there is no guarantee `char` is 8 bits, nor that it represents text, or even a particular encoding.

If your codebase has those guarantees, go ahead and use it.


Replies

hackyhackytoday at 7:30 AM

> there is no guarantee `char` is 8 bits, nor that it represents text, or even a particular encoding.

True, but sizeof(char) is defined to be 1. In section 7.6.2.5:

"The result of sizeof applied to any of the narrow character types is 1"

In fact, char and associated types are the only types in the standard where the size is not implementation-defined.

So the only way that a C++ implementation can conform to the standard and have a char type that is not 8 bits is if the size of a byte is not 8 bits. There are historical systems that meet that constraint but no modern systems that I am aware of.

[1] https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/n49...

20ktoday at 1:02 AM

char8_t also isn't guaranteed to be 8-bits, because sizeof(char) == 1 and sizeof(char8_t) >= 1. On a platform where char is 16 bits, char8_t will be 16 bits as well

The cpp standard explicitly says that it has the same size, typed, signedness and alignment as unsigned char, but its a distinct type. So its pretty useless, and badly named

show 2 replies
dataflowtoday at 12:54 AM

How many non-8-bit-char platforms are there with char8_t support, and how many do we expect in the future?

show 1 reply
Maxatartoday at 3:25 AM

There's no guarantee char8_t is 8 bits either, it's only guaranteed to be at least 8 bits.