logoalt Hacker News

zombotyesterday at 12:17 PM4 repliesview on HN

Does C allow Unicode identifiers now, or is that pseudo code? The code snippets also contain `&`, so something definitely went wrong with the transcoding to HTML.


Replies

pjmlpyesterday at 12:50 PM

Besides the sibling comment on C23, it does work fine on GCC.

https://godbolt.org/z/qKejzc1Kb

Whereas clang loudly complains,

https://godbolt.org/z/qWrccWzYW

qsortyesterday at 12:21 PM

Quoting cppreference:

An identifier is an arbitrarily long sequence of digits, underscores, lowercase and uppercase Latin letters, and Unicode characters specified using \u and \U escape notation(since C99), of class XID_Continue(since C23). A valid identifier must begin with a non-digit character (Latin letter, underscore, or Unicode non-digit character(since C99)(until C23), or Unicode character of class XID_Start)(since C23)). Identifiers are case-sensitive (lowercase and uppercase letters are distinct). Every identifier must conform to Normalization Form C.(since C23)

In practice depends on the compiler.

show 1 reply
Y_Yyesterday at 4:06 PM

Implementation-defined until C99, explicitly possible via UCNs aince c99, possible with explicit encoding since C23, but literals are still implementation defined.

unwindyesterday at 12:26 PM

I can't even view the post, I just get some kind of content management system-like with the page as JSON or something, in pink-on-white. I'm super confused. :|

The answer to your question seems to (still) be "no".