logoalt Hacker News

9rxlast Tuesday at 4:35 PM1 replyview on HN

That's not what your link says:

    The Java language allows source code to express Unicode
    characters in a UTF-16 encoding, and this is unaffected
    by the choice of UTF-8 for the default charset.
Perhaps you pasted the wrong one by mistake?

Not that such a change to the language for future code, even if only hypothetical, would bear any difference to this discussion anyway as legacy Java code encoded in other charsets is still Java code.


Replies

lxgrlast Tuesday at 4:44 PM

Just one sentence after the one you quoted:

> However, the javac compiler is affected because it assumes that .java source files are encoded with the default charset, unless configured otherwise by the -encoding option.

Interestingly, in Windows, Java programs were supposedly encoded in CP-1252 before this...?

> In JDK 17 and earlier, the default charset is determined when the Java runtime starts. On macOS, it is UTF-8 except in the POSIX C locale. On other operating systems, it depends upon the user's locale and the default encoding, e.g., on Windows, it is a codepage-based charset such as windows-1252 or windows-31j.

show 1 reply