logoalt Hacker News

adrian_btoday at 1:12 PM3 repliesview on HN

The root problem is actually that the C language allows implicit conversions from an unsigned type to a signed type and from a signed type to an unsigned type, and in certain contexts such implicit conversions are actually mandated by the standard, like in the buggy expression from the parent article.

It does not matter which is the relationship between the sizes of such types, there will always be values of the operand that cannot be represented in the result.

Saying that the behavior is sometimes undefined is not acceptable. Any implicit conversion of this kind must be an error. Whenever a conversion between signed and unsigned or unsigned and signed is desired, it must be explicit.

This may be the worst mistake that has ever been made in the design of the C language and it has not been corrected even after 50 years.

Making this an error would indeed produce a deluge of error messages in many carelessly written legacy programs, but the program conversion is trivial and it is extremely likely that many of these cases where the compilers do not signal errors can cause bugs in certain corner cases, like in the parent article.


Replies

zahlmantoday at 5:02 PM

> It does not matter which is the relationship between the sizes of such types, there will always be values of the operand that cannot be represented in the result.

Hmm? Seems to me that unsigned -> larger signed works, although other conversions may not.

But yes, I generally agree that these are terrible conversions to do implicitly, given that the entire point of those types is to control the interpretation of memory at a bits-and-bytes level. Languages where implicit numeric conversions make sense are generally not languages that care so much about integer size, and the entire point of having unsigned types is to bake that range constraint in.

ueckertoday at 1:21 PM

You could just use -Wsign-conversion.

show 1 reply
fonheponhotoday at 4:44 PM

> It does not matter which is the relationship between the sizes of such types, there will always be values of the operand that cannot be represented in the result.

It's not that bad actually; not "always". The only nontrivial case is when, as a part of the usual arithmetic conversions, you (perhaps unwittingly) convert a signed integer type to an unsigned integer type [*], and the original value was negative.

[*] This can happen in two cases (paraphrasing the standard):

- if the operand that has unsigned integer type has rank greater than or equal to the rank of the signed integer type of the other operand,

- if the operand that has signed integer type has rank greater than or equal to the rank of the unsigned integer type of the other operand, but the signed integer type cannot represent all values of the unsigned integer type.

Examples: (a) "unsigned int" vs. "signed int"; (b) "long signed int" vs. "unsigned int" in a POSIX ILP32 programming environment. Under (a), you get conversion to "unsigned int"; under (b), you get conversion (for both operands) to "long unsigned int".

Section "3.2 Conversions | 3.2.1 Arithmetic operands | 3.2.1.1 Characters, and integers" in the C89 Rationale <https://www.open-std.org/Jtc1/sc22/WG14/www/C89Rationale.pdf> is worth reading. (An updated version of the same section is included in the C99 Rationale <https://www.open-std.org/jtc1/sc22/wg14/www/C99RationaleV5.1...> under 6.3.1.1.)

It deals precisely with the problem highlighted in the blog post. I'll quote just the beginning and the end:

> Since the publication of K&R, a serious divergence has occurred among implementations of C in the evolution of integral promotion rules. Implementations fall into two major camps, which may be characterized as unsigned preserving and value preserving. [...]

> The unsigned preserving rules greatly increase the number of situations where unsigned int confronts signed int to yield a questionably signed result, whereas the value preserving rules minimize such confrontations. Thus, the value preserving rules were considered to be safer for the novice, or unwary, programmer. After much discussion, the Committee decided in favor of value preserving rules, despite the fact that the UNIX C compilers had evolved in the direction of unsigned preserving.

> QUIET CHANGE -- A program that depends upon unsigned preserving arithmetic conversions will behave differently, probably without complaint. This is considered the most serious semantic change made by the Committee to a widespread current practice.