logoalt Hacker News

ueckeryesterday at 1:34 PM3 repliesview on HN

The standard will not forbid anything that breaks billions of lines of code still be used and maintained.

But it is easy enough to use modern tooling and coding styles to deal with signed overflow. Nowadays, silent unsigned wrap around causing logic errors is the more vexing issue, which indicates the undefined behavior actually helps rather than hurts when used with good tooling.


Replies

ojedayesterday at 9:00 PM

> which indicates the undefined behavior actually helps rather than hurts when used with good tooling

No, one doesn't need undefined behavior for that at all (which does hurt).

What actually helps is diagnosing the issue, just like one can diagnose the unsigned case just fine (which is not UB).

Instead, for this sort of thing, C could have "Erroneous Behavior", like Rust has (C++ also added it, recently).

Of course, existing ambiguous C code will remain to be tricky. What matters, after all, is having ways to express what we are expecting in the source code, so that a reader (whether tooling, humans or LLMs) can rely on that.

adrian_byesterday at 1:47 PM

Silent unsigned wrap around is caused by another mistake of the C language (and of all later languages inspired by C), there is only a single unsigned type.

The hardware of modern CPUs actually implements 5 distinct data types that must be declared as "unsigned" in C: non-negative integers, integer residues a.k.a. modular integers, bit strings, binary polynomials and binary polynomial residues.

A modern programming language should better have these 5 distinct types, but it must have at least distinct types for non-negative integers and for integer residues. There are several programming languages that provide at least this distinction. The other data types would be more difficult to support in a high-level language, as they use certain machine instructions that compilers typically do not know how to use.

The change in the C standard that was made so that now "unsigned" means integer residue, has left the language without any means to specify a data type for non-negative integers, which is extremely wrong, because there are more programs that use "unsigned" for non-negative integers than programs that use "unsigned" for integer residues.

The hardware of most CPUs implements very well non-negative integers so non-negative integer overflow is easily detected, but the current standard makes impossible to use the hardware.

show 3 replies
throw_awaityesterday at 6:42 PM

Those billions of lines are already broken by definition.

show 1 reply