logoalt Hacker News

thaumasiotesyesterday at 9:10 PM3 repliesview on HN

> It's Undefined Behavior.

Susam's post doesn't make this clear. The quotes from K&R say that the modifications to the variable may take place in any order, but they don't directly say that doing this is Undefined Behavior, which would make it permissible to do anything, including e.g. interpreting the increments as decrements.

The C99 standard is quoted saying this:

>> Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression.

It's possible that something else in the standard defines noncompliance with this clause as Undefined Behavior. But that's not the most intuitive interpretation; what this seems to say, to me, is that the line of code `a = a++ + ++a` should fail to compile, because it's not in compliance with a requirement of the language. Compilers that produce any result at all are suffering from a bug.

(It seems more likely that the actual intent is to specify that, given the line of code `b = a++ + ++a`, with a initially equal to 5, the compiler is required to ensure that the value stored at the address of a is never equal to 6 - that it begins at 5, and at some indefinite point it becomes 7, but that there is no intermediate stage between them. But I find the 'compiler failure on attempt to put multiple modifications between two sequence points' interpretation preferable.)


Replies

vilhelm_syesterday at 10:55 PM

The "shall" in the standard means it's undefined behavior. This is explained in the "Conformance" section,

> 2. If a ‘‘shall’’ or ‘‘shall not’’requirement that appears outside of a constraint is violated, the behavior is undefined. Undefined behavior is otherwise indicated in this International Standard by the words ‘‘undefined behavior’’ or by the omission of any explicit definition of behavior. There is no difference in emphasis among these three; they all describe ‘‘behavior that is undefined’’.

Compilers will not refuse to compile the code, indeed the blog post we are all commenting on reports the results from a bunch of different compilers. Historically the reason the C standard specified a lot of undefined behavior is that the actually existing C compilers at the time compiled the code but disagreed about the output.

show 1 reply
Dylan16807yesterday at 9:19 PM

Compilers are not able to prevent you from violating must/shall in the general case. So they're not held to that bar. Unless the standard says not to compile it, it's not a compiler bug.

Also, imagine a situation where the line of code actually lists three different variables, but all three of them are passed in by address. It quickly becomes impossible for the compiler to know you violated the spec by reusing the same variable. And even optimizations that make sense here could corrupt the value pretty badly and possibly lead to worse errors.

show 1 reply
susamyesterday at 10:30 PM

I searched K&R to see if there is any language that implies a += a++ + a++ to be undefined. I couldn't find anything. I found the following excerpt which is closest to what I claim, in spirit. But still, it does not explicitly spell out that an object must not be modified more than once between sequence points. From § A.7 Expressions:

> The precedence and associativity of operators is fully specified, but the order of evaluation of expressions is, with certain exceptions, undefined, even if the subexpressions involve side effects. That is, unless the definition of the operator guarantees that its operands are evaluated in a particular order, the implementation is free to evaluate operands in any order, or even to interleave their evaluation. However, each operator combines the values produced by its operands in a way compatible with the parsing of the expression in which it appears. This rule revokes the previous freedom to reorder expressions with operators that are mathematically commutative and associative, but can fail to be computationally associative. The change affects only floating-point computations near the limits of their accuracy, and situations where overflow is possible.

So I think, the text in K&R serves as warning against writing such code, at best. The C99 draft has more relevant language. From § 4. Conformance:

> If a "shall" or "shall not" requirement that appears outside of a constraint is violated, the behavior is undefined. Undefined behavior is otherwise indicated in this International Standard by the words "undefined behavior" or by the omission of any explicit definition of behavior. There is no difference in emphasis among these three; they all describe "behavior that is undefined".

This along with the § 6.5 excerpt already mentioned in my post implies a += a++ + a++ to be undefined. When I get some more time later, I'll make an update to my post to include the § 4. Conformance language too for completeness.

Thank you for the nice comment!