logoalt Hacker News

HexDecOctBinyesterday at 3:28 PM3 repliesview on HN

Will this create more nasal demons? I always disable strict aliasing, and it's not clear to me after reading the whole article whether provenance is about making sane code illegal, or making previously illegal sane code legal.


Replies

jcranmeryesterday at 4:09 PM

All C compilers have some notion of pointer provenance embedded in them, and this is true going back decades.

The problem is that the documented definitions of pointer provenance (which generally amount to "you must somehow have a data dependency from the original object definition (e.g., malloc)") aren't really upheld by the optimizer, and the effective definition of the optimizer is generally internally inconsistent because people don't think about side effects of pointer-to-integer conversion. The one-past-the-end pointer being equal (but of different provenance) to a different object is a particular vexatious case.

The definition given in TS6010 is generally the closest you'll get to a formal description of the behavior that optimizers are already generally following, except for cases that are clearly agreed to be bugs. The biggest problem is that it makes pointer-to-int an operation with side effects that need to be preserved, and compilers today generally fail to preserve those side effects (especially when pointer-to-int conversion happens more as an implicit operation).

The practical effect of provenance--that you can't magic a pointer to an object out of thin air--has always been true. This is largely trying to clarify what it means to actually magic a pointer out of thin air; it's not a perfect answer, but it's the best answer anyone's come up with to date.

Diggseyyesterday at 4:50 PM

It's standardizing the contract between the programmer and the compiler.

Previously a lot of C code was non-portable because it relied on behaviour that wasn't defined as part of the standard. If you compiled it with the wrong compiler or the wrong flags you might get miscompilations.

The provenance memory model draws a line in the sand and says "all C code on this side of the line should behave in this well defined way". Any optimizations implemented by compiler authors which would miscompile code on that side of the line would need to be disabled.

Assuming the authors of the model have done a good job, the impact on compiler optimizations should be minimized whilst making as much existing C code fall on the "right" side of the line as possible.

For new C code it provides programmers a way to write useful code that is also portable, since we now have a line that we can all hopefully agree on.

layer8yesterday at 4:21 PM

This is basically a formalization of the general understanding one already had when reading the C standard thoroughly 25 years ago. At least I was nodding along throughout the article. It cleans up the parts where the standard was too imprecise and handwavy.