logoalt Hacker News

simonaskyesterday at 8:58 AM6 repliesview on HN

Many, many programmers come to C (and C++) with a lower-level understanding that actually gets in the way here. They understand that all types "are" just bytes and that all pointers "are" just register-sized integer addresses, because that's how the hardware works and has worked for decades.

It's perfectly reasonable to expect any load through `int*` to just load 4 bytes from memory, done and done. They get surprised that it is far from the whole story, and the result is UB.

Meanwhile, the actual computers we have been using for decades have no problems actually just loading 4 bytes through any arbitrary pointer with zero overhead. But no.


Replies

lelanthranyesterday at 9:17 AM

> They understand that all types "are" just bytes and that all pointers "are" just register-sized integer addresses, because that's how the hardware works and has worked for decades.

I'd clarify this with "They understand that all values are just bytes".

> Meanwhile, the actual computers we have been using for decades have no problems actually just loading 4 bytes through any arbitrary pointer with zero overhead.

It's partly the standards fault here - rather than saying "We don't know how vendors will implement this, so we shall leave it as implementation-defined", they say "We don't know how vendors will implement this, so we will leave it as undefined".

A clear majority of the UB problems with C could be fixed if the standards committee slowly moved all UB into IB. It's not that there isn't any progress (Signed twos-complement is coming, after all), it's that there is (I believe) much pushback from compiler authors (who dominate the standards) who don't want to make UB into IB.

show 3 replies
da-alexyesterday at 11:05 AM

> Meanwhile, the actual computers we have been using for decades have no problems actually just loading 4 bytes through any arbitrary pointer with zero overhead. But no.

Not if those 4 bytes span a cacheline boundary, that will most likely result in 1/2 throughput compared to loading values inside a single cacheline. And if it causes cache-misses it takes up twice the L2 or L3 bandwidth.

Even worse, if the int spans two pages, it will need two TLB lookups. If it's a hot variable and the only thing you use from those pages, it even uses up an additional TLB entry, that could otherwise be used for better perf elsewhere, etc.

And if you're on embedded (and many C programs are), Cortex-M CPUs either can't handle unaligned accesses (M0, M0+) or take 2-3 times as long (split the load into 2x2 byte or 1x2 + 2x1 byte)

show 1 reply
tialaramexyesterday at 12:49 PM

> that all pointers "are" just register-sized integer addresses

And crucially until DR#260 https://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_260.htm this was a reasonable guess as to what the pointers are. Probably not a wise guess because it's not how your C compiler worked even then, but a reasonable guess if you didn't think too hard about this.

One way I like to think about this is that all C's types are just the machine integers wearing crap Halloween costumes. Groucho glasses for bool, maybe a Lincoln hat for char, float and double can be bright orange make-up and a long tie. But the pointers are different, because unlike the other types those have provenance.

5 == 5, 'Z' == 'Z', true == true, 1.5f == 1.5f, but whether two pointers are equivalent does not depend solely on their bit pattern in C.

mafuyyesterday at 12:34 PM

I'm not sure that's right. For instance, the Pentium 4 spec explicitly says unaligned int32 loads take longer. And x86/x64 is very gentle in that regard, other archs would whip you. So an unaligned int access is rightfully treated differently. It should be IB.

Just creating the pointer, though, should not be UB, even though it apparently is. It should not even be IB.

show 1 reply
TazeTSchnitzelyesterday at 3:03 PM

> Meanwhile, the actual computers we have been using for decades have no problems actually just loading 4 bytes through any arbitrary pointer with zero overhead.

PCs yes, but there are many other things C is compiled to for which this is not true.

show 1 reply
pjc50yesterday at 9:12 AM

Except ARM32. ARM64 doesn't guarantee it to be valid in all cases either.