logoalt Hacker News

atq2119last Tuesday at 2:39 PM2 repliesview on HN

Looks like a good write up, but I'd caution that some of the statements about memory models aren't completely accurate.

The terms relaxed, acquire, and release refer to how an atomic operation is ordered against other accesses to memory.

Counter to what the article states, a relaxed atomic is still atomic, meaning that it cannot tear and, for RMW atomic, no other access can go between the read and the write. But a relaxed atomic does not order other accesses, which can lead to unintuitive outcomes.

By contrast, once you've observed another thread's release store with an acquire load, you're guaranteed that your subsequent memory accesses "happen after" all of the other thread's accesses from before that release store -- which is what you'd intuitively expect, it's just that in modern systems (which are really highly distributed systems even on a single chip) there's a cost to establishing this kind of guarantee, which is why you can opt out of it with relaxed atomics if you know what you're doing.


Replies

viegalast Tuesday at 2:49 PM

Yes, I meant to clarify the memory model discussion; I had tried to simplify and did a poor job; I got similar feedback after it was published, and never remembered to get to it. Will try to do it soon, though it's about the worst time for this to have hit, not sure when I'll be able to sit down for it, but will try to get it done in the next day. Hopefully it doesn't wait until next time it gets some views.

show 2 replies
Syzygieslast Tuesday at 5:48 PM

Yes. I was surprised there was no mention of "false sharing".

https://en.wikipedia.org/wiki/False_sharing

Rather than incrementing each counter by one, dither the counters to reduce cache conflicts? So what if the dequeue becomes a bit fuzzy. Make the queue a bit longer, so everyone survives at least as long as they would have survived before.

Or simply use a prime length queue, and change what +1 means, so one's stride is longer than the cache conflict concern. Any stride will generate the full cyclic group, for a prime.

show 1 reply