> If memory gets unified, what is the value proposition of ROCm supposed to be over mesa3d? Why does AMD need to invent some new way to communicate with GPUs? Why would it be faster?
And the memory barriers? How do you sync up the L1/L2 cache of a CPU core with the GPU's cache?
Exactly. With a ROCm memory barrier, ensuring parallelism between CPU + GPU, while also providing a mechanism for synchronization.
GPU and CPU can share memory, but they do not share caches. You need programming effort to make ANY of this work.