logoalt Hacker News

bangaladore01/21/20251 replyview on HN

I feel like in pretty much every case here they still do not need arbitrary access. The point of DMA cheating is to make zero modification of the target computer. The moment a driver needs to be used to say allow an IOMMU range for a given device, the target computer has been tainted and you lose much of the benefit of DMA in the first place.

Does a GPU need access to memory of a Usermode application for some reason, okay, the GPU driver should orchestrate that.

> We haven't even gotten into exotic hardware that wants to do some kind of shared memory clustering between machines, or cache cards (something like Optane) which are PCIe cards that can be used as system memory via DMA, or dedicated security processors intended to scan memory for malware etc.

Again, opt-in. The driver should specify explicit ranges when initializing the device.


Replies

AnthonyMouse01/21/2025

> I feel like in pretty much every case here they still do not need arbitrary access.

Several of those cases do indeed need arbitrary access.

> The moment a driver needs to be used to say allow an IOMMU range for a given device, the target computer has been tainted and you lose much of the benefit of DMA in the first place.

The premise there being that the device is doing something suspicious rather than the same thing that device would ordinarily do if it was present in the machine for innocuous reasons.

> Does a GPU need access to memory of a Usermode application for some reason, okay, the GPU driver should orchestrate that.

Okay, so the GPU has some CPU cores on it and if the usermode application is scheduled on any of those cores -- or could be scheduled on any of them -- then it will need access to that application's entire address space. Which is what happens by default, since they're ordinary CPU cores that just happen to be on the other side of a PCIe bus.

> Again, opt-in. The driver should specify explicit ranges when initializing the device.

What ranges? The security processor is intended to scan every last memory page. The cache card is storing arbitrary memory pages on itself and would need access to arbitrary others because any given page could be transferred to or from the cache at any time. The cluster card is presenting the entire cluster's combined memory as a single address space to every node and managing which pages are stored on which node.

And just to reiterate, it doesn't have to be anything exotic. The storage controller in a common machine is going to do DMA to arbitrary memory pages for swap.

show 1 reply