logoalt Hacker News

girfanlast Thursday at 3:01 PM2 repliesview on HN

Cool post. Have you looked at slicing a single GPU up for multiple VMs? Is there anything other than MIG that you have come across to partition SMs and memory bandwidth within a single GPU?


Replies

namibjlast Thursday at 3:36 PM

Last I checked MIG was the only one that made hard promises about especially memory bandwidth; as long as your memory access patterns aren't secret and you have enough trust in the other guests not being highly unfriendly with their cache usage behavior, you should be able to get away with much less strict isolation. Think docker vs. VMs-with-dedicated-cores.

But I thought MIG did do the job of chopping a GPU that's too big for most individual users into something that behaves very close to a literal array of smaller GPUs stuffed into the same PCIe card form factor? Think how a Tesla K80 was pretty much just two GK210 "GPUs" on a PLX "PCIe switch" which connects them to each other and to the host. Obviously trivial to give one to each of two VMs (at least if the PLX didn't interfere with IOMMU separation or such.... for mere performance isolation it certainly sufficed (once you block a heavy user from power budget throttling the sibling, at least).

show 1 reply
ben_slast Thursday at 3:24 PM

Thanks! I haven't looked deeply into slicing up a single GPU. My understanding is that vGPU (which we briefly mention in the post) can partition memory but time-shares compute, while MIG is the only mechanism that provides partitioning of both SMs and memory bandwidth within a single GPU.

show 1 reply