logoalt Hacker News

tdortmanlast Wednesday at 12:57 AM0 repliesview on HN

I haven't tested this but I would be very surprised if the PCIe bus wasn't a severe bottleneck in that case, unless you can somehow amortize the cost of the memcpy.

Though that being said, with such massive datasets you'll already be bottlenecked by the necessary communication between GPUs (sadly even with NVLink) since the queried data always lives on the GPU.