Unified memory is only a feature because NVidia so aggressively uses VRAM for market segmentation.
The 5090 ($2k MSRP but realistically $3-3.5k) is almost the same as the RTX 6000 Pro (~$10k). Same memory bandwidth (1800GB/s). Slightly different CUDA cores (21k vs 24k). Big difference? VRAM (32GB vs 96GB).
NVidia ultimately doesn't want to upset this segmentation so the RTX Spark will never undermine their other offerings. This is why I think Apple has a real market opportunity if they choose to embrace it.
I have so many questions… Since Apple already sells unified memory systems, what is the market opportunity you envision? Do you see Nvidia and Apple as competitors, and how? (And I’m not suggesting they’re not, necessarily, but I want to hear where you’re coming from, and they do have very different markets.) Hasn’t Apple used storage size (RAM & disk) for market segmentation for decades? And how does a machine with 128GB unified mem not potentially cut into some people’s reasons for wanting a 96GB GPU?
Even low-VRAM cards are actually very useful for running the comparatively smaller dense layers in large local MoE models. This only requires transfering very small amounts of data across the PCIe bus (similar to pipeline parallelism) so it fits nicely around the existing bottlenecks on that hardware.
> 5090 ($2k MSRP but realistically $3-3.5k)
These days, more like >$4.1K (at least in the US).
To this day I do not get why Intel doesn't just offer massive memory options for their cards. Just charge what it costs to add the extra memory, no upcharge, and they will never be able to keep up with demand. Cheap VRAM is enough to justify a lot of open source investment into challenging CUDA.