But those Thunderbolt links are slower than modern PCIe. If there's actually a M5-based Mac Studio with the same Thunderbolt support, you'll be better off e.g. for LLM inference, streaming read-only model weights from storage as we've seen with recent experiments than pushing the same amount of data via Thunderbolt. It's only if you want to go beyond local memory constraints (e.g. larger contexts) that the Thunderbolt link becomes useful.
Wasn't streaming models from storage into limited memory a case where it was impressive that you could make the elephant dance at all?
If you want to get usable speeds from very large models that haven't been quantitized to death on local machines, RDMA over Thunderbolt enables that use case.
Consumer PC GPUs don't have enough RAM, enterprise GPUs that can handle the load very well are obscenely expensive, Strix Halo tops out at 128 Gigs of RAM and is limited on Thunderbolt ports.
Why everyone wants to live in dongle/external cabling/dock hell is beyond me. PCIe cards are powered internally with no extra cables. They are secure. They do not move or fall off of shit. They do not require cable management or external power supplies. They do not have to talk to the CPU through a stupid USB hub or a Thunderbolt dock. Crappy USB HDMI capture on my Mac led me to running a fucking PC with slots to capture video off of a 50 foot HDMI cable, that then streamed the feed to my Mac from NDI, because it was more reliable than the elgarbo capture dongle I was using. This shit is bad. It sucks. It's twice the price and half the quality of a Blackmagic Design capture card. But, no slots, so I guess I can go get fucked.