70B dense models are way behind SOTA. Even the aforementioned Kimi 2.5 has fewer active parameters than that, and then quantized at int4. We're at a point where some near-frontier models may run out of the box on Mac Mini-grade hardware, with perhaps no real need to even upgrade to the Mac Studio.
>may
I'm completely over these hypotheticals and 'testing grade'.
I know Nvidia VRAM works, not some marketing about 'integrated ram'. Heck look at /r/locallama/ There is a reason its entirely Nvidia.