It’s 128b dense model. Good luck getting more than 3t/s out of a mac. It doesn’t matter if it fits or not.
You could run it on a single Mac Studio with M3 Ultra, or two Mac Studios with M4 Max at higher perf than that. And lightly quantizing this could give us modern dense models in the ~80GB size range, which is a very compelling target.
You could run it on a single Mac Studio with M3 Ultra, or two Mac Studios with M4 Max at higher perf than that. And lightly quantizing this could give us modern dense models in the ~80GB size range, which is a very compelling target.