Why are there so few 32,64,128,256,512 GB models which could run on current consumer hardware? And why is the maximum RAM on Mac studio M4 128 GB??
the only real benefit is privacy which 99.9% of people dont get about. Almost all serving metrics (cost, throughput, ttft) are better with large gpu clusters. Latency is usually hidden by prefill cost.
128 GB should be enough for anybody (just kidding). I hope the M5 Max will have higher RAM limits
the only real benefit is privacy which 99.9% of people dont get about. Almost all serving metrics (cost, throughput, ttft) are better with large gpu clusters. Latency is usually hidden by prefill cost.