Would you mind sharing the napkin maths?
Not OP, but basically take GiB/s and divide by 30. You need at least 128GiB to hold the model, too. It's expensive to get 200 GiB/s, very expensive to get 400 GiB/s and above that you are looking at DC-grade GPUs. Multiple, in fact.
Not OP, but basically take GiB/s and divide by 30. You need at least 128GiB to hold the model, too. It's expensive to get 200 GiB/s, very expensive to get 400 GiB/s and above that you are looking at DC-grade GPUs. Multiple, in fact.