And here I am with 128GB Strix Halo longingly eyeing the Blackwell cards that spit tokens 10-20x the speed.
The question is ultimate shape of knowledge compression and bandwidth optimization at which we arrive I suppose.