> 2. WAY lower bandwidth requirements for inference. Means with approaches like this it should run on consumer hardware far better. It apparently requires 1/6th the memory bandwidth of a traditional approach for better results.
That should be the headline right there. Giant side 60 font headline.
Some people have PhDs in burying the lede!
except it's not true