But much of the latency in cache is getting the signal to and from the cell, not the actual store threshold. And I can't see much difference in that unless you can actually eliminate gates (and so make it smaller, making it physically closer on average).