You didn't touch on the most important aspect for cost: die area!
How much die space ($) will that circuitry, that's probably statistically near zero chance for you main customers workload (who has model weight of 0 or 1!?), add. And, if you can stomach the cost, what else could you put there instead?
Weights should not be 0 (at least not frequently) but in a ReLU-based neural network, activations are 0 pretty often. You're absolutely right about die area though.