TPUs do include dedicated hardware, SparseCores, for sparse operations.
https://docs.cloud.google.com/tpu/docs/system-architecture-t...
https://openxla.org/xla/sparsecore