From the paper it appears that it's probably more useful on small-ish models.
What does it cost to train a model like 1-bit Bonsai? Anyone know?
What does it cost to train a model like 1-bit Bonsai? Anyone know?