I wonder if this will actually be why the models move to "neuralese" or whatever non-language latent representation people work out. Interpretability disappears but efficiency potentially goes way up. Even without a performance increase that would be pretty huge.