This is one reason, and another is that both Dennard scaling has stopped and GPUs hit a memory wall for DRAM. The only reason AI hardware gets the significant improvements is that they are using big matmuls and a lot of research has been in getting lower precision (now 4bit) training working (numerical precision stability was always a huge problem with backprop).