Then you're assuming an efficiency that is analogous to how Moore's law made it efficient for chips. Same difference. The problem is that AI scaling in the longest term is a completely unknown problem.
Training improvements and Moore's Law are "analogous" but not "same difference." They are far from the same thing, governed by completely different factors, and one can happen and has been happening independently from the other.
Training improvements and Moore's Law are "analogous" but not "same difference." They are far from the same thing, governed by completely different factors, and one can happen and has been happening independently from the other.