Note that CPUs have also gotten dramatically wider in both execution width and vector capability since you were a teenager. The increased throughput shifts the balance more toward being able to burn operations to reduce dependency chains. It's possible for your idea to have been both non-viable on the CPUs at the time and more viable on CPUs now.