Token costs do go down over time for sure due to software optimizations (i.e. better attention kernals) but acting like hardware INFLATION isn't happening for at least a few more years is just nonsense. Objectively an A100 is more expensive to rent today than it was in 2024 (a 7 year old GPU - Big short guy is a turbo idiot) and rising. As such, over short time horizons, it's possible to see limited amounts of "price per token goes up" for the same model.