Since previous generations of models get aggressively retired the cost reduction essentially never gets passed down to the customer.
A certain amount of input and output tokens doesn't cost 10x less than before.