The problem is actually because their cache invalidates randomly so that's why replaying inputs at 200k+ and above sucks up all usage. This is a bug within their systems that they refuse to acknowledge. My guess is that API clients kick off subscription users cache early which explains this behavior, if so then it's a feature not a bug.
They also silently raised the usage input tokens consume so it's a double whammi.