98.6% cache hits doesn't distinguish an efficient workflow from an overly chatty linear agent repeatedly reusing the same context. Plus, it says nothing directly that the process has good useful progress per token.
We are all going to be graded by (tickets closed / tokens burned) soon enough.
We are all going to be graded by (tickets closed / tokens burned) soon enough.