That's not an A/B test because it has no way of controlling for broader economic trends over time. How do you figure out if what you're seeing is because of that one thing that changed, or the enormous list of other things that also changed around the same time?
A more valid design would be randomly assigning some cities to institute congestion pricing, and other cities to not have it. Obviously not feasible in practice, but that's at least the kind of thing to strive toward when designing these kinds of studies.
Everyone knows how you can conduct good experiments in a land of frictionless spherical cows.
> randomly assigning some cities to institute congestion pricing, and other cities to not have it
Cities are stupidly heterogenous. These data wouldn't be more meaningful than comparing cities with congestion pricing to those without. (And comparing them from their congestion eras.)
That would be a bad design for an A/B study (and NYC congestion pricing is not a “study” anyway), because cities are few and not alike and have an enormous list of other things that are different. What NYC equivalent would you pick?
In any case, not every policy change needs to be an academic exercise.