test A - before
test B - after
what are you talking about ?
"before" and "after" introduces a large axis of noise
The problem is that for A/B testing to really work you need independent groups outcomes. As soon as there is any bias in group selection or cross group effect it's very hard to unpick.
Generally, that's considered to introduce counfounding factors on the time axis ("did we see improvement because we changed something or because flu season hit and people stayed home") that you'd prefer to mitigate by running your A and B simultaneously.
But in the absence of the ability to run them simultaneously, "A is before and B is after" can be a fine proxy. Of course, if B is worse, it'd be nice if you could only subject, say, 5% of your population to it before you just slam the slider to 100% and hit everyone with it.
“A/B in time” suffers from inability to control for other factors that might vary over time. In this case, that could be the economy or other transit policies.
But sometimes it’s the only possible approach.