It also modified many of the tests to make them pass in mischievous ways. You can't trust a test suite to catch regressions if the new version doesn't use the same test suite.
I think demonstrating broken behavior in the new build would be interesting if you have a non passing test from the original suite
Do you have some examples?