But since it is a UB, there's no guarantee that your test program produces the same result as the same code running on production, even if you have the same compiler.
That's very unlikely, and in the worst case you've reduced a difficult bug into an easier to understand bug.
That's very unlikely, and in the worst case you've reduced a difficult bug into an easier to understand bug.