I believe the theory isn't that one is better than the other, but that different models would make different mistakes, so you can be more confident in the places where the code and tests agree.