What makes the level they chose a “baseline,” against which it would be appropriate to do statistical tests?