I smell the bias-variance tradeoff. By underfitting more, they get closer to the degenerate case of a model that only knows one perfect photo.