But why should I care? If you demonstrated that a model can perform more accurate diagnoses than a doctor, but also it had this strange behavior when no image was presented, why should that deter me from using the model?
Because you don’t have any way of telling if it actually used the image presented, or based it’s conclusions on a different image it made up
Because you don’t have any way of telling if it actually used the image presented, or based it’s conclusions on a different image it made up