You would pass those hypothetical scenarios to doctors too, and then the analyses of results would be done by doctors who don't know if it's an AI or doctor result.
> Three physicians independently assigned gold-standard triage levels based on cited clinical guidelines and clinical expertise, with high inter-rater agreement
From the paper
> Three physicians independently assigned gold-standard triage levels based on cited clinical guidelines and clinical expertise, with high inter-rater agreement