logoalt Hacker News

jmathaiyesterday at 9:26 PM0 repliesview on HN

I advise a medical non profit and we ran a series of tests against cases doctors input to our system looking for specialist recommendations.

Our findings found that gpt-5-mini performed better than gpt-5, sonnet 4 and medgemma.

I think these studies are very hard to accurately score. But in any case, AI seems to do a very good job compared to humans. Unsurprising, really.