I agree with you on this specific study, however, I can't really wrap my head about the fact th...

pixel_popping • yesterday at 7:54 PM • 17 replies • view on HN

I agree with you on this specific study, however, I can't really wrap my head about the fact that doctors will be better than AI models on the long-run. After all, medicine is all about knowledge, experience and intelligence (maybe "pattern recognition"), all those, we must assume that the best AI models (especially ones focusing solely in the medical field) would largely beat large majority of humans (aka doctors), if we already have this assumption for software engineers, we should have it for this field as well, and let's be realistic, each time I've seen a doc the last few months (and ER twice), each time they were using ChatGPT btw (not kidding, it chocked me).

So I’m genuinely curious:

What is the specific capability (or combination of capabilities) that people believe will remain permanently (or at least for decades) where a top medical AI cannot match or exceed the performance of a good human doctor? Let's put liability and ethics aside, let's be purely objective about it.

Replies

teleforce • today at 2:36 AM

>What is the specific capability (or combination of capabilities) that people believe will remain permanently (or at least for decades) where a top medical AI cannot match or exceed the performance of a good human doctor? Let's put liability and ethics aside, let's be purely objective about it.

You cannot simply put liability and ethics aside, after all there's Hippocatic oath that's fundamental to the practice physicians.

Having said that there's always two extreme of this camp, those who hate AI and another kind of obsess with AI in medicine, we will be much better if we are in the middle aka moderate on this issue.

IMHO, the AI should be used as screening and triage tool with very high sensitivity preferably 100%, otherwise it will create "the boy who cried wolf" scenario.

For 100% sensitivity essentially we have zero false negative, but potential false positive.

The false positive however can be further checked by physician-in-a-loop for example they can look into case of CVD with potential input from the specialist for example cardiologist (or more specific cardiac electrophysiology). This can help with the very limited cardiologists available globally, compared to general population with potential heart disease or CVDs, and alarmingly low accuracy (sensitivity, specificity) of the CVD conventional screening and triage.

The current risk based like SCORE-2 screening triage for CVD with sensitivity around is only around 50% (2025 study) [3].

[1] Hipprocatic Oath:

https://en.wikipedia.org/wiki/Hippocratic_Oath

[2] The Hippocratic Oath:

https://pmc.ncbi.nlm.nih.gov/articles/PMC9297488/

[3] Risk stratification for cardiovascular disease: a comparative analysis of cluster analysis and traditional prediction models:

https://academic.oup.com/eurjpc/advance-article/doi/10.1093/...

➕ show 1 reply

gherkinnn • yesterday at 8:06 PM

To answer your question: talking to a human.

Medicine is so much more than "knowledge, experience, and pattern matching", as any patient ever can attest to. Why is it so hard for some people to understand that humans need other humans and human problems can't be solved with technology?

➕ show 14 replies

hyperpape • yesterday at 11:33 PM

> we must assume that the best AI models (especially ones focusing solely in the medical field) would largely beat large majority of humans (aka doctors), if we already have this assumption for software engineers, we should have it for this field as well,

This is a pretty wild leap. Code has a lot of hooks for training via hill-climbing during post-training. During post-training, you can literally set up arbitrary scenarios and give the bot more or less real feedback (actual programs, actual tests, actual compiler errors).

It's not impossible we'll get a training regime that does the "same thing" for medicine that we're doing for code, but I don't know that we've envisioned what it looks like.

➕ show 2 replies

Terretta • today at 12:49 AM

Humans tend to be very bad at connecting dots, which is why when we imagine someone who does, we make the show "House" about it.

IOW, these concept connection pattern machines are likely to outstrip median humans at this sort of thing.

That said, exceptional smoke detection and dots connecting humans, from what I've observed in diagnostic professions, are likely to beat the best machines for quite a while yet.

throw234234234 • yesterday at 11:22 PM

My personal anecdote when I talk to people - everyone when talking about their job w.r.t AI is like "at least I'm not a software engineer!". To give a hint this isn't just a US phenomenon - seen this in other countries too where due to AI SWE and/or tech as a career with status has gone down the drain. Then they always go on trying to defend why their job is different. For example "human touch", "asking the right questions" etc not knowing that good engineers also need to do this.

The truth is we just don't know how things will play out right now IMV. I expect some job destruction, some jobs to remain in all fields, some jobs to change, etc. We assume it will totally destroy a job or not when in reality most fields will be somewhere in between. The mix/coefficient of these outcomes is yet to be determined and I suspect most fields will augment both AI and human in different ratios. Certain fields also have a lot of demand that can absorb this efficiency increase (e.g. I think health has a lot of unmet demand for example).

dragonwriter • yesterday at 9:52 PM

> After all, medicine is all about knowledge, experience and intelligence (maybe "pattern recognition"), all those, we must assume that the best AI models (especially ones focusing solely in the medical field) would largely beat large majority of humans

No, I don’t see that we must.

> if we already have this assumption for software engineers

No, this doesn’t follow, and even if it did, while I am aware that the CEOs of firms who have an extraordinarily large vested personal and corporate financial interest in this being perceived to be the case have expressed this re: software engineers, I don’t think it is warranted there, either.

➕ show 2 replies

root_axis • yesterday at 10:01 PM

Diagnosis is just a small part of a doctor's job. In this case, we're also talking about an ER, it's a very physical environment. Beyond that, a doctor is able to examine a patient in a manner that isn't feasible for machines any time in the foreseeable future.

More importantly, LLMs regularly hallucinate, so they cannot be relied upon without an expert to check for mistakes - it will be a regular occurrence that the LLM just states something that is obviously wrong, and society will not find it acceptable that their loved ones can die because of vibe medicine.

Like with software though, they are obviously a beneficial tool if used responsibly.

largbae • yesterday at 8:14 PM

But liability and ethics cannot be put aside. If treatments were free of cost and perfectly address problems, then a correct diagnosis would always lead to the optimal patient outcome. In that scenario, AI diagnosis will be like code generation and go asymptotic to perfection as models improve.

But a doctor's job in the real world today is to navigate a total mess of uncertainty: about the expected outcome of treatments given a patient's age and other peoblems. About the psychological effect of knowing about a problem that they cannot effectively treat. Even about what the signals in the chart and x-ray mean with any certainty.

We are very far from having unit test suites for medical problems.

➕ show 1 reply

nkrisc • yesterday at 8:02 PM

> What is the specific capability (or combination of capabilities) that people believe will remain permanently (or at least for decades) where a top medical AI cannot match or exceed the performance of a good human doctor? Let's put liability and ethics aside, let's be purely objective about it.

Being a human when a patient is experiencing what is potentially one of the worst moments of their life. AI could be a tool doctors use, but let’s not dehumanize health care further, it is one of the most human professions that crosses about every division you can think of.

I would not want to receive a cancer diagnosis from a fucking AI doctor.

➕ show 2 replies

pianopatrick • yesterday at 10:39 PM

Last time I went to the ER the doctor used a scope to look down my throat and check everything seemed fine. I don't think pure AI like ChatGPT will be able to do that any time soon. Maybe a medical robot with AI will one day, but that seems at least a few years off.

➕ show 2 replies

fc417fc802 • yesterday at 8:05 PM

> I can't really wrap my head about the fact that doctors will be better than AI models on the long-run.

Nobody said that though?

If the current trajectory continues and if advancements are made regarding automated data collection about patients and if those advancements are adopted in the clinic then presumably specialized medical models will exceed human performance at the task of diagnosis at some point in the future. Clearly that hasn't happened yet.

➕ show 1 reply

KaiserPro • yesterday at 9:17 PM

There are a few sides to medicine:

1) looking at tests and working out a set of actions

2) following a pathway based on diagnosis

3) pulling out patient history to work out what the fuck is wrong with someone.

Once you have a diagnosis, in a lot of cases the treatment path is normally quite clear (ie patient comes in with abdomen pain, you distract the patient and press on their belly, when you release it they scream == very high chance of appendicitis, surgery/antibiotics depending on how close you think they are to bursting)

but getting the patient to be honest, and or working out what is relevant information is quite hard and takes a load of training. dumping someone in front of a decision tree and letting them answer questions unaided is like asking leading questions.

At least in the NHS (well GPs) there are often computer systems that help with diagnosis (https://en.wikipedia.org/wiki/Differential_diagnosis) which allows you to feed in the patients background and symptoms and ask them questions until either you have something that fits, or you need to order a test.

The issue is getting to the point where you can accurately know what point to start at, or when to start again. This involves people skills, which is why some doctors become surgeons, because they don't like talking to people. And those surgeons that don't like talking to people become orthopods. (me smash, me drill, me do good)

Where AI actually is probably quite good is note taking, and continuous monitoring of HCU/ICU patients

godelski • yesterday at 9:55 PM

  > After all, medicine is all about knowledge, experience and intelligence

So is... everything?

LLMs are really really good at knowledge.

But they are really really bad at intelligence [0]

They have no such thing as experience.

Do not fool yourself, intelligence and knowledge are not the same thing. It is extremely easy to conflate the two and we're extremely biased to because the two typically strongly correlate. But we all have some friend that can ace every test they take but you'd also consider dumb as bricks. You'd be amazed at what we can do with just knowledge. Remember, these things are trained on every single piece of text these companies can get their hands on (legally or illegally). We're even talking about random hyper niche subreddits. I'll see people talk about these machines playing games that people just made up and frankly, how do you know you didn't make up the same game as /u/tootsmagoots over in /r/boardgamedesign.

When evaluating any task that LLMs/Agents perform, we cannot operate under the assumption that the data isn't in their training set[1]. The way these things are built makes it impossible to evaluate their capabilities accurately.

[0] before someone responds "there's no definition of intelligence", don't be stupid. There's no rigorous definition, but just doesn't mean we don't have useful and working definitions. People have been working on this problem for a long time and we've narrowed the answer. Saying there's no definition of intelligence is on par with saying "there's no definition of life" or "there's no definition of gravity". Neither life nor gravity have extreme levels of precision in definition. FFS we don't even know if the gravaton is real or not.

[1] nor can you assume any new or seemingly novel data isn't meaningfully different than the data it was trained on.

➕ show 2 replies

themafia • yesterday at 8:16 PM

This study is based almost entirely on pre-existing "vignettes." In other words, on tests that are already known and have existed for years, the model did well, which is precisely what you should expect.

It provides no information on real world outcomes or expectations of performance in such a setting. A simple question might be "how accurate are patient electronic health records typically?"

Finally, if the Internet somehow goes down at my hospital, the Doctor can still think, while LLM services cannot. If the power goes out at the hospital, the Doctor can still operate, while even local LLMs cannot.

You're going to need to improve the power efficiency of these models by at least two orders of magnitude before they're generally useful replacements of anything. As it is now they're a very expensive, inefficient and fragile toy.

delfinom • yesterday at 8:55 PM

Medicine is about knowledge, but acquiring knowledge may in fact require "breaking out of the box" that AI is increasing behind to avoid touching "touchy subjects" or insulting anyone and so on.

dominotw • yesterday at 10:12 PM

> What is the specific capability (or combination of capabilities) that people believe will remain permanently (or at least for decades) where a top medical AI cannot match or exceed the performance of a good human doctor?

Detecting when patient is lying . all patients lie - Dr. House

xoofoog • yesterday at 10:27 PM

I would love to replace my doctors with AI. Today. Please. I have had Long Covid for over a year now, which is a shitty shitty condition. It’s complicated and not super well understood. But you know who understands it way better than any doctor I’ve ever seen? Every AI I’ve talked to about it. Because there is tons of research going on, and the AI is (with minor prompting) fully up to date on all of it.

I take treatment ideas to real doctors. They are skeptical, and don’t have the time to read the actual research, and refuse to act. Or give me trite advice which has been proven actively harmful like “you just need to hit the gym.” Umm, my heart rate doubles when I stand up because of POTS. “Then use the rowing machine so can stay reclined.” If I did what my human doctors have told me without doing my own research I would be way sicker than I am.

I don’t need empathy. I don’t need bedside manner. Or intuition. Or a warm hug. I need somebody who will read all the published research, and reason carefully about what’s going on in my body, and develop a treatment plan. At this, AI beats human doctors today by a long shot.

alt Hacker News

Replies