logoalt Hacker News

Ontario auditors find doctors' AI note takers routinely blow basic facts

98 pointsby sohkamyungyesterday at 10:37 PM32 commentsview on HN

Comments

zOneLetteryesterday at 11:24 PM

Anecdotally, we use an LLM note-taker at work for meetings. I had to intervene recently because our CIO was VERY angry at our vendor for something they promised to do and never did. He wasn't at the meeting where the "promise" was made. I was. They never promised anything, and the discussion was significantly more nuanced than what the LLM wrote in the detailed summary.

In other cases, I have seen it miss the mark when the discussion is not very linear. For example, if I am going back and forth with the SOC team about their response to a recent alert/incident. It'll get the gist of it right, but if you're relying on it for accuracy, holy hell does it miss the mark.

I can see the LLM take great notes for that initial nurse visit when you're at the hospital: summarize your main issue, weight, height, recent changes, etc. I would not trust it when it comes to a detailed and technical back-and-forth with the doctor. I would think for compliance reasons hospitals would not want to alter the records and only go by transcripts, but what do I know...

show 2 replies
dmixtoday at 1:50 AM

> They specifically address the AI Scribe program, the Ontario Ministry of Health initiated for physicians, nurse practitioners, and other healthcare professionals across the broader health sector.

makes me wonder what quality software the ministry would push (probably mostly qualifications like SOC).

This is apparently this list of approved vendors

https://www.supplyontario.ca/vor/software/tender-20123-artif...

rainsfordtoday at 12:18 AM

I have generally moved from bearish to bullish on the future of current AI technology, but the continued inaccuracy with basic facts all while the models significantly improve continues to give me significant pause.

As an example, creating recipes with Claude Opus based on flavor profiles and preferences feels magical, right up until the point at which it can't accurately convert between tablespoons and teaspoons. It's like the point in the movie where a character is acting nearly right but something is a bit off and then it turns out they're a zombie and going to try to eat your brain. This note taking example feels similar. It nearly works in some pretty impressive ways and then fails at the important details in a way that something able to do the things AI can allegedly do really shouldn't.

It's these failures that make me more and more convinced that while current generation AI can do some pretty cool things if you manage it right, we're not actually on the right track to achieve real intelligence. The persistence of these incredibly basic failure modes even as models advance makes it fairly obvious that continued advancement isn't going to actually address those problems.

show 2 replies
Hobadeeyesterday at 11:42 PM

The AI note taker we use at work records the meeting as well, and each note it takes about the meeting has a timestamp link that takes you directly there in the recording so you can check it yourself. While I'm sure a solution like this is more complicated in a HIPPAA environment, something like this is critical for things as important as healthcare.

show 2 replies
ceejayozyesterday at 11:25 PM

> 60% of evaluated AI Scribe systems mixed up prescribed drugs in patient notes, auditors say

Not mentioned, as far as I can see: the comparative human mistake rate.

Having seen a lot of medical records, 60% sounds about normal lol.

show 5 replies
jqpabc123today at 1:24 AM

And once again, we have an example of how AI is a liability issue waiting to happen.

nothinkjustaitoday at 12:38 AM

People will eventually figure out LLMs have no capacity for intent and are fundamentally unreliable for tasks such as summarization, note taking etc.

samcoopertoday at 12:54 AM

[flagged]