logoalt Hacker News

burntetoday at 4:16 PM17 repliesview on HN

I'm a healthcare CIO of 12 years, and I've evaluated 4 and deployed 2 of these tools, one of which is currently deployed at my currently healthcare employer. I am very measured on AI but the results I've seen from these virtual scribes is HUGE. In every case we have IMMEDIATELY seen improvements in patient NPS scores, provider satisfaction, and note quality. Notes are more standardized as well as more verbose and detailed, which makes it easier for future providers to understand the case. These better notes reduce our claim rejection rate.

And what converted me was direct patient response. Across the board patient feedback is extremely positive, with the most common comment being along the lines of "I really felt like the doctor connected with me better and they were more present in the visit."

These AI scribes really DO improve patient care, I've seen it with my own eyes.


Replies

dsr_today at 4:34 PM

Pre-AI voice recognition (2018), followed by 2 human reviews

https://jamanetwork.com/journals/jamanetworkopen/fullarticle...

=> the error rate was 7.4% in the version generated by speech recognition software, 0.4% after transcriptionist review, and 0.3% in the final version signed by physicians. Among the errors at each stage, 15.8%, 26.9%, and 25.9% involved clinical information, and 5.7%, 8.9%, and 6.4% were clinically significant, respectively.

AI "scribes" in a perfectly replicable best-of-all-worlds scenario (2025): https://bmjdigitalhealth.bmj.com/content/1/1/e000092

=> Omissions dominated error counts (83.8%, p<<0.001), with CAISs varying widely in error frequency and severity, and a median of 1–6 omissions per consultation (depending on CAIS). Although less frequent, hallucinations and factual inaccuracies were more often clinically serious. No tested CAIS produced error-free summaries.

On the gripping hand, people who work in the management end of the US healthcare industry can't be trusted with healthcare or information security to begin with.

show 4 replies
eclarktoday at 4:37 PM

Be careful with initial impressions of metrics. We as humans have a heavy tenancy to anchor to our first judgments or impression. We see a win and assume the win is long term, with no downsides, and dependent on the new information/change.

So combine that with the Hawthorne effect and new business or health initiatives that can look great simply because participants notice change and notice the increased attention. However many human patterns have a tendency to regress to the mean.

Personally I have seen this a lot with developer tools and DevOps. A new SEV/incident/disaster happens and everyone rushes to create or onboard to a tool that would help. Around the office everyone raves about it and is sure that it would fix all issues. And the number of commits goes up, or the number of SEV's in an area decreases for a while. People were paying attention, after a while the tool starts to slow down or not be as used. It's got rough edges that weren't seen or scenarios that were supposed to be supported never get fully integrated. Eventually the patterns regress, but with more tools and more complexity.

- https://pmc.ncbi.nlm.nih.gov/articles/PMC1936999/

- https://arxiv.org/abs/2102.12893

show 1 reply
wltoday at 4:31 PM

I got an erroneous Type II diabetes diagnosis dropped into the note by the AI scribe at my last appointment because my PCP discussed the A1C test he was ordering. Would not recommend. That isn't to say that manually typed notes or speech to text dictated notes are perfect (dot phrases have ended up "documenting" plenty of conversations that never happened), but a false diagnosis of a chronic disease seems like a really bad failure.

jubilantitoday at 4:28 PM

I still don't want a fucking audio recorder in my doctor's office or a fucking AI that sits in between me and my doctor.

I am intentionally cursing to express my anger at this casual betrayal of medical trust.

show 6 replies
sonofhanstoday at 4:36 PM

I’ve been in tech and medicine too. Consider that any “HUGE” effect in this context is likely exaggerated, especially for something as prosaic as a note-taking assistant.

As a patient sitting with a doctor, I don’t care how standardized the notes are. I don’t care about anyone’s NPS score. I do want the doctor to connect with me, but I also remember not too long ago when doctors did this anyway, without any assistance from robots.

show 3 replies
ygjbtoday at 7:16 PM

Not to be antagonistic, but a healthcare CIO in which country? This is very relevant because outside of the US, I think it is probably fair that most people who are most active on HN are from countries with public health care, and stronger consumer protection and privacy laws.

The healthcare outcomes are absolutely critical in evaluating the use and value of these tools, but there are second and third order effects from using the tools that need to be contextualized with the specific motivations of executives endorsing the tools.

show 1 reply
ryandraketoday at 4:46 PM

> improvements in patient NPS scores, provider satisfaction, and note quality

How are note quality improvements measured? Vibe-notes might be more verbose and better sounding (which would explain the NPS and satisfaction metrics), but still not actually match the doctor's actual words or intent. Are the AI-generated notes actually compared with ground truth to prove they are accurate?

show 1 reply
carefulfungitoday at 7:12 PM

Once you've had your medical records used against you by a third party, you start being much more careful about what you share with your doctors about yourself.

There is no trust in a Dr's office. What they record gets handed to companies who have interests adversarial to yours. Basically like talking to the police. If you, as a patient, think an automated recording is helping you long term, you are naive.

zaptheimpalertoday at 5:38 PM

Yep I would agree as a patient. My current doctor types so slow that 6 out of the 10 short minutes in an appointment just disappear while he types. Even with other docs who can touch type, it will free them up to focus completely on the appointment and reduce the hours they spend charting afterwards.

t-kalinowskitoday at 4:20 PM

Counterpoint from a doctor: https://substack.com/inbox/post/189714240

Scribes _feel_ good in the short-term, but it's not clear if they're actually good on longer time horizons.

show 4 replies
invalidptrtoday at 5:14 PM

How do you control for quality variation between patients? In my experience, AI note taking tools display a clear bias against participants who are {quieter, ESL, women, ...}. How can you evaluate whether these biases show up in a medical setting?

show 1 reply
parliament32today at 4:50 PM

Good, I'm glad. Now find a way to do it in-house. Shipping our conversation to some random-ass fly-by-night SaaS who pinky-swear-promises they're HIPAA-compliant is a non-starter for a medical professional I'd actually want to give money to.

show 1 reply
kakaciktoday at 6:39 PM

As a husband of wife who is GP, I would add a general long term issue - cognitive overload. GPs have to be almost-expert in everything, my wife is doing everything from preliminary cancer diagnosis and treatment, heart attack diagnosis to psychiatric care and many other places in between which should be normally covered by specialists. But there are simply not enough of them here for many tasks here (Switzerland). Any mistake can be potentially fatal to the patient, easily, trivially.

The amount of self-imposed stress and responsibility compared to puny insignificant software dev roles like mine is staggering. And its every single day, no easy day, ever.

On top of that, 3-4 hours daily just doing paperwork for insurances, legal, judges etc. that has to be flawless. LLms can help massively here, but it would be great if they are opt-in for patient (and thus he can get better focus of doctor / longer time spent / lower meeting cost), and if they could be local-only. Absolutely nobody from anywhere in Europe wants to send any data to US nor any of their closer servers, that game is closed for good.

show 1 reply
ydingtoday at 4:24 PM

When you evaluated the tools, what stood out between which ones were better or worse?

cromkatoday at 4:26 PM

But WHY not do this on premises? WHY?

show 3 replies
Jamesbeamtoday at 4:51 PM

This ad was brought to you by the AI scribe industry, Dr.Nicks favorite tool.

show 1 reply
mmoosstoday at 4:23 PM

If I allow it, is the data from my meeting sent offsite at any stage, for example to an LLM service (e.g., Anthropic, OpenAI, etc.)? Or do the LLM vendors (or any others) have access to the internal data at any stage?