One thing Apple really needs to get right is speech to text transcription. It feels like they're a decade behind on properly transcribing voices. At least half a decade.
Input on the iPhone is so dreadful nowadays. Their palm rejection is definitely worse than before, their text-correction algorithm for typing is worse than before, and STT hasn't improved.
"Vehicle Motion Cues come to visionOS, which can help reduce motion sickness for people who use Apple Vision Pro as a passenger in a moving vehicle. Vision Pro will also support face gestures for performing taps and system actions, plus a new way to select elements with one’s eyes while using Dwell Control."
Maybe just don't wear them in a car?
This looks like a genuinely useful application of LLMs.
I wish more companies focused on how they can help humans instead of replacing us or squeezing us as hard as possible in the name of productivity.
I have difficulty trusting this. There are plenty of videos online of LLMs making up stuff like "I just ate a hot dog, is there mustard around my mouth?" "No, everything is clean" while there is a big yellow stain om the guy's face
That hikawa phone grip looked neat. I can't stand pop sockets. Then I looked at the price. $100. Not accessible at all. Can anyone recommend alts?
> The total amount due on the bill is $83.89. Please verify this amount with your utility provider or by using Text Detection before making a payment.
1. Use AI to determine how much a bill is for
2. Call up the people who billed you and ask them how much they billed you
3. Pay billed amount
Honestly as a blind person and blind developer myself, most of these features get a shrug at best. For one, there's already a bunch of third-party apps that do most if not all of this (Seeing AI, Envision AI, BeMyEyes, Aira, etc.). So at best, this does what all those apps are doing but faster and on-device, which may or may not mean it is also more inaccurate, we'll have to see. In the meantime, Mac OS's screen reader, VoiceOver, has been left to essentially exist in maintenance mode for years, where users have had to build, arguably impressive, third-party solutions to add features to the thing that comparable screen readers on Windows have had for a really long time.
Through that lens, this all looks a bit performative to me, but again, maybe I'll be pleasantly surprised.
The one thing I'm mildly excited to see is the improvement to Voice Control, as guessing what the programmatic name of a button is or having to constantly use a numbers grid to target elements doesn't sound fun.
To respond to what I see in some of the comments:
- On speech rate: It does take quite a bit of practice to crank up the speech rate and there's a degree of retraining you need to do when you switch voices. A lot of more "human" sounding voices are harder to follow at super high speeds which is why a lot of people prefer more robotic but consistent speech and generally aren't convinced by AI-powered TTS yet; they often fall apart if you raise the speech rate past a certain point. - Re: actually waiting for the target audience's verdict: This is so important. I see more and more companies, individuals etc. talk about accessibility, build accessibility solutions and evangelize AI for accessibility without EVER talking to the people they claim to help. This will almost certainly mean mistakes will be made, up to and including doing more harm than good. If you want to do accessibility right, that includes AI products of any kind, hire people with lived experience or you'll get the equivalent of machine-translated text, hackerproof security in one click or an AI-powered coffee bar that orders thousands of rubber gloves. Coincidental note: I have time for new projects right now :P
That hikawa phone grip looked neat. I can't stand pop sockets. Then I looked at the price. Not that accessible. Can anyone recommend alts?
> A new power wheelchair control feature leverages the precision eye-tracking system on Apple Vision Pro to offer a responsive input method for compatible alternative drive systems. [0]
The above caption for Apple Vision Pro is for a video that to me, as an Apple Vision Pro user, is discomforting.
More questions are raised than are answered by the short video: Is the user able to fit the Apple Vision Pro by him/herself? What happens when dwelling on a directional control misregisters? Can the user recalibrate the "Eyes and Hands" setting? Dwelling on a control displaces focus and there may be impeding objects in the path of the power wheelchair. Is this really a good idea?
To my sensibility, the video is unsettling (at best), especially given how cumbersome Apple Vision Pro is.
[0] https://www.apple.com/newsroom/2026/05/apple-unveils-new-acc...
Accessibility features are such a great way to keep technology focused on real-world problems and real-world experiences.
I think the trap in creating anything is doing it for a crowd. Art, software, anything... it turns out better when it is made with a specific, named individual in-mind.
Accessibility features are almost always championed and field-tested with one specific loved one in mind and I think that's what keeps the technical solutions personable and grounded.
On-device video subtitles generation is exciting, should help with watching videos on mute. This seems like a low hanging fruit that should've already been grabbed by an app but I can't find any.
As Apple shifts towards services and fancy software features, I wonder how do they expect to stay competitive by only releasing them for a subset of languages.
Most apps have terrible accessibility labels because developer don't bother, which breaks every screen reader pipeline downstream. The Voice Control "say what you see" feature routes around that by letting users describe a button in plain language. That's a real fix for a problem caused by humans being lazy about ally.
There's my dopamine hit for the year.
Since Apple uses Gemini to power its AI, are those features actually powered by Google Gemini?
I'm super glad that they're doing this, but once again unexcited for another decade of Apple self-privileging on this stuff so they're the only ones allowed to touch or improve any of this surface, or UX outside an app's tiny box.
People talk a lot about how MacOS has gone downhill but I feel like it would have been a good start if developers could continue to patch over Apple's shortcomings like they used to be able to.
I imagine that we would be a few years into a spectrum of tools like this if they didn't lock it down like they do.
Totally aware that plenty of HN commenters are very glad that Apple keeps this locked down. I'm just the other opinion, that's all.
Now we know why the new AirPods will have cameras!
I don’t want to discredit more advancements in accessibility, but this feels like accessibility porn.
I have fond memories of an old coworker 10 years ago who is blind. He would use his phone no problem, texting, going about his day, he was even on Tinder (credit to Tinder for making their app so accessible long ago). He would commute on his own, walk to the train station, even transfer to another train during peak rush hour. I’m not saying it was all easy for him, but nothing in this video really stood out to me more than what shirt was on the bed. I know other services/apps have long existed to be the “eyes” for people who need support, but this video feels….uneventful?
I may be cynical about this though, as I often hate how Apple’s marketing makes these emotional bids about how life-critical they are to society - which is fair to a degree..but it just feels cheap to be glamorising “look! we saved this person from pending doom, cool right??”
Fun fact: This video was made accessible to sighted people because no blind person would ever listen to voice at that speed. Honestly if you ever observe a blind person using computers you'd impressed how they can listen to audio at unimaginable speeds.