logoalt Hacker News

nerdsnipertoday at 5:46 PM0 repliesview on HN

On just the transcription side:

WhisprFlow produces much better speech-to-text for long text messaging-by-voice (dictation / transcription) than apple's speech-to-text does. Whisper models in general seem to do a lot better than most built-into-OS/app models. Which is interesting, because there's nothing stopping them from just using Whisper models.

I love MacWhisper personally. Also, Gumroad is a fantastic app distribution platform for my personal values.

https://goodsnooze.gumroad.com/l/macwhisper

As far the "decision tree" side ... there's not much that can be done about that now. Agentic agents still go too "off-the-rails" to be productionized out to the billions of smartphones of the world. I'm working on voice-controlled agentic-with-rails AI features for my HomeAssistant, because Alexa / Google Home suck. But that's a hobby project and rogue AI actions only affect me, not billions of customers.