logoalt Hacker News

zug_zugyesterday at 6:34 PM15 repliesview on HN

For me the last remaining killer feature of ChatGPT is the quality of the voice chat. Do any of the competitors have something like that?


Replies

hbarkayesterday at 8:49 PM

On the contrary, I thought Gemini 3 Live mode is much much better than ChatGPT. The voices have none of the annoying artificial uptalking intonations that ChatGPT has, and the simplex/duplex interruptibility of Gemini Live seems more responsive. It knows when to break and pause during conversations.

show 1 reply
simondotauyesterday at 8:58 PM

I absolutely loathe ChatGPT's voice chat. It spends far too much time being conversational and its eagerness to please becomes fatiguing after the first back-and-forth.

joshmarlowyesterday at 8:25 PM

I think Grok's voice chat is almost there - only things missing for me: * it's slower to start-up by a couple of seconds * it's harder to switch between voice and text and back again in the same chat (though ChatGPT isn't perfect at this either)

And of course Grok's unhinged persona is... something else.

show 3 replies
Robdel12yesterday at 6:38 PM

I have found Claude‘s voice chat to be better. I only recently tried it because I liked ChatGPTs enough, but I think I’m going to use Claude going forward. I find myself getting interrupted by ChatGPT a lot whenever I do use it.

show 1 reply
josephwegneryesterday at 9:00 PM

Along with the hordes of other options people are responding with, I'm a big fan of Perplexity's voice chat. It does back-and-forth well in a way that I missed whenever I tried anything besides ChatGPT.

show 1 reply
tmalyyesterday at 7:39 PM

I can't keep up with half the new features all the model companies keep rolling out. I wish they would solve that

websiteapiyesterday at 7:13 PM

gemini live is a thing - never tried chaptgpt, are they not similar?

show 2 replies
sundarurfriendyesterday at 7:49 PM

Are you saying ChatGPT's voice chat is of good quality? Because for me it's one of its most frustrating weaknesses. I vastly prefer voice input to typing, and would love it if the voice chat mode actually worked well.

But apart from the voices being pretty meh, it's also really bad at detecting and filtering out noise, taking vehicle sounds as breaks to start talking in (even if I'm talking much louder at the same time) or as some random YouTube subtitles (car motor = "Thanks for watching, subscribe!").

The speech-to-text is really unreliable (the single-chat Dictate feature gets about 98% of my words correct, this Voice mode is closer to 75%), and they clearly use an inferior model for the AI backend for this too: with the same question asked in this back-and-forth Voice mode and a normal text chat, the answer quality difference is quite stark: the Voice mode answer is most often close to useless. It seems like they've overoptimized it for speed at the cost of quality, to the extent that it feels like it's a year behind in answer reliability and usefulness.

To your question about competitors, I've recently noticed that Grok seems to be much better at both the speech-to-text part and the noise handling, and the voices are less uncanny-valley sounding too. I'd say they also don't have that stark a difference between text answers and voice mode answers, and that would be true but unfortunately mainly because its text answers are also not great with hallucinations or following instructions.

So Grok has the voice part figured out, ChatGPT has the backend AI reliability figured out, but neither provide a real usable voice mode right now.

SweetSoftPillowyesterday at 9:44 PM

Gemini's much better, try it

whimsicalismyesterday at 8:03 PM

gemini does, grok does, nobody else does (except alibaba but it’s not there yet)

codybontecouyesterday at 6:44 PM

Their voice agent is handy. Currently trying to build around it.

ivapeyesterday at 7:59 PM

I'm a big user of Gemini voice. My sense is that Gemini voice uses very tight system prompts that are designed to give you an answer and kind of get you off the phone as much as possible. It doesn't have large context at all.

That's how I judge quality at least. The quality of the actual voice is roughly the same as ChatGPT, but I notice Gemini will try to match your pitch and tone and way of speaking.

Edit: But it looks like Gemini Voice has been replaced with voice transcription in the mobile app? That was sudden.

bigyabaiyesterday at 6:35 PM

Qwen does.

show 1 reply
semiinfinitelyyesterday at 7:53 PM

try gemini voice chat

FrasiertheLionyesterday at 6:34 PM

Try elevenlabs

show 1 reply