logoalt Hacker News

sbrothertoday at 3:07 AM2 repliesview on HN

> …but as a user, I would much rather wait an extra 200ms for my slow/expensive prompt to be accurate

I disagree with this SO strongly. I find the conversational voice mode to be a game changer because you can actually have an almost normal conversation with it. I'd be thrilled if they could shave off another 50-100ms of latency, and I might stop using it if they added 200ms. If I want deep research I'll use text and carefully compose my prompt; when I'm out and about I want to have a conversation with the Star Trek computer.

Interestingly I'm involved with a related effort at a different tech company and when I voiced this opinion it was clear that there was plenty of disagreement. This still surprises me since it seems so obvious to me that conversational fluidity is the number one most important feature.


Replies

kixelatedtoday at 3:58 AM

To clarify, I meant waiting an extra 200ms if the alternative was dropping part of the prompt. During periods of zero congestion, the latency would be the same.

bredrentoday at 3:55 AM

It is very important, the low latency.

I prompt orchestrations most of the day, and am very particular about the fidelity of my context stack.

Yet I’ve used advanced voice mode on ChatGPT via the iOS app a lot. And I have not had a problem with it understanding my requests or my side of the conversation.

I have looked at the dictation of my side and seen it has blatant mistakes, but I think the models have overcome that the same way they do conference audio stt transcripts.

I have had times where the ~sandbox of those conversations and their far more limited ability to build useful corpus of context via web searches or by accessing prior conversation content.

The biggest problem I have had with adv voice was when I accidentally set the personality to some kind of non emotional setting. (The current config seems much more nuanced)

The AI who normally speaks with relative warmth and easy going nature turned into an emotionless and detached entity.

It was unable to explain why it was acting this way. I suspect the low latency did disservice there because when it is paired with something adversarial it was deeply troubling.