logoalt Hacker News

ej88yesterday at 9:28 PM1 replyview on HN

adding some context as someone who works in this space

1. most people (average, non-tech people) reach for the phone to call in for easily solvable problems. Plus, if the agent is integrated deep enough & has tools to interact with crms, you can raise the ceiling on the types of problems it can solve.

You're trying to avoid the bad customer experience of human 1 reading off their script, then they transfer you to some other department who may or may not know how to solve your problem, and the entire interaction cost the company way more than the value created, so the company is disincentivized to help customers.

2. All the companies in this space start with the outsourced BPO market for cx (multi billion market still) but the next market is going to be in revenue generation and churn prevention at scale, i.e. how do you proactively avoid customer issues, how do you upsell and generate revenue instead of reducing cost, how do you keep customers happy?

3. I think more companies will pivot to outcome based pricing on the contrary, makes it so much more measurable than seat-based and protects margins better than usage based. Plus cx is one of the few industries with very well known metrics

4. Kind of? Most companies in this space don't use native voice models which are noticeably dumber, they use transcription + a stronger text model + TTS. The majority of customers can be handled with the latest SOTA text model and you need smart context engineering to handle the long tail of more complicated asks


Replies

woeiruayesterday at 11:31 PM

1 & 2 are totally dependent on the company being willing to let their agents do things that they haven’t traditionally let humans do. For example, issue refunds, or do things that cost money but generate good will. I am skeptical that companies will be OK with their agents doing those things on their own volition.

3. Cool so the user didn’t indicate if they were satisfied. What then?

4. You can’t use a SOTA model right now for reasoning, there’s too much latency for a conversation. So you’re either using an older, but significantly less capable model, or you’re paying out the nose for fast mode. If the former then you can’t trust the agent to do the right thing (see points 1&2). If the latter, there’s no cost savings over a human. So which is it?

show 3 replies