logoalt Hacker News

wavemodeyesterday at 3:41 AM1 replyview on HN

I like the concept. Though, they couldn't have found better text-to-speech voices? Or is it meant to be humorous how bad they are.


Replies

tom_0yesterday at 4:02 AM

It's a stylistic choice for sure. A little better than that is straight in uncanny valley, and human-level is too high latency and too expensive for us. We found that this level of crappy works great, in practice, plus it runs on-device! We use Rhasspy Piper to generate them.

show 1 reply