logoalt Hacker News

sschuellertoday at 6:19 PM4 repliesview on HN

I'm still looking for the "perfect" setup in order to clone my voice and use it locally to send voice replies in telegram via openclaw. Does anyone have auch a setup?

I want to be my own personal assistant...

EDIT: I can provide it a RTX 3080ti.


Replies

bdbdbdbtoday at 9:26 PM

Why not just send text replies? You can already do that

ilakshtoday at 6:48 PM

You need to provide info on your hardware. Pocket-TTS does cloning on CPU, but for me randomly outputs something pretty weird sounding mixed in with like 90% good outputs. So it hasn't been quite stable enough to run without checking output. But maybe it depends on your voice sample.

Qwen 3 TTS is good for voice cloning but requires GPU of some sort.

nicpottiertoday at 7:52 PM

Try training a model on piper, you will need to record a lot of utterances but the results are pretty great and the output is a fast TTS model.

justanotherunittoday at 6:22 PM

Is it not just to train a model on your voice recordings and just use that to generate audio clips from text?