I wonder how many of the current text-to-speech ML models have large parts of leaked or "stolen...

embedding-shape • today at 2:05 PM • 2 replies • view on HN

I wonder how many of the current text-to-speech ML models have large parts of leaked or "stolen" data in their training data? Almost none of the TTS releases seem to talk about exactly where they get their training data from, for some reason. I also wonder if we'll see an explosion in SOTA TTS in ~6 months from now.

Replies

hirako2000 • today at 2:11 PM

It's already there. And keeps moving.

Even have a nice UI on top.

https://voicebox.sh/

jubilanti • today at 2:42 PM

Not really, Mozilla Common Voice (the ImageNet of speech) is larger than this. Their English database has 3814 hours, 1.6 million sentences, from 100k speakers.

https://commonvoice.mozilla.org/en/languages

alt Hacker News

Replies