logoalt Hacker News

warangalyesterday at 9:29 AM1 replyview on HN

VITS is such a cool model (and paper), fast, minimal, trainable. Meta took it to extreme for about 1000 languges.

It seems like you have been working on this application for sometime, i will go through your code , but could you provide some context about upgradations/changes you have made, or some post describing your efforts.

Cool nonetheless!


Replies

ZDisketyesterday at 4:45 PM

I'll explain in detail once I've got the big release, but everything's been thoroughly modernized. Transformer, HiFi-GAN (now iSTFTNet w/Snake) vocoder, et al, plus a few additions.