There's a number of recent, good quality, small TTS models.
If the author doesn't describe some detail about the data, training, or a novel architecture, etc, I only assume they just took another one, do a little finetuning, and repackage as a new product.
[flagged]
Any recommendations?