Seems quite heavy for a STT model, Parakeet and Whisper are much smaller and perform great for quick...

walthamstow • today at 12:54 PM • 0 replies • view on HN

Seems quite heavy for a STT model, Parakeet and Whisper are much smaller and perform great for quick dictation and transcription of longer files. I guess that's due to additional accuracy and speaker diarisation?

The TTS example clip in the repo of 'spontaneous singing' is creepy as fuck

alt Hacker News