logoalt Hacker News

satvikpendemtoday at 4:39 PM1 replyview on HN

Looks like this model doesn't do realtime diarization, what model should I use if I want that? So far I've only seen paid models do diarization well. I heard about Nvidia NeMo but haven't tried that or even where to try it out.


Replies

breisatoday at 7:10 PM

Not sure if its "realtime" but the recently released VibeVoice-ASR from Microsoft does do diarization. https://huggingface.co/microsoft/VibeVoice-ASR