The diarization is on Voxtral Mini Transcribe V2, not Voxtral Mini 4B.
Do you have experience with that model for diarization? Does it feel accurate, and what's its realtime factor on a typical GPU? Diarization has been the biggest thorn in my side for a long time..
Ahh, yeah, and it's explicitly not working for realtime streams. Good catch!
Do you have experience with that model for diarization? Does it feel accurate, and what's its realtime factor on a typical GPU? Diarization has been the biggest thorn in my side for a long time..