I really wish those offering speech-to-text models provided transcription benchmarks specific to par...

gwerbret • today at 6:19 PM • 1 reply • view on HN

I really wish those offering speech-to-text models provided transcription benchmarks specific to particular fields of endeavor. I imagine performance would vary wildly when using jargon peculiar to software development, medicine, physics, and law, as compared to everyday speech. Considering that "enterprise" use is often specialized or sub-specialized, it seems like they're leaving money on Dragon's table by not catering to any of those needs.

Replies

consumer451 • today at 7:52 PM

Try it out! I read various papers full of jargon at high speed, and it is stunning.

https://huggingface.co/spaces/mistralai/Voxtral-Mini-Realtim...

alt Hacker News

Replies