The 60-minute single-pass transcription is the part that actually matters. Most ASR models chunk aud...

vijgaurav • today at 7:12 PM • 0 replies • view on HN

The 60-minute single-pass transcription is the part that actually matters. Most ASR models chunk audio and you lose speaker continuity across boundaries. If the diarization actually holds up on hour-long recordings without drifting, thats a real solve for podcast and meeting transcription workflows.

alt Hacker News