I created Voibe which takes a slightly different direction and uses gpt-4o-transcribe with a configurable custom prompt to achieve maximum accuracy (much better than Whisper). Requires your own OpenAI API key.
https://github.com/corlinp/voibe
I do see the name has since been taken by a paid service... shame.