logoalt Hacker News

blazingbananalast Sunday at 8:41 PM10 repliesview on HN

https://play.google.com/store/apps/details?id=com.blazingban...

Completely free, no ads, no in-app purchases and no accounts / network required offline voice transcription.

I have also built the macOS/Windows/Linux versions which I'll also make free to download and available on my site soon (https://blazingbanana.com/).

iOS version is built and works (extremely well), just waiting for the Apple Developer signup process to complete.

Big shout out to https://github.com/mybigday/whisper.rn and https://huggingface.co/ggerganov/whisper.cpp/tree/main for making this even possible.

Any suggestions are welcome.


Replies

bazzarghyesterday at 4:32 AM

On the subject of whisper being great... A few weeks ago a co-worker commented about the difficulty he'd had editing a work demo, I pointed at various jump-cutting tools that had automated what he did in the past (editing out silences). But I'd also wanted to play with whisper for a while...

So a couple of hours later I'd written a script that does transcription based editing: on the first pass it grabs a timestamped transcript and a plain text transcript for editing; you edit the words into any order you like and a second pass reassembles the video (it's just a couple of hundred lines of python wrapping whisper and ffmpeg). It also speeds up 4x any silences detected that sit within retained sequences in the video.

Matching up transcripts turns out to be not that hard; I normalise the text, split it, and then compare to the sequence of normalised words from the timestamped transcript. I find the longest common sequence, keep that, then recurse on the before/after sections (there's a little more detail, but not much). I also sent the transcription to ffmpeg to burn in as captions, because sometimes it makes the audio choppy and the captions make it easier to follow.

I know, tools have been doing this for years now. I just didn't have one to hand, and now I do, and I couldn't have done this without whisper.

show 1 reply
seineclelast Sunday at 8:58 PM

Couldn't find it on the Play store by searching for the name and the developer's name: if it is not just me then your app is very hard to discover.

So I am installing it through the link you provided, which directed me to a "install success" page saying "your purchase is successful" even if your app is free. Another obstacle to adoption :-)

Last, I was not informed on the page of the app' size. Seeing what it does and the time it takes to download I am afraid it could be huge? Third obstacle :-)

show 2 replies
firefoxdlast Sunday at 9:41 PM

Pretty cool. I've downloaded and lightly tested. Works great.

I love the "free forever, no ads part..." But it obscures what the app is for. Maybe start with the "Speech to text transcription" to make it clearer.

Either way, that's just semantics. Great job

show 1 reply
figmertlast Sunday at 10:31 PM

It'd be nice to keep the voice recording too, as I noticed at least one thing that it transcribed wrong.

This way one can listen to the recording again, and correct such issues.

show 1 reply
woscyesterday at 7:41 AM

That's very cool, I've been looking for a fully offline transcription app for quite a while. Thanks for building this! And thanks so much for providing an "import audio file" function, not just "record from mic" -- transcribing voice notes from various messenger apps is my main use case here.

Do you have an idea about supporting languages other than English?

show 1 reply
figmertyesterday at 10:29 AM

I just tried running this on a 30 minute meeting with some 10 people in. It got to the end, then just bailed without transcribing. I also did not get any errors or anything.

show 1 reply
buildcaptiveyesterday at 11:41 AM

@blazingbanana

We have a similar product in the construction space. Would love to talk to you about some of our challenges and possibly work together. Interested?

show 1 reply
mysfilast Sunday at 11:48 PM

I really liked wisprflow on my mac but my daily driver is Manjaro KDE. I have stitched together a bash script that copies the transcription (right now I am using the Parakeet TDT 0.6B) to my clipboard. I would give this a try on linux when it becomes available.

show 2 replies
abdullahkhalidslast Sunday at 11:24 PM

Would you consider adding it F-Droid?

show 1 reply
twaldeckeryesterday at 9:35 AM

nice app!

if I am talking in german the text is translating it to english. Didn't expect that

show 1 reply