logoalt Hacker News

jspizziriyesterday at 4:09 PM1 replyview on HN

There's a really awesome site called Librivox [1], where volunteers narrate books that are in the public domain. Those recordings are also in the public domain as well (this is just part of the Librivox thing). The quality of those recordings (both the narration, and the actual recording quality) varies quite a bit and most of them aren't at a quality I'd expect people to pay for and thus aren't useable for me. I've spent hours and hours sorting through those recordings finding the best ones (from a narration perspective) and then improving the recording/audio quality on them. Those recordings have all mostly been made in the last 20yrs, so they're not old recordings of the books. So, the value I add to the Librivox recordings are: curation/selection, audio enhancement, and a much better delivery mechanism (IMHO).

I'm also simultaneously building out our own library of original audio content by working with voice actors to get them recorded and proof read (this is a very expensive and time consuming process, but also very fun). One of the hardest parts is honestly the proofing process. Once I get finished narration files I have to compare them result with the actual script (as there are always mistakes) and request edits. I use whisper.cpp to transcribe them and then git and a few other scripts to compare the transcript with the actual book text.

I'll also add that I _do not_ use AI Audio narration because it just doesn't sound good IMHO, and I personally hate listening to it. I regularly run experiments to see what the current state of the tech is and it's still pretty far from where it needs to be IMO. I also don't love the idea of AI swallowing absolutely everything.

I appreciate the feedback and compliment!

[1] https://librivox.org/


Replies

t0mktoday at 8:58 AM

Thank you for describing the process and best of luck with soundreads!