logoalt Hacker News

wonger_today at 2:47 AM1 replyview on HN

I've had this on my back burner! Glad to see someone else wants this as well. The Internet Archive already has transcripts generated by whisper.cpp, so it's (just) a matter of gathering them into one place and making a good search feature.

Example transcript (JSON): https://archive.org/download/episode_1109/episode_1109_whisp...

Screenshot from a month ago: https://wonger.dev/assets/chronicles-screenshot.png

I can reach out to their project to see if they're interested.


Replies

karpourtoday at 12:15 PM

The issue with Whisper is that it messes up both names of both products and guests. What we aim to do is make new subs, but including all the product names and guest names as hints, so Whisper will get them right.