logoalt Hacker News

frumiousirclast Sunday at 10:59 AM1 replyview on HN

This needs to be run from an environment where `pip` is available as that tool is used during the running of the abogen app. Using `uv tool run abogen` gets you started but then the app hangs at model install time. `uv venv && uv pip install pip && source .venv/bin/activate && abogen` lets it run properly.

Otherwise, it's a nicely packaged GUI. Well done!

I tried a PDF and the UI to select pages or sections is good and generation is fast on my laptop's GTX 1650.

The result is an .ogg audio and .ass subtitle file. Played with mpv allows listening and reading along in the terminal. Only issue I have with the result is that visual line breaks from the PDF are preserved resulting in long pauses "randomly" in the middle of sentences. This greatly interrupts understanding of the audio.

Edit: enabling the skipping of single newlines helps!


Replies

nnashratlast Sunday at 12:51 PM

I just converted a 110 page book to wav in about an hour with a RTX 4060.

I didn't have the newlines enabled though so it was pretty useless.

Enabling makes this pretty awesome.

af_heart is a great voice to me while af_jessica I find annoying. That is the main issue I have with audiobooks , the randomness of liking the voice actor or not almost matters as much as what the book says for me.

I knew this day was coming soon and I really am blown away. I have got so use to audiobooks that it is hard to actually sit and read a full book for me. I have about 20 books to convert that would never have a market to bother having someone read the book and in a voice I really like. Incredible.