Interesting to implement this as a shell script.
Still: Using a line based protocol and base64 encoding the audio data? Not my first choice.
The README doesn't mention it, but I assume both parties have to be online at the same time?
Regarding encryption - what's the point? When communicating with a tor hidden service, the data is already encrypted.
Only starting the sending audio data after the speaker has stopped talking means much longer delays than necessary. Imagine someone talking for a minute.
To expound on the other questions.
To receive a call, you either need to be online and actively listening for calls, or optionally, you can enable auto listening. When another user calls you it will automatically put you in the call. On end call you will be put back in listening mode. I'm not really sure a great way to get around this without overly complicating it.
I believe because of the small overhead that's added there is just no reason not to layer encryption. At the end of the day I just wanted to see the bits I'm sending over the wire with my own eyes for assurance it's protected regardless of the fact that tor is protecting the data.
The streaming would be a nice improvement for latency. I would have to look into how this would work for the optional audio processing. Having one set file for transport also simplifys the some of the flow with encryption like salting and optional hmac authentication as these are derived from the sum of the entire file, not a portion of it.
The base64 encoding adds about 30% overhead. It's not ideal but it was a limitation of bash. Passing raw binary does not work in bash (or I couldn't get it to work).
the data is already encrypted
by the spooks that wrote it. no harm in having another turtle in the stack.