> You also don't want to send faster than real-time. If the user interrupts the model you just wasted a bunch of bandwidth sending 3 minutes of audio (but only played 10 seconds)
You only need to send ~1 second at a time. There's no reason to send 20ms or 10 min at a time. Both are stupid.