Low latency inference is very useful in voice-to-voice applications. You say it is a waste of power but at least their claim is that it is 10x more efficient. We'll see but if it works out it will definitely find its applications.
This is not voice-to-voice though, end-to-end voice chat models (the Her UX) are completely different.
This is not voice-to-voice though, end-to-end voice chat models (the Her UX) are completely different.