Anthropic doesn't have any realtime multimodal audio models available, they just use STT and TTS models slapped on top of Claude. So they are currently the worst provider if you actually want to use voice communication.