> If you don't speak the local language anyway, you can't decode pronounced spoken local language names anyway
This is plainly not true.
> Multilingual doesn't mean language agnostic. We humans are always monolingual, just multi-language hot-swappable if trained
This and the analogy make no sense to me. Mind you I am trilingual.
I also did not imply that the model itself needs to be multilingual. I implied that the software that uses the model to generate speech must be multilingual and support language change detection and switching mid-sentence.