It's a stylistic choice for sure. A little better than that is straight in uncanny valley, and human-level is too high latency and too expensive for us. We found that this level of crappy works great, in practice, plus it runs on-device! We use Rhasspy Piper to generate them.
I would personally avoid voices that skew too close to common tiktok TTS ai. Currently the heavy robots with the lower bassier voices sell that clunky robot voice vibe much better, but some of the more generic voices immediately take me out.