Hello Gabriel from Kyutai here, maybe it's related to the way we chunk the text? Can you post an issue on github with the extact text and voice? I'll take a look.