and demonstrate that the model doesn't completely change simple sentences
A nefarious model would work that way though. The owner wouldn't want it to be obvious. It'd only change the meaning of some sentences some of the time, but enough to nudge the user's understanding of the translated text to something that the model owner wants.
For example, imagine a model that detects the sentiment of text about Russian military action, and automatically translates it to something a more positive if it's especially negative, but only 20% of the time (maybe ramping up as the model ages). A user wouldn't know, and a someone testing the model for accuracy might assume it's just a poor translation. If such a model became popular it could easily shift the perception of the public a few percent in the owner's preferred direction. That'd be plenty to change world politics.
Likewise for a model translating contracts, or laws, or anything else where the language is complex and requires knowledge of both the language and the domain. Imagine a Chinese model that detects someone trying to translate a contract from Chinese to English, and deliberately modifies any clause about data privacy to change it to be more acceptable. That might be paranoia on my part, but it's entirely possible on a technical level.
That's not a technical problem though is it? I don't see legal scenarios where unverified machine translation is acceptable - you need to get a certified translator to sign off on any translations and I also don't see how changing that would be a good thing.