I have no doubt that the writer is better at translating than AI, but I have to say that AI translation has gotten so good that I'm not sure how much longer translation work will be there, or rather it might end up being more about auditing.
For example, I just read the Lawrence Ellsworth translation of The Three Musketeers, which I very thoroughly enjoyed. I don't speak or read French, but from my understanding Ellsworth's translation is considered one of the more accurate translations of the work.
Out of curiosity, I sic'd Claude Fable on the original French version of The Three Musketeers and told it to translate accurately, but also try and keep the same jovial tone as the original and do not censor anything. After it was done, I didn't read the entire output, but I did compare a few individual chapters between the Ellsworth translation and the Fable translation.
They were honestly remarkably similar. As far as I could tell, nothing was substantially different from the Ellsworth translation and the Fable translation. I do think that the prose for the Ellsworth translation was a bit better, but the prose for the Fable one was actually perfectly readable. Again, I don't speak French so I cannot say for sure, but I do not believe that I would have gotten a significantly different experience had I read the Fable version instead of the Ellsworth version.
Now, it's possible (and likely) that this is somewhat self-fulfilling; Fable might have been trained using Ellsworth's translation and as such it's very directly able to crib from it; sadly since I do not speak any language outside of English, there's sort of a catch-22: the only way I can compare the accuracy of a translation is to compare against other translations, but if other translations exist then that will likely influence the results, and if a translation doesn't already exist then I have no way of auditing it.
I'm still going to continue reading through Ellsworth's translations for the subsequent stories simply because that feels more canonical, and as I said I do think the prose was a bit better.
As somebody who regularly reads translated works, including the occasional machine translation (MTL), they (MTL) suck. You got a hugely biased result, which you recognize.
Translation is hard. If you're familiar with reading translations from specific languages MTL works have a very specific smell to them, it's a bit hard to describe but it's there. A good translation is miles (kilometers, for those outside of the US) above MTL.
That's not to say that perhaps the latest LLMs will have better translation abilities, but that they are generally crap currently. Maybe they are fine for something very short, but absolutely not for longer content.
> I did compare a few individual chapters between the Ellsworth translation and the Fable translation.
I'm pretty sure the Ellsworth translation is in the corpus. You basically instructed claude to regurgitate it.
The llms all have the more famous books memorized. You can trick them to recite them more or less word for word.
> Fable might have been trained using Ellsworth's translation and as such it's very directly able to crib from it
The `cp` program on my computer also has the remarkable ability to produce a faithful translation of The Three Musketeers when provided one as input.
> As far as I could tell, nothing was substantially different from the Ellsworth translation and the Fable translation.
Crucially the full translation was part of ChatGPT’s training set. Recall is a pretty solved problem in machine learning.
How well does it translate a French novel published yesterday? Where neither the original novel nor any translations are in the training set yet? Or might not even exist!
I tried asking ChatGPT to translate a letter I wrote in Slovenian this weekend. It got the general gist but missed a lot of the nuance. Completely missed several of the little touches of tone where the right choice of synonym conveys a whole bunch of information.
> Again, I don't speak French so I cannot say for sure
This reminds me of the adage, that ChatGPT is really great at everything except my own work.
I see the difficulties more in other areas, such as technical translations, specialist books, user manuals, and translating UIs, where contextual information and a back and forth with the client is needed to clarify details, and (for user manuals and UIs) the translator has to put themselves in the mind of the user and has to consider the possible contexts and use cases.
Honestly, translations of fiction are themselves creative works, and the translator needs to really understand both cultures and needs to write cohesively throughout the work. I'm not sure this is even really a question of "can it translate" so much as "can it create a good work of fiction" which is a much higher bar. So maybe the model can mimic the style (especially given that it was probably trained on existing translations) but could it really do so from scratch in a way that is actually compelling? I'm not so sure.
Of course as for the poor OP... is this a majority of what working translators are paid to do?
I suspect a lot of translation is just grunt work - technical and business documents. The lack of a cohesive voice with considered style is perhaps not really much of an issue in those. The expectations are just much lower; text that conveys the basic meaning is a much lower bar to clear.
She's probably better than a bot at that stuff, at least for now, but my concern is that it won't be "enough" better for businesses to justify her continued employment. And this is my general feeling about this stuff across society, in basically all domains.
You're very likely to get a somewhat circular reference; the key (for me) is that for 90% of the usages, "standard translation LLMs" are just fine - I still recommend a translator but they're more of a proof-reader for both languages, catching where something slipped through.
This is sort of missing the point-- people who dont deal with linguistics dont understand that there are multiple types of translation. There's word for word (which is what you're talking about) and sense for sense. If you let an LLM do all of your translation you're letting it interpret huge amounts of intent and context it doesnt (and probably cant) access. The ways in which this impacts the translation will forever be unknown to you and in the worst case lost forever.
So i guess in the end it just matters how important the work is.
LLMs are now being aggressively manipulated for propaganda purposes. Powerful people have realized that people believe LLMs, and treat them as authoritative sources of fact.
The number of lies, lies by omission, deceptive distortions, and fallacious argument tactics they generate is absurd, and increasing rapidly. Translation, when done as a service you are paid for, can't be relied on by propaganda bots.
> … considered one of the more accurate translations of the work.
I think you’re missing a big point of translating literary works. A purely “accurate”, phrase-by-phrase translation is often not very good; the actual literary style, the feeling and the allusions and references, often get lost that way. A good translation of literary work requires a lot of deliberate choices by the translator to deviate from literal translations in ways that convey the style of the original, or an extra layer of meaning that would be lost by an “accurate” translation of a phrase. Also, being consistent with these choices matters a lot, which OP claims LLMs are less good at.
An interesting counter-example: https://xcancel.com/ValerioCapraro/status/206506665753442336...
> Out of curiosity, I sic'd Claude Fable on the original French version of The Three Musketeers and told it to translate accurately, but also try and keep the same jovial tone as the original and do not censor anything. After it was done, I didn't read the entire output, but I did compare a few individual chapters between the Ellsworth translation and the Fable translation.
This isn’t a great test, because Claude almost certainly has multiple translations of The Three Musketeers in its training data.