Read the last two paragraphs :)

tombert • yesterday at 6:40 PM • 3 replies • view on HN

Replies

The things is, this is almost certainly what's happening.

You can (could, maybe they 'fixed' it by now) get sota LLMs to reproduce entire novels near verbatim.

The idea of giving it parallel texts of those novels in different languages, to train it on translation, is so obvious it'd just be strange if the AI labs didn't do it.

In fact DeepL was doing basically that more than 10 y ago.

Wowfunhappy • yesterday at 7:23 PM

Oops, I legitimately missed the second-to-last paragraph.

I still think there are better tests you could do. Ideally, you would choose a book that was published recently—after the model’s cut-off date—which is considered to be a good translation. But even something like The Girl With the Dragon Tattoo, which is not particularly new and by no means obscure, would be better than a famous work of literature like The Three Musketeers that has many translations.

➕ show 1 reply

card_zero • yesterday at 7:19 PM

They say "yes, I admit it, this is all invalid".

➕ show 1 reply

alt Hacker News

Replies