logoalt Hacker News

wizzwizz4yesterday at 7:23 PM0 repliesview on HN

> The core algorithm behind modern generative AI was developed specifically for translation

Indeed! And yet, generative AI systems wire it up as a lossy compression / predictive text model, which discreetly confabulates what it doesn't understand. Why not use a transformer-based model architecture actually designed for translation? I'd much rather the model take a best-guess (which might be useful, or might be nonsense, but will at least be conspicuous nonsense) than substitute a different (less-obviously nonsense) meaning entirely.

Bonus: purpose-built translation models are much smaller, can tractably be run on a CPU, and (since they require less data) can be built from corpora whose authors consented to this use. There's no compelling reason to throw an LLM at the problem, introducing multiple ethical issues and generally pissing off your audience, for a worse result.