logoalt Hacker News

scrollaway10/01/20241 replyview on HN

On this, I just want to share my take-away from my translation engineering days: I fully believe the "right way to do it" is to have two string types: A regular string type and a "user-visible string" type, in a similar way that some frameworks expose a "safe string" vs "unsafe string" (for escaping purposes).

User-visible strings are consistently translatable, and the translation mechanism needs to have deep access in the language for this. I think in typescript this is a fairly doable thing given the AST access you yourself make use of. I'll gladly dig into how you do this on your end but I'm guessing it's somewhere along those lines but not quite?

Incidentally, when you have two string types, it becomes fairly straightforward to detect strings that probably should be translated but aren't tagged as such. Most strings are like this, in fact, because string constants tend to be a bad idea in general (vs numeric constants), and string operations tend to be a bad idea in the context of i18n (you want to use templated strings instead). So you tag until you only have a few left-over.


Replies

Burj10/02/2024

Yeah, this tracks! The steps are basically 1) determine user facing elements 2) determine strings 3) map user facing elements to the string. (We use the ast and no llms for this)

The upside of this approach is that we get a lot of context for accurate translation. The other upside is that down the line we can pull off fully automatic translation, but as others have pointed out, this is more of a gimmick. We think it's cool but it's more like the cherry on top

Also, yeah, that pattern would make life infinitely easier. Most develors really should think like this already, and not mix user facing strings with strings for other logic. But from what ive seen, pre i18n, devs dont think like this. Someday...