logoalt Hacker News

gcrtoday at 12:37 PM3 repliesview on HN

This confused me too but the formula and rules for variants are specified by the configured language out-of-band, so there is support for this.

Let's take your example. In English, counting files looks like this:

    You have {file_count, plural,
       =0 {no files}
       one {1 file}
       other {# files}
    }
In Polish, there are several possible variants depending on the count:

    Masz 1 plik
    Masz 2,3,4 pliki
    Masz 5-21 pliko'w
    Masz 22-24 pliki
    Masz 25-31 pliko'w
Your Polish translators would write:

    Masz {file_count, plural,
       one {# plik}
       few {# pliki}
       other {# pliko'w}
    }
The library (and your translators) know that in Polish, the `few` variant kicks in when `i%10 = 2..4 && i%100 != 12..14`, etc. I think the library just knows these rules for each language as part of the standard. Mozilla says that it was an explicit design goal to put "variant selection logic in the hands of localizers rather than developers"

The point is that it's supported, it simplifies developer logic, and your translators know how to work with it.

See https://www.unicode.org/cldr/charts/48/supplemental/language...

(Apologies if I got the above translation strings wrong, I don't speak Polish. Just working from the GNU gettext example.)


Replies

yorwbatoday at 1:49 PM

"the library just knows these rules for each language as part of the standard" sounds great until you try to support a small minority language that the library just doesn't know about and then you're left trying to hack around it by pretending that it's actually a regional variety of another language with similar plural rules.

AFAIK, unlike gettext, MessageFormat doesn't allow you to specify a formula for the plural forms as part of the localization data, so the variant selection logic ended up in the hands of library developers rather than localizers or application developers.

And the standard does get updated occasionally, which can also lead to bugs with localization data written against another version of the standard: https://github.com/cakephp/cakephp/issues/18740

Muromectoday at 5:22 PM

>This confused me too but the formula and rules for variants are specified by the configured language out-of-band, so there is support for this.

Well, making out of band sure is one way to do to prevent lazy people from doing eval on plural forms from the po file. I hope the library is actually good then.

npodbielskitoday at 1:00 PM

usually it is ó instead of o' but otherwise very good :)