IntroductionAdds tools to CLDR to convert to and from the XMB message format. The XMB format is basically a key-value pair list, with no deeper structure. It does have a mechanism for named placeholders, with descriptions and examples. The messages for any given other language must correspond 1:1 with those of English.
The goal is to allow for bulk translation of CLDR files via existing translation tooling. Examples:ENGLISH
<msg id='615EB568A2478EAF' desc='The name of the country or region with BCP47 code = UZ. Before translating, please read cldr.org/translation.'
>Uzbekistan</msg>
<!-- English: MMMM d, y --> FRENCH <!-- English: Uzbekistan --> <!-- English: MMMM d, y -->
<msg id='5D6EA98708B9B43B'
><ph name='DAY_1_DIGIT'><ex>9</ex>d</ph> <ph name='MONTH_LONG'><ex>September</ex>MMMM</ph> <ph name='YEAR'><ex>2010</ex>y</ph></msg>
The id is common across the different languages. The description, the placeholder names and the placeholder examples (<ex>) are visible to the translator, as is the text between placeholders, of course. The translator can change the order of the placeholders, but they cannot be removed (or added).
The main tool for converting CLDR to this format is at GenerateXMB.java. It reads the en.xml file, and puts together a EnglishInfo object that has a mapping from paths to descriptions and placeholders. It also generates the English XMB file for translation. Next, each of the other CLDR locale files are read and their data is used to populate a separate XTB file for translation memory.
Files:
Others are at xmb/. The documentation files are at http://cldr.org/translation.
The tool generates log files during processing, targeted at development and debugging.
Examples:
PlaceholdersReplaces the placeholders ("{0}", "MMM", etc.) in patterns by variable names with examples. This is data-driven, using the file at xmbPlaceholders.txt.
Format:
The name cannot contain spaces.
Example:
Filtering and descriptionsData driven, using the file xmbHandling.txt.
Format:path_regex ; description
Example:^//ldml/dates/timeZoneNames/metazone\[@type=".*"]/commonlyUsed ; SKIP
^//ldml/dates/timeZoneNames/zone\[@type=".*"]/exemplarCity ; The name of a city in: {0}. See cldr.org/xxxx.
PluralsPlurals are represented with ICU Syntax, such as:
<msg id='4AC13E2DA211C113' desc='[ICU Syntax] The pattern used to compose plural for week, including abbreviated forms. These forms are special! Before translating, see cldr.org/translation/plurals.' >{LENGTH, select, abbreviated {{NUMBER_OF_WEEKS, plural, =0 {0 wks} =1 {1 wk} zero {# wks} one {# wk} two {# wks} few {# wks} many {# wks} other {# wks}}} other {{NUMBER_OF_WEEKS, plural, =0 {0 weeks} =1 {1 week} zero {# weeks} one {# week} two {# weeks} few {# weeks} many {# weeks} other {# weeks}}}}</msg> TODO
|