Information Hub for Linguists

This page was last modified on: May 24, 2019
The pages listed to the left provide guidelines for translation of CLDR strings. For an overview of the tools, please read the Survey Tool Guide before starting.

Data stability

Please be mindful of data stability by carefully reviewing previously Approved data. When it's clearly incorrect, it should be changed — but for data stability, don't change the field when it is already acceptable (even if not optimal). When you have an evidence of a variant being much better and in customary use than the existing Approved data, use the Forum to bring up discussions and gain consensus to change Approved values.

Current Survey Tool stage: Shakedown

The survey tool is now in Shakedown stage. All data submissions during this stage is considered as real data and they will be saved and used. Please see Milestone Schedule in the left navigation for the full schedule.


What's new in this release cycle

Survey Tool

  • The star label is now an indication referred to as the "Baseline".  The Baseline means that the data was either the last released or last modified by the technical committee (#11857)
  • Performance enhancements
    • We continue to work on improving the performance of the survey tool, and plan for some additional improvements rolling out during the submission and vetting periods.
    • You should be less often disconnected from the Survey tool (and more predictably)
    • The data refresh of the information panel should be faster.
    • Each locale should be loaded faster
    • For this release, your feedback on the log-out timing and the data refresh in the information pane were considered for performance.
  • The following list of languages have been increased to the 8 vote Approval level.
    • Amharic, Irish, Georgian, Kazakh, and Kygryz. (#12032)
  • Not new, but a reminder on additional information icons in the Survey tool under the English column "i" for additional information and "e" for an example.

New data

Approximately 200 new data items were introduced (the exact number will vary by locale).
  • Islamic calendar
    • For the following regions (and associated languages), there is increased coverage of Islamic calendar elements(per CLDR-10676); era and month names are now in basic coverage, standard date formats are now in moderate coverage, and flexible date/time formats are now in modern coverage.
      • AZ Azerbaijan
      • ID Indonesia
      • PK Pakistan
      • SO Somalia
      • TD Chad
      • TJ Tajikistan
      • TR Turkey
      • UZ Uzbekistan
  • Units 
    • New units and patterns (CLDR-11910CLDR-11454)
      • Addition of a "times" pattern (default = {0}⋅{1}) for compound units like foot⋅pound or newton⋅meter.
      • Addition of US therm (energy), decade (duration), pascal and bar (pressure). (Todo-M: link to Wiki)
    • New units in new “graphics” category (CLDR-9996):
      • em: Typographic length equal to a font’s point size.
      • pixel (px) and megapixel (MP): Used for counting the individual elements in bitmap image; in some contexts pixel means 1⁄96 inch, but that is not the intended usage here.
      • pixel-per-centimeter (ppcm) and pixel-per-inch (ppi): Typically used for indicating display resolution.
      • dot-per-centimeter (dpcm) and dots-per-inch: Typically used for indicating printer resolution.
  • English names updates (Todo-M: update with the actual information after the cleanup)
  • Emoji
    • Names and search keywords for the draft candidate emoji for Unicode 13.0.
      • While not final, translating the bulk of the emoji during this cycle allows us to speed up the process. (We may include Emoji 13 in the fall collection cycle again if necessary.)
    • Remember to look for Translation quality issues for emoji (below). 
  • New locales added for data contributions
    • Osage (osa)
    • Irish for United Kingdom (ga_GB) 
    • Creek (mus)
  • North Macedonia
    • The English name of the country was changed to "North Macedonia", and should be consistent with that (using a term for North) in other languages. Most languages were updated in the last release in post-processing, and "alt" names have been removed.
    • Please verify the name in your language and update to the new name if the update is not reflected in your language.
    • The "alt" names will be removed for in post-processing, for all languages where the main name has been confirmed.

Translation quality

Please review the following areas to improve translation quality before starting.
  • Timezones.
    • Please focus on Timezone name quality issues, checking for inconsistencies between the names of countries (regions) and the names of timezones. For example if "Macau" is the spelling used for the region, and "Macao" is the spelling used for the timezone, that's a problem.
    • A list of overlapping data between Timezone and Territory names are available in this public spreadsheet. Use this spreadsheet as a reference when working on Timezone names, and bring consistency for Timezone names where they are also found in Territory names.[Same workaround as v34]
  • AM/PM
    • For locales using the 24 hr as the standard formats, AM/PM data fields are difficult to handle. Translations of AM/PM  may be more confusing than the English strings.  
    • If the English AM/PM strings are more commonly understood, vote for inheritance English strings AM/PM. (Related tickets: Hindi #11417, German #10789)
    • If translations of AM/PM are commonly understood in your locale, use the translations.
  • Falklands/Malvinas translation consistency (#11526)
    • In some languages / locales, we have found that the handling of primary/secondary names were incorrect. In some Falklands is the primary name, and Malvinas is secondary; in others it is reversed.
    • Please review the translations for consistency, and check for which should be primary for your language / locales.
    • See additional details in the Translation guide: Geopolitical sensitive names.
  • Avoiding English
    • For items that do not work in your language, please don't simply use English. Find a solution that works for your language. For example, if your language doesn't have a concept of "quarters", use a translation that describes the concept "three-month period" rather than “quarter-of-a-year”.
    • For example, a number of Pashto items were found to be in English and has been removed. Please correct the situation and supply the missing data, reviewing the others for consistency. (#11565)
  • Emoji names and search keywords
    • Not simple translations
      • Remember that the character names and keywords are not translations.
      • They are the so-called transcreations, and may be completely different than translations. Don't simply translate the English; use terms that people would use to describe the image (which will show up in the Information Panel on the right). 
      • Moreover, there may be more or fewer keywords than in English.
    • Gender-neutral
      • Many of the new items are gender-neutral / gender-inclusive, such as "farmer". 
      • The label you use must apply to either men or women in that role, and must have a different name than the "man" version or "woman" version. 
      • Look at the existing gender-neutral terms, like person running / man running / woman running.

Translation guides: updated sections

  1. Survey Tool Guide:
    1. Import old votes was updated to include steps on how you can import non-winning votes.
    2. Icons was updated for the "baseline" explanation.

Known Issues

Please review this list before getting started to avoid creating duplicate tickets. This list will be updated as fixes are made available in production. If you hit a problem, please file a ticket.
  1. Shakedown issue: English name changes has 113 emoji name changes. This is not accurate and will be updated shortly. (Todo-M: Remove this known issue after GenerateBirth) 

Resolved Issues

    Previously listed on the known issues that have been resolved:

    TBD: Will be updated for v36 as necessary.

    Survey Tool Stages 

    Shakedown

    The survey tool is live and all data that you enter will be saved and used. You can start work, but there may be additional fixes during this period. So the tool may be taken down for updates more frequently than after we exit Shakedown. During Shakedown, your participation in looking for issues with the Survey tool is essential. If you find any problems in the tool, please file a ticket.

    Submission

    Make sure your coverage level is set correctly at the top of the page.

    There are two types of releases: full, and limited-submission. 

    Version 36 is a full-submission release.

    For a limited-submission release, the Survey Tool will only let you add or vote in certain rows. What you can do depends on your locale:
      1. Newly targeted locales: proceed with Submission (General). 
      2. Other targeted locales: proceed with Submission (General), but start with the Dashboard step and focus on Errors*, Missing†, and English Changed.
      3. Other locales: go to the Dashboard and deal with any Errors*.
    * Note that if the committee finds systematic errors in data, new tests can be added during the submission period, resulting in new Errors.

    If you want to know which locales are in which categories, see Targeted Locales.

    Submission (General)

    Make sure your coverage level is set correctly at the top of the page.
     
    For new locales or ones where the goal is to increase the level, it is best to proceed page-by-page starting with the Core Data section. At the top of each page you can see the number of items open on the page. Then scan down the page to see all the places where you need to vote (including adding items). Some 

    Then please focus on the Dashboard view, first getting all Missing† items entered, and then addressing any remaining Errorsand reviewing the English Changed (fixing your language if necessary). 

    * Note that if the committee finds systematic errors in data, new tests can be added during the submission period, resulting in new Errors.
    † Among the Missing are are new items for translation(On the DashboardNew means winning values that have changed since the last release.)

    If you are working in a sub-locales (such as fr_CA), coordinate with others on the Forum to work on each section after it is are done in the main locale (fr). That way you avoid additional work and gratuitous differences. See voting for inheritance vs. hard votes in Survey Tool guide

    Vetting

    All contributors are encourage to move their focus to the Dashboard view, and:
    1. Resolve all of the Errors.
    2. Review all items in the Forums that don't show consensus yet, and try to resolve them by posting relevant information.
    3. Consider other's opinions, by reviewing the Disputed and the Losing. See guidelines for handling Disputed and Losing.
    4. Review the items that are Flagged for TC and provide comments if you have information that should be considered.  
    To see the Flagged items, go to the Gear dropdown, under Forum see Flagged items:

    Resolution

    The vetting is done, and further work is being done by the CLDR committee to resolve problems. You should periodically take a couple of minutes to check your Forums to see if there are any questions about language-specific items that came up.

    Targeted Locales

    The categories of locales are based on the following:

    Newly targeted locales:
    Other targeted locales:
    • CLDR targets: the 82 languages as listed in Locale Coverage chart with Modern, Moderate or Basic in the CLDR target column (excluding newly targeted), and certain of their regional locales.
    • Highly active communities: Cherokee, Scottish Gaelic, Faroese; other locales with >95% modern coverage in the last release.
    Other locales:
    • All other locales
    Subpages (39): View All
    Comments