Information Hub for Linguists

2020-05-29 CLDR v38 shakedown start

The pages listed to the left provide guidelines for translation of CLDR strings. For an overview of the tools, please read the Survey Tool Guide before starting. 


Current Survey Tool stage: v38 is open for Shakedown 
Please refer to the Milestone Schedule in the left navigation for detailed schedule. However, please note that the exact dates will be refined by the committee as we look across different needs and availability.

See Shakedown phase for details. Shakedown is an invite only period, and if you have not been invited, we recommend that you wait for the General submission.

Shakedown participants, please send your feedback on:
  1. Survey tool issues that you have not encountered before. (e.g. Login in, Voting, Dashboard, changes to performance, notification emails, etc...are working as your previous experiences.) 
  2. Your experiences with new Forum features, and any bugs that you may encounter.
  3. Any difficulties with new data points. Including if you need additional translation tips on how to handle new data in your language.
  4. Clarity on translation guides. Please note the topics that have been updated for this release
  5. Any feedback on the contents of this Information Hub page. 

Prerequisites

  1. Know Data stability expectations
  2. Know topics under @Getting Started to ensure familiarity on what you may encounter working in the Survey Tool.
  3. @General translation guides are the customary expectations for all the vetting work.
  4. Please visit this page (Information hub for linguists) every other day, and check for news at the top. The information on this page will be updated at least weekly. Bookmark it

What's new in this cycle

  • If you are new to CLDR contribution, please read the prerequisites above first.
  • If you have contributed to CLDR in the past, below are the information that's new or have changed since the last release. 

Notation

💡marks important translation tips
greenmarks items that need special attention
yellowmarks new or changed text

Survey Tool 

  • In the Dashboard, the category "New" has been misleading and it's been renamed to "Changed" to reflect the category accurately. See details under Changed in the Survey Tool Guides. 
  • Major updates with the forum feature in the Survey Tool to incorporate a workflow.
    💡 Please read the details under Forum in the Survey Tool Guide.
    The enhancements include:

New data 

Following are new data that have been added for data collection in this release. 

  • Unicode Symbols [CLDR-13705]
    • There are ~100 new symbols under Characters\Symbols2. 
    • 💡 Translation Tips for Unicode symbols:
      • To find established names in your language, use common research methods or translator applications.
      • Use Wikipedia documentations of Unicode symbols when available. 
      • Research the symbol using the symbol (e.g. ) or the name (per mille) or the Unicode code point (U+2030 ). 
      • You can copy/paste the symbols shown in the Code column in the survey tool into your preferred search method.
      • On Windows, you can convert the symbol to the Unicode code point by selecting the symbol in an editor application (e.g. Word) with Alt+x. 
      • Search examples: Wikipedia per mille or Google search for ‰ or  Bing search for U+2030 or Wikitonary.
    • There are also ~200 additional symbols under the comprehensive coverage level if you have the time to contribute additional data. 
  • Emoji 13.1 [CLDR-13779]
  • Compact decimals and Units. A few new data are also requested in compact decimal and units. 
  • New CLDR target languages
    • Basic Level: Dogri, Navajo, Sanskrit
    • Moderate Level: Norwegian Nynorsk
  • Inflections.  [CLDR-13756] For a limited number of locales and units of measurement, we are adding support for inflections for noun case and gender. The following is the limited set of locales in v38:

    Grammatical Feature

    Locales

    Case

    pl, ru, de, hi

    Gender

    pl, ru, de, nb, da, sv, hi, es, fr, it, nl, pt

    Not all units will have the extra information: only a subset of about 75. For these units, many (but not all) forms have “seed” data, marked as provisional. Before starting, be sure to read Grammatical Inflection for instructions if your locale is one of the above.


Translation quality

Following are areas where we have seen data quality issues or those that need your attention more carefully. 
  • Avoiding voting for English
    • For items that do not work in your language, please don't simply use English. Find a solution that works for your language. For example, if your language doesn't have a concept of "quarters", use a translation that describes the concept "three-month period" rather than “quarter-of-a-year”.
  • Dealing with “Same as code” errors:
    • Since v37, if you voted for the Code, a Same as Code error will raise. 
    • When translating codes for items such as languages, regions, scripts, and keys, it is normally an error to select the code itself as the translated name (such as “en” as the translated name for code “en” English), except for some specific cases including certain script codes (for example, code “Thai” is also the name for script Thai in several languages).
    • If the error appears under Typography, you can ignore. [CLDR-13552]
  • Bidi example limitations [CLDR-10674]. If you are working with a bi-directional languages, be aware of the Right-to-Left and Neutral context. Survey Tool only shows examples with a strong RL context, and we have been issues where vetters removed the ALM bidi marks or modify the patterns without considering the neutral context. Please be cautious of changing the bi-di formatting data. 
  • Handling Display name menu variants 

    Translation guides: updated sections

    If you are new to CLDR, use the @Getting Started topics to get started and review the left Table of Contents under Translation Guides. 

    Major updates have been done to the following list of translation guides for clarity:

    Known Issues

    Please review this list before getting started to avoid creating duplicate tickets. This list will be updated as fixes are made available in Survey Tool Production. If you hit a problem, please file a ticket.

    Updated on 2020-05-29
    1. Disconnect error. If you see a persistent Loading error with a disconnect message, see Empty Cache
    2. Same name collision error. If two items differ only by upper/lower case or punctuation, it still counts as a collision. However, currently, only one of them is flagged as an error. [CLDR-11274]
    3. Images for the plain symbols. Non-emoji such as , √, », ¹, §, ... do not have images in the info pane.
      • Workaround: Look at the Code column; unlike the new emoji, your browser should display them there.
    4. Miscounted Provisional items. The Dashboard is omitting many provisional items on the units pages. CLDR-13833
      • Workaround: Review each of the Unit pages, looking for the items without ✔ marks, that is: ✘, , or  signs.
    Older known issues
    1. Brackets "[ ]" under Alphabetic information are used to group the alphabetic information and they are not part of the data. Ticket CLDR-13180
      • Workaround: Please ignore the [ ] in the Alphabetic information and do not try to update the data to exclude the [].

    Resolved Issues

    The following list of previously listed on the known issues have now been resolved:

    2020-05-00
    1. <none so far>
    Older resolved issues
    1. <none so far>