Information Hub for Linguists

2022-07-11 CLDR v42 vetting is now closed and in resolution mode.

Summary

This page and the pages in this section provide guidelines for translation of CLDR strings.

  • Please read this page completely before starting, and visit this page (Information hub for linguists) every other day, and check for news at the top. Also check the Known Issues for any new problems.

  • The information on this page will be updated at least weekly on Wednesdays if there are changes. New information will be highlighted by blue italicized text. Please bookmark it!

New data changes:

    • Coverage, Phase 2, primarily the modern coverage additions of language names, script names, etc.

    • New -alphaNextToNumber and -noCurrency variants of the currency formats, see Number and currency patterns

    • New -atTime variant of the date-time combining pattern, see Date/Time Patterns

    • Person Names

    • New emoji

    • Locale coverage level upgrades:

      • ⇒ Basic: bgc, bho, raj

      • ⇒ Moderate: xh

      • ⇒ Modern: hi_Latn, pcm, ha, ig, yo, nn

    • Unicode 15.0 additions (script names, collation data (Han), etc.)

Current Survey Tool stage: v42 Resolution

Thank you for contributing to CLDR v42!

The Survey Tool is has moved onto the resolution stage for CLDR v42. The Survey Tool is no longer open for vetting.

Please refer to the Milestone Schedule for the full v42 release schedule.

Prerequisites


  1. Know Data stability expectations

  2. Know topics under @Getting Started to ensure familiarity on what you may encounter working in the Survey Tool.

  3. @General translation guides are the customary expectations for all the vetting work.

  4. Disconnect error. If you see a persistent Loading error with a disconnect message or other odd behavior, please empty your cache.

  5. Survey Tool email notification may be going to your spam folder. Check your spam folder regularly.

Notation

💡 marks important translation tips

bolded text marks items that need special attention

blue italicized text marks latest updates

What's new in this cycle


  • If you are new to CLDR contribution, please read the prerequisites above first.

  • If you have contributed to CLDR in the past, below is the information that's new or changed since the last release.

Progress Widget

There is a new style for the progress widget that shows your voting progress on the page in the upper right corner of the Survey Tool next to the Info Panel toggle. You will see details of your progress when you hover over the widget, including what progress is being measured, and the total number of items remaining for you to vote on in that category. Your progress is measured based on the coverage level you have set, so make sure that it is set correctly.

Note: The total progress widget is currently only visible when the dashboard is open.

Page progress

Progress bar shows progress of items on page for your coverage level.

Overall progress

Progress bar shows progress of items overall for your coverage level.

Update to forum post responses

In order to reduce confusion there are no longer buttons to 'Agree' or 'Disagree' with a forum post (earlier people mistakenly thought that clicking the agree button would also change their vote).

If you change your vote an automatic posting indicating agreement will be added to the forum as before.

Examples

The examples have been augmented, especially for Minimal pairs. An example with an ❌ shows a case where the pattern has an inappropriate placeholder substituted. That example should be ungrammatical; if it is grammatical, then either the translated unit or the minimal pair pattern itself is incorrect. The example will show in the Info Panel, and also if you hover over the item.

Recent Changes

June 31

  • Person names now have new errors and warnings. Please go to https://st.unicode.org/cldr-apps/v#/USER/PersonNameFormats/ and scan the page for errors and warnings.

    • Background: we did a review of the person name formatting data at the end of the Submission Phase, and discovered some common problems that people were having. The new errors and warnings give you a chance to fix those. The errors must be fixed, while the warnings indicate that something might be wrong, so you should review them. If you have any questions, please contact your coordinator or file a ticket.

  • Bidi languages: Examples for time and short date formats now also have multiple bidi contexts. Please look them over to help assess the options for your vote.

June 29

  • Bulk uploading of values is not allowed during vetting mode

For Managers

June 27

  • The Survey tool is now in the Vetting Phase. Please see Survey Tool phase: Vetting for instructions.

  • The Survey tool now displays BIDI examples (Arabic, Hebrew, etc) in both default and RTL for certain fields. Please recheck the following

  • The Compact number formatter had two problems that have been resolved. Please recheck the Numbers Report for your locales.

    • Numbers with "fallback formatting" (such as less than 1000) now get the correct locale's format.

    • The plural rules for the different categories (eg, in Czech or Russian) are now coming from the correct locale.

June 22

  • Person Names clarification: The zxx is intended ONLY for use in Person Names in one of the two fields:

  • Person Names: As requested, additional patterns have been added for es, ca, gl, gd to allow people to have patterns that handle commas properly when using {surname2} with Sorting/Index.

    • In those locales, you will see the additional items marked with a code like "referring-formal-1".

    • They are marked as Provisional, so you will need to review, fix, and confirm the values.

    • NOTE: when you look at the examples for a pattern, it will show the results for just that pattern. The software will later pick the right pattern based on the fields present in the name record. So in the pattern with {surname2}, look just at the examples with names for {surname2}; and for the pattern without {surname2}, look just at the examples with names that don't have {surname2}.

  • The Progress Meter (and for Managers, the Priority Items report) has been adjusted to show progress relative to the items that needed to be handled, not the total number of items. So if if there are 2500 items total but only 250 new items, and 50 of those items are done, then you used to see 98% done (100% - 50/2500) and you will now see 80% (100% - 50/250).

  • There is one additional Known Issue.

June 15

  • Language/Region names — these will show up as Missing in your locale. If the alternate names are not use in your language, just confirm the inherited value.

    • Revised the alternate name added for New Zealand in the English locale.

    • Alternate name added for Turkey

    • Added "Hinglish" variant name for hi_Latn

  • Coverage levels

    • Increased the default coverage levels for Dogri, Odia, Kashmiri (Devanagari) for India org vetters

    • Decreased the default coverage levels for Maltese, Maori, Tigrinya for the CLDR org (used for general progress tracking) — the values had been set too high. Note that this does not prevent you from setting a higher level for your work.

  • Workaround for problem in setting NameOrder For Locales values (at the top of the Person Name Formats page).

    • As a workaround, please enter the code zxx where you want an empty list. We'll take care of converting that to an empty list for you once we fix the problem. Sorry for the hassle!

June 10

  • The locales requiring 8 votes for approval have been adjusted: the locales am, ga, ky have been set down to 4, while the locales bg, is, en_AU, en_GB, es_MX, fr_CA are now at 8.

For Managers

June 8

  • Reports (such as Datetime) have been updated to have you review and confirm the values. Please look them over and check the values, then pick one of the radio buttons at the top. They will show up as Missing items in the dashboard until you confirm them as acceptable.

  • There is now a variant name for the country Turkey. In English this appears as Türkiye. Your language may or might not have a variant spelling. If it doesn’t have two different spellings you can just confirm the default value.

  • Person Name Formatting

    • The “sorting” formats are intended for sorted lists and indexes (mostly for languages that invert a person’s name for that purpose). It has been retitled Sorting/Index to make that clearer. Note: for surname-first languages, it may be identical to the normal referring format; in particular, a “,” may not be used in your language.

    • The English patterns for the formal Sorting/Index formats have been changed to use the surname-core and surname-prefix, so that for a Full name the surname-prefix goes at the end, such as “Humboldt, Alexander von”. Please check that if your language splits the surname-core and surname-prefix (such as “de”, “de”, “de la”, “van der”, etc.) that they are in the right positions for your language. Whenever your pattern doesn’t need to split them, then you can just use a plain surname. That will include both the prefix (if any) and the core automatically.

    • If your language doesn't require spaces between words (eg, Japanese), a foreign name (Albert Einstein) in your script has been added to the examples in the Info Panel. It shows the effect of the “foreignSpaceReplacement”. For example, in Japanese that name shows up as アルベルトアインシュタイン. Please check the examples for the patterns you have entered to make sure that the foreignSpaceReplacement (eg, in Japanese ) shows up in the right places in this example.

    • For more information, a detailed walkthrough and FAQs, see Miscellaneous: Person Name Formats

  • Some broken links in the Info Panel were fixed

For Managers

  • The Priority Items Summary now shows progress percentages over locales, and the format has been improved to be more ‘spreadsheet-friendly’. The new format will not show up until tomorrow (The Create New Summary button does not yet work.)

  • The Vetting Participation now has a download button. Once it is enabled, it will download a spreadsheet-friendly format that has more information.

Survey Tool

Translation quality

The following are areas where we have seen data quality issues or those that need your attention more carefully.

  • Avoiding voting for English

    • For items that do not work in your language, please don't simply use English. Find a solution that works for your language. For example, if your language doesn't have a concept of "quarters", use a translation that describes the concept "three-month period" rather than “quarter-of-a-year”.

  • Dealing with “Same as code” errors:

    • Since v37, if you voted for the Code, a Same as Code error will raise.

    • When translating codes for items such as languages, regions, scripts, and keys, it is normally an error to select the code itself as the translated name (such as “en” as the translated name for code “en” English), except for some specific cases including certain script codes (for example, code “Thai” is also the name for script Thai in several languages).

    • If the error appears under Typography, you can ignore. [CLDR-13552]

  • Bidi example limitations [CLDR-10674]. If you are working with a bi-directional languages, be aware of the Right-to-Left and Neutral context. Survey Tool only shows examples with a strong RL context, and we have been issues where vetters removed the ALM bidi marks or modify the patterns without considering the neutral context. Please be cautious of changing the bi-di formatting data.

  • Handling Display name menu variants

Translation guides

If you are new to CLDR, use the @Getting Started topics to get started and review the left Table of Contents under Translation Guides.

Updated sections

The site has been been migrated from the classic to new Google Sites. If you have any difficulty accessing a page following the migration please report the issue to your PM.

Known Issues

Last updated: 2022-06-29

Please review this list before getting started to avoid creating duplicate tickets. This list will be updated as fixes are made available in Survey Tool Production. If you hit a problem, please file a ticket.

  1. CLDR-15672 GMT short value not showing up in Basic level

  2. Expanded section in left navigation panel flickers if clicked [CLDR-14750]

  3. Same name collision error. If two items differ only by upper/lower case or punctuation, it still counts as a collision. However, currently, only one of them is flagged as an error. [CLDR-11274]

  4. Images for the plain symbols. Non-emoji such as , √, », ¹, §, ... do not have images in the info pane. [CLDR-13477]

    • Workaround: Look at the Code column; unlike the new emoji, your browser should display them there.

  1. Careful with square brackets. Brackets "[ ]" under Alphabetic information are used to group the alphabetic information and they are not part of the data. [CLDR-13180]

    • Workaround: Please ignore the [ and ] characters in the Alphabetic information and do not try to update the data to exclude the [ and ].

Resolved Issues

Last updated: 2022-06-29

The following list of previously listed on the Known Issues have now been resolved and fixed:

  1. Bulk upload of values not allowed in vetting mode [CLDR-15778]

  2. Inability to see the difference between narrow non-breaking space issue and regular space in Survey Tool resolved in [CLDR-15763].

    1. 5 сентября 1999 г. is looking like 5 сентября 1999г., with no gap before the г.

  3. Link to default content page doesn't work. [CLDR-15683]

  4. Person Name Formats

    1. The space in foreignSpaceReplacement is now visible. [CLDR-15686] and display improved (CLDR-15686)

    2. The number of patterns for the "Sorting" (aka Index) form have been reduced to 6 instead of 18. [CLDR-15695]

    3. The Full sample name will have just surname-prefix and surname-core (when you use {surname} those will be combined together). [CLDR-15691]

  5. We are adding check-boxes for completing the Reports. Although you'll see these at the top of each Report, don't worry about them yet until we add instructions and guidance. [CLDR-8666]

  6. When there are 'constructed values' such as "Simplified Chinese" in region locales, the results will be more accurate. [CLDR-13263]

  7. Unclear error message when not logged into Survey Tool. [CLDR-14845]

  8. “Multiple Languages” confusing. [CLDR-15222]

  9. Issues importing your votes. Error message: "Status 500 Internal Server Error; URL:..." [CLDR-15178]

  10. Unclear to vetters why they can't edit locales that are not open or only open for specific paths [CLDR-15224]

  11. CLDR-15676 bulk update in Survey Tool broken (fixed)

  12. CLDR-15662 LocaleCompletion (third) Meter percent has too high a value

  13. Improvements have been made to this issue , may be resolved

  14. CLDR-15678 broken links in the Info Panel (blue box)