Key Links: Survey Tool | Survey Tool Guide | Walkthrough | Vetting Phase | FAQ&Known Bugs | Managing Users | Mailing List Most data in the Unicode Common Locale Data Repository is gathered and processed via what is called the Survey Tool, an online tool that can be used to view data for different languages and propose additions or changes. This tool provides a way to propose new localized data, see what others have proposed, and communicate with them to resolve differences. During each submission period, contributors from Unicode Consortium members, other organizations and the public at large are invited to review the data for their languages and countries, and propose new translations of terms or modifications, including language translations entirely new to the repository. For the release schedule, see CLDR Project.Note: If you would like to add data for a new
locale, please file a bug requesting the addition (see CLDR Change Requests). You should also notify your CLDR contact (see Survey
Tool Accounts). In this release, new structure has been added to provide for plurals, simple duration formats, more control over the formatting of locale names. There are a number of changes in the tool for usability: for example, only the timezone names that are important to translate are shown. There are also new items for translation, such as new territory codes. We would also like people to focus on getting enough votes for the unapproved items to make them approved. The following provides a brief description of the process. AccountsYou don’t need an account to view data for a particular language. If you wish to propose changes or additions, you will need an account: see Survey Tool Accounts. Locale ListThe main screen of the survey tool is located at http://unicode.org/cldr/apps/survey. It displays a list of languages currently available. Languages will vary by script (Arabic vs. Latin, or Simplified vs. Traditional Chinese), and occasionally by country. For historic reasons, this combination of language with script or country is known as a locale.For each language, the content is what is appropriate for the most populous country, thus the content for English [en] is whatever is appropriate for the United States. Any variation by country for that language will be represented in a country locale: thus content appropriate for the Australia that differs from what is in English [en] will be in a the sublocale English (Australia) [en_AU]. Click on the languages (optionally countries) that you would like to view. You can always get back to this page by clicking on Locales at the top left of the page.
Reviewing and Submitting DataThere is a key explaining the way the windows are laid out at Survey Tool Windows. You should review this before starting. You will then start going through each section: languages, scripts, territories, ... all the way to supplemental.
The locale data should be in the customary form for the target language, in the form that is in most common usage. For example, for the territory name in English one would use "Switzerland" instead of "Swiss Confederation", and use "United Kingdom" instead of "The United Kingdom of Great Britain and Northern Ireland". CoverageThe warnings about missing items are based on your coverage level. This level can be from comprehensive (all possible items) down to basic (a very minimal set of items). Locales that don't meet at least basic level may not be complete enough to be in the official release (although the data will be kept in the working repository).
Caution: these warnings are mechanically generated, and do not substitute for your judgment: you may want to translate more items based on your knowledge. For example, a Ukrainian speaker may want to translate the names of the neighboring countries, even if those are not warnings at the current coverage level. Country-Specific InformationThe language locale should contain the most broadly used data for that language, and should be appropriate for the most populous region; other specific region locales should only contain data where they need to override individual items, when the "inherited" language locale data would not be customary in that region. Once you've looked over all the sections in your language, you should go back to the Locale window, and scroll back to your language. You'll see different countries there on the right side of your language. If there are locale variations in the use of your language, according to country, then you can change them now. You only need to do this for cases where the usage in the countries differ from the main language. Each language has the default content for one of the countries using the language. You won't be able to edit that country locale; instead, any modifications should go in the main language locale. Resolving Differences among TranslatorsAfter the data submission phase, any differences in the submitted data will be resolved according to the data resolution process. However, even during the submission phase, you should collaborate with the other translators where you have questions, via email and the forums. Problems?The tool has undergone substantial revisions based on feedback we received during the last release. There are still some rough edges and we ask for your patience with problems that occur. In particular, the tool is not designed to handle a large number of people working at the same time, so if it appears unresponsive, please try again later on (and save your work as you go).If you find a problem, you may want to review Known Bugs to see whether it has already been reported (and whether there is a work-around). If not, or if you have suggestions for improvements, please file a bug using the Feedback link at the bottom of each window. If there are other issues, you can raise them on the Unicode CLDR Mailing List. Special ConsiderationsCharacter RepertoireThe data in the locale repository should contain the most appropriate choice of characters for the representation of the text. It may thus include Unicode characters that are not included in a given legacy character set. In particular, the data may contain curly quotes and apostrophes (such as in “can’t”), and similar characters such as the letter modifiers in ʻōlelo Hawaiʻi. These characters provide more distinctions than are available with the generic ASCII repertoire. They may be “downcast” to the best available characters when the data is imported into systems with a more limited repertoire of supported characters. (Downcasting information is provided with character fallback substitutions.) Hong Kong, MacauThe territory codes HK and MO are to be translated with the native equivalent of “Hong Kong SAR China” and “Macao SAR China”, respectively. SAR stands for “Special Administrative Region” and can be represented with acronym in the target language. There are alternative, short versions of these that should also be translated; those omit the "SAR China". |

symbol.