This document describes the minimal data needed for a new locale. There are two kinds of data:
- Core XML Data - This is data that the CLDR committee needs from the proposer before a new locale is added. The proposer is expected to also get a Survey Tool account, and contribute towards the Minimal Data.
- Minimal Data Commitment - Data that is expected to be provided for each locale. If it is not supplied in a timely fashion, the committee may remove the locale.
(The parenthesis at the start of each line below has the approximate number of strings for each item.)
Core XML Data
Note to translators: If you are having difficulties or questions about the following data, please contact us. Post a follow-up to your existing bug, file a new bug, or reply to the mailing list.
(02) Orientation (bidi writing systems only) [main/xxx.xml](01) Plural rules [supplemental/plurals.xml](01) Default content script and region (normally: normally country with largest population using that language, and normal script for that). [supplemental/supplementalMetadata.xml](N) Verify the country data ( i.e. which territories in which the language is spoken enough to create a locale ) [supplemental/supplementalData.xml](N) Casing information (cased scripts only, according to ScriptMetadata.txt)(N) Collation rules [non-Survey Tool]
- (04) Exemplar sets: main, auxiliary, index. [main/xxx.xml]
Recommended Core Data
The following are not required, but are strongly recommended:
*(N) Romanization table (non-Latin writing systems only) [spreadsheet, we'll translate into transforms/xxx-en.xml]
- (04) Exemplar set: punctuation. [main/xxx.xml]
- (01) Ordinal rules [supplemental/ordinals.xml]
- If a spreadsheet, for each letter (or sequence) in the exemplars, what is the corresponding Latin letter (or sequence).
- More sophisticated users can do a better job, supplying a file of rules like transforms/Arabic-Latin-BGN.xml.
Minimal Data Commitment
This data is to be entered using the Survey Tool except as noted.
- (44+) 4 main Date/Time formats, 12 long&abbreviated, format&stand-alone month-names, 7 long&abbreviated day-names, 2 long day periods.
- (01) Name of the language in the language.
- (N) For any country locales, name of the country in the language, name/symbol for that country's currency. Must be at least one, for the default content locale.
- (02) Datetime pattern, intervalFormatFallback
- (05) (for Latn) decimal and grouping separators; decimal, currency, percent formats
- (N) Names of countries (territories) with that language as official.
- (M) Names of exemplarCities in multizone countries with that language as official
- (05) Timezone patterns [http://cldr.unicode.org/translation/timezones]
- (02) localePattern/Separator [http://cldr.unicode.org/translation/localepattern]
- (03) key names
- (14) long/short unit names (time intervals)