Coverage Levels

Core Data

The data needed for a new locale to be added. See Core Data for New Locales for details on Core Data and how to submit for new locales.

Basic Data

It is expected that during the next Survey Tool cycle after a new locale is added, the data for the Basic Coverage Level will be supplied.

This includes:

    1. Delimiter Data —Quotation start/end, including alternates

    2. Numbering system — default numbering system + native numbering system (if default = Latin and native ≠ Latin)

    3. Locale Pattern Info — Locale pattern and separator, and code pattern

    4. Language Names — in the native language for the native language and for English

    5. Script Name(s) — Scripts customarily used to write the language

    6. Country Name(s) — For countries where commonly used (see "Core XML Data")

    7. Measurement System — metric vs UK vs US

    8. Full Month and Day of Week names

    9. AM/PM period names

    10. Date and Time formats

    11. Date/Time interval patterns — fallback

    12. Timezone baseline formats — region, gmt, gmt-zero, hour, fallback

    13. Number symbols — decimal and grouping separators; plus, minus, percent sign (for Latin number system, plus native if different)

    14. Number patterns — decimal, currency, percent, scientific

Moderate Data

Before submitting data above the Basic Level, the following must be in place:

  1. Plural and Ordinal rules

    • As in [supplemental/plurals.xml] and [supplemental/ordinals.xml]

    • Must also include minimal pairs

    • For more information, see cldr-spec/plural-rules.

  2. Casing information (only where the language uses a cased scripts according to ScriptMetadata.txt)

  3. Collation rules [non-Survey Tool]

    • This can be supplied as a list of characters, or as rule file.

    • The list is a space-delimited list of the characters used by the language (in the given script). The list may include multiple-character strings, where those are treated specially. For example, if "ch" is sorted after "h" one might see "a b c d .. g h ch i j ..."

    • More sophisticated users can do a better job, supplying a file of rules as in cldr-spec/collation-guidelines.

  4. The result will be a file like: common/collation/ar.xml or common/collation/da.xml.

The data for the Moderate Level includes:

### TBD

Modern Data

Before submitting data above the Moderate Level, the following must be in place:

  1. Grammatical Features

    1. The grammatical cases and other information, as in supplemental/grammaticalFeatures.xml

    2. Must include minimal pair values.

  2. Romanization table (non-Latin scripts only)

    1. This can be supplied as a spreadsheet or as a rule file.

    2. If a spreadsheet, for each letter (or sequence) in the exemplars, what is the corresponding Latin letter (or sequence).

    3. More sophisticated users can do a better job, supplying a file of rules like transforms/Arabic-Latin-BGN.xml.

The data for the Modern Level includes:

### TBD

Rules

For the coverage in the latest released version of CLDR, see Locale Coverage Chart.

To see the development version of the rules used to determine coverage, see coverageLevels.xml. For a list of the locales at a given level, see coverageLevels.txt.