CLDR 39 Release Note

No. Date Rel. Note Data Charts Spec Delta Tickets GitHub Tag Delta DTD
39 2021-04-07 v39 CLDR39 Charts39 LDML39 Δ39 release-39-alpha4 ΔDtd39

See Key to Header Links

Overview

The Unicode CLDR v39 alpha is now available for testing. The alpha has already been integrated into the development version of ICU. While the scope of the changes are small in this cycle, there are some significant migration issues, so we would especially appreciate feedback from non-ICU consumers of CLDR data. Feedback can be filed at CLDR TicketsThe public beta (data and specification) is planned for 2021-Mar-24, with the release following on 2021-Apr-07.

Unicode CLDR provides key building blocks for software supporting the world's languages. CLDR data is used by all major software systems (including all mobile phones) for their software internationalization and localization, adapting software to the conventions of different languages.

CLDR v39 had no submission phase. Instead the focus was on modernizing the Survey Tool software, preparing for data submission in the next release (v40). The data fixes in the release were confined to some global changes that are too difficult to do during a submission cycle, and various other fixes. There was a major change in how Norwegian is handled, in order to align the way that the locale identifiers no, nb, and nn are used. The unit support from the last release was integrated into ICU, and some fixes resulting from that process were made to the measurement unit data. Quite a number of fixes are made to the specification, to clarify text or fix problems in keyboards, measurement units, locale identifiers, and a few other areas.

Data Changes

DTD Changes 

  • Units
    • The systems attribute on <convertUnit> has new values: si (for SI Units) and metric for metric units that are not necessarily SI. Units like kilogram have both. Units that are not SI (like caret) just have metric.
    • The <unitConstant> and <unitQuantity> elements now have a description to make the derivation or application clearer, eg  "derivation from the mean atomic weights according to STANDARD ATOMIC WEIGHTS 2019 on https://ciaaw.org/atomic-weights.htm"
    1. Grammar
      • The < grammaticalCase> element adds additional values such as:   abessive, ablative, … adessive, allative, causal, …

    Locale Changes (Sample Link)

    There were general changes across all locales:
    • Removal of a translated name for the special root locale identifier.
    • For name of the currency code XOF from CFA to F CFA (this string contains a narrow no-break space).
    • To use µ consistently to represent a lower-case Greek MU.
    • For measuring blood glucose, changes the unit milligram-per-deciliter to be milligram-ofglucose-per-deciliter, to allow conversion to and from millimole-per-liter. (milligram-per-deciliter is retained as an alias). See Migration for important details
    • Imposed normalized spacing on fields, with whitespaces (including no-break spaces) trimmed from the start and end of values, and sequences of more than one space converted into a single whitespace character depending on the path and original value. Those whitespace characters include:
      • U+0020 SPACE (aka SP)
      • U+00A0 NO-BREAK SPACE (aka NBSP)
      • U+202F NARROW NO-BREAK SPACE (aka NNBSP)
    • Grouping digits for es_419, es-MX, es-US have changed from 2 to 1.
    • Combined date/time formats for zh to include a space.
    • Capitalization of "World" in English to follow the lower casing rule. 
    • Compound units spacing for Romanian.
    • Further refinement of Yoruba Alphabets in use in Core data
    • See also Known Issues
    In addition, a number of other corrections were made on a per-locale basis.
    •  For example, in three locales the unit times pattern diverged dramatically from the pattern used in compound units such as newton-meter
    • Changes for Norwegian (no/nb/nn) See Migration for important details.
    • [TBD- fix below]
    • Changed 3 metazones (for translation of timezones)
    • Units
      • Added the special unit ofglucose (for a molar mass value of 180.1557) to allow for the change in blood-glucose measurement listed above. See Migration for important details
      • Some mixed metric units (meter-and-centimeter and kilogram-and-gram) were removed from some locale preferences, pending verification
      • For preferences among units for road length, some units changed
      • The unit preferences for consumption were merged, allowing correct choice of liter/100 kilometer and mile/hour. See Migration for important details
      • Added systems="si" or systems="metric" as appropriate (see DTD Changes), and marked some units explicitly with uksystem or ussystem. Formerly the systems were unmarked.
    • Locales
      • Removed 'mis' from likely subtags (used for locale identifier canonicalization)
      • Dropped many one-way mappings from language mappings (used for finding best matches among locales)
      • Added grammatical case, gender, and definiteness information for additional locales: see Grammar Info 
    • For access to the draft data, see the GitHub tag above. For more details see the Delta Tickets above.

    JSON Data Changes

    TBD

    Specification Changes

    • There was a major change to move the LDML specification to Markdown. This conversion is not yet complete in the alpha.
    • Most specification changes have not yet been done.
    • See also Known Issues

    Chart Changes

    • The charts are updated with data for the release.
      • Note that the changes in the delta charts for Norwegian languages are not actual changes; they are an artifact of the Norwegian structural changes (see Migration section below). 
    • The Grammar Info link from the index was incorrect, and is now fixed. The index also shows the grammatical feature information also.

    Growth

    The usual growth chart has been omitted, since this release had no data submission phase. For the previous version's chart, see Growth Chart (v38.x)

    Migration

    • Norwegian. There was a significant change in the way that Norwegian was handled. The no/nb/nn codes predated the development of the macrolanguage structure, and this change brings it into alignment with other languages.
      • Formerly, nb was the main locale, and no was an alias to it. With this change, no is now the main locale, and nb inherits from it. All of the data that was in nb was moved to no. Due to locale data inheritance, resolved nb and no has the same contents that they had before, so conformant implementations should see no differences.
      • Additionally, nn is now inheriting from no. Practically speaking, this means that where there is missing data in nn, the data from no will be used. That would not be as satisfactory has having full data in nn, but is probably better than inheriting from root (English).
      • Implementations need to be aware of these changes, since they may expose assumptions in the code using CLDR that cause problems. 
        • In particular, any fast-path code that assumes that a language subtag alone (like nn) must inherit from root needs to be changed (this was the case for both CLDR internal code and for ICU).
        • nn (and nb) is no longer independent of no: if an implementation strips out locale data, it must not strip out no if it has nn or nb.
    • Blood Glucose. Blood glucose is measured in two different ways, depending on the country: mmol/L and mg/dl. These were not directly convertible to one another in v38, because they are prima facia incomparable units (items per volume vs mass per volume). To account for this, the milligram-per-deciliter was changed to the more explicit milligram-ofglucose-per-deciliter, where ofglucose is a special constant (items per gram).
    • Consumption. Fuel consumption is measured differently in different countries (volume per distance vs distance per volume). The unit preferences in v38 separated the usage data for these different measures. This has been changed so that the usage data can contain both units and their inverses: basically any interconvertible units.
    • Mu character. There was very inconsistent use of the µ character, since Latin 1 contains a compatibility equivalent character µ. The µ characters are now normalized to the regular Greek character.
    • Metazones. Three metazone values have changed.

    External Data version

    • Refer to properties/external_data_versions.tsv that supplies information on which versions of external data were used in CLDR. <Not yet updated>

    Known Issues

      Acknowledgments

      Many people have made significant contributions to CLDR and LDML; see the Acknowledgments page for a full listing.




























































      The Unicode Terms of Use apply to CLDR data; in particular, see Exhibit 1.
      For web pages with different views of CLDR data, see http://cldr.unicode.org/index/charts.
      Comments