CLDR 44 Release Note

No. Date Rel. Note Data Charts Spec Delta Tickets GitHub Tag JSON Tag Delta DTD
44 2023‑10‑31 v44 CLDR44 Charts44 LDML44 Δ44 release-44 44.0.0* ΔDtd44
44.1 2023‑12‑13 v44.1 n/a n/a LDML44.1 Δ44.1 release-44-1 44.1.0 See 44.1 Changes

See Key To Header Links *Note: For NPM, the JSON data uses version 44.0.1

Overview

Unicode CLDR provides key building blocks for software supporting the world’s languages. CLDR data is used by all major software systems (including all mobile phones) for their software internationalization and localization, adapting software to the conventions of different languages.

In CLDR 44, the focus is on:

  1. Formatting Person Names. Added further enhancements (data and structure) for formatting people’s names. For more information on why this feature is being added and what it does, see Background.
  2. Emoji 15.1 Support. Added short names, keywords, and sort-order for the new Unicode 15.1 emoji.
  3. Unicode 15.1 additions. Made the regular additions and changes for a new release of Unicode, including names for new scripts, collation data for Han characters, etc.
  4. Digitally disadvantaged language coverage. Work began to improve DDL coverage, with the following DDL locales now having higher coverage levels:
    1. Modern: Cherokee, Lower Sorbian, Upper Sorbian
    2. Moderate: Anii, Interlingua, Kurdish, Māori, Venetian
    3. Basic: Esperanto, Interlingue, Kangri, Kuvi, Kuvi (Devanagari), Kuvi (Odia), Kuvi (Telugu), Ligurian, Lombard, Low German, Luxembourgish, Makhuwa, Maltese, N’Ko, Occitan, Prussian, Silesian, Swampy Cree, Syriac, Toki Pona, Uyghur, Western Frisian, Yakut, Zhuang

Locale Coverage Status

The coverage status determines how well languages are supported on laptops, phones, and other computing devices. In particular, qualifying at a Basic level is typically a requirement just for being selectable on phones as a language. Note that for each language there are typically multiple locales, so 90 languages at Modern coverage corresponds to more than 350 locales at that coverage.

Below is the coverage in this release:

CLDR v44 Coverage

Version 44.1 Changes

DTD Changes

Specification Changes

Data Changes

Data Changes

DTD Changes

The following is a summary of the DTD changes which reflect changes in the structure. The relevant ones are described more fully in the data changes.

LDML

Supplemental Data

BCP47

Keyboards

BCP47 Changes

Supplemental Data Changes

Locale Changes

File Changes

(Aside from locale files)

Additions:

New XSD files in /common/dtd/.

These correspond to the DTDs, but do not carry the extra validity annotations.

New Test Data files in /common/testData/

Removals:

Files with insufficient data:

Old format keyboards were removed (see Migration):

JSON Data Changes

Keyboard Changes

Keyboard has a new DTD (keyboard3.dtd and the <keyboard3> element). This is a complete rewrite of the specification by the Keyboard Subcommittee, and is available as a technical preview in CLDR version 44. See TR35 Part 7: Keyboards. The prior DTDs are included in CLDR but are not used by CLDR data or tooling. Note: prior keyboard data files are not compatible, were not maintained and have also been removed.

Note that there are additional sample keyboard data files in progress which were not complete for v44, but may be consulted as samples:

See the Known Issues section for additional known issues.

Specification Changes

Please see Modifications section in the draft spec for the list of current changes.

A diff of the changes since CLDR 43 can be viewed here in GitHub, which was last updated on 6 October 2023. Clicking on the rich-diff icon for a page ( 📄 ) will often show the differences with a rich diff, such as the following:

image

Growth

The following chart shows the growth of CLDR locale-specific data over time. It is restricted to data items in /main and /annotations directories, so it does not include the non-locale-specific data; nor does it include corrections (which typically outnumber new items). The % values are percent of the current measure of Modern coverage. That level is increases each release, so previous releases had many locales that were at Modern coverage as assessed at the time of their release. There is one line per year, even though there were multiple releases in most years.

There were generally a relatively small number of additions this cycle; the focus was improvements in quality, and changes will not show up below.

image

Migration

Known Issues

These are not always the same. In the future, some of these functions will be separated out; see CLDR-17095.

Acknowledgments

Many people have made significant contributions to CLDR and LDML; see the Acknowledgments page for a full listing.

The Unicode Terms of Use apply to CLDR data; in particular, see Exhibit 1.

For web pages with different views of CLDR data, see http://cldr.unicode.org/index/charts.