CLDR 43 Release Note

This version is currently at Beta (for data). See the latest release

The planned schedule is:

2023 Mar 29, Wed — public Beta2 (data & spec)
2023 Apr 12, Wed — Release

The links will be changed for the final release.

Overview

Unicode CLDR provides key building blocks for software supporting the world's languages. CLDR data is used by all major software systems (including all mobile phones) for their software internationalization and localization, adapting software to the conventions of different languages.

CLDR 43 is a limited-submission release, focusing on just a few areas:

For details, see below.

Locale Status

The bar for each coverage level increases each release. Faroese (fo) increased from Basic to Moderate, while Cherokee (chr), Lower Sorbian (dsb), and Upper Sorbian (hsb) dropped from Modern to Moderate.

CLDR v43 Coverage

Data Changes

Locale Changes

File Changes

New files:

Note: All files were moved  from seed to common (see the Migration section)

JSON Data Changes

See the Migration section for general data changes.

Specification Changes

[###TBD not yet complete — the target for the spec is March 29]

The following are the most significant changes in the specification:

Growth

The following chart shows the growth of CLDR locale-specific data over time. It is restricted to data items in /main and /annotations directories, so it does not include the non-locale-specific data. The % values are percent of the current measure of Modern coverage. That level is notched up each release, so previous releases had many locales that were at Modern coverage as assessed at the time of their release. There is one line per year, even though there were multiple releases in most years.

The detailed information on changes between v43 release and v42 are at v43 delta_summary.tsv: look at the TOTAL line for the overall counts of Added/Deleted/Changed.

Because this was a limited-submission release, there are a small number of changes visible.

Language Matching

CLDR has data for language matching, as in this chart. The purpose and usage is sometimes misunderstood. 

So how is this used? Consider a user whose first language is Breton. If they open an application that only has localizations for English, German, and French, then Breton will not be available. In that case, the data in CLDR can be used to select French as a fallback localization — in the absence of other information. 

That last clause is important. The CLDR data is based on the likelihood that a person using language X understands text written in language Y, but large portions of the population for X might prefer other languages. 

The CLDR language matching data can and should be overridden whenever there is more information available that allows an implementation to do a better job. It is strongly recommended that systems allow users to not only specify their preferred language, but also any secondary languages. Thus a person speaking Kazakh who also knows French could specify French as a secondary language, and get a French localization for an app instead of the CLDR match. This has been done on both Android and iOS, for example.

Important:  language matching is different from the CLDR inheritance mechanism: they serve different purposes, and are not aligned. The CLDR inheritance mechanism is how CLDR organizes localized data, and  should not be used for language matching. Applications do not need to follow the CLDR inheritance chain.

References: LDML Language Matching, LDML Inheritance vs Related Information, ICU4J Locale Matcher, ICU4C Locale Matcher 

Migration

Known Issues

None at this time

Acknowledgments

Many people have made significant contributions to CLDR and LDML; see the Acknowledgments page for a full listing.


The Unicode Terms of Use apply to CLDR data; in particular, see Exhibit 1.

For web pages with different views of CLDR data, see http://cldr.unicode.org/index/charts.