CLDR Process


This document describes the Unicode CLDR Technical Committee's process for data collection, resolution, public feedback and release. 

Formal Technical Committee Procedures

For more information on the formal procedures for the Unicode CLDR Technical Committee, see the Technical Committee Procedures for the Unicode Consortium.

Specification Changes

The UTS #35: Locale Data Markup Language (LDML) specification are kept up to date with each release with change/added structure for new data types or other features. 

Data- Submission and Vetting

The contributors of locale data are expected to be language speakers residing in the country/region. In particular, national standards organizations are encouraged to be involved in the data vetting process.

There are two types of data in the repository:

The following 4 states are used to differentiate the data contribution levels. The initial data contributions are normally marked as draft; this may be changed once the data is vetted.

Implementations may choose the level at which they wish to accept data. They may choose to accept even unconfirmed data if having some data is better than no data for their purpose. Approved data are vetted by language speakers; however, this does not mean that the data is guaranteed to be error-free -- this is simply the best judgment of the vetters and the committee according to the process.

Survey Tool User Levels

There are multiple levels of access and control:

These levels are decided by the technical committee and the TC representative for the respective organizations.

Voting Process

Optimal Field Value

For each release, there is one optimal field value determined by the following:

Draft Status of Optimal Field Value

It is difficult to develop a formulation that provides for stability, yet allows people to make needed changes. The CLDR committee welcomes suggestions for tuning this mechanism. Such suggestions can be made by filing a new ticket.

Data- Resolution

After the contribution of collecting and vetting data, the data needs to be refined free of errors for the release:

If a locale does not have minimal data (at least at a provisional level), then it may be excluded from the release. Where this is done, it may be restored to the repository for the next submission cycle.

This process can be fine-tuned by the Technical Committee as needed, to resolve any problems that turn up. A committee decision can also override any of the above process for any specific values.

For more information see the key links in CLDR Survey Tool (especially the Vetting Phase).



There may be conflicting common practices or standards for a given country and language. Thus LDML provides keyword variants to reflect the different practices (for example, for German it allows the distinction between PHONEBOOK and DICTIONARY collation.).

When there is an existing national standard for a country that is widely accepted in practice, the goal is to follow that standard as much as possible. Where the common practice in the country deviates from the national standard, or if there are multiple conflicting common practices, or options in conforming to the national standard, or conflicting national standards, multiple variants may be entered into the CLDR, distinguished by keyword variants or variant locale identifiers.

Where a data value is identified as following a particular national standard (or other reference), the goal is to keep that data aligned with that standard. There is, however, no guarantee that data will be tagged with any or all of the national standards that it follows.

Maintenance Releases

Maintenance releases, such as 26.1, are issued whenever the standard identifiers change (that is, BCP 47 identifiers, Time zone identifiers, or ISO 4217 Currency identifiers).  Updates to identifiers will also mean updating the English names for those identifiers.

Corrigenda may also be included in maintenance releases. Maintenance releases may also be issued if there are substantive changes to supplemental data (non-language such as script info, transforms) data or other critical data changes that impact the CLDR data users community. 

The structure and DTD may change, but except for additions or for small bug fixes, data will not be changed in a way that would affect the content of resolved data.

Public Feedback Process

The public can supply formal feedback into CLDR via the Survey Tool or by filing a Bug Report or Feature Request. There is also a public forum for questions at CLDRMailing List (details on archives are found there).

There is also a members-only CLDRmailing list for members of the CLDR Technical Committee.

Public Review Issues may be posted in cases where broader public feedback is desired on a particular issue.

Be aware that changes and updates to CLDR will only be taken in response to information entered in the Survey Tool or by filing a Bug Report or Feature Request. Discussion on public mailing lists is not monitored; no actions will be taken in response to such discussion -- only in response to filed bugs. The process of checking and entering data takes time and effort; so even when bugs/feature requests are accepted, it may take some time before they are in a release of CLDR.

Data Release Process

Version Numbering

The locale data is frozen per version. Once a version is released, it is never modified. Any changes, however minor, will mean a newer version of the locale data being released. The version numbering scheme is "xy.z", where z is incremented for maintenance releases, and xy is incremented for regular semi-annual releases as defined by the regular semi-annual schedule

Release Schedule

Early releases of a version of the common locale data will be issued as either alpha or beta releases, available for public feedback. The dates for the next scheduled release will be on CLDR Project.

The schedule milestones are listed below.

Labels in the Jira column correspond to the phase field in Jira. Phase field in Jira is used to identify tickets that need to be completed before the start of each milestone (table above). 

Meetings and Communication

The currently-scheduled meetings are listed on the Unicode Calendar. Meetings are held by phone, every week at 8:00 AM Pacific Time (-08:00 GMT in winter, -07:00 GMT in summer). Additional meeting is scheduled every other Mondays depending on the need and people's availability. 

There is an internal email list for the Unicode CLDR Technical Committee, open to Unicode members and invited experts. All national standards bodies who are interested in locale data are also invited to become involved by establishing a Liaison membership in the Unicode Consortium, to gain access to this list.


