Each release of the Unicode CLDR is a stable release and may be used as reference material or cited as a normative reference by other specifications. Each version, once published, is absolutely stable and will never change. Implementations may also apply CLDR Corrigenda to a release. Bug reports and feature requests for subsequent versions may be filed at Bug Reports.
Key to Header Links
Access to the latest working snapshot of CLDR, and access to data collected for other platforms is available through the web. The Github Tag can be used to get the contents of the release, as described below.
The JSON data is available at https://github.com/unicode-cldr/cldr-json - see that page for more information.
CLDR files are maintained in a git source code repository at https://github.com/unicode-org/cldr.git .
Note: On Sep 14, 2021, the main branch was renamed to 'main', please see This Page for how to fix.
There are several ways to access the repository contents.
For browsing a particular file for a particular version, or revision history of a particular file, use the GitHub Browser. For example:
Go to the latest French LDML file at https://github.com/unicode-org/cldr/blob/master/common/main/fr.xml.
See all the files in a directory structure using https://github.com/unicode-org/cldr.
Find a file using https://github.com/unicode-org/cldr/find/master (click after "cldr /" above the blue box).
Advanced Git Access
For more access to the source repository, you can use an git client to check out or export LDML files directory from the repository at https://github.com/unicode-org/cldr.git
You will need "git-lfs" installed to be able to compile the CLDR tools. See https://git-lfs.github.com or use the Github Desktop client.
At the top level of each GitHub repository tree, there are a number of special folders, plus a number of platform folders.
common — CLDR data corresponding to the release
annotations — annotations and TTS names for characters
annotationsDerived — names algorithmically derived based on structure
bcp47 — data for unicode locale extensions
casing — intended capitalization for various categories in each language, for use by the Survey Tool
collation — collation LDML files
dtd — the latest XML DTD files for the release
main — the main locale-dependent LDML files
properties — property files in UCD format
rbnf — rule-based number formats
segments — rules for segmenting text
subdivisions — names of region (country) subdivisions.
supplemental — additional files with non-linguistic data.
testData — folders of test data for implementations.
transforms — data for transliteration and other text transforms
uca — customized Unicode collation data
validity — data for validating BCP47 identifiers
docs — the source of the LDML spec and other documents
exemplars/main — preliminary exemplar character data for locales which do not hav e
keyboards —source files for the CLDR keyboard data
seed — preliminary locales that do not yet have sufficient vetted data.
transforms — these folders have the same structure as their counterparts in common. Note that supplemental is not duplicated.
specs— deprecated, with contents moved to docs.
tools — source for internal tools for processing CLDR data
SurveyConsole — (not currently deployed) This is a tool providing an operational dashboard for the Survey Tool
c/genldml — The only C language tool, this was used to convert ICU format data into LDML.
cldr-apps-watcher — (not currently deployed) This is a tool which will watch the Survey Tool and ensure that it remains operational.
cldr-apps — Survey Tool source code
cldr-unittest — Unit tests against the CLDR code in the “java” directory. (Not to be confused with CheckCLDR tests.)
java — main source code for the CLDR tooling
python — utility Python code
scripts — accessory shell scripts, used for CLDR process and Survey Tool deployment
The common, dtd, and tools folders are in each release.
Note: Beginning with CLDR v21, the CLDR project no longer publishes POSIX-format locale sources as part of its distribution. The POSIX locale generation tools will continue to be made available as a part of the release. Developers who require POSIX compliant locales can generate them using these tools.
CLDR had historically included reference versions of POSIX-format locale source files that are generated using the default options for each supported locale. The reference versions of POSIX source information contain those data fields that are included in the POSIX specification.
Many operating system platforms provide additional extensions to the minimal POSIX required field set. Individual implementations may require addition of the platform-specific fields or a non-default character repertoire in order to provide full functionality on a given POSIX compliant operating system. As of the current release, the POSIX locale generation tools do not generate such platform-specific extensions, but they can be modified to support this.
CLDR 1.0 Release
The 1.0 version of CLDR is described here for historical interest only. It was hosted on the OpenI18N site before the CLDR project moved to the Unicode Consortium.