Unicode CLDR Version 35 Language/Locale Data Released

Unicode CLDR 35 provides an update to the key building blocks for software supporting the world's languages. CLDR data is used by all major software systems for their software internationalization and localization, adapting software to the conventions of different languages for such common software tasks.

CLDR 35 included a limited Survey Tool data collection phase. The following summarizes the changes in the release.

Data70,000+ new data fields, 13,400+ revised data fields
Basic coverageNew languages at Basic coverage: Cebuano (ceb), Hausa (ha), Igbo (ig), Yoruba (yo)
Modern coverageLanguages Somali (so) and Javanese (jv) increased coverage from Moderate to Modern
Emoji 12.0Names and annotations (search keywords) for 90+ new emoji;
Also includes fixes for previous names & keywords
CollationCollation updated to Unicode 12.0, including new emoji;
Japanese single-character (ligature) era names added to collation and search collation
Measurement units 23 additional units
Date formatsTwo additional flexible formats, and 20 new interval formats
Japanese calendarIn Japanese locale, updated to use Gannen (元年) year numbering for non-numeric formats (which include 年); also more consistent use of narrow eras in numeric date formats such as “H31/3/27”.
Region NamesMany names updated to local equivalents of “North Macedonia” (MK) and “Eswatini” (SZ).
SegmentationEnhanced Grapheme Cluster Boundary rules for 6 Indic scripts: Gujr, Telu, Mlym, Orya, Beng, Deva.

A dot release, version 35.1 is expected in April, with further changes for Japanese calendar.

For details, see Detailed Specification ChangesDetailed Structure Changes, Detailed Data ChangesFor further details and links to documentation, see the CLDR Release Notes

About the Unicode Consortium

The Unicode Consortium is a non-profit organization founded to develop, extend and promote use of the Unicode Standard and related globalization standards.

The membership of the consortium represents a broad spectrum of corporations and organizations, many in the computer and information processing industry. Members include: Adobe, Apple, Emojipedia, Facebook, Google, Government of Bangladesh, Government of India, Huawei, IBM, Microsoft, Monotype Imaging, Netflix, Shopify, Sultanate of Oman MARA, Oracle, Rajya Marathi Vikas Sanstha, SAP, Symantec, Tamil Virtual University, The University of California (Berkeley), plus well over a hundred Associate, Liaison, and Individual members. For a complete member list go to http://www.unicode.org/consortium/members.html

For more information, please contact the Unicode Consortium http://www.unicode.org/contacts.html.
