Information Hub for Linguists

💡🆕

The sequence above marks items that have been recently added. In your browser you can copy this sequence, then use ⌘-F (Mac) or Ctrl-F (Windows) to find all the places it occurs.

When a section below changes, the date will be in the header.

Starting Submission

Before you start Submission, please read the CLDR training (if new to the survey tool). Please prioritize the sections Missing, Provisional, and Errors. Please read the Updates. For more information about the priorities during Submission, see Survey Tool stages.

Prerequisites

  1. If you’re new to CLDR, take the CLDR training below.
  2. If you’re already experienced with CLDR, read the Critical reminders section (mandatory).
  3. Review the Status and Schedule, New Areas, Survey Tool, and Known Issues.
  4. Once you are ready, go to the Survey Tool and log in.

Status and Schedule

The Survey Tool is now open for General Submission for CLDR 48. The General Submission phase will be followed by the Vetting phase starting on June 11th.

New languages

The following new languages are available in the Survey Tool for submission during the CLDR 48 period:

New Areas

Most of the following are relevant to locales at the Modern Coverage Level.

New emoji

Seven new emoji have been added (images below). These were released in Unicode 17 in September 2025.

emoji image

Core Data

There are new Alphabetic Information items.

More information is available in the Exemplars section of the Unicode Sets page

Locale display names

💡🆕 Sark, CQ, is now in modern coverage under Locale Display Names > Territories (Europe) > Northern Europe

Languages names which were added or changed in English

As new locales reach Basic Coverage, their language names are added for locales targeting modern coverage. This will be happening the week of April 28th.

Core/Extensions

There is a new mechanism for better menu names. When you see a Code with -core or -extension, please read Locale Option Value Names.

💡🆕 The link in the Info Panel was not pointing to Locale Option Value Names; that has been fixed. There is also now a Full List of the option names on that page.

Scripts

There are 5 new scripts for Unicode 17. Currently the names are in English: Beria Erfe, Chisoi, Sidetic, Tai Yo, Tolong Siki. Coverage for other languages is at comprehensive.

DateTime formats

Gregorian Calendar (Year First) calendar

💡🆕 This is a variant of the Gregorian calendar whose formats always use year-month-day ordering and a 24-hour time cycle. See Year First Calendar for more details. Note: the code is iso8601, but disregard that; it will be changed after submission.

New flexible format and interval patterns

💡🆕 There are some new patterns for you to supply. Make sure that the format is consistent with related patterns.

Note: Some locales have inconsistent patterns using eras: in some patterns using G (AD vs BC in Gregorian) but in related patterns using GGGGG (which is a narrow form: A vs B in Gregorian). The GGGGG is not typically needed except in special cases, such as the Japanese calendar.

New “relative” variant for date-time combining pattern

There is a new “-relative” variant for Date-Time Combined Formats.

Before CLDR 48, there were two variants:

However, in some languages the use of a relative date such as “tomorrow” or “2 days ago” required a different combining pattern than for a fixed date like “March 20”. So in CLDR 48 a new “relative” variant is introduced. This will be used (instead of the “atTime” variant) for the combination of a relative date and a single time.

If you do not supply this, that combination will fall back to using the “standard” variant; in English that would produce “tomorrow, 3:00 PM”. If instead you want the same combining behavior for a relative date with a single time as for a fvfixed date with single time (as was the case in CLDR 47 and earlier), then for each length style copy the existing “atTime” form to the new “relative” form.

Missing date & time patterns

💡🆕 Some dates and times are ambiguous due to missing patterns. These additional patterns have been added to resolve this issue:

Timezones, metazones and exemplar cities

New gmtUnknownFormat

Normally time zones formatted using UTC offset (like xxxx) use the gmtFormat pattern (“GMT{0}” in root). The new gmtUnknownFormat is used when formatting time zones using a UTC offset for cases when the offset or zone is unknown. The root value “GMT+?” need not be changed if it works for your locale; however it should be consistent with the gmtFormat and gmtZeroFormat in your locale. See Time Zones and City names

“Unknown City” → “Unknown Location”

For zone Etc/Unknown, the exemplarCity name was changed in English from “Unknown City” to “Unknown Location”; other locales should update accordingly.

Changes to the root and/or English names of many exemplar cities and some metazones

Exemplar cities added or changed in English. This was typically to move towards the official spelling in the country in question, such as retaining accents. You should check these, but don’t hesitate to retain the older version() if it is a different script or more customary in your language. For example, English still uses "Mexico City" instead of "Ciudad de MĂ©xico".

Code New Value Code New Value
Africa/El_Aaiun El AaiĂșn Antarctica/Casey Casey Station
Africa/Lome LomĂ© Antarctica/DumontDUrville Dumont d’Urville Station
Africa/Ndjamena N’Djamena Antarctica/McMurdo McMurdo Station
America/Araguaina AraguaĂ­na Asia/Aqtau Aktau
America/Argentina/Rio_Gallegos RĂ­o Gallegos Asia/Hovd Khovd
America/Argentina/Tucuman TucumĂĄn Asia/Qyzylorda Kyzylorda
America/Belem BelĂ©m Asia/Urumqi ÜrĂŒmqi
America/Bogota BogotĂĄ Atlantic/Canary Canarias
America/Cordoba CĂłrdoba Europe/Busingen BĂŒsingen
America/Cuiaba CuiabĂĄ Europe/Chisinau Chișinău
America/Eirunepe Eirunepé Europe/Tirane Tirana
America/Maceio MaceiĂł Indian/Chagos Chagos Archipelago
America/Mazatlan MazatlĂĄn Indian/Comoro Comoros
America/Mexico_City Ciudad de México Indian/Kerguelen Kerguelen Islands
America/Miquelon Saint-Pierre Indian/Mahe Mahé
America/Santarem Santarém Pacific/Chatham Chatham Islands
America/Sao_Paulo SĂŁo Paulo Pacific/Galapagos GalĂĄpagos
Antarctica/Rothera Rothera Station Pacific/Kwajalein Kwajalein Atoll
Antarctica/Palmer Palmer Land Pacific/Marquesas Marquesas Islands
Antarctica/Troll Troll Station Pacific/Midway Midway Atoll
Antarctica/Syowa Showa Station Pacific/Noumea Nouméa
Antarctica/Mawson Mawson Station Pacific/Pitcairn Pitcairn Islands
Antarctica/Vostok Vostok Station Pacific/Wallis Wallis & Futuna

Metazones:

Number formats

Currency patterns alphaNextToNumber, noCurrency

For more information see Number and currency patterns.

Rational formats

These patterns specify the formatting of rational fractions in your language. Rational fractions contain a numerator and denominator, such as œ, and may also have an integer, such a 5œ. There are two different “combination patterns”, needed because sometimes fonts don’t properly support fractions (such as displaying 5 1/2), and need two patterns: one with a space and one without. It can be tricky to understand the difference, so be sure to carefully read Rational Formatting before making any changes.

Here are the the English values and a short description of their purpose:

Code Default Value Description
Rational {0}⁄{1} The format for a rational fraction with arbitrary numerator and denominator; the English pattern uses the Unicode character ‘⁄’ U+2044 FRACTION SLASH which causes composition of fractions such as 22⁄7.
Integer + Rational {0} {1} The format for combining an integer with a rational fraction composed using the pattern above; the English pattern uses U+202F NARROW NO-BREAK SPACE (NNBSP) to produce a non-breaking thin space.
Integer + Rational-superSub {0}⁠{1} The format for combining an integer with a rational fraction using composed using the pattern above; the English pattern uses U+2060 WORD JOINER, a zero-width no-break space.
Usage sometimes An indication of the extent to which rational fractions are used in the locale; must be either never or sometimes.

If an integer and fraction (5œ) is best expressed in your language with a space between them (5 œ), then copy the pattern from integerAndRationalPattern to integerAndRationalPattern-superSub. However, you cannot do the reverse. Some fonts and rendering systems don’t properly handle the fraction slash, and the user would see something like 51/2 (fifty-one halves). So in that case, implementations must have the integerAndRationalPattern with a space in it to fall back on, unless they have verified that the font / rendering system supports superscripting the numerator.

Units

Rework certain concentration units

The keys for two units changed (the translations can probably remain the same) and there is one new unit that is used for constructing certain other kinds of concentration units:

Please check over the values for concentr-part-per-1e6 and concentr-part-per-1e9 in your locale. Some languages had used the equivalent of “millionths” instead of the equivalent of “parts per million”.

For more information see Concentrations.

Many new units in English

Mnny new units were added in English. The metric ones are used in scientific contexts, and will need to be translated in all languages. However, the case inflections (accusative, dative, etc) will not be requested.

The units (English names) are:

Survey Tool

Once trained and up to speed on Critical reminders (below), log in to the Survey Tool to begin your work.

Survey Tool Changes

  1. The ability to search in the Survey Tool has been added in CLDR-18423 and supports searching for: values, English value, and for the codes
  2. There has been substantial performance work that will show up for the first time. If there are performance issues, please file a ticket with a row URL and an explanation for what happened.
  3. In the Dashboard, you can filter the messages instead of jumping to the first one. In the Dashboard header, each notification category (such as “Missing” or “Abstained”) has a checkbox determining whether it is shown or hidden.
  4. In each row of the vetting page, there is now a visible icon when there are forum messages at the right side of the English column:
    1. đŸ‘ïžâ€đŸ—šïž if there are any open posts
    2. 💬 if there are posts, but all are closed
  5. For Units and a few other sections, the Pages have changed to reduce the size on the page to improve performance.
    1. Pages may be split, and/or retitled
    2. Rows may move to a different page.
  6. The symbols in the A column have been changed to be searchable in browsers (with Find in Page) and stand out more on the page. See below for a table. They override the symbols in Survey Tool Guide: Icons.

See Recent changes for additional recent changes in the Survey Tool.

Important Notes

New Approve Status Icons

Symbol Status Notes
✅ Approved Enough votes for use in implementations 

☑ Contributed Enough votes for use in implementations 

✖ Provisional Not enough votes for implementations 

❌ Unconfirmed Not enough votes for implementations 

đŸ•łïž Missing Completely missing
âŹ†ïž Inherited Used in combination with ✖ and ❌

Enhanced “Show Hidden”

💡🆕 If a field contains characters that are invisible or certain characters that look like others, a special Show Hidden bar will appear below the field that helps distinguish them. For example, see Example Hidden — here is a screen-shot.

Example of hidden characters

Note that if you hover over the Show Hidden bar, you’ll see the name of the special character and a short description. Some of the commonly used special characters are listed below, with an example from CLDR.

Symbol Example Show Hidden Name Description
❰NDASH❱ {0}–{1} {0}❰NDASH❱{1} En dash Slightly wider than a hyphen; used for ranges of numbers and dates in many languages; for clarity may have ❰TSP❱s around it.
❰TSP❱ d – d d❰TSP❱❰NDASH❱❰TSP❱d Thin space A space character that is narrower (in most fonts) than the regular one.
❰NB❱ {0}⁠{1} {0}❰NB❱{1} No Break An invisible character that doesn’t allow linebreaks on either side; also limits fraction super/subscripting
❰NBTSP❱ h a h❰NBTSP❱a No-break thin space A thin space that disallows linebreaks; equivalent to ❰TSP❱❰NB❱
❰NBSP❱ re call ❰NBTSP❱ No-break space A regular space that disallows linebreaks; equivalent to adding ❰NB❱ after a space
❰NBHY❱ re‑call re❰NBHY❱fine No-break hyphen A regular hyphen that disallows linebreaks; equivalent to -❰NB❱

The BIDI controls — ❰ALM❱ ❰LRM❱ ❰RLM❱ are used in bidirectional scripts (Arabic, Hebrew, etc.) to control the birectional order if needed; typically next to numbers or punctuation.

To see how to input these from the keyboard, and for a key to all the escapes, see Key for Show Hidden.

Known Issues

Last updated: 2025-05-17

This list will be updated as fixes are made available in Survey Tool Production. If you find a problem, please file a ticket, but please review this list first to avoid creating duplicate tickets.

  1. [CLDR-18577 - If your language does not have a variant value, you can vote for inheritance from the standard version.
  2. CLDR-17829 - some links in the Info panel not displaying properly
  3. CLDR-13477 - Images for the plain symbols. Non-emoji such as €, √, », Âč, §, 
 do not have images in the Info Panel. Workaround: Look at the Code column; unlike the new emoji, your browser should display them there.
  4. CLDR-17683 - Some items are not able to be flagged for TC review. This is being investigated. Meanwhile, Please enter forum posts meanwhile with any comments.
  5. CLDR-18637 - Some example pop-ups are showing ‘undefined’ instead of the expected example
  6. CLDR-18607 - Unable to download current votes in CSV
  7. CLDR-18615 - Unclear error message if a link sends you to a page that no longer exists in the Survey Tool
  8. CLDR-18627 - Some locale display names at comprehensive are not available in all locales

Resolved Issues

Last updated: 2025-05-17

  1. 💡🆕 CLDR-18605 - Fix issue blocking import of winning votes from the previous cycle
  2. 💡🆕 CLDR-18649 - Same as root is now a warning if English is the same as root as well
  3. CLDR-18513 - Redirect from read-only locale to the default content locale does not work
  4. CLDR-17694 - Back button in browser fails in forum under certain conditions
  5. CLDR-17658 - Dashboard slowness

Recent Changes

  1. CLDR-17658 - In the Dashboard, the Abstains items will only have one entry per page. You can use that entry to go to its page, and then fix Abstains on that page. Once you are done on that page, hit the Dashboard refresh button (â†ș). This fixes a performance problem for people with a large number of Abstains, and reduces clutter in the Dashboard.

CLDR training (for new linguists)

Before getting started to contribute data in CLDR, and jumping in to using the Survey Tool, it is important that you understand the CLDR process & take the CLDR training. It takes about 2-3 hours to complete the training.

  1. Understand the basics about the CLDR process read the Survey Tool Guide and an overview of the Survey Tool Stages.
    • New: A video is available which shows how to login and begin contributing data for your locale.
  2. Read the Getting Started topics on the Information Hub:

*If you (individual or your organization) have not established a connection with the CLDR technical committee, start with Survey Tool Accounts.

Critical reminders (for all linguists)

You’re already familiar with the CLDR process, but do keep the following in mind:

  1. Aim at commonly used language - CLDR should reflect common-usage standards not academic /official standards (unless commonly followed). Keep that perspective in mind.
  2. Carefully consider changes to existing standards - any change to a value from a previous CLDR release (blue star) should be carefully considered and discussed with your fellow linguists in the CLDR Forum. Remember your change will be reflected across thousands of online products — and potentially almost all online users of your language.
  3. Keep consistency across logical groups - ensure that all related entries are consistent. If you change the name of a weekday, make sure it’s reflected across all related items. Check that the order of month and day are consistent in all the date formats, etc.
    • Tip: The Reports are a great way to validate consistency across related logical groups, e.g. translations of date formats. Use them to proofread your work for consistency.
  4. Avoid voting for English - for items that do not work in your language, don’t simply use English. Find a solution that works for your language. For example, if your language doesn’t have a concept of calendar “quarters”, use a translation that describes the concept “three-month period” rather than “quarter-of-a-year”.
  5. Watch out for complex sections and read the instructions carefully if in doubt:
    1. Date & Time
    2. Time zones
    3. Plural forms

Tip: The links in the Info Panel will point you to relevant instructions for the entry you’re editing/vetting. Use it if in doubt.