Information Hub for Linguists
đĄđ
The sequence above marks items that have been recently added. In your browser you can copy this sequence, then use â-F (Mac) or Ctrl-F (Windows) to find all the places it occurs.
- đĄđ Added missing date & time formats:
- GyM and GyMEd
- Time formats: EBh, Eh, EH when there is an existing EBhm, Ehm, and EHm in the calendar respectively.
- Changed HH patterns in available and interval formats to include a reference to hour since seeing an hour number alone is ambigious.
- đĄđ The region Sark, CQ, is now in modern coverage
- New flexible format and interval patterns
- Gregorian Calendar (Year First) calendar
- Enhanced âShow Hiddenâ
When a section below changes, the date will be in the header.
Starting Submission
Before you start Submission, please read the CLDR training (if new to the survey tool). Please prioritize the sections Missing, Provisional, and Errors. Please read the Updates. For more information about the priorities during Submission, see Survey Tool stages.
Prerequisites
- If youâre new to CLDR, take the CLDR training below.
- If youâre already experienced with CLDR, read the Critical reminders section (mandatory).
- Review the Status and Schedule, New Areas, Survey Tool, and Known Issues.
- Once you are ready, go to the Survey Tool and log in.
Status and Schedule
The Survey Tool is now open for General Submission for CLDR 48. The General Submission phase will be followed by the Vetting phase starting on June 11th.
- Disconnect error. If you see a persistent Loading error with a disconnect message or other odd behavior, please empty your cache.
- Survey Tool email notification may be going to your spam folder. Check your spam folder regularly.
- âSame as codeâ errors - when translating codes for items such as languages, regions, scripts, and keys, it is normally an error to select the code itself as the translated name. If the error appears under Typography, you can ignore it.
New languages
The following new languages are available in the Survey Tool for submission during the CLDR 48 period:
- Buryat (bua)
- Coptic (cop)
- Haitian Creole (ht)
- đĄđ Hmong Daw (mww)
- Kazakh (Latin) (kk_Latn)
- Laz (lzz)
- Luri Bakhtiari (bqi)
- Nselxcin (Okanagan) (oka)
- PÄli (pi)
- Piedmontese (pms)
- Qâeqchiâ (kek)
- Samogitian (sgs)
- Sunuwar (suz)
- Chinese (Latin) (zh-Latn)
New Areas
Most of the following are relevant to locales at the Modern Coverage Level.
New emoji
Seven new emoji have been added (images below). These were released in Unicode 17 in September 2025.
Core Data
There are new Alphabetic Information items.
numbers-auxiliary
â If there are are characters used in numbers that are not customarily used, but may occur, add them here instead of inauxiliary
.punctuation-auxiliary
â If there are punctuation characters that are not customarily used, but may occur, add them here instead of inauxiliary
.punctuation-person
â If there are punctuation characters that are customarily used in peopleâs names in standard documents, add them here. This should be a small list such as â.â or â-â. Do not include âfancifulâ characters such as emoji or kaomoji.
More information is available in the Exemplars section of the Unicode Sets page
Locale display names
đĄđ Sark, CQ, is now in modern coverage under Locale Display Names > Territories (Europe) > Northern Europe
Languages names which were added or changed in English
As new locales reach Basic Coverage, their language names are added for locales targeting modern coverage. This will be happening the week of April 28th.
- tkl: English name changed to Tokelauan.
Core/Extensions
There is a new mechanism for better menu names. When you see a Code with -core
or -extension
, please read Locale Option Value Names.
đĄđ The link in the Info Panel was not pointing to Locale Option Value Names; that has been fixed. There is also now a Full List of the option names on that page.
Scripts
There are 5 new scripts for Unicode 17. Currently the names are in English: Beria Erfe, Chisoi, Sidetic, Tai Yo, Tolong Siki. Coverage for other languages is at comprehensive.
DateTime formats
Gregorian Calendar (Year First) calendar
đĄđ This is a variant of the Gregorian calendar whose formats always use year-month-day ordering and a 24-hour time cycle. See Year First Calendar for more details.
Note: the code is iso8601
, but disregard that; it will be changed after submission.
New flexible format and interval patterns
đĄđ There are some new patterns for you to supply. Make sure that the format is consistent with related patterns.
Note: Some locales have inconsistent patterns using eras: in some patterns using G (AD vs BC in Gregorian) but in related patterns using GGGGG (which is a narrow form: A vs B in Gregorian). The GGGGG is not typically needed except in special cases, such as the Japanese calendar.
New ârelativeâ variant for date-time combining pattern
There is a new â-relativeâ variant for Date-Time Combined Formats.
Before CLDR 48, there were two variants:
- A âstandardâ variant for combining date with time, typically without literal text. In English this was â{1}, {0}â and resulted in combined date patterns like âMarch 20, 3:00 PMâ, âMarch 20, 3:00-5:00 PMâ, âtomorrow, 3:00 PMâ, âtomorrow, 3:00-5:00 PMâ, âin 2 days, 3:00 PMâ
- An âatTimeâ variant for combining date with a single time (not a range). For longer styles in English this was â{1} âatâ {0}â and resulted in combined date patterns like âMarch 20 at 3:00 PMâ, âtomorrow at 3:00 PMâ, â2 days ago at 3:00 PMâ.
However, in some languages the use of a relative date such as âtomorrowâ or â2 days agoâ required a different combining pattern than for a fixed date like âMarch 20â. So in CLDR 48 a new ârelativeâ variant is introduced. This will be used (instead of the âatTimeâ variant) for the combination of a relative date and a single time.
If you do not supply this, that combination will fall back to using the âstandardâ variant; in English that would produce âtomorrow, 3:00 PMâ. If instead you want the same combining behavior for a relative date with a single time as for a fvfixed date with single time (as was the case in CLDR 47 and earlier), then for each length style copy the existing âatTimeâ form to the new ârelativeâ form.
Missing date & time patterns
đĄđ Some dates and times are ambiguous due to missing patterns. These additional patterns have been added to resolve this issue:
- GyM and GyMEd
- Time formats: EBh, Eh, EH when there is an existing EBhm, Ehm, and EHm in the respective calendar.
- Changed HH patterns in available and interval formats to include a reference to hour since seeing an hour number alone is ambigious.
Timezones, metazones and exemplar cities
New gmtUnknownFormat
Normally time zones formatted using UTC offset (like xxxx) use the gmtFormat
pattern (âGMT{0}â in root). The new gmtUnknownFormat
is used when formatting time zones using a UTC offset for cases when the offset or zone is unknown. The root value âGMT+?â need not be changed if it works for your locale; however it should be consistent with the gmtFormat
and gmtZeroFormat
in your locale. See Time Zones and City names
âUnknown Cityâ â âUnknown Locationâ
For zone Etc/Unknown
, the exemplarCity name was changed in English from âUnknown Cityâ to âUnknown Locationâ; other locales should update accordingly.
Changes to the root and/or English names of many exemplar cities and some metazones
Exemplar cities added or changed in English.
This was typically to move towards the official spelling in the country in question, such as retaining accents.
You should check these, but donât hesitate to retain the older version(
Code | New Value | Code | New Value |
---|---|---|---|
Africa/El_Aaiun | El AaiĂșn | Antarctica/Casey | Casey Station |
Africa/Lome | LomĂ© | Antarctica/DumontDUrville | Dumont dâUrville Station |
Africa/Ndjamena | NâDjamena | Antarctica/McMurdo | McMurdo Station |
America/Araguaina | AraguaĂna | Asia/Aqtau | Aktau |
America/Argentina/Rio_Gallegos | RĂo Gallegos | Asia/Hovd | Khovd |
America/Argentina/Tucuman | TucumĂĄn | Asia/Qyzylorda | Kyzylorda |
America/Belem | BelĂ©m | Asia/Urumqi | ĂrĂŒmqi |
America/Bogota | BogotĂĄ | Atlantic/Canary | Canarias |
America/Cordoba | CĂłrdoba | Europe/Busingen | BĂŒsingen |
America/Cuiaba | CuiabĂĄ | Europe/Chisinau | ChiÈinÄu |
America/Eirunepe | Eirunepé | Europe/Tirane | Tirana |
America/Maceio | MaceiĂł | Indian/Chagos | Chagos Archipelago |
America/Mazatlan | MazatlĂĄn | Indian/Comoro | Comoros |
America/Mexico_City | Ciudad de México | Indian/Kerguelen | Kerguelen Islands |
America/Miquelon | Saint-Pierre | Indian/Mahe | Mahé |
America/Santarem | Santarém | Pacific/Chatham | Chatham Islands |
America/Sao_Paulo | SĂŁo Paulo | Pacific/Galapagos | GalĂĄpagos |
Antarctica/Rothera | Rothera Station | Pacific/Kwajalein | Kwajalein Atoll |
Antarctica/Palmer | Palmer Land | Pacific/Marquesas | Marquesas Islands |
Antarctica/Troll | Troll Station | Pacific/Midway | Midway Atoll |
Antarctica/Syowa | Showa Station | Pacific/Noumea | Nouméa |
Antarctica/Mawson | Mawson Station | Pacific/Pitcairn | Pitcairn Islands |
Antarctica/Vostok | Vostok Station | Pacific/Wallis | Wallis & Futuna |
Metazones:
- Hovd Time changed to Khovd Time
- Qyzylorda Time changed to Kyzylorda Time
Number formats
Currency patterns alphaNextToNumber, noCurrency
- The
alphaNextToNumber
patterns should be used when currency symbol is alphabetic, such as âUSDâ; in this case the m=pattern may add a space to offset the currency symbol from the numeric value, if the standard pattern does not already include a space.- Note that some currency units may only be alphabetic at the start or end, such as CA$ or $CA. This pattern will be used if an alphabetic character would end up being adjacent to a number in the regular pattern. So suppose that the regular pattern is â€#,##0â and this pattern is †#,##0â: $CA would use this pattern (â$CA 123â), but CA$ would just use the regular pattern to get âCA$123â.
- The
noCurrency
patterns should be used when the currency amount is to be formatted without a currency symbol, as in a table of values all using the same currency. This pattern must not include the currency symbol pattern character â€â.
For more information see Number and currency patterns.
Rational formats
These patterns specify the formatting of rational fractions in your language. Rational fractions contain a numerator and denominator, such as œ, and may also have an integer, such a 5œ. There are two different âcombination patternsâ, needed because sometimes fonts donât properly support fractions (such as displaying 5 1/2), and need two patterns: one with a space and one without. It can be tricky to understand the difference, so be sure to carefully read Rational Formatting before making any changes.
Here are the the English values and a short description of their purpose:
Code | Default Value | Description |
---|---|---|
Rational |
{0}â{1} | The format for a rational fraction with arbitrary numerator and denominator; the English pattern uses the Unicode character âââ U+2044 FRACTION SLASH which causes composition of fractions such as 22â7. |
Integer + Rational |
{0}âŻ{1} | The format for combining an integer with a rational fraction composed using the pattern above; the English pattern uses U+202F NARROW NO-BREAK SPACE (NNBSP) to produce a non-breaking thin space . |
Integer + Rational-superSub |
{0}â {1} | The format for combining an integer with a rational fraction using composed using the pattern above; the English pattern uses U+2060 WORD JOINER, a zero-width no-break space. |
Usage |
sometimes | An indication of the extent to which rational fractions are used in the locale; must be either never or sometimes . |
If an integer and fraction (5œ) is best expressed in your language with a space between them (5 œ), then copy the pattern from integerAndRationalPattern to integerAndRationalPattern-superSub. However, you cannot do the reverse. Some fonts and rendering systems donât properly handle the fraction slash, and the user would see something like 51/2 (fifty-one halves). So in that case, implementations must have the integerAndRationalPattern with a space in it to fall back on, unless they have verified that the font / rendering system supports superscripting the numerator.
Units
Rework certain concentration units
The keys for two units changed (the translations can probably remain the same) and there is one new unit that is used for constructing certain other kinds of concentration units:
- key
permillion
changed toconcentr-part-per-1e6
; English values remain âparts per millionâ, â{0} part per millionâ, etc. - key
portion-per-1e9
changed toconcentr-part-per-1e9
; English values remain âparts per billionâ, â{0} part per billionâ, etc. - new key
part
used for constructing arbitrary concentrations such as âparts per 100,000â; English values âpartsâ, â{0} partâ, etc.
Please check over the values for concentr-part-per-1e6
and concentr-part-per-1e9
in your locale.
Some languages had used the equivalent of âmillionthsâ instead of the equivalent of âparts per millionâ.
For more information see Concentrations.
Many new units in English
Mnny new units were added in English. The metric ones are used in scientific contexts, and will need to be translated in all languages. However, the case inflections (accusative, dative, etc) will not be requested.
The units (English names) are:
- angle: steradians
- area: bu [JP], cho [JP], se [JP] (Japanese units)
- duration: fortnights
- concentr: katals
- electric: coulombs, farads, henrys, siemens
- energy: becquerels, British thermal units [IT], calories [IT], grays, sieverts
- force: kilograms-force
- length: chains, rods; jo [JP], ken [JP], ri [JP], rin [JP], shaku [cloth, JP], >shaku [JP], sun [JP] (Japanese units)
- magnetic: teslas, webers
- mass: slugs; fun [JP] (Japanese unit)
- temperature: rankines
- volume: metric fluid ounces; cups Imperial, pints Imperial; cup [JP], koku [JP], kosaji [JP], osaji [JP], sai [JP], shaku [volume, JP], to [JP] (Japanese units)
Survey Tool
Once trained and up to speed on Critical reminders (below), log in to the Survey Tool to begin your work.
Survey Tool Changes
- The ability to search in the Survey Tool has been added in CLDR-18423 and supports searching for: values, English value, and for the codes
- There has been substantial performance work that will show up for the first time. If there are performance issues, please file a ticket with a row URL and an explanation for what happened.
- In the Dashboard, you can filter the messages instead of jumping to the first one. In the Dashboard header, each notification category (such as âMissingâ or âAbstainedâ) has a checkbox determining whether it is shown or hidden.
- In each row of the vetting page, there is now a visible icon when there are forum messages at the right side of the English column:
- đïžâđšïž if there are any open posts
- đŹ if there are posts, but all are closed
- For Units and a few other sections, the Pages have changed to reduce the size on the page to improve performance.
- Pages may be split, and/or retitled
- Rows may move to a different page.
- The symbols in the A column have been changed to be searchable in browsers (with Find in Page) and stand out more on the page. See below for a table. They override the symbols in Survey Tool Guide: Icons.
See Recent changes for additional recent changes in the Survey Tool.
Important Notes
- Some of the Page reorganization may continue.
New Approve Status Icons
Symbol | Status | Notes |
---|---|---|
â | Approved | Enough votes for use in implementations ⊠|
âïž | Contributed | Enough votes for use in implementations ⊠|
âïž | Provisional | Not enough votes for implementations ⊠|
â | Unconfirmed | Not enough votes for implementations ⊠|
đłïž | Missing | Completely missing |
âŹïž | Inherited | Used in combination with âïž and â |
Enhanced âShow Hiddenâ
đĄđ If a field contains characters that are invisible or certain characters that look like others, a special Show Hidden bar will appear below the field that helps distinguish them. For example, see Example Hidden â here is a screen-shot.
Note that if you hover over the Show Hidden bar, youâll see the name of the special character and a short description. Some of the commonly used special characters are listed below, with an example from CLDR.
Symbol | Example | Show Hidden | Name | Description |
---|---|---|---|---|
â°NDASHâ± | {0}â{1} | {0}â°NDASHâ±{1} | En dash | Slightly wider than a hyphen; used for ranges of numbers and dates in many languages; for clarity may have â°TSPâ±s around it. |
â°TSPâ± | dâââd | dâ°TSPâ±â°NDASHâ±â°TSPâ±d | Thin space | A space character that is narrower (in most fonts) than the regular one. |
â°NBâ± | {0}â {1} | {0}â°NBâ±{1} | No Break | An invisible character that doesnât allow linebreaks on either side; also limits fraction super/subscripting |
â°NBTSPâ± | hâŻa | hâ°NBTSPâ±a | No-break thin space | A thin space that disallows linebreaks; equivalent to â°TSPâ±â°NBâ± |
â°NBSPâ± | re call | â°NBTSPâ± | No-break space | A regular space that disallows linebreaks; equivalent to adding â°NBâ± after a space |
â°NBHYâ± | reâcall | reâ°NBHYâ±fine | No-break hyphen | A regular hyphen that disallows linebreaks; equivalent to -â°NBâ± |
The BIDI controls â â°ALMâ± â°LRMâ± â°RLMâ± are used in bidirectional scripts (Arabic, Hebrew, etc.) to control the birectional order if needed; typically next to numbers or punctuation.
To see how to input these from the keyboard, and for a key to all the escapes, see Key for Show Hidden.
Known Issues
Last updated: 2025-05-17
This list will be updated as fixes are made available in Survey Tool Production. If you find a problem, please file a ticket, but please review this list first to avoid creating duplicate tickets.
- [CLDR-18577 - If your language does not have a variant value, you can vote for inheritance from the standard version.
- CLDR-17829 - some links in the Info panel not displaying properly
- CLDR-13477 - Images for the plain symbols. Non-emoji such as âŹ, â, », Âč, §, ⊠do not have images in the Info Panel. Workaround: Look at the Code column; unlike the new emoji, your browser should display them there.
- CLDR-17683 - Some items are not able to be flagged for TC review. This is being investigated. Meanwhile, Please enter forum posts meanwhile with any comments.
- CLDR-18637 - Some example pop-ups are showing âundefinedâ instead of the expected example
- CLDR-18607 - Unable to download current votes in CSV
- CLDR-18615 - Unclear error message if a link sends you to a page that no longer exists in the Survey Tool
- CLDR-18627 - Some locale display names at comprehensive are not available in all locales
Resolved Issues
Last updated: 2025-05-17
- đĄđ CLDR-18605 - Fix issue blocking import of winning votes from the previous cycle
- đĄđ CLDR-18649 - Same as root is now a warning if English is the same as root as well
- CLDR-18513 - Redirect from read-only locale to the default content locale does not work
- CLDR-17694 - Back button in browser fails in forum under certain conditions
- CLDR-17658 - Dashboard slowness
Recent Changes
- CLDR-17658 - In the Dashboard, the Abstains items will only have one entry per page. You can use that entry to go to its page, and then fix Abstains on that page. Once you are done on that page, hit the Dashboard refresh button (âș). This fixes a performance problem for people with a large number of Abstains, and reduces clutter in the Dashboard.
CLDR training (for new linguists)
Before getting started to contribute data in CLDR, and jumping in to using the Survey Tool, it is important that you understand the CLDR process & take the CLDR training. It takes about 2-3 hours to complete the training.
- Understand the basics about the CLDR process read the Survey Tool Guide and an overview of the Survey Tool Stages.
- New: A video is available which shows how to login and begin contributing data for your locale.
- Read the Getting Started topics on the Information Hub:
*If you (individual or your organization) have not established a connection with the CLDR technical committee, start with Survey Tool Accounts.
Critical reminders (for all linguists)
Youâre already familiar with the CLDR process, but do keep the following in mind:
- Aim at commonly used language - CLDR should reflect common-usage standards not academic /official standards (unless commonly followed). Keep that perspective in mind.
- Carefully consider changes to existing standards - any change to a value from a previous CLDR release (blue star) should be carefully considered and discussed with your fellow linguists in the CLDR Forum. Remember your change will be reflected across thousands of online products â and potentially almost all online users of your language.
- Keep consistency across logical groups - ensure that all related entries are consistent. If you change the name of a weekday, make sure itâs reflected across all related items. Check that the order of month and day are consistent in all the date formats, etc.
- Tip: The Reports are a great way to validate consistency across related logical groups, e.g. translations of date formats. Use them to proofread your work for consistency.
- Avoid voting for English - for items that do not work in your language, donât simply use English. Find a solution that works for your language. For example, if your language doesnât have a concept of calendar âquartersâ, use a translation that describes the concept âthree-month periodâ rather than âquarter-of-a-yearâ.
- Watch out for complex sections and read the instructions carefully if in doubt:
Tip: The links in the Info Panel will point you to relevant instructions for the entry youâre editing/vetting. Use it if in doubt.