Language Support Levels

People often ask whether some device or application supports their language. This seems like a simple question: yes or no. But the reality is that there are different levels of support for a language, ranging from allowing the user to read their language on the platform all the way up to having a voice assistant in their language.


This page defines a common set of terminology for language support levels for platforms such as operating systems, browsers, etc. The goal is to have consistent terminology so that people can clearly indicate the level of support for a given language. The focus here is on the incremental changes necessary to add a language to a platform that is already enabled for Unicode.


The following support levels are defined:

Language Support Levels

Notes:

Applications

These levels also apply to individual applications, with some adjustments. For example, typically applications can only support languages that are Selectable on the platform. But once that level is reached, an application can often support a language at a higher level than the platform can. 

Good applications and services will often allow choice of different languages. For example, a word processor can offer a menu for picking the language of selected text, so that that text can be handled in accordance with the selected language in operations like spell-checking. A spreadsheet can format dates in one column in French, and a second column in Russian. (This choice of language is independent of the language of the UI for the application or service.)


CLDR Coverage Levels


CLDR provides different coverage levels, referenced by the above table.

Basic coverage — Selection

The most basic language support requires Unicode characters for writing that language, the Unicode properties and algorithms for those characters that let them work on computers and phones (such as line-wrapping), the fonts used to display those characters, and the keyboard layouts needed to enter those characters.

That is, the text is interchangeable across devices, it displays as expected on any device (given fonts), and users can enter and edit text in that language.

Moderate coverage — Minimal i18n support

For content language support, the user sees correct processing for that language, with sorting, matching, display and entry of dates, times, numbers, currencies, and so on. This requires data and algorithms that support these features, so that a phone or other device knows to display a date as “Freitag, 13. Januar 2012” or “Παρασκευή, 13 Ιανουαρίου 2012”.

Modern coverage — Full i18n support

For UI-language support,  the language is also supported in the user interface of an application, web page, or OS: All of the menus, dialogs, help-text, and so on are in the user’s language. The user doesn’t need to know another language in order to use the application, web page, or OS.

Of course, more sophisticated programs will further layer on top of UI or content language support to offer capabilities that depend on language: complex searching algorithms will identify entities in the text, and allow for matching that takes into account linguistic synonyms and inflections; text-to-speech and speech-to-text capabilities allow the user to easily interact with a device, and so on. These tend to require very sophisticated machine-learning models, typically based on massive amounts of data.

In practice

Much of the world is multilingual, and people are often fluent enough in a second language to be able to use that as their UI language. But they still want and need to be able to use their language as a content language. Take Yoruba, for example, with ca. 30 million speakers. A Yoruba speaker may be able to use English as their UI language, but still needs to be able to write emails in Yoruba, compose documents in Yoruba, and create a spreadsheet in Yoruba. To help with language preservation, content language support on digital devices is an important step above basic language support.

The Unicode Consortium enables vendors to support additional content languages by providing the characters those languages need, the properties and algorithms for those characters that let them work on computers and phones (line break, ...), and core linguistic support (sorting, matching; keyboard layouts; entering/displaying dates, times, numbers, currencies, measurements, country names,…). The consortium doesn't supply fonts, but those are available from other sources, such as Google's Noto fonts project (free, under the open font license).

The consortium also offers some support for UI language support in CLDR and ICU, but most of the work necessary to support a given language as a UI language depends on the app or OS provider doing the necessary translations.

Example: Cherokee

For basic language support, characters for Cherokee had to be added to Unicode, since it doesn’t use the Latin characters (A, B, C, ...  ). The Unicode properties had to be supplied, so that the standard Unicode algorithms would work for for text comparison, line-wrap, word selection, and so on. Fonts and keyboard layouts for Cherokee are available, but might require an additional step for installation if not already on the OS. The content language support for dates, times, and other features are provided for in CLDR.

Developers can produce apps that support Cherokee as a content language using the Unicode ICU libraries. It supplies them with the code to handle the necessary Unicode characters, properties and algorithms, and the CLDR content-language data.

Systems that support the most up-to-date Unicode ICU libraries (like Android or the Mac) should see Cherokee as a choice among languages. For example, on a Mac someone can pick a Cherokee keyboard, and type in GMail the Cherokee characters: 

Language selection UI

Below are examples of selecting Cherokee on different systems. (Cherokee in Cherokee looks like "CWY", and is midway down on each.)

Android

Macintosh

Windows

Cherokee is not typically a UI language for the OS, meaning the system isn't translated into it. So in practice a user must also select an alternative language such as English that will appear in the UI for any applications that don't support Cherokee.