Emoji Names and Keywords

CLDR collects short character names and keywords for Emoji characters and sequences.

These are found in Survey Tool under Characters, and they are divided into different category types. For example, Smileys, People, Animal & Nature, etc…

image

No Font support

If you see a “tofu” box in the Code column (as shown in the screenshot below), it may be that your system doesn’t have font support to display the emoji images.

image

However an emoji image should show up in the right-hand navigation bar, as below.

image

References

Constructed

Many of the emoji names are constructed, which means that in implementations emoji, multiple parts are used to construct and add on to other emoji. These emoji that can be used as parts to add on are found in under Characters in the Survey Tool, under Component, People, and Category. Please review these carefully!

  1. Characters\Component contains special emoji whose names are used for emoji with hair colors and skin tones.
    1. Carefully check all of these using examples provided in the right pane when you hover over each translated term (either in the Winning or Others columns)

    image

    image

    1. Correct meaningless translations. We have seen in some languages problems with older terms for the skin tones that won’t mean anything to users. Use understandable terms like “light skin” instead of numbered levels like “peau 1”.
    2. CLDR doesn’t have gender agreement for nouns, so please choose the grammatical forms that work the best.
    3. For example, in some languages there will be an adjective for “light skinned” that would need to agree with the noun (man or woman). It may work to make noun phrases instead, eg “light skin” or “bald head”.
  2. Characters\People contains three values which are used to construct emoji.
    1. All of these have examples marked by an ⓔ in the English Column.
    2. Hover over the ⓔ to see how some sample constructed emoji would look in English. image
  3. Characters\Category contain terms like “flag” (used in constructing flag names). These 3 terms are also marked with ⓔ, so make sure to review each of the examples in English and your language.
  4. Blond/Bearded. The people with blond hair or beards need to have names consistent with those used for hair styles (see dark skin tone examples), such as:
    1. 🧔 — man: beard
    2. 👱 — person: blond hair
    3. 👱‍♂️ — man: blond hair
    4. 👱‍♀️ — woman: blond hair

FAQ Tips for character names and keywords

In CLDR we provide descriptive but short names for the characters across languages. (For reference, see Background: Unicode Std. vs CLDR names.)

Goals for the short names collected in CLDR are:

   
Unique Be unique among emoji names for that language. (You will receive an error in the survey tool if the name is already in use for another emoji.)
Short Be short (as much as possible), both written and spoken.
Descriptive Be descriptive of the prevailing color images. Don’t be “over-descriptive”, however. The descriptions only have to be enough to distinguish each image from the others: they shouldn’t have any details beyond that.
Consistent Be consistent across images with similar features. (Don’t call 📫 a mailbox and 📬 a post box).
Flexible It is not a goal to be immutable: in future versions of CLDR, you can improve names by casing new suggestions if a more appropriate names are available.
Not Literal Names should not just be literal translations of the English– the names should be based on associations to the image in your language. But you can be informed by the English names. (Exceptions to this when there is no equivalent in your language. See Emoji specific to a country/region)

When voting on the emoji names and keywords:

Unique Names

The names must be unique. If you try to give two different emoji the same name, you will get an error, such as in the following, where the same name is given to “tiger” and “leopard” as shown in the image below.

Tips on how to handle unique names:

image image

Other common problem cases that must be distinguished. NOTE that punctuation and uppercase vs lowercase distinctions are discarded when testing for uniqueness!

   
🌀 🌪 Cyclone vs Tornado
🌒 🌓 🌔 🌖 🌗 🌘 🌙 Different phases of the moon
🎵 🎶 Different numbers of notes
♏ 🦂 Symbol vs animal
💃 🕺 Woman vs man
😡 😬 Pouting vs grimacing
🚽 🚻 🚾 Toilet vs restroom vs WC (water closet)
🆙 🔼 “Up!” button vs upwards button.

Gender

There are different ways emoji may have gender.

For the full triples, we need three unique names:

In some languages it may be tricky to do this, especially for the neutral case.

Gender-neutral forms

Be sure to review the way in which this has been done for other cases in your language (and related languages) so that you are as consistent as possible.

Character Keywords

Keywords are one or more words or short phrases that can be used to search for the character in your language.

When picking keywords, remember the following goals:

   
Not Unique Unlike the short names, keywords do not have to be unique across different emoji. For example, in English “mailbox” is a keyword associated to any of ( 📫 📪 📬 📭).
Not Short Unlike the short names, the keywords need not be as short as possible. Don’t go overboard in length, however!
Number Each keyword should help narrow down the choices among emoji. The goal is to provide the best set of keywords that are relevant to the emoji; be mindful of the number of keywords and try to limit the number of keywords below 5. (You will start to see a warning on keywords beyond 6.) Consumers of CLDR data typically generate synonyms and forms of the keywords that are supplied, so you shouldn’t provide variant forms of keywords. For human emoji (such as “student”), the gender-inclusive form will automatically be fleshed out by adding the keywords of the male and female variants. Here are some tips on how to be mindful of the number of keywords: • Don’t add grammatical variants or case inflections : pick one of {“walks”, “walking”}; pick one of {sake, saké}. • Don’t add multiple forms with different gender variants: pick a representative form. • Don’t add keywords to the gender-neutral form that are already supplied for the male or female form. • Don’t add emoji names (these will be added automatically). • Don’t add repeats of words starting with the same starting word in the emoji name. For example, for “umbrella with rain drops” emoji, the set of keywords considered are {clothing | drop | rain | umbrella | umbrella with rain drops}; but, you do not need {umbrella} and {umbrella with rain drops}. Thus, the best set of keywords are {clothing | drop | rain}. Please follow these guidelines even if the source English does not.
Separation A “ | “ (pipe) character is used to separate each keyword when there are multiple keywords for an emoji (see screenshot below) . When you want to append another keyword, click the + to add, copy the Winning value, and append “ | “ before adding your new keyword.
Ordering The ordering of keywords are not significant . Do not add new suggestions only for the ordering of the keywords. Keywords are in “alphabetical order” (default Unicode ordering), and will override any ordering you provide in your suggestion.
Not Literal Like the short names, the keywords should not just be lite ral translations of the English — the keywords should be based on associations to the image in your language. It does not need to be the same number of keywords in your language as there are in English! But you can be informed by the English names.
Not offensive Do not add keywords for an emoji where the association of the two could be offensive to many people. For example, do not use “superstition” as a keyword for the emoji related to religion, or add a keyword for a particular nation to the goblin emoji (👺).

Keywords vote calculation

Emoji keywords are unique in the Survey Tool voting experience because it contains multiple item (words) in a single vote. The following enhancements to the Survey Tool are introduced starting in v34 to help improve the efficiency of keyword voting efforts.

  1. Keyword voting: The calculation of the winning set of keywords is now different. Beforehand, if you had the following choices, #1 would win. Now, the fact that #2 is a subset of #3 gives it a larger weight in voting, and #2 will win.
    1. {small} : 4 votes
    2. {big large} : 3 votes
    3. {big large grand} : 3 votes
  2. Keyword de-duplication: If one keyword phrase is covered by other keyword phrases, then it will be removed. For example, the set {big bad wolf big bad wolf} ⇒ {bad big wolf}. This will happen automatically as you enter values.
    1. Note that the items in the set are also automatically alphabetized: {big | bad | wolf} ⇒ {bad | big | wolf}

      Keywords in Survey Tool view

image

Background: Unicode Std. vs CLDR names

The Unicode Name character property are part of the Unicode standard. They are immutable, unique IDs over all Unicode characters, and limited to uppercase ASCII letters, digits and hyphen. The names often do not apply well to the prevailing practice for emoji images, and are only available in English. Their main purpose is to serve as unique identifiers, and may not be particularly descriptive or short. Example: ɞ U+025E LATIN SMALL LETTER CLOSED REVERSED OPEN E. Because the Unicode Name is immutable, it cannot change to reflect that.

The Unicode Name may inform the English CLDR short names, and that is what we start with (for English) unless one of the other factors come into play. CLDR short names are not limited to ASCII or uppercase, even in English.

Animal Faces

A: For some animals, there are two different emoji, one of which has a name including the word “face”: for example, 🐕 U+1F415 dog and 🐶 U+1F436 dog face. In these cases, the use of “face” in the name is important for distinguishing the two emoji, and the name in your language should include an indication that it is the face rather than a whole or partial body.

For other animals, there is no such distinction. For example, there is only one wolf: 🐺 U+1F43A. In that case, you don’t need to use a term corresponding to “face” in your language, even if the English name has the word face (that is often due to historical accident.)