Characters Names and Keywords (Emoji)

 
These are found in Survey Tool under Characters, and they are divided into different category types. For example, Smileys, People, Animal & Nature, etc...

If you see a "tofu" box in the Code column (as shown in the screenshot below), it may be that your system doesn't have font support to display the emoji images.

 
However an emoji image should show up in the right-hand navigation bar, as below.


It is often useful to view related languages to see how the emoji are handled. You can see the last release results at http://www.unicode.org/cldr/charts/latest/annotations/index.html , or open up the survey tool in another window and pick another language. You can also find information about other images by doing the following:
  1. Go to the page http://unicode.org/emoji/charts/full-emoji-list.html (you can leave this window open while you work on the emoji section).
  2. Open the "Find in page" menu (typically control or command-F), and paste in the character you can't see (eg, 😅) or search for the English name (e.g. "star-struck")
  3. You should see the images for the emoji, as below.
 

FAQ Tips for character names and keywords

  • Buttonized category: For example, for the "Up! button" 🆙 or Japanese "reserved" button 🈯, use the wording "button" in your language rather than referring to them as Ideogram or Ideograph.
  • Emoji specific to a country/region: For example, we have the Japanese symbol for beginner 🔰. If this symbol represents something else in your locale, you should adjust the name to reflect that. Otherwise you could refer to it as "shoshinsha mark" to make the origin clear.
  • Sensitivities in your locale. For example:
    • where alcohol is prohibited: Names and keywords of emoji that could be associated to alcohol should be handled with sensitivity to your locale. You could describe the emoji without reference to alcohol. For example "wine glass" 🍷 may be associated to sparkling drinks rather than wine.
    • where gambling is prohibited: Names and keywords of emoji associated with gambling (for example 🎲, ♣️, 🎰) should be handled with sensitivity to your locale. For example to describe as "dice" or "card" with no association to gambling in names or keywords.
  • Avoid using words that may describe one's opinion or sentiments. Use unbiased descriptive words. (For example, do not use "faith" in relation to a religious emoji.)

Short Character Names

In CLDR we provide descriptive but short names for the characters across languages. (For reference, see Background: Unicode Std. vs CLDR names.)

 

Goals for the short names collected in CLDR are:

  • Be unique among emoji names for that language. (You will receive an error in the survey tool if the name is already in use for another emoji.)
  • Be short (as much as possible), both written and spoken
  • Be descriptive of the prevailing color images. The descriptions only have to be enough to distinguish each image; they don't have to have any details beyond that.
  • Be consistent across images with similar features. (Don’t call 📫 a mailbox and 📬 a post box).
  • It is not a goal to be immutable; in future versions of CLDR, you can improve names by casing new suggestions if a more appropriate names are available. 
When voting on the emoji names and keywords:
  • Follow the middle of sentence rule. See Capitalization guideline.
  • As usual, the names in “en” are American English; where necessary those are customized for “en-GB”. For differences for sub-locale, see Regional Variant guideline.
  • The names in other languages do not have to follow the English name; the descriptions of the colored images may vary by culture. But translators may be informed by the English names.

Unique Names

The names must be unique. If you try to give two different emoji the same name, you will get an error, such as in the following, where the same name is given to "tiger" and "leopard" as shown in the image below.

Tips on how to handle unique names:
  • For animals and plants, it often helps to check the background of the English name, such as Tiger and Leopard, or look at languages related to yours to see how they are handled. For example, some Nordic languages don't distinguish Octopus and Squid, so people have had to come up with an alternative translations.
  • To handle name conflicts between a Zodiac symbol name with an animal/bug (Scorpius Zodiac vs Scorpion),
    try one of the following:
    • Add "zodiac" to all the zodiac signs
    • Add "zodiac" only to the conflicting item (e.g. Scorpius Zodiac)
    • Add a term to qualify the animal/bug
  • Unique name conflicts will be different for every language; however, you could look at how other languages have handled similar conflicts and get ideas on how to get around unique name errors.

Other common problem cases that must be distinguished.  NOTE that punctuation and uppercase vs lowercase distinctions are discarded when testing for uniqueness!

🌀 🌪 Cyclone vs Tornado
🌒 🌓 🌔 🌖 🌗 🌘 🌙 Different phases of the moon
🎵 🎶 Different numbers of notes
♏  🦂 Symbol vs animal
💃 🕺 Woman vs man
😡 😬 Pouting vs grimacing    
🚽  🚻  🚾 Toilet vs restroom vs WC (water closet)    
🆙  🔼  "Up!" button vs upwards button.

Gender

There are different ways emoji may have gender.
In some languages it may be tricky to do this, especially for the neutral case. Be sure to review the way in which this has been done for other cases in your language (and related languages) so that you are consistent. For example, you don't want the equivalent of "male firefighter" but "man police officer", or have "policeman" but "female police officer".  

Character Keywords

Note: The Keywords have been moved to Comprehensive coverage in the Survey tool, and is not visible under Modern coverage. This change is done to help prioritize the workload, and is planned to be moved to Modern in a future contribution cycle.

Keywords are one or more words or short phrases that can be used to search for the character in your language.
When picking keywords, consider:
  • Unlike the short names, keywords do not have to be unique across different emoji. For example, in English "mailbox" is a keyword associated to any of (📫 📪 📬 📭).
  • Unlike the short names, the keywords do not have to be as short as possible.
  • Each keyword should help narrow down the choices among emoji, so provide as many keywords that are relevant to the emoji.
  • A " | " (pipe) character is used to separate each keyword when there are multiple keywords for an emoji (see screenshot below). When you want to append another keyword, click the + to add, copy the Winning value, and append " | " before adding your new keyword.
  • The ordering of keywords are not significant. Do not add new suggestions only for the ordering of the keywords. Keywords are order per Unicode ordering, and will override any ordering you provide in your suggestion.
  • Like the short names, the keywords should not just be literal translations of the English— the keywords should be based on associations to the image in your language.
  • It does not need to be the same number of keywords in your language as there are in English. 

Background: Unicode Std. vs CLDR names

The Unicode Name character property are part of the Unicode standard. They are immutable, unique IDs over all Unicode characters, and limited to uppercase ASCII letters, digits and hyphen. The names often do not apply well to the prevailing practice for emoji images, and are only available in English. Their main purpose is to serve as unique identifiers, and may not be particularly descriptive or short. Example: ɞ  U+025E LATIN SMALL LETTER CLOSED REVERSED OPEN E. Because the Unicode Name is immutable, it cannot change to reflect that.

The Unicode Name may inform the English CLDR short names, and that is what we start with (for English) unless one of the other factors come into play. CLDR short names are not limited to ASCII or uppercase, even in English.
Comments