Update Language Script Info


    1. Rick should have checked in spreadsheets in https://github.com/unicode-org/cldr/tree/main/tools/cldr-code/src/main/resources/org/unicode/cldr/util/data, with names of the form:

      1. country_language_population_raw.txt

      2. language_script_raw.txt

      3. For a descriptions of the contents, see Language Script Guidelines

        1. Do not edit the above files with a plain text editor; they are tab-delimited UTF-8 with many fields and should be imported/edited with a spreadsheet editor. Rick uses Excel, but Google sheets should also work fine.

    2. The world bank, un, and factbook data should be updated as per Updating Population, GDP, Literacy

  1. Note that there is an auxiliary file util/data/external/other_country_data.txt, which contains data that supplements that. If there are errors below because the country population is less than the language population, then that file may need updating.

    1. Run the tool ConvertLanguageData.

      1. -DADD_POP=true; for error messages.

        1. If there are any different country names, you'll get an error: edit external/alternate_country_names.txt to add them.

        2. Look for failures in the language vs script data, following the line:

          • Problems in language_script_raw.txt

        3. Look for Territory Language data, following the line:

            • Possible Failures ...

                • In Basic Data but not Population > 20%

                • and the reverse.

        4. Look for general problems, following the line:

          • Failures in Output.

            • It will also warn if a country doesn't have an official or de facto official language.

        5. Work until resolved.

    2. The tool updates in place {cldrdata}/common/supplemental/supplementalData.xml

    3. Carefully diff

    4. Then run QuickCheck to verify that the DTD is in order, and commit.

Update the supplementalData.xml <territoryContainment>

  1. Note: should automate this!

  2. Go to https://unstats.un.org/unsd/methodology/m49/, click on the tab Geographic Regions, copy the table, and paste into util/data/external/m49_raw.txt. Diff with old and check that the format didn't change

    1. NOTE: there is now a cleaner way to get the data, at https://unstats.un.org/unsd/methodology/m49/overview/

    2. For the UN, go to http://www.un.org/en/member-states/index.html. Copy the table, and paste into util/data/external/un_member_states_raw.txt. Diff with old.

    3. For the EU, do the same with https://europa.eu/european-union/about-eu/countries/member-countries_en, into util/data/external/eu_member_states_raw.txt

  3. For the EZ, do the same with http://ec.europa.eu/economy_finance/euro/adoption/euro_area/index_en.htm, into util/data/external/ez_member_states_raw.txt

    1. If there are changes, update <territoryContainment>