CLDRModify Passes

Main Process

This section describes how to run the CLDRModify passes for the mechanical cleanup before the release.

Successive Passes

You will then run CLDRModify with different options, in multiple passes.

  1. With each step, sanity check the results (as described below), fix any problems, tag, and check in as described below.
    1. This sanity check is important, since the regularizing often reveals problems in the original. For example, a date format like MMM’-yy regularizes to MMM-‘yy’ -- but the original was clearly an error.
    2. If you need to do a single file over again (eg resolving conflicts), use the -m option on CLDRModify, as described below.

After passes

Details

For the purpose of this document, we’ll assume you are generating into {cldrdata}/dropbox/gen/main/ as the target directory. Change any instance below to the directory that you actually use.

Passes

Options

Standard Options: add to your regular preferences -DSHOW_FILES plus your choice of source/target directories.

  1. After doing the /common/main/ files, do the other directories, with the extra options:
    1. -s{cldrdir}/common/annotations
    2. -s{cldrdir}/common/subdivisions
    3. -s{cldrdir}/seed/main
    4. -s{cldrdir}/seed/annotations
    5. -s{cldrdir}/exemplars/main

Other options for each pass:
You will have to repeat this cycle if any outside changes are made to the data!

One-Time Fixes

There are a number of “one time fixes” that are in the CLDRModify code. The code remains in case we want to adapt for future cases, but don’t use them unless you fix the code to do what you want, and carefully diff the results. Here are some of them:

  1. -fk: use a configuration file. Details on CLDRModify Config
    1. -fe: fix Interindic. If you need to make changes in transliteration, you might want to modify this and run it.
    2. -fs: Fix the stand-alone narrow values.
    3. -fu: Fix the unit patterns.
    4. -fd: Fix dates
    5. -fz: Fix exemplars
    6. -fr: Fix references and standard
    7. -fc: Transition from an old currency to a new currency. This fix is quite useful when a country introduces a new currency code ( usually due to a devaluation ), but the name remains the same. In order to use this fix, modify the following values in the CLDRModify code under “fixList.add(‘c’, “Fix transiton from an old currency code to a new one
      1. Change the String oldCurrencyCode and newCurrencyCode to reflect the currency codes you are transitioning.
      2. Change the int fromDate and toDate to reflect the dates that the old currency was in circulation. These will be used to create the date range in the old currency string.
      3. Run the CLDRModify tool as usual, diff the results and check in.

How to check in consistently after each pass

Sanity Check

  1. The console will list changes made, such as:
    • Creating File: {cldrdata}/dropbox\gen\main\zh_Hant.xml
      • *Renaming old {cldrdata}/dropbox\gen\main\zh_Hant.xml
      • %zh_Hant_HK - Replacing: <yy’年’M’月’d’日’> by <yy年M月d日> at: //ldml/dates/calendars/calendar[@type=”gregorian”]/dateFormats/dateFormatLength[@type=”short”]/dateFormat[@type=”standard”]/pattern[@type=”standard”]
        2. The diff folder in the output has CompareIt! bat files for each change, or you can use SVN diff after moving to the SVN folder by doing the Copy and then checking.

Copy Files

Now ready to check in

If someone checks in a change in the middle of one of your passes, it is generally easier to check in the rest of the changes, check out a clean copy of that file, and return the pass with only that file. The -m(uk) option can be used to restrict the pass to only uk.xml, for example.