Adding Transforms/Transliterators

For each transform:

1. There should be a .xml file with rules, and a corresponding .txt file with tests.
2. Put the .xml file in workspace/cldr/common/transforms
3. Put the .txt file in workspace/cldr/tools/cldr-unittest/src/org/unicode/cldr/unittest/data/transformtest/
  1. Note that the .txt file may look reversed if it is RTL, since the 2 fields will show up from right to left.
4. Run org.unicode.cldr.unittest.TestTransforms and verify that it works

- 1. Then run Run org.unicode.cldr.unittest.TestAll, just to make sure
  2. If either fails, communicate back to the author any problems, go back to step 1

1. Check in.

Adding new Transliterators

There is a gotcha when adding transforms. If transform A-B depends on transform X (eg it uses ::A-C; ::C-B), then ICU has to register B before A. A table for this is built for testing CLDR transforms, and for the tool that converts to ICU: ConvertTransforms. When you add a new transform, you may need to add to that table. (If you are lucky, and X occurs alphabetically before A-B, then you don't need to do this.)

How to do it:

Open CLDRTransforms
Goto class DependencyOrder
There is a static list at the top of that class, with lines like:
1. addDependency("es-zh", "es-es_FONIPA", "es_FONIPA-zh");
Add a new line of that form

Make sure you run the tests to verify that the new transliterators are correct.

Testing Transliterators

run org.unicode.cldr.unittest.TestTransforms - does a basic test of transforms.

The following need to be merged into the unittest above. For now they are standalone.

org.unicode.cldr.test.TestTransformsSimple - runs a few other tests.
org.unicode.cldr.icu.ConvertTransforms - generates the ICU-style transforms, in a folder of your choice. Do this before running TestTransforms.
org.unicode.cldr.test.TestTransforms - runs the ICU4J transliteration tests. Set -Dfiles to the folder you used for #1, like:
- -Dfiles=${workspace_loc}/Generated/cldr/icu-transforms/

Adding test files

You can add plaintext test files to the following folder. Any files there are run as a part of the unittest.

${workspace_loc}/cldr/tools/java/org/unicode/cldr/util/data/test

Each such test file should have the name of the transliterator + ".txt". The format is:

{source_string}{tab}{expected_result}

For example, for cs-ja.txt

Achijáš Šíloský アヒヤーシュ・シーロスキー

achnatonova アフナトノヴァ

...