Caution: This is probably not the page you are looking for. Check here We need to flesh out this page more, but here is an example of adding a conversion: http://unicode.org/cldr/trac/changeset/4691 See also: Working with LDML2ICUConverter and use it for ICU4CA typical setup to do development both in CLDR (LDML2ICUConverter) and ICU4C is:
Here, <icu4j_branch_name> , <cldr_branch_name> , <icu4c_branch_name> are the branches from which these three projects are checked out. Since icu4j is used only for running the LDML2ICUConverter, it can be any stable branch or even trunk. The script below assumes <cldr_branch_name> and <icu4c_branch_name> are the same, i.e., either they are developed on the trunk or the development branches are named the same. Some tweaking of the scripts are needed if they are different.The rest of this document assumes that the commands are run from root_dir .Checking outCheckout can be done as follows:
Building LDML toolsTo build LDML tools, use the following script (build_ldml.sh ) .
#!/bin/bash BRANCH_NAME='trunk' ICU4J_BRANCH_NAME='trunk' CLEAN_FLAG= while getopts 'b:j:c' BRANCH_OPTION do case "$BRANCH_OPTION" in b) BRANCH_NAME="$OPTARG" ;; j) ICU4J_BRANCH_NAME="$OPTARG" ;; c) CLEAN_FLAG=1 ;; *) printf "Usage: %s [-b <branch_name>] [-j <icu4j_branch_name>] <project>\n" $(basename 0) >&2 exit -1 ;; esac done shift $(($OPTIND-1)) PROJECT=$1 while true; do if [ -z "$PROJECT" ]; then echo Project is missing. break fi CURR_DIR=`pwd` export ICU4JDIR=$CURR_DIR/icu4j-trunk/$ICU4J_BRANCH_NAME if [ ! -d "$ICU4JDIR" ]; then echo \$ICU4JDIR=$ICU4JDIR is not a directory. break fi export CLDR_DIR=$CURR_DIR/$PROJECT/cldr/$BRANCH_NAME if [ ! -d "$CLDR_DIR" ]; then echo \$CLDR_DIR=$CLDR_DIR is not a directory. break fi echo "\$ICU4JDIR = $ICU4JDIR" echo "\$CLDR_DIR = $CLDR_DIR" export ICU4J_JAR=$ICU4JDIR/icu4j.jar echo "\$ICU4J_JAR = $ICU4J_JAR" if [ ! -f $ICU4J_JAR ]; then echo \$ICU4J_JAR does not exist. break fi export UTILITIES_JAR=$ICU4JDIR/out/cldr_util/lib/utilities.jar echo "\$UTILITIES_JAR = $UTILITIES_JAR" if [ ! -f $UTILITIES_JAR ]; then echo \$UTILITIES_JAR does not exist. break fi ICU4J_CLASSES_1="$ICU4JDIR/main/classes/core/out/bin" ICU4J_CLASSES_2="$ICU4JDIR/out/cldr_util/bin" export ICU4J_CLASSES="$ICU4J_CLASSES_1:$ICU4J_CLASSES_1" echo "\$ICU4J_CLASSES = $ICU4J_CLASSES" if [ ! -d $ICU4J_CLASSES_1 ]; then echo $ICU4J_CLASSES_1 does not exist. break fi if [ ! -d $ICU4J_CLASSES_2 ]; then echo $ICU4J_CLASSES_2 does not exist. break fi export XML_APIS_JAR=/usr/share/java/xalan2.jar echo "\$XML_APIS_JAR = $XML_APIS_JAR" if [ ! -f $XML_APIS_JAR ]; then echo \$XML_APIS_JAR does not exist. break fi export CLDR_CLASSES="$CLDR_DIR/tools/java/classes:$CLDR_DIR/tools/java" cd $CLDR_DIR/tools/java if [ "$CLEAN_FLAG" ]; then ant clean fi ant all ant jar ant icu break done Generating ICU4C locale data files from CLDR data filesTo generate ICU4C .txt files from the CLDR .xml files, use the following script (gen_icu4c_txt.sh ).
#!/bin/bash BRANCH_NAME='trunk' ICU4J_BRANCH_NAME='trunk' CLEAN_FLAG= while getopts 'b:j:c' BRANCH_OPTION do case "$BRANCH_OPTION" in b) BRANCH_NAME="$OPTARG" ;; j) ICU4J_BRANCH_NAME="$OPTARG" ;; c) CLEAN_FLAG=1 ;; *) printf "Usage: %s [-b <branch_name>] [-j <icu4j_branch_name>] [-c] <project>\n" $(basename 0) >&2 exit -1 ;; esac done shift $(($OPTIND-1)) PROJECT=$1 while true; do if [ -z "$PROJECT" ]; then echo Project is missing. break fi CURR_DIR=`pwd` export ICU4JDIR=$CURR_DIR/icu4j-trunk/$ICU4J_BRANCH_NAME echo "\$ICU4JDIR = $ICU4JDIR" if [ ! -d "$ICU4JDIR" ]; then echo \$ICU4JDIR=$ICU4JDIR is not a directory. break fi export CLDR_DIR=$CURR_DIR/$PROJECT/cldr/$BRANCH_NAME echo "\$CLDR_DIR = $CLDR_DIR" if [ ! -d "$CLDR_DIR" ]; then echo \$CLDR_DIR=$CLDR_DIR is not a directory. break fi export ICU4J_JAR=$ICU4JDIR/icu4j.jar echo "\$ICU4J_JAR = $ICU4J_JAR" if [ ! -f $ICU4J_JAR ]; then echo \$ICU4J_JAR does not exist. break fi export UTILITIES_JAR=$ICU4JDIR/out/cldr_util/lib/utilities.jar echo "\$UTILITIES_JAR = $UTILITIES_JAR" if [ ! -f $UTILITIES_JAR ]; then echo \$UTILITIES_JAR does not exist. break fi ICU4J_CLASSES_1="$ICU4JDIR/main/classes/core/out/bin" ICU4J_CLASSES_2="$ICU4JDIR/out/cldr_util/bin" export ICU4J_CLASSES="$ICU4J_CLASSES_1:$ICU4J_CLASSES_2" echo "\$ICU4J_CLASSES = $ICU4J_CLASSES" if [ ! -d $ICU4J_CLASSES_1 ]; then echo $ICU4J_CLASSES_1 does not exist. break fi if [ ! -d $ICU4J_CLASSES_2 ]; then echo $ICU4J_CLASSES_2 does not exist. break fi export XML_APIS_JAR=/usr/share/java/xalan2.jar echo "\$XML_APIS_JAR = $XML_APIS_JAR" if [ ! -f $XML_APIS_JAR ]; then echo \$XML_APIS_JAR does not exist. break fi CLDR_CLASSES_1="$CLDR_DIR/tools/java/classes" CLDR_CLASSES_2="$CLDR_DIR/tools/java" export CLDR_CLASSES="$CLDR_CLASSES_1:$CLDR_CLASSES_2" echo "\$CLDR_CLASSES = $CLDR_CLASSES" if [ ! -d $CLDR_CLASSES_1 ]; then echo $CLDR_CLASSES_1 does not exist. break fi if [ ! -d $CLDR_CLASSES_2 ]; then echo $CLDR_CLASSES_2 does not exist. break fi export ICU4C_DIR=$CURR_DIR/$PROJECT/icu4c/$BRANCH_NAME echo "\$ICU4C_DIR = $ICU4C_DIR" if [ ! -d $ICU4C_DIR ]; then echo \$ICU4C_DIR is not a directory. break fi cd $ICU4C_DIR/source/data if [ "$CLEAN_FLAG" ]; then ant clean fi ant all break done (Configuring and ) Building ICU4CTo build ICU4C, use the following script (build_icu4c.sh ).
#!/bin/bash BRANCH_NAME='trunk' CLEAN_FLAG= TEST_FLAG= while getopts 'b:j:ct' BRANCH_OPTION do case "$BRANCH_OPTION" in b) BRANCH_NAME="$OPTARG" ;; c) CLEAN_FLAG=1 ;; t) TEST_FLAG=1 ;; *) printf "Usage: %s [-b <branch_name>] [-c] [-t] <project>\n" $(basename 0) >&2 exit -1 ;; esac done shift $(($OPTIND-1)) PROJECT=$1 while true; do if [ -z "$PROJECT" ]; then echo Project is missing. break fi CURR_DIR=`pwd` DEV_DIR=$CURR_DIR/$PROJECT/icu4c/$BRANCH_NAME/dev if [ ! -d $DEV_DIR ]; then echo Creating development directory... mkdir $DEV_DIR cd $DEV_DIR echo Configuring build environment.... ../source/runConfigureICU Linux fi cd $DEV_DIR if [ "$CLEAN_FLAG" ]; then echo Cleaning... make clean fi echo Building... make if [ "$TEST_FLAG" ]; then echo Testing... make check fi break done Running LDML2ICUConverter standaloneWarning: LDML2ICUConverter is partly deprecated. NewLdml2IcuConverter now handles conversion of locale, supplemental and bcp47 data. The script here should be used for those types of data anymore. root_dir/tmp , the following script (run_ldml2icuconverter.sh ) can be used.
#!/bin/bash BRANCH_NAME='trunk' ICU4J_BRANCH_NAME='trunk' OUT_DIR= MODULE= LOCALE= while getopts 'b:d:j:l:m:' BRANCH_OPTION do case "$BRANCH_OPTION" in b) BRANCH_NAME="$OPTARG" ;; d) OUT_DIR="$OPTARG" ;; j) ICU4J_BRANCH_NAME="$OPTARG" ;; l) LOCALE="$OPTARG" ;; m) MODULE="$OPTARG" ;; *) printf "Usage: %s [-b <branch_name>] [-j <icu4j_branch_name>] <project>\n" $(basename 0) >&2 exit -1 ;; esac done shift $(($OPTIND-1)) PROJECT=$1 while true; do if [ -z "$PROJECT" ]; then echo Project is missing. break fi echo "\$MODULE = $MODULE" if [ -z "$MODULE" ]; then echo Module is missing. Use the -m switch to specify the locale. break fi echo "\$LOCALE = $LOCALE" if [ -z "$LOCALE" ]; then echo Locale is missing. Use the -l switch to specify the locale. break fi CURR_DIR=`pwd` if [ -z "$OUT_DIR" ]; then OUT_DIR=$CURR_DIR/tmp fi if [ ! -d "$OUT_DIR" ]; then echo "$OUT_DIR should be present." break fi export ICU4JDIR=$CURR_DIR/icu4j-trunk/$ICU4J_BRANCH_NAME if [ ! -d "$ICU4JDIR" ]; then echo \$ICU4JDIR=$ICU4JDIR is not a directory. break fi export CLDR_DIR=$CURR_DIR/$PROJECT/cldr/$BRANCH_NAME if [ ! -d "$CLDR_DIR" ]; then echo \$CLDR_DIR=$CLDR_DIR is not a directory. break fi echo "\$ICU4JDIR = $ICU4JDIR" echo "\$CLDR_DIR = $CLDR_DIR" export ICU4J_JAR=$ICU4JDIR/icu4j.jar echo "\$ICU4J_JAR = $ICU4J_JAR" if [ ! -f $ICU4J_JAR ]; then echo \$ICU4J_JAR does not exist. break fi export UTILITIES_JAR=$ICU4JDIR/out/cldr_util/lib/utilities.jar echo "\$UTILITIES_JAR = $UTILITIES_JAR" if [ ! -f $UTILITIES_JAR ]; then echo \$UTILITIES_JAR does not exist. break fi ICU4J_CLASSES_1="$ICU4JDIR/main/classes/core/out/bin" ICU4J_CLASSES_2="$ICU4JDIR/out/cldr_util/bin" export ICU4J_CLASSES="$ICU4J_CLASSES_1:$ICU4J_CLASSES_1" echo "\$ICU4J_CLASSES = $ICU4J_CLASSES" if [ ! -d $ICU4J_CLASSES_1 ]; then echo $ICU4J_CLASSES_1 does not exist. break fi if [ ! -d $ICU4J_CLASSES_2 ]; then echo $ICU4J_CLASSES_2 does not exist. break fi export XML_APIS_JAR=/usr/share/java/xalan2.jar echo "\$XML_APIS_JAR = $XML_APIS_JAR" if [ ! -f $XML_APIS_JAR ]; then echo \$XML_APIS_JAR does not exist. break fi export CLDR_CLASSES="$CLDR_DIR/tools/java/classes:$CLDR_DIR/tools/java" INPUT_DIR=$CLDR_DIR/common/$MODULE if [ ! -d "$INPUT_DIR" ]; then echo echo "\$INPUT_DIR=$INPUT_DIR is not a directory. Probably you have an invalid module \"$MODULE\"." echo "Valid modules are bcp47 collation main rbnf segments supplemental transforms." break fi java \ -cp "$UTILITIES_JAR:$ICU4J_JAR:$XML_APIS_JAR:$CLDR_CLASSES:/usr/share/java/xml-apis.jar:/usr/share/java/xercesImpl.jar:/usr/share/java/ant.jar" \ -Dfile.encoding=UTF-8 -Xmx700M -DSHOW_FILES -DSHOW -DCLDR_DIR=$CLDR_DIR \ org.unicode.cldr.icu.LDML2ICUConverter \ -s $INPUT_DIR \ -m $CLDR_DIR/common/supplemental \ -d $OUT_DIR \ $LOCALE.xml break done Comparing generated files and data sizesThe following Python program ( comp_dirs.py ) can be used to compare the file sizes in two directories.
#!/usr/bin/python import filecmp import os import os.path import sys class CmpDirs: def __init__(self, dir1, dir2): self.commonFiles = set() self.changedWithSameSizeFiles = set() self.shrunkFiles = {} self.expandedFiles = {} self.dir1 = dir1 self.dir2 = dir2 self.totalSize1 = 0L self.totalSize2 = 0L self.totalChanged1 = 0L self.totalChanged2 = 0L def compareFiles(self, filename): file1 = "%s/%s" % (self.dir1, filename) file2 = "%s/%s" % (self.dir2, filename) size1 = os.path.getsize(file1) size2 = os.path.getsize(file2) self.totalSize1 += size1 self.totalSize2 += size2 if size1 == size2: if not filecmp.cmp(file1, file2): self.changedWithSameSizeFiles.add(filename) else: self.totalChanged1 += size1 self.totalChanged2 += size2 sizes = [size1, size2] if size1 > size2: self.shrunkFiles[filename] = sizes else: self.expandedFiles[filename] = sizes def compareDirectories(self): files1 = set(os.listdir(self.dir1)) files2 = set(os.listdir(self.dir2)) if (files1 != files2): print "Directories have different contents" diff1 = files1 - files2 if diff1: print "The following files exist in %s but not in %s" % (self.dir1, self.dir2) print " " + "\n ".join(diff1) diff2 = files2 - files1 if diff2: print "The following files exist in %s but not in %s" % (self.dir2, self.dir1) print " " + "\n ".join(diff2) self.commonFiles = files1 & files2 for f in self.commonFiles: self.compareFiles(f) def report(self): headerFormat = "%-40s = %12ld" headerFormatWithPercentage = "%-40s = %12ld (%10.2f%%)" print "COMPARISON OF DIRECTORIES %s and %s" % (self.dir1, self.dir2) print print headerFormat % ("Number of files", len(self.commonFiles)) print headerFormat % ("Number of shrunk files", len(self.shrunkFiles)) print headerFormat % ("Number of expanded files", len(self.expandedFiles)) print headerFormat % ("Total size (original)", self.totalSize1) if self.totalSize1 == 0L: print "No change in total size." else: print headerFormatWithPercentage % ("Total size (final)", self.totalSize2, self.totalSize2 * 100.0 / self.totalSize1) print headerFormat % ("Total size of changed files (original)", self.totalChanged1) if self.totalChanged1 == 0L: print "No files changed." else: print headerFormatWithPercentage % ("Total size of changed files (final)", self.totalChanged2, self.totalChanged2 * 100.0 / self.totalChanged1) print headingLines = "-" * 55 heading = "%-20s%12s%12s%11s" % ("File", "Old size", "New size", "Percentage") if len(self.shrunkFiles) > 0: print "\nSHRUNK FILES:" print headingLines print heading print headingLines for f in sorted(self.shrunkFiles.keys()): sizes = self.shrunkFiles[f] print "%-20s%12ld%12ld%8.2f %%" % (f, sizes[0], sizes[1], sizes[1] * 100.0 / sizes[0]) if len(self.expandedFiles) > 0: print "\nEXPANDED FILES:" print headingLines print heading print headingLines for f in sorted(self.expandedFiles.keys()): sizes = self.expandedFiles[f] print "%-20s%12ld%12ld%8.2f %%" % (f, sizes[0], sizes[1], sizes[1] * 100.0 / sizes[0]) print headingLines if __name__ == "__main__": if len(sys.argv) < 3: print "Usage: comp_dirs.py <old_dir> <new_dir>" sys.exit(1) cd = CmpDirs(sys.argv[1], sys.argv[2]) cd.compareDirectories() cd.report() |