We would like to add script reordering as a new collation setting. This will allow, for example, sorting Greek before Latin, and digits after all letters, without listing all affected characters in the rules. Since this is a parameter, it can also be changed at runtime without changing any rules.
This will be implemented via a permutation table for primary collation weights. See the original (somewhat outdated) ICU collation design doc for reference:
Add the 'kr' key, with an ordered list of script names as its types, in the order they should be sorted. For example, to specify an ordering of Greek, followed by Latin, followed by everything else (Zzzz = unknown), with digits (Zyyy = Common) last, the following would be used: el-u-kr-grek-latn-zzzz-zyyy. That would modify the ordering found on http://unicode.org/charts/collation/ in the following way:
Issue: do we still want Unsupported at the very end??
The 'digitaft' type for the 'co' key is no longer needed, and can be deprecated (with some minor changes to data).
Add an additional attribute, scriptReorder, to <settings>. Its value will be the script names separated by spaces, in the order they should be sorted. The script code Zzzz stands for "any other script", and the script code Zyyy stands for Common.
<settings scriptReorder="grek latn zzzz zyyy">
Note: after looking at the data, I'm thinking that we might want to change the above:
See http://site.icu-project.org/design/collation/script-reorderingTo allow a key to have multiple types (for listing multiple script codes), change:
extension = key "-" typeto
extension = key ("-" type)+
We want to add the ability for collation to "import" rules from another collator. This provides two useful features:
<import source="de" type="phonebk">
Add private as an additional attribute for <settings>:
<settings private="true"> // mirroring <transform>'s private attribute
This attribute indicates to clients that the collation is intended only for <import>, and should not be available as a stand-alone collator or listed in available collator APIs.
Update CLDR 26 (2014): A collation type is marked "private" via a type naming convention, rather than an attribute, so that it is easy for an implementation to omit such a type from a list of available types without reading its data. See CLDR ticket #3949 comment:18.