Package net.sf.saxon.codenorm
Class UnicodeDataGenerator
java.lang.Object
net.sf.saxon.codenorm.UnicodeDataGenerator
This class reads the Unicode character database, extracts information needed
to perform unicode normalization, and writes this information out in the form of the
Java "source" module UnicodeData.java. This class is therefore executed (via its main()
method) at the time Saxon is built - it only needs to be rerun when the Unicode data tables
have changed.
The class is derived from the sample program NormalizerData.java published by the Unicode consortium. That code has been modified so that instead of building the run-time data structures directly, they are written to a Java "source" module, which is then compiled. Also, the ability to construct a condensed version of the data tables has been removed.
Copyright (c) 1991-2005 Unicode, Inc.
For terms of use, see http://www.unicode.org/terms_of_use.html
For documentation, see UAX#15.
- Author:
- Mark Davis, Michael Kay: Saxon modifications.
-
Field Summary
Fields -
Method Summary
Modifier and TypeMethodDescription(package private) static void
build()
Called exactly once by NormalizerData to build the static datastatic String
Utility: Parses a sequence of hex Unicode characters separated by spacesstatic String
hex
(char i) Utility: Supplies a zero-padded hex representation of a Unicode character (without 0x, \\u)static String
Utility: Supplies a zero-padded hex representation of a Unicode character (without 0x, \\u)static void
Main program.
-
Field Details
-
copyright
- See Also:
-
-
Method Details
-
build
static void build()Called exactly once by NormalizerData to build the static data -
fromHex
Utility: Parses a sequence of hex Unicode characters separated by spaces -
hex
Utility: Supplies a zero-padded hex representation of a Unicode character (without 0x, \\u) -
hex
Utility: Supplies a zero-padded hex representation of a Unicode character (without 0x, \\u) -
main
Main program. Run this program to regenerate the Java module UnicodeData.java against revised data from the Unicode character database.Usage: java UnicodeDataGenerator dir >UnicodeData.java
where dir is the directory containing the files UnicodeData.text and CompositionExclusions.txt from the Unicode character database.
- Throws:
Exception
-