Package net.sf.saxon.serialize.codenorm
Class Normalizer
java.lang.Object
net.sf.saxon.serialize.codenorm.Normalizer
Implements Unicode Normalization Forms C, D, KC, KD.
Copyright (c) 1991-2005 Unicode, Inc.
For terms of use, see http://www.unicode.org/terms_of_use.html
For documentation, see UAX#15.
The Unicode Consortium makes no expressed or implied warranty of any kind, and assumes no liability for errors or omissions. No liability is assumed for incidental and consequential damages in connection with or arising out of the use of the information here.
The Unicode Consortium makes no expressed or implied warranty of any kind, and assumes no liability for errors or omissions. No liability is assumed for incidental and consequential damages in connection with or arising out of the use of the information here.
- Author:
- Mark Davis Updates for supplementary code points: Vladimir Weinstein & Markus Scherer Modified to remove dependency on ICU code: Michael Kay
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final byte
Normalization Form Selector(package private) static final byte
Masks for the form selector(package private) static final byte
Masks for the form selectorstatic final byte
Normalization Form Selectorstatic final byte
Normalization Form Selectorstatic final byte
Normalization Form Selectorstatic final byte
Normalization Form Selector -
Constructor Summary
ConstructorsConstructorDescriptionNormalizer
(byte form, Configuration config) Create a normalizer for a given form.Normalizer
(CharSequence formCS, Configuration config) Create a normalizer for a given form, expressed as a character string -
Method Summary
Modifier and TypeMethodDescription(package private) boolean
getExcluded
(char ch) Just accessible for testing.(package private) String
getRawDecompositionMapping
(char ch) Just accessible for testing.normalize
(CharSequence source) Normalizes text according to the chosen form
-
Field Details
-
COMPATIBILITY_MASK
static final byte COMPATIBILITY_MASKMasks for the form selector- See Also:
-
COMPOSITION_MASK
static final byte COMPOSITION_MASKMasks for the form selector- See Also:
-
D
public static final byte DNormalization Form Selector- See Also:
-
C
public static final byte CNormalization Form Selector- See Also:
-
KD
public static final byte KDNormalization Form Selector- See Also:
-
KC
public static final byte KCNormalization Form Selector- See Also:
-
NO_ACTION
public static final byte NO_ACTIONNormalization Form Selector- See Also:
-
-
Constructor Details
-
Normalizer
Create a normalizer for a given form.- Parameters:
form
- the normalization form required: for exampleC
,D
- Throws:
XPathException
-
Normalizer
Create a normalizer for a given form, expressed as a character string- Parameters:
formCS
- the normalization form required: for example "NFC" or "NFD"- Throws:
XPathException
-
-
Method Details
-
normalize
Normalizes text according to the chosen form- Parameters:
source
- the original text, unnormalized- Returns:
- target the resulting normalized text
-
getExcluded
boolean getExcluded(char ch) Just accessible for testing.- Parameters:
ch
- a character- Returns:
- true if the character is an excluded character
-
getRawDecompositionMapping
Just accessible for testing.- Parameters:
ch
- a character- Returns:
- the raw decomposition mapping of the character
-