Package net.sf.saxon.serialize.codenorm
Class Normalizer
- java.lang.Object
-
- net.sf.saxon.serialize.codenorm.Normalizer
-
public class Normalizer extends java.lang.Object
Implements Unicode Normalization Forms C, D, KC, KD. Copyright (c) 1991-2005 Unicode, Inc. For terms of use, see http://www.unicode.org/terms_of_use.html For documentation, see UAX#15.
The Unicode Consortium makes no expressed or implied warranty of any kind, and assumes no liability for errors or omissions. No liability is assumed for incidental and consequential damages in connection with or arising out of the use of the information here.- Author:
- Mark Davis Updates for supplementary code points: Vladimir Weinstein & Markus Scherer Modified to remove dependency on ICU code: Michael Kay
-
-
Field Summary
Fields Modifier and Type Field Description static byte
C
Normalization Form Selector(package private) static byte
COMPATIBILITY_MASK
Masks for the form selector(package private) static byte
COMPOSITION_MASK
Masks for the form selectorstatic byte
D
Normalization Form Selectorstatic byte
KC
Normalization Form Selectorstatic byte
KD
Normalization Form Selectorstatic byte
NO_ACTION
Normalization Form Selector
-
Constructor Summary
Constructors Constructor Description Normalizer(byte form, Configuration config)
Create a normalizer for a given form.Normalizer(java.lang.CharSequence formCS, Configuration config)
Create a normalizer for a given form, expressed as a character string
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description (package private) boolean
getExcluded(char ch)
Just accessible for testing.(package private) java.lang.String
getRawDecompositionMapping(char ch)
Just accessible for testing.java.lang.CharSequence
normalize(java.lang.CharSequence source)
Normalizes text according to the chosen form
-
-
-
Field Detail
-
COMPATIBILITY_MASK
static final byte COMPATIBILITY_MASK
Masks for the form selector- See Also:
- Constant Field Values
-
COMPOSITION_MASK
static final byte COMPOSITION_MASK
Masks for the form selector- See Also:
- Constant Field Values
-
D
public static final byte D
Normalization Form Selector- See Also:
- Constant Field Values
-
C
public static final byte C
Normalization Form Selector- See Also:
- Constant Field Values
-
KD
public static final byte KD
Normalization Form Selector- See Also:
- Constant Field Values
-
KC
public static final byte KC
Normalization Form Selector- See Also:
- Constant Field Values
-
NO_ACTION
public static final byte NO_ACTION
Normalization Form Selector- See Also:
- Constant Field Values
-
-
Constructor Detail
-
Normalizer
public Normalizer(byte form, Configuration config) throws XPathException
Create a normalizer for a given form.- Parameters:
form
- the normalization form required: for exampleC
,D
- Throws:
XPathException
-
Normalizer
public Normalizer(java.lang.CharSequence formCS, Configuration config) throws XPathException
Create a normalizer for a given form, expressed as a character string- Parameters:
formCS
- the normalization form required: for example "NFC" or "NFD"- Throws:
XPathException
-
-
Method Detail
-
normalize
public java.lang.CharSequence normalize(java.lang.CharSequence source)
Normalizes text according to the chosen form- Parameters:
source
- the original text, unnormalized- Returns:
- target the resulting normalized text
-
getExcluded
boolean getExcluded(char ch)
Just accessible for testing.- Parameters:
ch
- a character- Returns:
- true if the character is an excluded character
-
getRawDecompositionMapping
java.lang.String getRawDecompositionMapping(char ch)
Just accessible for testing.- Parameters:
ch
- a character- Returns:
- the raw decomposition mapping of the character
-
-