Class Normalizer


  • public class Normalizer
    extends java.lang.Object
    Implements Unicode Normalization Forms C, D, KC, KD. Copyright (c) 1991-2005 Unicode, Inc. For terms of use, see http://www.unicode.org/terms_of_use.html For documentation, see UAX#15.
    The Unicode Consortium makes no expressed or implied warranty of any kind, and assumes no liability for errors or omissions. No liability is assumed for incidental and consequential damages in connection with or arising out of the use of the information here.
    Author:
    Mark Davis Updates for supplementary code points: Vladimir Weinstein & Markus Scherer Modified to remove dependency on ICU code: Michael Kay
    • Field Summary

      Fields 
      Modifier and Type Field Description
      static byte C
      Normalization Form Selector
      (package private) static byte COMPATIBILITY_MASK
      Masks for the form selector
      (package private) static byte COMPOSITION_MASK
      Masks for the form selector
      static byte D
      Normalization Form Selector
      static byte KC
      Normalization Form Selector
      static byte KD
      Normalization Form Selector
      static byte NO_ACTION
      Normalization Form Selector
    • Constructor Summary

      Constructors 
      Constructor Description
      Normalizer​(byte form, Configuration config)
      Create a normalizer for a given form.
      Normalizer​(java.lang.CharSequence formCS, Configuration config)
      Create a normalizer for a given form, expressed as a character string
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      (package private) boolean getExcluded​(char ch)
      Just accessible for testing.
      (package private) java.lang.String getRawDecompositionMapping​(char ch)
      Just accessible for testing.
      java.lang.CharSequence normalize​(java.lang.CharSequence source)
      Normalizes text according to the chosen form
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • Normalizer

        public Normalizer​(byte form,
                          Configuration config)
                   throws XPathException
        Create a normalizer for a given form.
        Parameters:
        form - the normalization form required: for example C, D
        Throws:
        XPathException
      • Normalizer

        public Normalizer​(java.lang.CharSequence formCS,
                          Configuration config)
                   throws XPathException
        Create a normalizer for a given form, expressed as a character string
        Parameters:
        formCS - the normalization form required: for example "NFC" or "NFD"
        Throws:
        XPathException
    • Method Detail

      • normalize

        public java.lang.CharSequence normalize​(java.lang.CharSequence source)
        Normalizes text according to the chosen form
        Parameters:
        source - the original text, unnormalized
        Returns:
        target the resulting normalized text
      • getExcluded

        boolean getExcluded​(char ch)
        Just accessible for testing.
        Parameters:
        ch - a character
        Returns:
        true if the character is an excluded character
      • getRawDecompositionMapping

        java.lang.String getRawDecompositionMapping​(char ch)
        Just accessible for testing.
        Parameters:
        ch - a character
        Returns:
        the raw decomposition mapping of the character