Class Normalizer

java.lang.Object
net.sf.saxon.serialize.codenorm.Normalizer

public class Normalizer extends Object
Implements Unicode Normalization Forms C, D, KC, KD. Copyright (c) 1991-2005 Unicode, Inc. For terms of use, see http://www.unicode.org/terms_of_use.html For documentation, see UAX#15.
The Unicode Consortium makes no expressed or implied warranty of any kind, and assumes no liability for errors or omissions. No liability is assumed for incidental and consequential damages in connection with or arising out of the use of the information here.
Author:
Mark Davis Updates for supplementary code points: Vladimir Weinstein & Markus Scherer Modified to remove dependency on ICU code: Michael Kay
  • Field Details

  • Constructor Details

    • Normalizer

      public Normalizer(byte form, Configuration config) throws XPathException
      Create a normalizer for a given form.
      Parameters:
      form - the normalization form required: for example C, D
      Throws:
      XPathException
    • Normalizer

      public Normalizer(CharSequence formCS, Configuration config) throws XPathException
      Create a normalizer for a given form, expressed as a character string
      Parameters:
      formCS - the normalization form required: for example "NFC" or "NFD"
      Throws:
      XPathException
  • Method Details

    • normalize

      public CharSequence normalize(CharSequence source)
      Normalizes text according to the chosen form
      Parameters:
      source - the original text, unnormalized
      Returns:
      target the resulting normalized text
    • getExcluded

      boolean getExcluded(char ch)
      Just accessible for testing.
      Parameters:
      ch - a character
      Returns:
      true if the character is an excluded character
    • getRawDecompositionMapping

      String getRawDecompositionMapping(char ch)
      Just accessible for testing.
      Parameters:
      ch - a character
      Returns:
      the raw decomposition mapping of the character