Class NormalizationChecker

    • Field Summary

      Fields 
      Modifier and Type Field Description
      private boolean alreadyComplainedAboutThisRun
      Indicates whether the current run has already caused an error.
      private boolean atStartOfRun
      Indicates whether the checker the next call to characters() is the first call in a run.
      private char[] buf
      A buffer for holding sequences overlap the SAX buffer boundary.
      private char[] bufHolder
      A holder for the original buffer (for the memory leak prevention mechanism).
      private static com.ibm.icu.text.UnicodeSet COMPOSING_CHARACTERS
      A thread-safe set of composing characters as per Charmod Norm.
      private org.xml.sax.ErrorHandler errorHandler  
      private org.xml.sax.Locator locator  
      private int pos
      The current used length of the buffer, i.e.
    • Constructor Summary

      Constructors 
      Constructor Description
      NormalizationChecker​(org.xml.sax.Locator locator)
      Constructor with mode selection.
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      private void appendToBuf​(char[] ch, int start, int end)
      Appends a slice of an UTF-16 code unit array to the internal buffer.
      void characters​(char[] ch, int start, int length)
      Receive notification of a run of UTF-16 code units.
      void end()
      Signals the end of the stream.
      void err​(java.lang.String message)
      Emit an error.
      private void errAboutTextRun()
      Emits an error stating that the current text run or the source text is not in NFC.
      private static boolean isComposingChar​(int c)
      Returns true if the argument is a composing character and false otherwise.
      private static boolean isComposingCharOrSurrogate​(char c)
      Returns true if the argument is a composing BMP character or a surrogate and false otherwise.
      void setErrorHandler​(org.xml.sax.ErrorHandler errorHandler)  
      void start()
      Signals the start of the stream.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • errorHandler

        private org.xml.sax.ErrorHandler errorHandler
      • locator

        private org.xml.sax.Locator locator
      • COMPOSING_CHARACTERS

        private static final com.ibm.icu.text.UnicodeSet COMPOSING_CHARACTERS
        A thread-safe set of composing characters as per Charmod Norm.
      • buf

        private char[] buf
        A buffer for holding sequences overlap the SAX buffer boundary.
      • bufHolder

        private char[] bufHolder
        A holder for the original buffer (for the memory leak prevention mechanism).
      • pos

        private int pos
        The current used length of the buffer, i.e. the index of the first slot that does not hold current data.
      • atStartOfRun

        private boolean atStartOfRun
        Indicates whether the checker the next call to characters() is the first call in a run.
      • alreadyComplainedAboutThisRun

        private boolean alreadyComplainedAboutThisRun
        Indicates whether the current run has already caused an error.
    • Constructor Detail

      • NormalizationChecker

        public NormalizationChecker​(org.xml.sax.Locator locator)
        Constructor with mode selection.
        Parameters:
        sourceTextMode - whether the source text-related messages should be enabled.
    • Method Detail

      • err

        public void err​(java.lang.String message)
                 throws org.xml.sax.SAXException
        Emit an error. The locator is used.
        Parameters:
        message - the error message
        Throws:
        org.xml.sax.SAXException - if something goes wrong
      • isComposingCharOrSurrogate

        private static boolean isComposingCharOrSurrogate​(char c)
        Returns true if the argument is a composing BMP character or a surrogate and false otherwise.
        Parameters:
        c - a UTF-16 code unit
        Returns:
        true if the argument is a composing BMP character or a surrogate and false otherwise
      • isComposingChar

        private static boolean isComposingChar​(int c)
        Returns true if the argument is a composing character and false otherwise.
        Parameters:
        c - a Unicode code point
        Returns:
        true if the argument is a composing character false otherwise
      • characters

        public void characters​(char[] ch,
                               int start,
                               int length)
                        throws org.xml.sax.SAXException
        Description copied from interface: CharacterHandler
        Receive notification of a run of UTF-16 code units.
        Specified by:
        characters in interface CharacterHandler
        Parameters:
        ch - the buffer
        start - start index in the buffer
        length - the number of characters to process starting from start
        Throws:
        org.xml.sax.SAXException - if things go wrong
        See Also:
        CharacterHandler.characters(char[], int, int)
      • errAboutTextRun

        private void errAboutTextRun()
                              throws org.xml.sax.SAXException
        Emits an error stating that the current text run or the source text is not in NFC.
        Throws:
        org.xml.sax.SAXException - if the ErrorHandler throws
      • appendToBuf

        private void appendToBuf​(char[] ch,
                                 int start,
                                 int end)
        Appends a slice of an UTF-16 code unit array to the internal buffer.
        Parameters:
        ch - the array from which to copy
        start - the index of the first element that is copied
        end - the index of the first element that is not copied
      • end

        public void end()
                 throws org.xml.sax.SAXException
        Description copied from interface: CharacterHandler
        Signals the end of the stream. Can be used for cleanup. Doesn't mean that the stream ended successfully.
        Specified by:
        end in interface CharacterHandler
        Throws:
        org.xml.sax.SAXException - if things go wrong
        See Also:
        CharacterHandler.end()
      • setErrorHandler

        public void setErrorHandler​(org.xml.sax.ErrorHandler errorHandler)