Package nu.validator.htmlparser.extra
Class NormalizationChecker
- java.lang.Object
-
- nu.validator.htmlparser.extra.NormalizationChecker
-
- All Implemented Interfaces:
CharacterHandler
public final class NormalizationChecker extends java.lang.Object implements CharacterHandler
- Version:
- $Id$
-
-
Field Summary
Fields Modifier and Type Field Description private boolean
alreadyComplainedAboutThisRun
Indicates whether the current run has already caused an error.private boolean
atStartOfRun
Indicates whether the checker the next call tocharacters()
is the first call in a run.private char[]
buf
A buffer for holding sequences overlap the SAX buffer boundary.private char[]
bufHolder
A holder for the original buffer (for the memory leak prevention mechanism).private static com.ibm.icu.text.UnicodeSet
COMPOSING_CHARACTERS
A thread-safe set of composing characters as per Charmod Norm.private org.xml.sax.ErrorHandler
errorHandler
private org.xml.sax.Locator
locator
private int
pos
The current used length of the buffer, i.e.
-
Constructor Summary
Constructors Constructor Description NormalizationChecker(org.xml.sax.Locator locator)
Constructor with mode selection.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description private void
appendToBuf(char[] ch, int start, int end)
Appends a slice of an UTF-16 code unit array to the internal buffer.void
characters(char[] ch, int start, int length)
Receive notification of a run of UTF-16 code units.void
end()
Signals the end of the stream.void
err(java.lang.String message)
Emit an error.private void
errAboutTextRun()
Emits an error stating that the current text run or the source text is not in NFC.private static boolean
isComposingChar(int c)
Returnstrue
if the argument is a composing character andfalse
otherwise.private static boolean
isComposingCharOrSurrogate(char c)
Returnstrue
if the argument is a composing BMP character or a surrogate andfalse
otherwise.void
setErrorHandler(org.xml.sax.ErrorHandler errorHandler)
void
start()
Signals the start of the stream.
-
-
-
Field Detail
-
errorHandler
private org.xml.sax.ErrorHandler errorHandler
-
locator
private org.xml.sax.Locator locator
-
COMPOSING_CHARACTERS
private static final com.ibm.icu.text.UnicodeSet COMPOSING_CHARACTERS
A thread-safe set of composing characters as per Charmod Norm.
-
buf
private char[] buf
A buffer for holding sequences overlap the SAX buffer boundary.
-
bufHolder
private char[] bufHolder
A holder for the original buffer (for the memory leak prevention mechanism).
-
pos
private int pos
The current used length of the buffer, i.e. the index of the first slot that does not hold current data.
-
atStartOfRun
private boolean atStartOfRun
Indicates whether the checker the next call tocharacters()
is the first call in a run.
-
alreadyComplainedAboutThisRun
private boolean alreadyComplainedAboutThisRun
Indicates whether the current run has already caused an error.
-
-
Method Detail
-
err
public void err(java.lang.String message) throws org.xml.sax.SAXException
Emit an error. The locator is used.- Parameters:
message
- the error message- Throws:
org.xml.sax.SAXException
- if something goes wrong
-
isComposingCharOrSurrogate
private static boolean isComposingCharOrSurrogate(char c)
Returnstrue
if the argument is a composing BMP character or a surrogate andfalse
otherwise.- Parameters:
c
- a UTF-16 code unit- Returns:
true
if the argument is a composing BMP character or a surrogate andfalse
otherwise
-
isComposingChar
private static boolean isComposingChar(int c)
Returnstrue
if the argument is a composing character andfalse
otherwise.- Parameters:
c
- a Unicode code point- Returns:
true
if the argument is a composing characterfalse
otherwise
-
start
public void start()
Description copied from interface:CharacterHandler
Signals the start of the stream. Can be used for setup.- Specified by:
start
in interfaceCharacterHandler
- See Also:
CharacterHandler.start()
-
characters
public void characters(char[] ch, int start, int length) throws org.xml.sax.SAXException
Description copied from interface:CharacterHandler
Receive notification of a run of UTF-16 code units.- Specified by:
characters
in interfaceCharacterHandler
- Parameters:
ch
- the bufferstart
- start index in the bufferlength
- the number of characters to process starting fromstart
- Throws:
org.xml.sax.SAXException
- if things go wrong- See Also:
CharacterHandler.characters(char[], int, int)
-
errAboutTextRun
private void errAboutTextRun() throws org.xml.sax.SAXException
Emits an error stating that the current text run or the source text is not in NFC.- Throws:
org.xml.sax.SAXException
- if theErrorHandler
throws
-
appendToBuf
private void appendToBuf(char[] ch, int start, int end)
Appends a slice of an UTF-16 code unit array to the internal buffer.- Parameters:
ch
- the array from which to copystart
- the index of the first element that is copiedend
- the index of the first element that is not copied
-
end
public void end() throws org.xml.sax.SAXException
Description copied from interface:CharacterHandler
Signals the end of the stream. Can be used for cleanup. Doesn't mean that the stream ended successfully.- Specified by:
end
in interfaceCharacterHandler
- Throws:
org.xml.sax.SAXException
- if things go wrong- See Also:
CharacterHandler.end()
-
setErrorHandler
public void setErrorHandler(org.xml.sax.ErrorHandler errorHandler)
-
-