Class ErrorReportingTokenizer

  • All Implemented Interfaces:
    org.xml.sax.Locator

    public class ErrorReportingTokenizer
    extends Tokenizer
    • Field Detail

      • SURROGATE_OFFSET

        private static final int SURROGATE_OFFSET
        Magic value for UTF-16 operations.
        See Also:
        Constant Field Values
      • contentNonXmlCharPolicy

        private XmlViolationPolicy contentNonXmlCharPolicy
        The policy for non-space non-XML characters.
      • alreadyComplainedAboutNonAscii

        private boolean alreadyComplainedAboutNonAscii
        Used together with nonAsciiProhibited.
      • alreadyWarnedAboutPrivateUseCharacters

        private boolean alreadyWarnedAboutPrivateUseCharacters
        Keeps track of PUA warnings.
      • line

        private int line
        The current line number in the current resource being parsed. (First line is 1.) Passed on as locator data.
      • linePrev

        private int linePrev
      • col

        private int col
        The current column number in the current resource being tokenized. (First column is 1, counted by UTF-16 code units.) Passed on as locator data.
      • colPrev

        private int colPrev
      • nextCharOnNewLine

        private boolean nextCharOnNewLine
      • prev

        private char prev
      • errorProfileMap

        private java.util.HashMap<java.lang.String,​java.lang.String> errorProfileMap
      • transitionBaseOffset

        private int transitionBaseOffset
    • Constructor Detail

      • ErrorReportingTokenizer

        public ErrorReportingTokenizer​(TokenHandler tokenHandler,
                                       boolean newAttributesEachTime)
        Parameters:
        tokenHandler -
        newAttributesEachTime -
      • ErrorReportingTokenizer

        public ErrorReportingTokenizer​(TokenHandler tokenHandler)
        Parameters:
        tokenHandler -
    • Method Detail

      • getLineNumber

        public int getLineNumber()
        Specified by:
        getLineNumber in interface org.xml.sax.Locator
        Overrides:
        getLineNumber in class Tokenizer
        See Also:
        Locator.getLineNumber()
      • getColumnNumber

        public int getColumnNumber()
        Specified by:
        getColumnNumber in interface org.xml.sax.Locator
        Overrides:
        getColumnNumber in class Tokenizer
        See Also:
        Locator.getColumnNumber()
      • setContentNonXmlCharPolicy

        public void setContentNonXmlCharPolicy​(XmlViolationPolicy contentNonXmlCharPolicy)
        Sets the contentNonXmlCharPolicy.
        Overrides:
        setContentNonXmlCharPolicy in class Tokenizer
        Parameters:
        contentNonXmlCharPolicy - the contentNonXmlCharPolicy to set
      • setErrorProfile

        public void setErrorProfile​(java.util.HashMap<java.lang.String,​java.lang.String> errorProfileMap)
        Sets the errorProfile.
        Parameters:
        errorProfile -
      • note

        public void note​(java.lang.String profile,
                         java.lang.String message)
                  throws org.xml.sax.SAXException
        Reports on an event based on profile selected.
        Parameters:
        profile - the profile this message belongs to
        message - the message itself
        Throws:
        org.xml.sax.SAXException
      • startErrorReporting

        protected void startErrorReporting()
                                    throws org.xml.sax.SAXException
        Overrides:
        startErrorReporting in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • getLine

        public int getLine()
        Returns the line.
        Overrides:
        getLine in class Tokenizer
        Returns:
        the line
      • getCol

        public int getCol()
        Returns the col.
        Overrides:
        getCol in class Tokenizer
        Returns:
        the col
      • isNextCharOnNewLine

        public boolean isNextCharOnNewLine()
        Returns the nextCharOnNewLine.
        Overrides:
        isNextCharOnNewLine in class Tokenizer
        Returns:
        the nextCharOnNewLine
      • complainAboutNonAscii

        private void complainAboutNonAscii()
                                    throws org.xml.sax.SAXException
        Throws:
        org.xml.sax.SAXException
      • isAlreadyComplainedAboutNonAscii

        public boolean isAlreadyComplainedAboutNonAscii()
        Returns the alreadyComplainedAboutNonAscii.
        Overrides:
        isAlreadyComplainedAboutNonAscii in class Tokenizer
        Returns:
        the alreadyComplainedAboutNonAscii
      • flushChars

        protected void flushChars​(char[] buf,
                                  int pos)
                           throws org.xml.sax.SAXException
        Flushes coalesced character tokens.
        Overrides:
        flushChars in class Tokenizer
        Parameters:
        buf - TODO
        pos - TODO
        Throws:
        org.xml.sax.SAXException
      • checkChar

        protected char checkChar​(char[] buf,
                                 int pos)
                          throws org.xml.sax.SAXException
        Overrides:
        checkChar in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • toUPlusString

        private java.lang.String toUPlusString​(int c)
      • warnAboutPrivateUseChar

        private void warnAboutPrivateUseChar()
                                      throws org.xml.sax.SAXException
        Emits a warning about private use characters if the warning has not been emitted yet.
        Throws:
        org.xml.sax.SAXException
      • isPrivateUse

        private boolean isPrivateUse​(char c)
        Tells if the argument is a BMP PUA character.
        Parameters:
        c - the UTF-16 code unit to check
        Returns:
        true if PUA character
      • isAstralPrivateUse

        private boolean isAstralPrivateUse​(int c)
        Tells if the argument is an astral PUA character.
        Parameters:
        c - the code point to check
        Returns:
        true if astral private use
      • errGarbageAfterLtSlash

        protected void errGarbageAfterLtSlash()
                                       throws org.xml.sax.SAXException
        Overrides:
        errGarbageAfterLtSlash in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errLtSlashGt

        protected void errLtSlashGt()
                             throws org.xml.sax.SAXException
        Overrides:
        errLtSlashGt in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errWarnLtSlashInRcdata

        protected void errWarnLtSlashInRcdata()
                                       throws org.xml.sax.SAXException
        Overrides:
        errWarnLtSlashInRcdata in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errHtml4LtSlashInRcdata

        protected void errHtml4LtSlashInRcdata​(char folded)
                                        throws org.xml.sax.SAXException
        Overrides:
        errHtml4LtSlashInRcdata in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errCharRefLacksSemicolon

        protected void errCharRefLacksSemicolon()
                                         throws org.xml.sax.SAXException
        Overrides:
        errCharRefLacksSemicolon in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errNoDigitsInNCR

        protected void errNoDigitsInNCR()
                                 throws org.xml.sax.SAXException
        Overrides:
        errNoDigitsInNCR in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errGtInSystemId

        protected void errGtInSystemId()
                                throws org.xml.sax.SAXException
        Overrides:
        errGtInSystemId in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errGtInPublicId

        protected void errGtInPublicId()
                                throws org.xml.sax.SAXException
        Overrides:
        errGtInPublicId in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errNamelessDoctype

        protected void errNamelessDoctype()
                                   throws org.xml.sax.SAXException
        Overrides:
        errNamelessDoctype in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errConsecutiveHyphens

        protected void errConsecutiveHyphens()
                                      throws org.xml.sax.SAXException
        Overrides:
        errConsecutiveHyphens in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errPrematureEndOfComment

        protected void errPrematureEndOfComment()
                                         throws org.xml.sax.SAXException
        Overrides:
        errPrematureEndOfComment in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errBogusComment

        protected void errBogusComment()
                                throws org.xml.sax.SAXException
        Overrides:
        errBogusComment in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errUnquotedAttributeValOrNull

        protected void errUnquotedAttributeValOrNull​(char c)
                                              throws org.xml.sax.SAXException
        Overrides:
        errUnquotedAttributeValOrNull in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errSlashNotFollowedByGt

        protected void errSlashNotFollowedByGt()
                                        throws org.xml.sax.SAXException
        Overrides:
        errSlashNotFollowedByGt in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errHtml4XmlVoidSyntax

        protected void errHtml4XmlVoidSyntax()
                                      throws org.xml.sax.SAXException
        Overrides:
        errHtml4XmlVoidSyntax in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errNoSpaceBetweenAttributes

        protected void errNoSpaceBetweenAttributes()
                                            throws org.xml.sax.SAXException
        Overrides:
        errNoSpaceBetweenAttributes in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errHtml4NonNameInUnquotedAttribute

        protected void errHtml4NonNameInUnquotedAttribute​(char c)
                                                   throws org.xml.sax.SAXException
        Overrides:
        errHtml4NonNameInUnquotedAttribute in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errAttributeValueMissing

        protected void errAttributeValueMissing()
                                         throws org.xml.sax.SAXException
        Overrides:
        errAttributeValueMissing in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errBadCharBeforeAttributeNameOrNull

        protected void errBadCharBeforeAttributeNameOrNull​(char c)
                                                    throws org.xml.sax.SAXException
        Overrides:
        errBadCharBeforeAttributeNameOrNull in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errEqualsSignBeforeAttributeName

        protected void errEqualsSignBeforeAttributeName()
                                                 throws org.xml.sax.SAXException
        Overrides:
        errEqualsSignBeforeAttributeName in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errBadCharAfterLt

        protected void errBadCharAfterLt​(char c)
                                  throws org.xml.sax.SAXException
        Overrides:
        errBadCharAfterLt in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errLtGt

        protected void errLtGt()
                        throws org.xml.sax.SAXException
        Overrides:
        errLtGt in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errProcessingInstruction

        protected void errProcessingInstruction()
                                         throws org.xml.sax.SAXException
        Overrides:
        errProcessingInstruction in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errNotSemicolonTerminated

        protected void errNotSemicolonTerminated()
                                          throws org.xml.sax.SAXException
        Overrides:
        errNotSemicolonTerminated in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errNoNamedCharacterMatch

        protected void errNoNamedCharacterMatch()
                                         throws org.xml.sax.SAXException
        Overrides:
        errNoNamedCharacterMatch in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errQuoteBeforeAttributeName

        protected void errQuoteBeforeAttributeName​(char c)
                                            throws org.xml.sax.SAXException
        Overrides:
        errQuoteBeforeAttributeName in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errQuoteOrLtInAttributeNameOrNull

        protected void errQuoteOrLtInAttributeNameOrNull​(char c)
                                                  throws org.xml.sax.SAXException
        Overrides:
        errQuoteOrLtInAttributeNameOrNull in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errExpectedPublicId

        protected void errExpectedPublicId()
                                    throws org.xml.sax.SAXException
        Overrides:
        errExpectedPublicId in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errBogusDoctype

        protected void errBogusDoctype()
                                throws org.xml.sax.SAXException
        Overrides:
        errBogusDoctype in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • maybeWarnPrivateUseAstral

        protected void maybeWarnPrivateUseAstral()
                                          throws org.xml.sax.SAXException
        Overrides:
        maybeWarnPrivateUseAstral in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • maybeWarnPrivateUse

        protected void maybeWarnPrivateUse​(char ch)
                                    throws org.xml.sax.SAXException
        Overrides:
        maybeWarnPrivateUse in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • maybeErrSlashInEndTag

        protected void maybeErrSlashInEndTag​(boolean selfClosing)
                                      throws org.xml.sax.SAXException
        Overrides:
        maybeErrSlashInEndTag in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errNcrNonCharacter

        protected char errNcrNonCharacter​(char ch)
                                   throws org.xml.sax.SAXException
        Overrides:
        errNcrNonCharacter in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errNcrSurrogate

        protected void errNcrSurrogate()
                                throws org.xml.sax.SAXException
        Overrides:
        errNcrSurrogate in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errNcrControlChar

        protected char errNcrControlChar​(char ch)
                                  throws org.xml.sax.SAXException
        Overrides:
        errNcrControlChar in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errNcrCr

        protected void errNcrCr()
                         throws org.xml.sax.SAXException
        Overrides:
        errNcrCr in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errNcrInC1Range

        protected void errNcrInC1Range()
                                throws org.xml.sax.SAXException
        Overrides:
        errNcrInC1Range in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errEofInPublicId

        protected void errEofInPublicId()
                                 throws org.xml.sax.SAXException
        Overrides:
        errEofInPublicId in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errEofInComment

        protected void errEofInComment()
                                throws org.xml.sax.SAXException
        Overrides:
        errEofInComment in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errEofInDoctype

        protected void errEofInDoctype()
                                throws org.xml.sax.SAXException
        Overrides:
        errEofInDoctype in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errEofInAttributeValue

        protected void errEofInAttributeValue()
                                       throws org.xml.sax.SAXException
        Overrides:
        errEofInAttributeValue in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errEofInAttributeName

        protected void errEofInAttributeName()
                                      throws org.xml.sax.SAXException
        Overrides:
        errEofInAttributeName in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errEofWithoutGt

        protected void errEofWithoutGt()
                                throws org.xml.sax.SAXException
        Overrides:
        errEofWithoutGt in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errEofInTagName

        protected void errEofInTagName()
                                throws org.xml.sax.SAXException
        Overrides:
        errEofInTagName in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errEofInEndTag

        protected void errEofInEndTag()
                               throws org.xml.sax.SAXException
        Overrides:
        errEofInEndTag in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errEofAfterLt

        protected void errEofAfterLt()
                              throws org.xml.sax.SAXException
        Overrides:
        errEofAfterLt in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errNcrOutOfRange

        protected void errNcrOutOfRange()
                                 throws org.xml.sax.SAXException
        Overrides:
        errNcrOutOfRange in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errNcrUnassigned

        protected void errNcrUnassigned()
                                 throws org.xml.sax.SAXException
        Overrides:
        errNcrUnassigned in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errDuplicateAttribute

        protected void errDuplicateAttribute()
                                      throws org.xml.sax.SAXException
        Overrides:
        errDuplicateAttribute in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errEofInSystemId

        protected void errEofInSystemId()
                                 throws org.xml.sax.SAXException
        Overrides:
        errEofInSystemId in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errExpectedSystemId

        protected void errExpectedSystemId()
                                    throws org.xml.sax.SAXException
        Overrides:
        errExpectedSystemId in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errMissingSpaceBeforeDoctypeName

        protected void errMissingSpaceBeforeDoctypeName()
                                                 throws org.xml.sax.SAXException
        Overrides:
        errMissingSpaceBeforeDoctypeName in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errHyphenHyphenBang

        protected void errHyphenHyphenBang()
                                    throws org.xml.sax.SAXException
        Overrides:
        errHyphenHyphenBang in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errNcrControlChar

        protected void errNcrControlChar()
                                  throws org.xml.sax.SAXException
        Overrides:
        errNcrControlChar in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errNcrZero

        protected void errNcrZero()
                           throws org.xml.sax.SAXException
        Overrides:
        errNcrZero in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • errNoSpaceBetweenPublicAndSystemIds

        protected void errNoSpaceBetweenPublicAndSystemIds()
                                                    throws org.xml.sax.SAXException
        Overrides:
        errNoSpaceBetweenPublicAndSystemIds in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • noteAttributeWithoutValue

        protected void noteAttributeWithoutValue()
                                          throws org.xml.sax.SAXException
        Overrides:
        noteAttributeWithoutValue in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • noteUnquotedAttributeValue

        protected void noteUnquotedAttributeValue()
                                           throws org.xml.sax.SAXException
        Overrides:
        noteUnquotedAttributeValue in class Tokenizer
        Throws:
        org.xml.sax.SAXException
      • setTransitionHandler

        public void setTransitionHandler​(TransitionHandler transitionHandler)
        Sets the transitionHandler.
        Parameters:
        transitionHandler - the transitionHandler to set
      • setTransitionBaseOffset

        public void setTransitionBaseOffset​(int offset)
        Sets an offset to be added to the position reported to TransitionHandler.
        Overrides:
        setTransitionBaseOffset in class Tokenizer
        Parameters:
        offset - the offset