Package com.fasterxml.aalto.async
Class AsyncByteScanner
- java.lang.Object
-
- com.fasterxml.aalto.in.XmlScanner
-
- com.fasterxml.aalto.in.ByteBasedScanner
-
- com.fasterxml.aalto.async.AsyncByteScanner
-
- All Implemented Interfaces:
AsyncInputFeeder
,XmlConsts
,javax.xml.namespace.NamespaceContext
,javax.xml.stream.XMLStreamConstants
- Direct Known Subclasses:
AsyncByteArrayScanner
,AsyncByteBufferScanner
public abstract class AsyncByteScanner extends ByteBasedScanner implements AsyncInputFeeder
-
-
Field Summary
Fields Modifier and Type Field Description protected XmlCharTypes
_charTypes
This is a simple container object that is used to access the decoding tables for characters.protected int
_currQuad
Bytes parsed for the current, incomplete, quadprotected int
_currQuadBytes
Number of bytes pending/buffered, stored in_currQuad
protected boolean
_elemAllNsBound
protected boolean
_elemAttrCount
protected PName
_elemAttrName
protected int
_elemAttrPtr
Pointer for the next character of currently being parsed value within attribute value bufferprotected byte
_elemAttrQuote
protected int
_elemNsPtr
Pointer for the next character of currently being parsed namespace URI for the current namespace declarationprotected boolean
_endOfInput
Flag that is sent when calling application indicates that there will be no more input to parse.protected int
_entityValue
Entity value accumulated so farprotected boolean
_inDtdDeclaration
Flag that indicates whether we are inside a declaration during parsing of internal DTD subset.protected int
_nextEvent
Due to asynchronous nature of parsing, we may know what event we are trying to parse, even if it's not yet complete.protected int
_pendingInput
There are some multi-byte combinations that must be handled as a unit: CR+LF linefeeds, multi-byte UTF-8 characters, and multi-character end markers for comments and PIs.protected int[]
_quadBuffer
This buffer is used for name parsing.protected int
_quadCount
Number of complete quads parsed for current name (quads themselves are stored in_quadBuffer
).protected int
_state
In addition to the event type, there is need for additional state informationprotected int
_surroundingEvent
For token/state combinations that are 'shared' between events (or embedded in them), this is where the surrounding event state is retained.protected ByteBasedPNameTable
_symbols
For now, symbol table contains prefixed names.protected static int
EVENT_INCOMPLETE
protected static int
PENDING_STATE_ATTR_VALUE_AMP
protected static int
PENDING_STATE_ATTR_VALUE_AMP_HASH
protected static int
PENDING_STATE_ATTR_VALUE_AMP_HASH_X
protected static int
PENDING_STATE_ATTR_VALUE_DEC_DIGIT
protected static int
PENDING_STATE_ATTR_VALUE_ENTITY_NAME
protected static int
PENDING_STATE_ATTR_VALUE_HEX_DIGIT
protected static int
PENDING_STATE_CDATA_BRACKET1
protected static int
PENDING_STATE_CDATA_BRACKET2
protected static int
PENDING_STATE_COMMENT_HYPHEN1
protected static int
PENDING_STATE_COMMENT_HYPHEN2
protected static int
PENDING_STATE_CR
protected static int
PENDING_STATE_ENT_IN_DEC_DIGIT
protected static int
PENDING_STATE_ENT_IN_HEX_DIGIT
protected static int
PENDING_STATE_ENT_SEEN_HASH
protected static int
PENDING_STATE_ENT_SEEN_HASH_X
protected static int
PENDING_STATE_PI_QMARK
protected static int
PENDING_STATE_TEXT_AMP
protected static int
PENDING_STATE_TEXT_AMP_HASH
protected static int
PENDING_STATE_TEXT_BRACKET1
protected static int
PENDING_STATE_TEXT_BRACKET2
protected static int
PENDING_STATE_TEXT_DEC_ENTITY
protected static int
PENDING_STATE_TEXT_HEX_ENTITY
protected static int
PENDING_STATE_TEXT_IN_ENTITY
protected static int
PENDING_STATE_XMLDECL_LT
protected static int
PENDING_STATE_XMLDECL_LTQ
protected static int
PENDING_STATE_XMLDECL_TARGET
protected static int
STATE_CDATA_C
protected static int
STATE_CDATA_CD
protected static int
STATE_CDATA_CDA
protected static int
STATE_CDATA_CDAT
protected static int
STATE_CDATA_CDATA
protected static int
STATE_CDATA_CONTENT
protected static int
STATE_COMMENT_CONTENT
protected static int
STATE_COMMENT_HYPHEN
protected static int
STATE_COMMENT_HYPHEN2
protected static int
STATE_DEFAULT
Default starting state for many events/contexts -- nothing has been seen so far, no event incomplete.protected static int
STATE_DTD_AFTER_DOCTYPE
protected static int
STATE_DTD_AFTER_PUBLIC
protected static int
STATE_DTD_AFTER_PUBLIC_ID
protected static int
STATE_DTD_AFTER_ROOT_NAME
protected static int
STATE_DTD_AFTER_SYSTEM
protected static int
STATE_DTD_AFTER_SYSTEM_ID
protected static int
STATE_DTD_BEFORE_IDS
protected static int
STATE_DTD_BEFORE_PUBLIC_ID
protected static int
STATE_DTD_BEFORE_ROOT_NAME
protected static int
STATE_DTD_BEFORE_SYSTEM_ID
protected static int
STATE_DTD_DOCTYPE
protected static int
STATE_DTD_EXPECT_CLOSING_GT
protected static int
STATE_DTD_INT_SUBSET
protected static int
STATE_DTD_PUBLIC_ID
protected static int
STATE_DTD_PUBLIC_OR_SYSTEM
protected static int
STATE_DTD_ROOT_NAME
protected static int
STATE_DTD_SYSTEM_ID
protected static int
STATE_EE_NEED_GT
protected static int
STATE_PI_AFTER_TARGET
protected static int
STATE_PI_AFTER_TARGET_QMARK
protected static int
STATE_PI_AFTER_TARGET_WS
protected static int
STATE_PI_IN_DATA
protected static int
STATE_PI_IN_TARGET
protected static int
STATE_PROLOG_DECL
protected static int
STATE_PROLOG_INITIAL
State in which a less-than sign has been seenprotected static int
STATE_PROLOG_SEEN_LT
protected static int
STATE_SE_ATTR_NAME
protected static int
STATE_SE_ATTR_VALUE_NORMAL
protected static int
STATE_SE_ATTR_VALUE_NSDECL
protected static int
STATE_SE_ELEM_NAME
protected static int
STATE_SE_SEEN_SLASH
protected static int
STATE_SE_SPACE_OR_ATTRNAME
protected static int
STATE_SE_SPACE_OR_ATTRVALUE
protected static int
STATE_SE_SPACE_OR_END
protected static int
STATE_SE_SPACE_OR_EQ
protected static int
STATE_TEXT_AMP
protected static int
STATE_TEXT_AMP_NAME
protected static int
STATE_TREE_NAMED_ENTITY_START
protected static int
STATE_TREE_NUMERIC_ENTITY_START
protected static int
STATE_TREE_SEEN_AMP
protected static int
STATE_TREE_SEEN_EXCL
protected static int
STATE_TREE_SEEN_LT
protected static int
STATE_TREE_SEEN_SLASH
protected static int
STATE_XMLDECL_AFTER_ENCODING
protected static int
STATE_XMLDECL_AFTER_ENCODING_VALUE
protected static int
STATE_XMLDECL_AFTER_STANDALONE
protected static int
STATE_XMLDECL_AFTER_STANDALONE_VALUE
protected static int
STATE_XMLDECL_AFTER_VERSION
protected static int
STATE_XMLDECL_AFTER_VERSION_VALUE
protected static int
STATE_XMLDECL_AFTER_XML
protected static int
STATE_XMLDECL_BEFORE_ENCODING
protected static int
STATE_XMLDECL_BEFORE_STANDALONE
protected static int
STATE_XMLDECL_BEFORE_VERSION
protected static int
STATE_XMLDECL_ENCODING
protected static int
STATE_XMLDECL_ENCODING_EQ
protected static int
STATE_XMLDECL_ENCODING_VALUE
protected static int
STATE_XMLDECL_ENDQ
protected static int
STATE_XMLDECL_STANDALONE
protected static int
STATE_XMLDECL_STANDALONE_EQ
protected static int
STATE_XMLDECL_STANDALONE_VALUE
protected static int
STATE_XMLDECL_VERSION
protected static int
STATE_XMLDECL_VERSION_EQ
protected static int
STATE_XMLDECL_VERSION_VALUE
-
Fields inherited from class com.fasterxml.aalto.in.ByteBasedScanner
_inputEnd, _inputPtr, _tmpChar, BYTE_a, BYTE_A, BYTE_AMP, BYTE_APOS, BYTE_C, BYTE_CR, BYTE_D, BYTE_EQ, BYTE_EXCL, BYTE_g, BYTE_GT, BYTE_HASH, BYTE_HYPHEN, BYTE_l, BYTE_LBRACKET, BYTE_LF, BYTE_LT, BYTE_m, BYTE_NULL, BYTE_o, BYTE_p, BYTE_P, BYTE_q, BYTE_QMARK, BYTE_QUOT, BYTE_RBRACKET, BYTE_s, BYTE_S, BYTE_SEMICOLON, BYTE_SLASH, BYTE_SPACE, BYTE_t, BYTE_T, BYTE_TAB, BYTE_u, BYTE_x
-
Fields inherited from class com.fasterxml.aalto.in.XmlScanner
_attrCollector, _attrCount, _cfgCoalescing, _cfgLazyParsing, _config, _currElem, _currNsCount, _currRow, _currToken, _defaultNs, _depth, _entityPending, _isEmptyTag, _lastNsContext, _lastNsDecl, _nameBuffer, _nsBindingCache, _nsBindingCount, _nsBindings, _nsBindMisses, _pastBytesOrChars, _publicId, _rowStartOffset, _startColumn, _startRawOffset, _startRow, _systemId, _textBuilder, _tokenIncomplete, _tokenName, _xml11, CDATA_STR, INT_0, INT_9, INT_a, INT_A, INT_AMP, INT_APOS, INT_COLON, INT_CR, INT_EQ, INT_EXCL, INT_f, INT_F, INT_GT, INT_HYPHEN, INT_LBRACKET, INT_LF, INT_LT, INT_NULL, INT_QMARK, INT_QUOTE, INT_RBRACKET, INT_SLASH, INT_SPACE, INT_TAB, INT_z, MAX_UNICODE_CHAR, TOKEN_EOI
-
Fields inherited from interface com.fasterxml.aalto.util.XmlConsts
CHAR_CR, CHAR_LF, CHAR_NULL, CHAR_SPACE, STAX_DEFAULT_OUTPUT_ENCODING, STAX_DEFAULT_OUTPUT_VERSION, XML_DECL_KW_ENCODING, XML_DECL_KW_STANDALONE, XML_DECL_KW_VERSION, XML_SA_NO, XML_SA_YES, XML_V_10, XML_V_10_STR, XML_V_11, XML_V_11_STR, XML_V_UNKNOWN
-
-
Constructor Summary
Constructors Modifier Constructor Description protected
AsyncByteScanner(ReaderConfig cfg)
-
Method Summary
All Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description protected void
_activateEncoding()
Initialization method to call when encoding has been definitely figured out, from XML declarations, or, from lack of one (using defaults).protected void
_closeSource()
Since the async scanner has no access to whatever passes content, there is no input source in same sense as with blocking scanner; and there is nothing to close.protected abstract byte
_currentByte()
protected PName
_findXmlDeclName(int lastQuad, int lastByteCount)
protected abstract byte
_nextByte()
private PName
_parseNewXmlDeclName(byte b)
private PName
_parseXmlDeclName()
protected abstract byte
_prevByte()
protected void
_releaseBuffers()
protected int
_startDocumentNoXmlDecl()
Helper method called when it is determined that the document does NOT start with an xml declaration.protected PName
addPName(ByteBasedPNameTable symbols, int hash, int[] quads, int qlen, int lastQuadBytes)
protected abstract boolean
asyncSkipSpace()
protected void
checkPITargetName(PName targetName)
protected int
decodeCharForError(byte b)
Method called by methods when encountering a byte that can not be part of a valid character in the current context.void
endOfInput()
Method that should be called after last chunk of data to parse has been fed.protected PName
findPName(int lastQuad, int lastByteCount)
Method called to process a sequence of bytes that is likely to be a PName.protected void
finishCData()
protected abstract void
finishCharacters()
protected void
finishComment()
protected void
finishDTD(boolean copyContents)
protected void
finishPI()
protected void
finishSpace()
protected void
finishToken()
This method is called to ensure that the current token/event has been completely parsed, such that we have all the data needed to return it (textual content, PI data, comment text etc)protected abstract boolean
handleAttrValue()
protected abstract int
handleComment()
private int
handleDTD()
protected abstract boolean
handleDTDInternalSubset(boolean init)
protected abstract boolean
handleNsDecl()
protected abstract boolean
handlePartialCR()
protected abstract int
handlePI()
private int
handlePrologDeclStart(boolean isProlog)
protected abstract int
handleStartElement()
protected abstract int
handleStartElementStart(byte b)
private int
handleXmlDeclaration()
Method called to complete parsing of XML declaration, once it has been reliably detected.protected boolean
loadMore()
int
nextFromProlog(boolean isProlog)
private boolean
parseDtdId(char[] outputBuffer, int outputPtr, boolean system)
protected abstract PName
parseNewName(byte b)
protected abstract PName
parsePName()
protected boolean
parseXmlDeclAttr(char[] outputBuffer, int outputPtr)
Method called to try to parse an XML pseudo-attribute value.protected void
reportInvalidOther(int mask, int ptr)
protected void
skipCData()
protected abstract boolean
skipCharacters()
protected void
skipComment()
protected void
skipPI()
protected void
skipSpace()
protected abstract int
startCharacters(byte b)
Method called to initialize state for CHARACTERS event, after just a single byte has been seen.private java.lang.Boolean
startXmlDeclaration()
Method that deals with recognizing XML declaration, but not with parsing its contents.protected int
throwInternal()
protected boolean
validPublicIdChar(int c)
Checks that a character for a PublicIdprotected void
verifyAndAppendEntityCharacter(int charFromEntity)
Method called to verify validity of given character (from entity) and append it to the text bufferprotected void
verifyAndSetPublicId()
protected void
verifyAndSetSystemId()
protected void
verifyAndSetXmlEncoding()
protected void
verifyAndSetXmlStandalone()
protected void
verifyAndSetXmlVersion()
-
Methods inherited from class com.fasterxml.aalto.in.ByteBasedScanner
addUTFPName, getCurrentColumnNr, getCurrentLocation, getEndingByteOffset, getEndingCharOffset, getStartingByteOffset, getStartingCharOffset, markLF, markLF, reportInvalidInitial, reportInvalidOther, setStartLocation
-
Methods inherited from class com.fasterxml.aalto.in.XmlScanner
bindName, bindNs, checkImmutableBinding, close, decodeAttrBinaryValue, decodeAttrValue, decodeAttrValues, decodeElements, findAttrIndex, findOrCreateBinding, fireSaxCharacterEvents, fireSaxCommentEvent, fireSaxEndElement, fireSaxPIEvent, fireSaxSpaceEvents, fireSaxStartElement, getAttrCollector, getAttrCount, getAttrLocalName, getAttrNsURI, getAttrPrefix, getAttrPrefixedName, getAttrQName, getAttrType, getAttrValue, getAttrValue, getConfig, getCurrentLineNr, getDepth, getDTDPublicId, getDTDSystemId, getEndLocation, getInputPublicId, getInputSystemId, getName, getNamespacePrefix, getNamespaceURI, getNamespaceURI, getNamespaceURI, getNonTransientNamespaceContext, getNsCount, getPrefix, getPrefixes, getQName, getStartLocation, getText, getText, getTextCharacters, getTextCharacters, getTextLength, handleInvalidXmlChar, hasEmptyStack, isAttrSpecified, isEmptyTag, isTextWhitespace, loadMoreGuaranteed, loadMoreGuaranteed, nextFromTree, reportDoubleHyphenInComments, reportDuplicateNsDecl, reportEntityOverflow, reportEofInName, reportIllegalCDataEnd, reportIllegalNsDecl, reportIllegalNsDecl, reportInputProblem, reportInvalidNameChar, reportInvalidNsIndex, reportInvalidXmlChar, reportMissingPISpace, reportMultipleColonsInName, reportPrologProblem, reportPrologUnexpChar, reportPrologUnexpElement, reportTreeUnexpChar, reportUnboundPrefix, reportUnexpandedEntityInAttr, reportUnexpectedEndTag, resetForDecoding, skipCoalescedText, skipToken, throwInvalidSpace, throwNullChar, throwUnexpectedChar, verifyXmlChar
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface com.fasterxml.aalto.AsyncInputFeeder
needMoreInput
-
-
-
-
Field Detail
-
EVENT_INCOMPLETE
protected static final int EVENT_INCOMPLETE
- See Also:
- Constant Field Values
-
STATE_DEFAULT
protected static final int STATE_DEFAULT
Default starting state for many events/contexts -- nothing has been seen so far, no event incomplete. Not used for all event types.- See Also:
- Constant Field Values
-
STATE_PROLOG_INITIAL
protected static final int STATE_PROLOG_INITIAL
State in which a less-than sign has been seen- See Also:
- Constant Field Values
-
STATE_PROLOG_SEEN_LT
protected static final int STATE_PROLOG_SEEN_LT
- See Also:
- Constant Field Values
-
STATE_PROLOG_DECL
protected static final int STATE_PROLOG_DECL
- See Also:
- Constant Field Values
-
STATE_TREE_SEEN_LT
protected static final int STATE_TREE_SEEN_LT
- See Also:
- Constant Field Values
-
STATE_TREE_SEEN_AMP
protected static final int STATE_TREE_SEEN_AMP
- See Also:
- Constant Field Values
-
STATE_TREE_SEEN_EXCL
protected static final int STATE_TREE_SEEN_EXCL
- See Also:
- Constant Field Values
-
STATE_TREE_SEEN_SLASH
protected static final int STATE_TREE_SEEN_SLASH
- See Also:
- Constant Field Values
-
STATE_TREE_NUMERIC_ENTITY_START
protected static final int STATE_TREE_NUMERIC_ENTITY_START
- See Also:
- Constant Field Values
-
STATE_TREE_NAMED_ENTITY_START
protected static final int STATE_TREE_NAMED_ENTITY_START
- See Also:
- Constant Field Values
-
STATE_XMLDECL_AFTER_XML
protected static final int STATE_XMLDECL_AFTER_XML
- See Also:
- Constant Field Values
-
STATE_XMLDECL_BEFORE_VERSION
protected static final int STATE_XMLDECL_BEFORE_VERSION
- See Also:
- Constant Field Values
-
STATE_XMLDECL_VERSION
protected static final int STATE_XMLDECL_VERSION
- See Also:
- Constant Field Values
-
STATE_XMLDECL_AFTER_VERSION
protected static final int STATE_XMLDECL_AFTER_VERSION
- See Also:
- Constant Field Values
-
STATE_XMLDECL_VERSION_EQ
protected static final int STATE_XMLDECL_VERSION_EQ
- See Also:
- Constant Field Values
-
STATE_XMLDECL_VERSION_VALUE
protected static final int STATE_XMLDECL_VERSION_VALUE
- See Also:
- Constant Field Values
-
STATE_XMLDECL_AFTER_VERSION_VALUE
protected static final int STATE_XMLDECL_AFTER_VERSION_VALUE
- See Also:
- Constant Field Values
-
STATE_XMLDECL_BEFORE_ENCODING
protected static final int STATE_XMLDECL_BEFORE_ENCODING
- See Also:
- Constant Field Values
-
STATE_XMLDECL_ENCODING
protected static final int STATE_XMLDECL_ENCODING
- See Also:
- Constant Field Values
-
STATE_XMLDECL_AFTER_ENCODING
protected static final int STATE_XMLDECL_AFTER_ENCODING
- See Also:
- Constant Field Values
-
STATE_XMLDECL_ENCODING_EQ
protected static final int STATE_XMLDECL_ENCODING_EQ
- See Also:
- Constant Field Values
-
STATE_XMLDECL_ENCODING_VALUE
protected static final int STATE_XMLDECL_ENCODING_VALUE
- See Also:
- Constant Field Values
-
STATE_XMLDECL_AFTER_ENCODING_VALUE
protected static final int STATE_XMLDECL_AFTER_ENCODING_VALUE
- See Also:
- Constant Field Values
-
STATE_XMLDECL_BEFORE_STANDALONE
protected static final int STATE_XMLDECL_BEFORE_STANDALONE
- See Also:
- Constant Field Values
-
STATE_XMLDECL_STANDALONE
protected static final int STATE_XMLDECL_STANDALONE
- See Also:
- Constant Field Values
-
STATE_XMLDECL_AFTER_STANDALONE
protected static final int STATE_XMLDECL_AFTER_STANDALONE
- See Also:
- Constant Field Values
-
STATE_XMLDECL_STANDALONE_EQ
protected static final int STATE_XMLDECL_STANDALONE_EQ
- See Also:
- Constant Field Values
-
STATE_XMLDECL_STANDALONE_VALUE
protected static final int STATE_XMLDECL_STANDALONE_VALUE
- See Also:
- Constant Field Values
-
STATE_XMLDECL_AFTER_STANDALONE_VALUE
protected static final int STATE_XMLDECL_AFTER_STANDALONE_VALUE
- See Also:
- Constant Field Values
-
STATE_XMLDECL_ENDQ
protected static final int STATE_XMLDECL_ENDQ
- See Also:
- Constant Field Values
-
STATE_DTD_DOCTYPE
protected static final int STATE_DTD_DOCTYPE
- See Also:
- Constant Field Values
-
STATE_DTD_AFTER_DOCTYPE
protected static final int STATE_DTD_AFTER_DOCTYPE
- See Also:
- Constant Field Values
-
STATE_DTD_BEFORE_ROOT_NAME
protected static final int STATE_DTD_BEFORE_ROOT_NAME
- See Also:
- Constant Field Values
-
STATE_DTD_ROOT_NAME
protected static final int STATE_DTD_ROOT_NAME
- See Also:
- Constant Field Values
-
STATE_DTD_AFTER_ROOT_NAME
protected static final int STATE_DTD_AFTER_ROOT_NAME
- See Also:
- Constant Field Values
-
STATE_DTD_BEFORE_IDS
protected static final int STATE_DTD_BEFORE_IDS
- See Also:
- Constant Field Values
-
STATE_DTD_PUBLIC_OR_SYSTEM
protected static final int STATE_DTD_PUBLIC_OR_SYSTEM
- See Also:
- Constant Field Values
-
STATE_DTD_AFTER_PUBLIC
protected static final int STATE_DTD_AFTER_PUBLIC
- See Also:
- Constant Field Values
-
STATE_DTD_AFTER_SYSTEM
protected static final int STATE_DTD_AFTER_SYSTEM
- See Also:
- Constant Field Values
-
STATE_DTD_BEFORE_PUBLIC_ID
protected static final int STATE_DTD_BEFORE_PUBLIC_ID
- See Also:
- Constant Field Values
-
STATE_DTD_PUBLIC_ID
protected static final int STATE_DTD_PUBLIC_ID
- See Also:
- Constant Field Values
-
STATE_DTD_AFTER_PUBLIC_ID
protected static final int STATE_DTD_AFTER_PUBLIC_ID
- See Also:
- Constant Field Values
-
STATE_DTD_BEFORE_SYSTEM_ID
protected static final int STATE_DTD_BEFORE_SYSTEM_ID
- See Also:
- Constant Field Values
-
STATE_DTD_SYSTEM_ID
protected static final int STATE_DTD_SYSTEM_ID
- See Also:
- Constant Field Values
-
STATE_DTD_AFTER_SYSTEM_ID
protected static final int STATE_DTD_AFTER_SYSTEM_ID
- See Also:
- Constant Field Values
-
STATE_DTD_INT_SUBSET
protected static final int STATE_DTD_INT_SUBSET
- See Also:
- Constant Field Values
-
STATE_DTD_EXPECT_CLOSING_GT
protected static final int STATE_DTD_EXPECT_CLOSING_GT
- See Also:
- Constant Field Values
-
STATE_TEXT_AMP
protected static final int STATE_TEXT_AMP
- See Also:
- Constant Field Values
-
STATE_TEXT_AMP_NAME
protected static final int STATE_TEXT_AMP_NAME
- See Also:
- Constant Field Values
-
STATE_COMMENT_CONTENT
protected static final int STATE_COMMENT_CONTENT
- See Also:
- Constant Field Values
-
STATE_COMMENT_HYPHEN
protected static final int STATE_COMMENT_HYPHEN
- See Also:
- Constant Field Values
-
STATE_COMMENT_HYPHEN2
protected static final int STATE_COMMENT_HYPHEN2
- See Also:
- Constant Field Values
-
STATE_CDATA_CONTENT
protected static final int STATE_CDATA_CONTENT
- See Also:
- Constant Field Values
-
STATE_CDATA_C
protected static final int STATE_CDATA_C
- See Also:
- Constant Field Values
-
STATE_CDATA_CD
protected static final int STATE_CDATA_CD
- See Also:
- Constant Field Values
-
STATE_CDATA_CDA
protected static final int STATE_CDATA_CDA
- See Also:
- Constant Field Values
-
STATE_CDATA_CDAT
protected static final int STATE_CDATA_CDAT
- See Also:
- Constant Field Values
-
STATE_CDATA_CDATA
protected static final int STATE_CDATA_CDATA
- See Also:
- Constant Field Values
-
STATE_PI_AFTER_TARGET
protected static final int STATE_PI_AFTER_TARGET
- See Also:
- Constant Field Values
-
STATE_PI_AFTER_TARGET_WS
protected static final int STATE_PI_AFTER_TARGET_WS
- See Also:
- Constant Field Values
-
STATE_PI_AFTER_TARGET_QMARK
protected static final int STATE_PI_AFTER_TARGET_QMARK
- See Also:
- Constant Field Values
-
STATE_PI_IN_TARGET
protected static final int STATE_PI_IN_TARGET
- See Also:
- Constant Field Values
-
STATE_PI_IN_DATA
protected static final int STATE_PI_IN_DATA
- See Also:
- Constant Field Values
-
STATE_SE_ELEM_NAME
protected static final int STATE_SE_ELEM_NAME
- See Also:
- Constant Field Values
-
STATE_SE_SPACE_OR_END
protected static final int STATE_SE_SPACE_OR_END
- See Also:
- Constant Field Values
-
STATE_SE_SPACE_OR_ATTRNAME
protected static final int STATE_SE_SPACE_OR_ATTRNAME
- See Also:
- Constant Field Values
-
STATE_SE_ATTR_NAME
protected static final int STATE_SE_ATTR_NAME
- See Also:
- Constant Field Values
-
STATE_SE_SPACE_OR_EQ
protected static final int STATE_SE_SPACE_OR_EQ
- See Also:
- Constant Field Values
-
STATE_SE_SPACE_OR_ATTRVALUE
protected static final int STATE_SE_SPACE_OR_ATTRVALUE
- See Also:
- Constant Field Values
-
STATE_SE_ATTR_VALUE_NORMAL
protected static final int STATE_SE_ATTR_VALUE_NORMAL
- See Also:
- Constant Field Values
-
STATE_SE_ATTR_VALUE_NSDECL
protected static final int STATE_SE_ATTR_VALUE_NSDECL
- See Also:
- Constant Field Values
-
STATE_SE_SEEN_SLASH
protected static final int STATE_SE_SEEN_SLASH
- See Also:
- Constant Field Values
-
STATE_EE_NEED_GT
protected static final int STATE_EE_NEED_GT
- See Also:
- Constant Field Values
-
PENDING_STATE_CR
protected static final int PENDING_STATE_CR
- See Also:
- Constant Field Values
-
PENDING_STATE_XMLDECL_LT
protected static final int PENDING_STATE_XMLDECL_LT
- See Also:
- Constant Field Values
-
PENDING_STATE_XMLDECL_LTQ
protected static final int PENDING_STATE_XMLDECL_LTQ
- See Also:
- Constant Field Values
-
PENDING_STATE_XMLDECL_TARGET
protected static final int PENDING_STATE_XMLDECL_TARGET
- See Also:
- Constant Field Values
-
PENDING_STATE_PI_QMARK
protected static final int PENDING_STATE_PI_QMARK
- See Also:
- Constant Field Values
-
PENDING_STATE_COMMENT_HYPHEN1
protected static final int PENDING_STATE_COMMENT_HYPHEN1
- See Also:
- Constant Field Values
-
PENDING_STATE_COMMENT_HYPHEN2
protected static final int PENDING_STATE_COMMENT_HYPHEN2
- See Also:
- Constant Field Values
-
PENDING_STATE_CDATA_BRACKET1
protected static final int PENDING_STATE_CDATA_BRACKET1
- See Also:
- Constant Field Values
-
PENDING_STATE_CDATA_BRACKET2
protected static final int PENDING_STATE_CDATA_BRACKET2
- See Also:
- Constant Field Values
-
PENDING_STATE_ENT_SEEN_HASH
protected static final int PENDING_STATE_ENT_SEEN_HASH
- See Also:
- Constant Field Values
-
PENDING_STATE_ENT_SEEN_HASH_X
protected static final int PENDING_STATE_ENT_SEEN_HASH_X
- See Also:
- Constant Field Values
-
PENDING_STATE_ENT_IN_DEC_DIGIT
protected static final int PENDING_STATE_ENT_IN_DEC_DIGIT
- See Also:
- Constant Field Values
-
PENDING_STATE_ENT_IN_HEX_DIGIT
protected static final int PENDING_STATE_ENT_IN_HEX_DIGIT
- See Also:
- Constant Field Values
-
PENDING_STATE_ATTR_VALUE_AMP
protected static final int PENDING_STATE_ATTR_VALUE_AMP
- See Also:
- Constant Field Values
-
PENDING_STATE_ATTR_VALUE_AMP_HASH
protected static final int PENDING_STATE_ATTR_VALUE_AMP_HASH
- See Also:
- Constant Field Values
-
PENDING_STATE_ATTR_VALUE_AMP_HASH_X
protected static final int PENDING_STATE_ATTR_VALUE_AMP_HASH_X
- See Also:
- Constant Field Values
-
PENDING_STATE_ATTR_VALUE_ENTITY_NAME
protected static final int PENDING_STATE_ATTR_VALUE_ENTITY_NAME
- See Also:
- Constant Field Values
-
PENDING_STATE_ATTR_VALUE_DEC_DIGIT
protected static final int PENDING_STATE_ATTR_VALUE_DEC_DIGIT
- See Also:
- Constant Field Values
-
PENDING_STATE_ATTR_VALUE_HEX_DIGIT
protected static final int PENDING_STATE_ATTR_VALUE_HEX_DIGIT
- See Also:
- Constant Field Values
-
PENDING_STATE_TEXT_AMP
protected static final int PENDING_STATE_TEXT_AMP
- See Also:
- Constant Field Values
-
PENDING_STATE_TEXT_AMP_HASH
protected static final int PENDING_STATE_TEXT_AMP_HASH
- See Also:
- Constant Field Values
-
PENDING_STATE_TEXT_DEC_ENTITY
protected static final int PENDING_STATE_TEXT_DEC_ENTITY
- See Also:
- Constant Field Values
-
PENDING_STATE_TEXT_HEX_ENTITY
protected static final int PENDING_STATE_TEXT_HEX_ENTITY
- See Also:
- Constant Field Values
-
PENDING_STATE_TEXT_IN_ENTITY
protected static final int PENDING_STATE_TEXT_IN_ENTITY
- See Also:
- Constant Field Values
-
PENDING_STATE_TEXT_BRACKET1
protected static final int PENDING_STATE_TEXT_BRACKET1
- See Also:
- Constant Field Values
-
PENDING_STATE_TEXT_BRACKET2
protected static final int PENDING_STATE_TEXT_BRACKET2
- See Also:
- Constant Field Values
-
_charTypes
protected XmlCharTypes _charTypes
This is a simple container object that is used to access the decoding tables for characters. Indirection is needed since we actually support multiple utf-8 compatible encodings, not just utf-8 itself.NOTE: non-final due to xml declaration handling occurring later.
-
_symbols
protected ByteBasedPNameTable _symbols
For now, symbol table contains prefixed names. In future it is possible that they may be split into prefixes and local names?NOTE: non-final for async scanners
-
_quadBuffer
protected int[] _quadBuffer
This buffer is used for name parsing. Will be expanded if/as needed; 32 ints can hold names 128 ascii chars long.
-
_nextEvent
protected int _nextEvent
Due to asynchronous nature of parsing, we may know what event we are trying to parse, even if it's not yet complete. Type of that event is stored here.
-
_state
protected int _state
In addition to the event type, there is need for additional state information
-
_surroundingEvent
protected int _surroundingEvent
For token/state combinations that are 'shared' between events (or embedded in them), this is where the surrounding event state is retained.
-
_pendingInput
protected int _pendingInput
There are some multi-byte combinations that must be handled as a unit: CR+LF linefeeds, multi-byte UTF-8 characters, and multi-character end markers for comments and PIs. Since they can be split across input buffer boundaries, first byte(s) may need to be temporarily stored.If so, this int will store byte(s), in little-endian format (that is, first pending byte is at 0x000000FF, second [if any] at 0x0000FF00, and third at 0x00FF0000). This can be (and is) used to figure out actual number of bytes pending, for multi-byte (UTF-8) character decoding.
Note: it is assumed that if value is 0, there is no data. Thus, if 0 needed to be added pending, it has to be masked.
-
_endOfInput
protected boolean _endOfInput
Flag that is sent when calling application indicates that there will be no more input to parse.
-
_quadCount
protected int _quadCount
Number of complete quads parsed for current name (quads themselves are stored in_quadBuffer
).
-
_currQuad
protected int _currQuad
Bytes parsed for the current, incomplete, quad
-
_currQuadBytes
protected int _currQuadBytes
Number of bytes pending/buffered, stored in_currQuad
-
_entityValue
protected int _entityValue
Entity value accumulated so far
-
_elemAllNsBound
protected boolean _elemAllNsBound
-
_elemAttrCount
protected boolean _elemAttrCount
-
_elemAttrQuote
protected byte _elemAttrQuote
-
_elemAttrName
protected PName _elemAttrName
-
_elemAttrPtr
protected int _elemAttrPtr
Pointer for the next character of currently being parsed value within attribute value buffer
-
_elemNsPtr
protected int _elemNsPtr
Pointer for the next character of currently being parsed namespace URI for the current namespace declaration
-
_inDtdDeclaration
protected boolean _inDtdDeclaration
Flag that indicates whether we are inside a declaration during parsing of internal DTD subset.
-
-
Constructor Detail
-
AsyncByteScanner
protected AsyncByteScanner(ReaderConfig cfg)
-
-
Method Detail
-
_activateEncoding
protected void _activateEncoding()
Initialization method to call when encoding has been definitely figured out, from XML declarations, or, from lack of one (using defaults).- Since:
- 1.1.1
-
endOfInput
public void endOfInput()
Description copied from interface:AsyncInputFeeder
Method that should be called after last chunk of data to parse has been fed. May be called regardless of whatAsyncInputFeeder.needMoreInput()
returns. After calling this method, no more data can be fed; and parser assumes no more data will be available.- Specified by:
endOfInput
in interfaceAsyncInputFeeder
-
_releaseBuffers
protected void _releaseBuffers()
- Overrides:
_releaseBuffers
in classXmlScanner
-
_closeSource
protected void _closeSource() throws java.io.IOException
Since the async scanner has no access to whatever passes content, there is no input source in same sense as with blocking scanner; and there is nothing to close. But we can at least mark input as having ended.- Specified by:
_closeSource
in classByteBasedScanner
- Throws:
java.io.IOException
-
verifyAndSetXmlVersion
protected void verifyAndSetXmlVersion() throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
verifyAndSetXmlEncoding
protected void verifyAndSetXmlEncoding() throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
verifyAndSetXmlStandalone
protected void verifyAndSetXmlStandalone() throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
verifyAndSetPublicId
protected void verifyAndSetPublicId() throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
verifyAndSetSystemId
protected void verifyAndSetSystemId() throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
_currentByte
protected abstract byte _currentByte() throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
_nextByte
protected abstract byte _nextByte() throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
_prevByte
protected abstract byte _prevByte() throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
handlePI
protected abstract int handlePI() throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
handleDTDInternalSubset
protected abstract boolean handleDTDInternalSubset(boolean init) throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
handleComment
protected abstract int handleComment() throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
handleStartElementStart
protected abstract int handleStartElementStart(byte b) throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
handleStartElement
protected abstract int handleStartElement() throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
parsePName
protected abstract PName parsePName() throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
parseNewName
protected abstract PName parseNewName(byte b) throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
asyncSkipSpace
protected abstract boolean asyncSkipSpace() throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
handlePartialCR
protected abstract boolean handlePartialCR() throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
finishToken
protected final void finishToken() throws javax.xml.stream.XMLStreamException
Description copied from class:XmlScanner
This method is called to ensure that the current token/event has been completely parsed, such that we have all the data needed to return it (textual content, PI data, comment text etc)- Specified by:
finishToken
in classXmlScanner
- Throws:
javax.xml.stream.XMLStreamException
-
startCharacters
protected abstract int startCharacters(byte b) throws javax.xml.stream.XMLStreamException
Method called to initialize state for CHARACTERS event, after just a single byte has been seen. What needs to be done next depends on whether coalescing mode is set or not: if it is not set, just a single character needs to be decoded, after which current event will be incomplete, but defined as CHARACTERS. In coalescing mode, the whole content must be read before current event can be defined. The reason for difference is that whenXMLStreamReader.next()
returns, no blocking can occur when calling other methods.- Returns:
- Event type detected; either CHARACTERS, if at least one full character was decoded (and can be returned), EVENT_INCOMPLETE if not (part of a multi-byte character split across input buffer boundary)
- Throws:
javax.xml.stream.XMLStreamException
-
handleAttrValue
protected abstract boolean handleAttrValue() throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
handleNsDecl
protected abstract boolean handleNsDecl() throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
finishCData
protected void finishCData() throws javax.xml.stream.XMLStreamException
- Specified by:
finishCData
in classXmlScanner
- Throws:
javax.xml.stream.XMLStreamException
-
finishComment
protected void finishComment() throws javax.xml.stream.XMLStreamException
- Specified by:
finishComment
in classXmlScanner
- Throws:
javax.xml.stream.XMLStreamException
-
finishDTD
protected void finishDTD(boolean copyContents) throws javax.xml.stream.XMLStreamException
- Specified by:
finishDTD
in classXmlScanner
- Throws:
javax.xml.stream.XMLStreamException
-
finishPI
protected void finishPI() throws javax.xml.stream.XMLStreamException
- Specified by:
finishPI
in classXmlScanner
- Throws:
javax.xml.stream.XMLStreamException
-
finishSpace
protected void finishSpace() throws javax.xml.stream.XMLStreamException
- Specified by:
finishSpace
in classXmlScanner
- Throws:
javax.xml.stream.XMLStreamException
-
skipCharacters
protected abstract boolean skipCharacters() throws javax.xml.stream.XMLStreamException
- Specified by:
skipCharacters
in classXmlScanner
- Returns:
- True if the whole characters segment was succesfully skipped; false if not
- Throws:
javax.xml.stream.XMLStreamException
-
skipCData
protected void skipCData() throws javax.xml.stream.XMLStreamException
- Specified by:
skipCData
in classXmlScanner
- Throws:
javax.xml.stream.XMLStreamException
-
skipComment
protected void skipComment() throws javax.xml.stream.XMLStreamException
- Specified by:
skipComment
in classXmlScanner
- Throws:
javax.xml.stream.XMLStreamException
-
skipPI
protected void skipPI() throws javax.xml.stream.XMLStreamException
- Specified by:
skipPI
in classXmlScanner
- Throws:
javax.xml.stream.XMLStreamException
-
skipSpace
protected void skipSpace() throws javax.xml.stream.XMLStreamException
- Specified by:
skipSpace
in classXmlScanner
- Throws:
javax.xml.stream.XMLStreamException
-
loadMore
protected boolean loadMore() throws javax.xml.stream.XMLStreamException
- Specified by:
loadMore
in classXmlScanner
- Throws:
javax.xml.stream.XMLStreamException
-
finishCharacters
protected abstract void finishCharacters() throws javax.xml.stream.XMLStreamException
- Specified by:
finishCharacters
in classXmlScanner
- Throws:
javax.xml.stream.XMLStreamException
-
findPName
protected final PName findPName(int lastQuad, int lastByteCount) throws javax.xml.stream.XMLStreamException
Method called to process a sequence of bytes that is likely to be a PName. At this point we encountered an end marker, and may either hit a formerly seen well-formed PName; an as-of-yet unseen well-formed PName; or a non-well-formed sequence (containing one or more non-name chars without any valid end markers).- Parameters:
lastQuad
- Word with last 0 to 3 bytes of the PName; not included in the quad arraylastByteCount
- Number of bytes contained in lastQuad; 0 to 3.- Throws:
javax.xml.stream.XMLStreamException
-
addPName
protected final PName addPName(ByteBasedPNameTable symbols, int hash, int[] quads, int qlen, int lastQuadBytes) throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
verifyAndAppendEntityCharacter
protected void verifyAndAppendEntityCharacter(int charFromEntity) throws javax.xml.stream.XMLStreamException
Method called to verify validity of given character (from entity) and append it to the text buffer- Throws:
javax.xml.stream.XMLStreamException
-
validPublicIdChar
protected boolean validPublicIdChar(int c)
Checks that a character for a PublicId- Parameters:
c
- A character- Returns:
- true if the character is valid for use in the Public ID of an XML doctype declaration
- See Also:
- "http://www.w3.org/TR/xml/#NT-PubidLiteral"
-
decodeCharForError
protected int decodeCharForError(byte b) throws javax.xml.stream.XMLStreamException
Description copied from class:ByteBasedScanner
Method called by methods when encountering a byte that can not be part of a valid character in the current context. Should return the actual decoded character for error reporting purposes.- Specified by:
decodeCharForError
in classByteBasedScanner
- Throws:
javax.xml.stream.XMLStreamException
-
checkPITargetName
protected void checkPITargetName(PName targetName) throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
throwInternal
protected int throwInternal()
-
reportInvalidOther
protected void reportInvalidOther(int mask, int ptr) throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
nextFromProlog
public final int nextFromProlog(boolean isProlog) throws javax.xml.stream.XMLStreamException
- Specified by:
nextFromProlog
in classXmlScanner
- Throws:
javax.xml.stream.XMLStreamException
-
_startDocumentNoXmlDecl
protected int _startDocumentNoXmlDecl() throws javax.xml.stream.XMLStreamException
Helper method called when it is determined that the document does NOT start with an xml declaration. Needs to return START_DOCUMENT, and initialize other state appropriately.- Throws:
javax.xml.stream.XMLStreamException
-
handlePrologDeclStart
private final int handlePrologDeclStart(boolean isProlog) throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
startXmlDeclaration
private final java.lang.Boolean startXmlDeclaration() throws javax.xml.stream.XMLStreamException
Method that deals with recognizing XML declaration, but not with parsing its contents.- Returns:
- null if parsing is inconclusive (may or may not be XML declaration); Boolean.TRUE if complete XML declaration, and Boolean.FALSE if something else
- Throws:
javax.xml.stream.XMLStreamException
-
handleXmlDeclaration
private int handleXmlDeclaration() throws javax.xml.stream.XMLStreamException
Method called to complete parsing of XML declaration, once it has been reliably detected.- Returns:
- Completed token (START_DOCUMENT), if fully parsed; incomplete (EVENT_INCOMPLETE) otherwise
- Throws:
javax.xml.stream.XMLStreamException
-
handleDTD
private int handleDTD() throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
parseDtdId
private final boolean parseDtdId(char[] outputBuffer, int outputPtr, boolean system) throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
_parseNewXmlDeclName
private final PName _parseNewXmlDeclName(byte b) throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
_parseXmlDeclName
private final PName _parseXmlDeclName() throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
_findXmlDeclName
protected final PName _findXmlDeclName(int lastQuad, int lastByteCount) throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
parseXmlDeclAttr
protected boolean parseXmlDeclAttr(char[] outputBuffer, int outputPtr) throws javax.xml.stream.XMLStreamException
Method called to try to parse an XML pseudo-attribute value. This is relatively simple, since we can't have linefeeds or entities; and although there are exact rules for what is allowed, we can do coarse parsing and only later on verify validity (for encoding could do stricter parsing in future?)NOTE: pseudo-attribute values required to be 7-bit ASCII so can do crude cast.
- Returns:
- True if we managed to parse the whole pseudo-attribute
- Throws:
javax.xml.stream.XMLStreamException
-
-