Package com.fasterxml.aalto.in
Class ReaderScanner
- java.lang.Object
-
- com.fasterxml.aalto.in.XmlScanner
-
- com.fasterxml.aalto.in.ReaderScanner
-
- All Implemented Interfaces:
XmlConsts
,javax.xml.namespace.NamespaceContext
,javax.xml.stream.XMLStreamConstants
public final class ReaderScanner extends XmlScanner
This is the concrete scanner implementation used when input comes as aReader
. In general using this scanner is quite a bit less optimal than that ofInputStream
based scanner. Nonetheless, it is included for completeness, since Stax interface allows passing Readers as input sources.
-
-
Field Summary
Fields Modifier and Type Field Description protected java.io.Reader
_in
Underlying InputStream to use for reading content.protected char[]
_inputBuffer
protected int
_inputEnd
protected int
_inputPtr
protected CharBasedPNameTable
_symbols
For now, symbol table contains prefixed names.protected int
mTmpChar
Storage location for a single character that can not be pushed back (for example, multi-byte char)private static XmlCharTypes
sCharTypes
Although java chars are basically UTF-16 in memory, the closest match for char types is Latin1.-
Fields inherited from class com.fasterxml.aalto.in.XmlScanner
_attrCollector, _attrCount, _cfgCoalescing, _cfgLazyParsing, _config, _currElem, _currNsCount, _currRow, _currToken, _defaultNs, _depth, _entityPending, _isEmptyTag, _lastNsContext, _lastNsDecl, _nameBuffer, _nsBindingCache, _nsBindingCount, _nsBindings, _nsBindMisses, _pastBytesOrChars, _publicId, _rowStartOffset, _startColumn, _startRawOffset, _startRow, _systemId, _textBuilder, _tokenIncomplete, _tokenName, _xml11, CDATA_STR, INT_0, INT_9, INT_a, INT_A, INT_AMP, INT_APOS, INT_COLON, INT_CR, INT_EQ, INT_EXCL, INT_f, INT_F, INT_GT, INT_HYPHEN, INT_LBRACKET, INT_LF, INT_LT, INT_NULL, INT_QMARK, INT_QUOTE, INT_RBRACKET, INT_SLASH, INT_SPACE, INT_TAB, INT_z, MAX_UNICODE_CHAR, TOKEN_EOI
-
Fields inherited from interface com.fasterxml.aalto.util.XmlConsts
CHAR_CR, CHAR_LF, CHAR_NULL, CHAR_SPACE, STAX_DEFAULT_OUTPUT_ENCODING, STAX_DEFAULT_OUTPUT_VERSION, XML_DECL_KW_ENCODING, XML_DECL_KW_STANDALONE, XML_DECL_KW_VERSION, XML_SA_NO, XML_SA_YES, XML_V_10, XML_V_10_STR, XML_V_11, XML_V_11_STR, XML_V_UNKNOWN
-
-
Constructor Summary
Constructors Constructor Description ReaderScanner(ReaderConfig cfg, java.io.Reader r)
ReaderScanner(ReaderConfig cfg, java.io.Reader r, char[] buffer, int ptr, int last)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected void
_closeSource()
protected int
_nextEntity()
Helper method used to isolate things that need to be (re)set in cases whereprotected void
_releaseBuffers()
protected PName
addPName(char[] nameBuffer, int nameLen, int hash)
protected int
checkInTreeIndentation(char c)
Note: consequtive white space is only considered indentation, if the following token seems like a tag (start/end).protected int
checkPrologIndentation(char c)
private char
checkSurrogate(char firstChar)
This method is called to verify that a surrogate pair found describes a legal surrogate pair (ie.private int
checkSurrogateNameChar(char firstChar, char sec, int index)
private int
collectValue(int attrPtr, char quoteChar, PName attrName)
This method implements the tight loop for parsing attribute values.private int
decodeSurrogate(char firstChar)
This method is similar tocheckSurrogate
, but returns the actual character code encoded by the surrogate pair.protected void
finishCData()
protected void
finishCharacters()
protected void
finishCoalescedCData()
protected void
finishCoalescedCharacters()
protected void
finishCoalescedText()
Method that gets called after a primary text segment (of type CHARACTERS or CDATA, not applicable to SPACE) has been read in text buffer.protected void
finishComment()
protected void
finishDTD(boolean copyContents)
protected void
finishPI()
protected void
finishSpace()
protected void
finishToken()
This method is called to ensure that the current token/event has been completely parsed, such that we have all the data needed to return it (textual content, PI data, comment text etc)int
getCurrentColumnNr()
org.codehaus.stax2.XMLStreamLocation2
getCurrentLocation()
long
getEndingByteOffset()
long
getEndingCharOffset()
long
getStartingByteOffset()
long
getStartingCharOffset()
protected int
handleCharEntity()
protected int
handleCommentOrCdataStart()
private int
handleDtdStart()
protected int
handleEndElement()
protected int
handleEntityInText(boolean inAttr)
private void
handleNsDeclaration(PName name, char quoteChar)
Method called from the main START_ELEMENT handling loop, to parse namespace URI values.protected int
handlePIStart()
protected int
handlePrologDeclStart(boolean isProlog)
protected int
handleStartElement(char c)
protected boolean
loadAndRetain(int nrOfChars)
protected boolean
loadMore()
protected char
loadOne()
protected char
loadOne(int type)
protected void
markLF()
protected void
markLF(int offset)
private void
matchAsciiKeyword(java.lang.String keyw)
int
nextFromProlog(boolean isProlog)
int
nextFromTree()
protected PName
parsePName(char c)
protected java.lang.String
parsePublicId(char quoteChar)
protected java.lang.String
parseSystemId(char quoteChar)
private void
reportInvalidFirstSurrogate(char ch)
private void
reportInvalidSecondSurrogate(char ch)
protected void
setStartLocation()
protected void
skipCData()
protected boolean
skipCharacters()
protected boolean
skipCoalescedText()
Method that gets called after a primary text segment (of type CHARACTERS or CDATA, not applicable to SPACE) has been skipped.protected void
skipComment()
protected char
skipInternalWs(boolean reqd, java.lang.String msg)
protected void
skipPI()
protected void
skipSpace()
-
Methods inherited from class com.fasterxml.aalto.in.XmlScanner
bindName, bindNs, checkImmutableBinding, close, decodeAttrBinaryValue, decodeAttrValue, decodeAttrValues, decodeElements, findAttrIndex, findOrCreateBinding, fireSaxCharacterEvents, fireSaxCommentEvent, fireSaxEndElement, fireSaxPIEvent, fireSaxSpaceEvents, fireSaxStartElement, getAttrCollector, getAttrCount, getAttrLocalName, getAttrNsURI, getAttrPrefix, getAttrPrefixedName, getAttrQName, getAttrType, getAttrValue, getAttrValue, getConfig, getCurrentLineNr, getDepth, getDTDPublicId, getDTDSystemId, getEndLocation, getInputPublicId, getInputSystemId, getName, getNamespacePrefix, getNamespaceURI, getNamespaceURI, getNamespaceURI, getNonTransientNamespaceContext, getNsCount, getPrefix, getPrefixes, getQName, getStartLocation, getText, getText, getTextCharacters, getTextCharacters, getTextLength, handleInvalidXmlChar, hasEmptyStack, isAttrSpecified, isEmptyTag, isTextWhitespace, loadMoreGuaranteed, loadMoreGuaranteed, reportDoubleHyphenInComments, reportDuplicateNsDecl, reportEntityOverflow, reportEofInName, reportIllegalCDataEnd, reportIllegalNsDecl, reportIllegalNsDecl, reportInputProblem, reportInvalidNameChar, reportInvalidNsIndex, reportInvalidXmlChar, reportMissingPISpace, reportMultipleColonsInName, reportPrologProblem, reportPrologUnexpChar, reportPrologUnexpElement, reportTreeUnexpChar, reportUnboundPrefix, reportUnexpandedEntityInAttr, reportUnexpectedEndTag, resetForDecoding, skipToken, throwInvalidSpace, throwNullChar, throwUnexpectedChar, verifyXmlChar
-
-
-
-
Field Detail
-
sCharTypes
private static final XmlCharTypes sCharTypes
Although java chars are basically UTF-16 in memory, the closest match for char types is Latin1.
-
_in
protected java.io.Reader _in
Underlying InputStream to use for reading content.
-
_inputBuffer
protected char[] _inputBuffer
-
_inputPtr
protected int _inputPtr
-
_inputEnd
protected int _inputEnd
-
mTmpChar
protected int mTmpChar
Storage location for a single character that can not be pushed back (for example, multi-byte char)
-
_symbols
protected final CharBasedPNameTable _symbols
For now, symbol table contains prefixed names. In future it is possible that they may be split into prefixes and local names?
-
-
Constructor Detail
-
ReaderScanner
public ReaderScanner(ReaderConfig cfg, java.io.Reader r, char[] buffer, int ptr, int last)
-
ReaderScanner
public ReaderScanner(ReaderConfig cfg, java.io.Reader r)
-
-
Method Detail
-
_releaseBuffers
protected void _releaseBuffers()
- Overrides:
_releaseBuffers
in classXmlScanner
-
_closeSource
protected void _closeSource() throws java.io.IOException
- Specified by:
_closeSource
in classXmlScanner
- Throws:
java.io.IOException
-
finishToken
protected final void finishToken() throws javax.xml.stream.XMLStreamException
Description copied from class:XmlScanner
This method is called to ensure that the current token/event has been completely parsed, such that we have all the data needed to return it (textual content, PI data, comment text etc)- Specified by:
finishToken
in classXmlScanner
- Throws:
javax.xml.stream.XMLStreamException
-
nextFromProlog
public final int nextFromProlog(boolean isProlog) throws javax.xml.stream.XMLStreamException
- Specified by:
nextFromProlog
in classXmlScanner
- Throws:
javax.xml.stream.XMLStreamException
-
nextFromTree
public final int nextFromTree() throws javax.xml.stream.XMLStreamException
- Specified by:
nextFromTree
in classXmlScanner
- Throws:
javax.xml.stream.XMLStreamException
-
_nextEntity
protected int _nextEntity()
Helper method used to isolate things that need to be (re)set in cases where
-
handlePrologDeclStart
protected final int handlePrologDeclStart(boolean isProlog) throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
handleDtdStart
private final int handleDtdStart() throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
handleCommentOrCdataStart
protected final int handleCommentOrCdataStart() throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
handlePIStart
protected final int handlePIStart() throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
handleCharEntity
protected final int handleCharEntity() throws javax.xml.stream.XMLStreamException
- Returns:
- Code point for the entity that expands to a valid XML content character.
- Throws:
javax.xml.stream.XMLStreamException
-
handleStartElement
protected final int handleStartElement(char c) throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
collectValue
private int collectValue(int attrPtr, char quoteChar, PName attrName) throws javax.xml.stream.XMLStreamException
This method implements the tight loop for parsing attribute values. It's off-lined from the main start element method to simplify main method, which makes code more maintainable and possibly easier for JIT/HotSpot to optimize.- Throws:
javax.xml.stream.XMLStreamException
-
handleNsDeclaration
private void handleNsDeclaration(PName name, char quoteChar) throws javax.xml.stream.XMLStreamException
Method called from the main START_ELEMENT handling loop, to parse namespace URI values.- Throws:
javax.xml.stream.XMLStreamException
-
handleEndElement
protected final int handleEndElement() throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
handleEntityInText
protected final int handleEntityInText(boolean inAttr) throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
finishComment
protected final void finishComment() throws javax.xml.stream.XMLStreamException
- Specified by:
finishComment
in classXmlScanner
- Throws:
javax.xml.stream.XMLStreamException
-
finishPI
protected final void finishPI() throws javax.xml.stream.XMLStreamException
- Specified by:
finishPI
in classXmlScanner
- Throws:
javax.xml.stream.XMLStreamException
-
finishDTD
protected final void finishDTD(boolean copyContents) throws javax.xml.stream.XMLStreamException
- Specified by:
finishDTD
in classXmlScanner
- Throws:
javax.xml.stream.XMLStreamException
-
finishCData
protected final void finishCData() throws javax.xml.stream.XMLStreamException
- Specified by:
finishCData
in classXmlScanner
- Throws:
javax.xml.stream.XMLStreamException
-
finishCharacters
protected final void finishCharacters() throws javax.xml.stream.XMLStreamException
- Specified by:
finishCharacters
in classXmlScanner
- Throws:
javax.xml.stream.XMLStreamException
-
finishSpace
protected final void finishSpace() throws javax.xml.stream.XMLStreamException
- Specified by:
finishSpace
in classXmlScanner
- Throws:
javax.xml.stream.XMLStreamException
-
finishCoalescedText
protected final void finishCoalescedText() throws javax.xml.stream.XMLStreamException
Method that gets called after a primary text segment (of type CHARACTERS or CDATA, not applicable to SPACE) has been read in text buffer. Method has to see if the following event would be textual as well, and if so, read it (and any other following textual segments).- Throws:
javax.xml.stream.XMLStreamException
-
finishCoalescedCData
protected final void finishCoalescedCData() throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
finishCoalescedCharacters
protected final void finishCoalescedCharacters() throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
skipCoalescedText
protected final boolean skipCoalescedText() throws javax.xml.stream.XMLStreamException
Method that gets called after a primary text segment (of type CHARACTERS or CDATA, not applicable to SPACE) has been skipped. Method has to see if the following event would be textual as well, and if so, skip it (and any other following textual segments).- Specified by:
skipCoalescedText
in classXmlScanner
- Returns:
- True if we encountered an unexpandable entity
- Throws:
javax.xml.stream.XMLStreamException
-
skipComment
protected final void skipComment() throws javax.xml.stream.XMLStreamException
- Specified by:
skipComment
in classXmlScanner
- Throws:
javax.xml.stream.XMLStreamException
-
skipPI
protected final void skipPI() throws javax.xml.stream.XMLStreamException
- Specified by:
skipPI
in classXmlScanner
- Throws:
javax.xml.stream.XMLStreamException
-
skipCharacters
protected final boolean skipCharacters() throws javax.xml.stream.XMLStreamException
- Specified by:
skipCharacters
in classXmlScanner
- Returns:
- True, if an unexpanded entity was encountered (and is now pending)
- Throws:
javax.xml.stream.XMLStreamException
-
skipCData
protected final void skipCData() throws javax.xml.stream.XMLStreamException
- Specified by:
skipCData
in classXmlScanner
- Throws:
javax.xml.stream.XMLStreamException
-
skipSpace
protected final void skipSpace() throws javax.xml.stream.XMLStreamException
- Specified by:
skipSpace
in classXmlScanner
- Throws:
javax.xml.stream.XMLStreamException
-
skipInternalWs
protected char skipInternalWs(boolean reqd, java.lang.String msg) throws javax.xml.stream.XMLStreamException
- Returns:
- First byte following skipped white space
- Throws:
javax.xml.stream.XMLStreamException
-
matchAsciiKeyword
private final void matchAsciiKeyword(java.lang.String keyw) throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
checkInTreeIndentation
protected final int checkInTreeIndentation(char c) throws javax.xml.stream.XMLStreamException
Note: consequtive white space is only considered indentation, if the following token seems like a tag (start/end). This so that if a CDATA section follows, it can be coalesced in coalescing mode. Although we could check if coalescing mode is enabled, this should seldom have significant effect either way, so it removes one possible source of problems in coalescing mode.
- Returns:
- -1, if indentation was handled; offset in the output buffer, if not
- Throws:
javax.xml.stream.XMLStreamException
-
checkPrologIndentation
protected final int checkPrologIndentation(char c) throws javax.xml.stream.XMLStreamException
- Returns:
- -1, if indentation was handled; offset in the output buffer, if not
- Throws:
javax.xml.stream.XMLStreamException
-
parsePName
protected PName parsePName(char c) throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
addPName
protected final PName addPName(char[] nameBuffer, int nameLen, int hash) throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
parsePublicId
protected java.lang.String parsePublicId(char quoteChar) throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
parseSystemId
protected java.lang.String parseSystemId(char quoteChar) throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
checkSurrogate
private char checkSurrogate(char firstChar) throws javax.xml.stream.XMLStreamException
This method is called to verify that a surrogate pair found describes a legal surrogate pair (ie. expands to a legal XML char)- Throws:
javax.xml.stream.XMLStreamException
-
checkSurrogateNameChar
private int checkSurrogateNameChar(char firstChar, char sec, int index) throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
decodeSurrogate
private int decodeSurrogate(char firstChar) throws javax.xml.stream.XMLStreamException
This method is similar tocheckSurrogate
, but returns the actual character code encoded by the surrogate pair. This is needed if further validation rules (such as name charactert checks) are to be done.- Throws:
javax.xml.stream.XMLStreamException
-
reportInvalidFirstSurrogate
private void reportInvalidFirstSurrogate(char ch) throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
reportInvalidSecondSurrogate
private void reportInvalidSecondSurrogate(char ch) throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
getCurrentLocation
public org.codehaus.stax2.XMLStreamLocation2 getCurrentLocation()
- Specified by:
getCurrentLocation
in classXmlScanner
- Returns:
- Current input location
-
getCurrentColumnNr
public int getCurrentColumnNr()
- Specified by:
getCurrentColumnNr
in classXmlScanner
-
getStartingByteOffset
public long getStartingByteOffset()
- Specified by:
getStartingByteOffset
in classXmlScanner
-
getStartingCharOffset
public long getStartingCharOffset()
- Specified by:
getStartingCharOffset
in classXmlScanner
-
getEndingByteOffset
public long getEndingByteOffset() throws javax.xml.stream.XMLStreamException
- Specified by:
getEndingByteOffset
in classXmlScanner
- Throws:
javax.xml.stream.XMLStreamException
-
getEndingCharOffset
public long getEndingCharOffset() throws javax.xml.stream.XMLStreamException
- Specified by:
getEndingCharOffset
in classXmlScanner
- Throws:
javax.xml.stream.XMLStreamException
-
markLF
protected final void markLF(int offset)
-
markLF
protected final void markLF()
-
setStartLocation
protected final void setStartLocation()
-
loadMore
protected final boolean loadMore() throws javax.xml.stream.XMLStreamException
- Specified by:
loadMore
in classXmlScanner
- Throws:
javax.xml.stream.XMLStreamException
-
loadOne
protected final char loadOne() throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
loadOne
protected final char loadOne(int type) throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
loadAndRetain
protected final boolean loadAndRetain(int nrOfChars) throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
-