Package org.gjt.xpp.impl.pullparser
Class PullParser
java.lang.Object
org.gjt.xpp.impl.pullparser.PullParser
- All Implemented Interfaces:
XmlPullParser
,XmlPullParserBufferControl
,XmlPullParserEventPosition
public class PullParser
extends Object
implements XmlPullParser, XmlPullParserBufferControl, XmlPullParserEventPosition
XML Pull Parser (XPP) allows to pull XML events from input stream.
Advantages:
- very simple pull interface - ideal for deserializing XML objects (like SOAP)
- simple and efficient thin wrapper around Tokenizer class - when compared with using Tokenizer directly adds about 10% for big documents, maximum 50% more processing time for small documents
- lightweight memory model - minimized memory allocation: element content and attributes are only read on explicit method calls, both StartTag and EndTag can be reused during parsing
- small - total compiled size around 20K
- by default supports namespaces parsing (can be switched off)
- support for mixed content can be explicitly disabled
- this is beta version - may have still bugs :-)
- does not parse DTD (recognizes only predefined entities)
- Author:
- Aleksander Slominski
-
Field Summary
FieldsModifier and TypeFieldDescriptionprotected Attribute[]
temporary array of current attributesprotected int
index for last attribute in attrPos arrayprotected int
size of attrPos arrayprotected static final boolean
Should attribute uniqueness be checked for attributes as in specified XML and NS specifications?protected String
Content of current element if in CONTENT stateprotected ElementContent[]
temprary array to keep ElementContent stackprotected int
how many elements are on elStackprotected int
size of elStack arrayprotected boolean
Have we read empty element?protected int
end position of current event in tokenizer bifferprotected int
start position of current event in tokenizer bifferprotected Hashtable
mapping of names prefixes to urisprotected boolean
should parser report namespace xmlns* attributes ?protected boolean
Have we seen root elementprotected byte
what is current event type as returned from next()?protected boolean
should parser support namespaces?protected byte
what is current token returned from tokeizerprotected Tokenizer
XML tokenizer that is doing actual tokenizning of input stream.protected static final boolean
Fields inherited from interface org.gjt.xpp.XmlPullParser
CONTENT, END_DOCUMENT, END_TAG, START_TAG
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprotected void
ensureAttribs
(int size) Make sure that in attributes temporary array is enough space.protected void
ensureCapacity
(int size) Make sure that we have enough space to keep element stack if passed size.int
int
int
Return how big is content.int
getDepth()
Returns the current depth of the element.char[]
NOTE: This may be internal buffer and is valud only until call to method next()- do NOT attempt modify !int
int
byte
Returns the type of the current element (START_TAG, END_TAG, CONTENT, etc)int
int
Returns the local name of the current element (current event must be START_TAG or END_TAG)int
getNamespacesLength
(int depth) Returns the namespace URI of the current element Returns null if not applicable (current event must be START_TAG or END_TAG)Return string describing current position of parser in input stream.Returns the prefix of the current element or null if elemet has no prefix.getQNameLocal
(String qName) Return local part of qname.getQNameUri
(String qName) Return uri part of qname.Returns the raw name (prefix + ':' + localName) of the current element (current event must be START_TAG or END_TAG)int
boolean
Is mixed element context allowed?boolean
boolean
Is parser going to report namespace attributes (xmlns*) ?boolean
Is parser namespace aware?boolean
Return true if just read CONTENT contained only white spaces.byte
next()
This is key method - it reads more from input stream and returns next event type (such as START_TAG, END_TAG, CONTENT).Return String that contains just read CONTENT.void
readEndTag
(XmlEndTag etag) Read value of just read END_TAG into passed as argument EndTag.void
readNamespacesPrefixes
(int depth, String[] prefixes, int off, int len) Return namespace prefixes for element at depthvoid
readNamespacesUris
(int depth, String[] uris, int off, int len) Return namespace URIs for element at depthbyte
Read subtree into node: call readNodeWithoutChildren and then parse subtree adding children (values obtained with readXontent or readNodeWithoutChildren).void
Read node: it calls readStartTag and then if parser is namespaces aware currently declared nemaspeces will be added and defaultNamespace will be set.void
readStartTag
(XmlStartTag stag) Read value of just read START_TAG into passed as argument StartTag.void
reset()
Reset parser state so it can be used to parse newprotected void
void
setAllowedMixedContent
(boolean enable) Allow for mixed element content.void
setBufferShrinkable
(boolean shrinkable) void
setHardLimit
(int value) void
setInput
(char[] buf) Reset parser and set new input.void
setInput
(char[] buf, int off, int len) Set the input for parser.void
Reset parser and set new input.void
setNamespaceAttributesReporting
(boolean enable) Make parser to report xmlns* attributes.void
setNamespaceAware
(boolean awareness) Set support of namespaces.void
setSoftLimit
(int value) byte
skipNode()
If parser has just read start tag it allows to skip whoole subtree contined in this element.
-
Field Details
-
USE_QNAMEBUF
protected static final boolean USE_QNAMEBUF- See Also:
-
CHECK_ATTRIB_UNIQ
protected static final boolean CHECK_ATTRIB_UNIQShould attribute uniqueness be checked for attributes as in specified XML and NS specifications?- See Also:
-
emptyElement
protected boolean emptyElementHave we read empty element? -
seenRootElement
protected boolean seenRootElementHave we seen root element -
elContent
Content of current element if in CONTENT state -
tokenizer
XML tokenizer that is doing actual tokenizning of input stream. -
eventStart
protected int eventStartstart position of current event in tokenizer biffer -
eventEnd
protected int eventEndend position of current event in tokenizer biffer -
state
protected byte statewhat is current event type as returned from next()? -
token
protected byte tokenwhat is current token returned from tokeizer -
supportNs
protected boolean supportNsshould parser support namespaces? -
reportNsAttribs
protected boolean reportNsAttribsshould parser report namespace xmlns* attributes ? -
prefix2Ns
mapping of names prefixes to uris -
attrPosEnd
protected int attrPosEndindex for last attribute in attrPos array -
attrPosSize
protected int attrPosSizesize of attrPos array -
attrPos
temporary array of current attributes -
elStackDepth
protected int elStackDepthhow many elements are on elStack -
elStackSize
protected int elStackSizesize of elStack array -
elStack
temprary array to keep ElementContent stack
-
-
Constructor Details
-
PullParser
public PullParser()Create instance of pull parser.
-
-
Method Details
-
setInput
Reset parser and set new input.- Specified by:
setInput
in interfaceXmlPullParser
-
setInput
public void setInput(char[] buf) Reset parser and set new input.- Specified by:
setInput
in interfaceXmlPullParser
-
setInput
Description copied from interface:XmlPullParser
Set the input for parser.- Specified by:
setInput
in interfaceXmlPullParser
- Throws:
XmlPullParserException
-
reset
public void reset()Reset parser state so it can be used to parse new- Specified by:
reset
in interfaceXmlPullParser
-
isAllowedMixedContent
public boolean isAllowedMixedContent()Description copied from interface:XmlPullParser
Is mixed element context allowed?- Specified by:
isAllowedMixedContent
in interfaceXmlPullParser
-
setAllowedMixedContent
public void setAllowedMixedContent(boolean enable) Allow for mixed element content. Enabled by default. When disbaled element must containt either text or other elements.- Specified by:
setAllowedMixedContent
in interfaceXmlPullParser
-
isNamespaceAware
public boolean isNamespaceAware()Description copied from interface:XmlPullParser
Is parser namespace aware?- Specified by:
isNamespaceAware
in interfaceXmlPullParser
-
setNamespaceAware
Set support of namespaces. Disabled by default.- Specified by:
setNamespaceAware
in interfaceXmlPullParser
- Throws:
XmlPullParserException
-
isNamespaceAttributesReporting
public boolean isNamespaceAttributesReporting()Description copied from interface:XmlPullParser
Is parser going to report namespace attributes (xmlns*) ?- Specified by:
isNamespaceAttributesReporting
in interfaceXmlPullParser
-
setNamespaceAttributesReporting
public void setNamespaceAttributesReporting(boolean enable) Make parser to report xmlns* attributes. Disabled by default. Only meaningful when namespaces are enabled (when namespaces are disabled all attributes are always reported).- Specified by:
setNamespaceAttributesReporting
in interfaceXmlPullParser
-
getNamespaceUri
Description copied from interface:XmlPullParser
Returns the namespace URI of the current element Returns null if not applicable (current event must be START_TAG or END_TAG)- Specified by:
getNamespaceUri
in interfaceXmlPullParser
-
getLocalName
Description copied from interface:XmlPullParser
Returns the local name of the current element (current event must be START_TAG or END_TAG)- Specified by:
getLocalName
in interfaceXmlPullParser
-
getPrefix
Description copied from interface:XmlPullParser
Returns the prefix of the current element or null if elemet has no prefix. (current event must be START_TAG or END_TAG)- Specified by:
getPrefix
in interfaceXmlPullParser
-
getRawName
Description copied from interface:XmlPullParser
Returns the raw name (prefix + ':' + localName) of the current element (current event must be START_TAG or END_TAG)- Specified by:
getRawName
in interfaceXmlPullParser
-
getQNameLocal
Description copied from interface:XmlPullParser
Return local part of qname. For example for 'xsi:type' it returns 'type'.- Specified by:
getQNameLocal
in interfaceXmlPullParser
-
getQNameUri
Description copied from interface:XmlPullParser
Return uri part of qname. It is depending on current state of parser to find what namespace uri is mapped from namespace prefix. For example for 'xsi:type' if xsi namespace prefix was declared to 'urn:foo' it will return 'urn:foo'.- Specified by:
getQNameUri
in interfaceXmlPullParser
- Throws:
XmlPullParserException
-
getDepth
public int getDepth()Description copied from interface:XmlPullParser
Returns the current depth of the element.- Specified by:
getDepth
in interfaceXmlPullParser
-
getNamespacesLength
public int getNamespacesLength(int depth) - Specified by:
getNamespacesLength
in interfaceXmlPullParser
-
readNamespacesPrefixes
public void readNamespacesPrefixes(int depth, String[] prefixes, int off, int len) throws XmlPullParserException Return namespace prefixes for element at depth- Specified by:
readNamespacesPrefixes
in interfaceXmlPullParser
- Throws:
XmlPullParserException
-
readNamespacesUris
public void readNamespacesUris(int depth, String[] uris, int off, int len) throws XmlPullParserException Return namespace URIs for element at depth- Specified by:
readNamespacesUris
in interfaceXmlPullParser
- Throws:
XmlPullParserException
-
getPosDesc
Return string describing current position of parser in input stream.- Specified by:
getPosDesc
in interfaceXmlPullParser
-
getLineNumber
public int getLineNumber()- Specified by:
getLineNumber
in interfaceXmlPullParser
-
getColumnNumber
public int getColumnNumber()- Specified by:
getColumnNumber
in interfaceXmlPullParser
-
next
This is key method - it reads more from input stream and returns next event type (such as START_TAG, END_TAG, CONTENT). or END_DOCUMENT if no more input.This is simple automata (in pseudo-code):
byte next() { while(state != END_DOCUMENT) { token = tokenizer.next(); // get next XML token switch(token) { case Tokenizer.END_DOCUMENT: return state = END_DOCUMENT case Tokenizer.CONTENT: // check if content allowed - only inside element return state = CONTENT case Tokenizer.ETAG_NAME: // popup element from stack - compare if matched start and end tag // if namespaces supported restore namespaces prefix mappings return state = END_TAG; case Tokenizer.STAG_NAME: // create new element push it on stack // process attributes (including namespaces) // set emptyElement = true; if empty element // check atribute uniqueness (including nmespacese prefixes) return state = START_TAG; } } }
Actual parsing is more complex especilly for start tag due to dealing with attributes reported separately from tokenizer and declaring namespace prefixes and uris.
- Specified by:
next
in interfaceXmlPullParser
- Throws:
XmlPullParserException
IOException
-
getEventType
public byte getEventType()Description copied from interface:XmlPullParser
Returns the type of the current element (START_TAG, END_TAG, CONTENT, etc)- Specified by:
getEventType
in interfaceXmlPullParser
-
isWhitespaceContent
Return true if just read CONTENT contained only white spaces.- Specified by:
isWhitespaceContent
in interfaceXmlPullParser
- Throws:
XmlPullParserException
-
getContentLength
Description copied from interface:XmlPullParser
Return how big is content.NOTE: parser must be on CONTENT event.
- Specified by:
getContentLength
in interfaceXmlPullParser
- Throws:
XmlPullParserException
-
readContent
Return String that contains just read CONTENT.- Specified by:
readContent
in interfaceXmlPullParser
- Throws:
XmlPullParserException
-
readEndTag
Read value of just read END_TAG into passed as argument EndTag.- Specified by:
readEndTag
in interfaceXmlPullParser
- Throws:
XmlPullParserException
-
readStartTag
Read value of just read START_TAG into passed as argument StartTag.- Specified by:
readStartTag
in interfaceXmlPullParser
- Throws:
XmlPullParserException
-
readNodeWithoutChildren
Description copied from interface:XmlPullParser
Read node: it calls readStartTag and then if parser is namespaces aware currently declared nemaspeces will be added and defaultNamespace will be set.NOTE: parser must be on START_TAG event. and all events will written into node!
- Specified by:
readNodeWithoutChildren
in interfaceXmlPullParser
- Throws:
XmlPullParserException
-
readNode
Description copied from interface:XmlPullParser
Read subtree into node: call readNodeWithoutChildren and then parse subtree adding children (values obtained with readXontent or readNodeWithoutChildren).NOTE: parser must be on START_TAG event. and all events will written into node!
- Specified by:
readNode
in interfaceXmlPullParser
- Throws:
XmlPullParserException
IOException
-
skipNode
If parser has just read start tag it allows to skip whoole subtree contined in this element. Returns when encounters end tag matching the start tag.- Specified by:
skipNode
in interfaceXmlPullParser
- Throws:
XmlPullParserException
IOException
-
getHardLimit
public int getHardLimit()- Specified by:
getHardLimit
in interfaceXmlPullParserBufferControl
-
setHardLimit
- Specified by:
setHardLimit
in interfaceXmlPullParserBufferControl
- Throws:
XmlPullParserException
-
getSoftLimit
public int getSoftLimit()- Specified by:
getSoftLimit
in interfaceXmlPullParserBufferControl
-
setSoftLimit
- Specified by:
setSoftLimit
in interfaceXmlPullParserBufferControl
- Throws:
XmlPullParserException
-
getBufferShrinkOffset
public int getBufferShrinkOffset()- Specified by:
getBufferShrinkOffset
in interfaceXmlPullParserBufferControl
-
setBufferShrinkable
- Specified by:
setBufferShrinkable
in interfaceXmlPullParserBufferControl
- Throws:
XmlPullParserException
-
isBufferShrinkable
public boolean isBufferShrinkable()- Specified by:
isBufferShrinkable
in interfaceXmlPullParserBufferControl
-
getEventStart
public int getEventStart()- Specified by:
getEventStart
in interfaceXmlPullParserEventPosition
-
getEventEnd
public int getEventEnd()- Specified by:
getEventEnd
in interfaceXmlPullParserEventPosition
-
getEventBuffer
public char[] getEventBuffer()Description copied from interface:XmlPullParserEventPosition
NOTE: This may be internal buffer and is valud only until call to method next()- do NOT attempt modify !
- Specified by:
getEventBuffer
in interfaceXmlPullParserEventPosition
-
ensureCapacity
protected void ensureCapacity(int size) Make sure that we have enough space to keep element stack if passed size. -
ensureAttribs
protected void ensureAttribs(int size) Make sure that in attributes temporary array is enough space. -
resetState
protected void resetState()
-