Package org.htmlunit.html.parser.neko
Class HtmlUnitNekoDOMBuilder
java.lang.Object
org.htmlunit.cyberneko.xerces.parsers.XMLParser
org.htmlunit.cyberneko.xerces.parsers.AbstractXMLDocumentParser
org.htmlunit.cyberneko.xerces.parsers.AbstractSAXParser
org.htmlunit.html.parser.neko.HtmlUnitNekoDOMBuilder
- All Implemented Interfaces:
org.htmlunit.cyberneko.HTMLTagBalancingListener
,org.htmlunit.cyberneko.xerces.xni.XMLDocumentHandler
,HTMLParserDOMBuilder
,ContentHandler
,LexicalHandler
,XMLReader
final class HtmlUnitNekoDOMBuilder
extends org.htmlunit.cyberneko.xerces.parsers.AbstractSAXParser
implements ContentHandler, LexicalHandler, org.htmlunit.cyberneko.HTMLTagBalancingListener, HTMLParserDOMBuilder
INTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
The parser and DOM builder. This class subclasses Xerces's AbstractSAXParser and implements the ContentHandler interface. Thus all parser APIs are kept private. The ContentHandler methods consume SAX events to build the page DOM
The parser and DOM builder. This class subclasses Xerces's AbstractSAXParser and implements the ContentHandler interface. Thus all parser APIs are kept private. The ContentHandler methods consume SAX events to build the page DOM
-
Nested Class Summary
Nested ClassesNested classes/interfaces inherited from class org.htmlunit.cyberneko.xerces.parsers.AbstractSAXParser
org.htmlunit.cyberneko.xerces.parsers.AbstractSAXParser.AttributesProxy, org.htmlunit.cyberneko.xerces.parsers.AbstractSAXParser.LocatorProxy
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate HtmlElement
private final org.htmlunit.cyberneko.xerces.xni.XMLString
private HtmlForm
private final boolean
private DomNode
private static final String
private static final String
private boolean
private static final org.htmlunit.cyberneko.HTMLElements
private static final org.htmlunit.cyberneko.HTMLElements
private final HTMLParser
private final int
private boolean
private boolean
private boolean
private Locator
private final HtmlPage
private boolean
Did the snippet tried to overwrite the start node?Fields inherited from class org.htmlunit.cyberneko.xerces.parsers.AbstractSAXParser
fContentHandler, fDTDHandler, fLexicalHandler, fLexicalHandlerParameterEntities, fNamespaceContext, fNamespacePrefixes, fNamespaces, fStandalone, fUseEntityResolver2, fVersion, LEXICAL_HANDLER, NAMESPACES
Fields inherited from class org.htmlunit.cyberneko.xerces.parsers.XMLParser
ERROR_HANDLER, parserConfiguration_
-
Constructor Summary
ConstructorsConstructorDescriptionHtmlUnitNekoDOMBuilder
(HTMLParser htmlParser, DomNode node, URL url, String htmlContent, boolean createdByJavascript) Creates a new builder for parsing the specified response contents. -
Method Summary
Modifier and TypeMethodDescriptionprivate void
addNodeToRightParent
(DomNode currentNode, DomElement newElement) Adds the new node to the right parent that is not necessary the currentNode in case of malformed HTML code.private static void
appendChild
(DomNode parent, DomNode child) void
characters
(char[] ch, int start, int length) void
comment
(char[] ch, int start, int length) private static void
copyAttributes
(DomElement to, org.htmlunit.cyberneko.xerces.xni.XMLAttributes attrs) private static org.htmlunit.cyberneko.xerces.xni.parser.XMLParserConfiguration
createConfiguration
(BrowserVersion browserVersion) Create the configuration depending on the simulated browservoid
endCDATA()
void
void
endDTD()
void
endElement
(String namespaceURI, String localName, String qName) void
endElement
(org.htmlunit.cyberneko.xerces.xni.QName element, org.htmlunit.cyberneko.xerces.xni.Augmentations augs) void
void
endPrefixMapping
(String prefix) private DomNode
findElementOnStack
(String... searchedElementNames) (package private) HtmlElement
getBody()
private void
Picks up the character data accumulated so far and add it to the current element as a text node.void
ignorableWhitespace
(char[] ch, int start, int length) void
ignoredEndElement
(org.htmlunit.cyberneko.xerces.xni.QName element, org.htmlunit.cyberneko.xerces.xni.Augmentations augs) void
ignoredStartElement
(org.htmlunit.cyberneko.xerces.xni.QName elem, org.htmlunit.cyberneko.xerces.xni.XMLAttributes attrs, org.htmlunit.cyberneko.xerces.xni.Augmentations augs) private static boolean
isSynthesized
(org.htmlunit.cyberneko.xerces.xni.Augmentations augs) private static boolean
isTableCell
(String nodeName) private static boolean
isTableChild
(String nodeName) void
parse
(org.htmlunit.cyberneko.xerces.xni.parser.XMLInputSource inputSource) void
processingInstruction
(String target, String data) void
pushInputString
(String html) Parses and then inserts the specified HTML content into the HTML content currently being parsed.void
setDocumentLocator
(Locator locator) void
skippedEntity
(String name) void
void
void
void
startElement
(String namespaceURI, String localName, String qName, Attributes atts) void
startElement
(org.htmlunit.cyberneko.xerces.xni.QName element, org.htmlunit.cyberneko.xerces.xni.XMLAttributes attributes, org.htmlunit.cyberneko.xerces.xni.Augmentations augs) void
startEntity
(String name) void
startPrefixMapping
(String prefix, String uri) Methods inherited from class org.htmlunit.cyberneko.xerces.parsers.AbstractSAXParser
characters, comment, doctypeDecl, endCDATA, endDocument, endNamespaceMapping, getContentHandler, getDTDHandler, getEntityResolver, getErrorHandler, getFeature, getLexicalHandler, getProperty, parse, parse, processingInstruction, reset, setContentHandler, setDTDHandler, setEntityResolver, setErrorHandler, setFeature, setLexicalHandler, setProperty, startCDATA, startDocument, startNamespaceMapping, xmlDecl
Methods inherited from class org.htmlunit.cyberneko.xerces.parsers.AbstractXMLDocumentParser
emptyElement, getDocumentSource, setDocumentSource
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface org.xml.sax.ContentHandler
declaration
-
Field Details
-
HTMLELEMENTS
private static final org.htmlunit.cyberneko.HTMLElements HTMLELEMENTS -
HTMLELEMENTS_WITH_CMD
private static final org.htmlunit.cyberneko.HTMLElements HTMLELEMENTS_WITH_CMD -
htmlParser_
-
page_
-
locator_
-
stack_
-
snippetStartNodeOverwritten_
private boolean snippetStartNodeOverwritten_Did the snippet tried to overwrite the start node? -
initialSize_
private final int initialSize_ -
currentNode_
-
createdByJavascript_
private final boolean createdByJavascript_ -
characters_
private final org.htmlunit.cyberneko.xerces.xni.XMLString characters_ -
headParsed_
-
body_
-
lastTagWasSynthesized_
private boolean lastTagWasSynthesized_ -
consumingForm_
-
formEndingIsAdjusting_
private boolean formEndingIsAdjusting_ -
insideSvg_
private boolean insideSvg_ -
insideTemplate_
private boolean insideTemplate_ -
FEATURE_AUGMENTATIONS
- See Also:
-
FEATURE_PARSE_NOSCRIPT
- See Also:
-
-
Constructor Details
-
HtmlUnitNekoDOMBuilder
HtmlUnitNekoDOMBuilder(HTMLParser htmlParser, DomNode node, URL url, String htmlContent, boolean createdByJavascript) Creates a new builder for parsing the specified response contents.- Parameters:
node
- the location at which to insert the new contenturl
- the page's URLcreatedByJavascript
- if true the (script) tag was created by javascript
-
-
Method Details
-
pushInputString
Parses and then inserts the specified HTML content into the HTML content currently being parsed.- Specified by:
pushInputString
in interfaceHTMLParserDOMBuilder
- Parameters:
html
- the HTML content to push
-
createConfiguration
private static org.htmlunit.cyberneko.xerces.xni.parser.XMLParserConfiguration createConfiguration(BrowserVersion browserVersion) Create the configuration depending on the simulated browser- Returns:
- the configuration
-
setDocumentLocator
- Specified by:
setDocumentLocator
in interfaceContentHandler
-
startDocument
- Specified by:
startDocument
in interfaceContentHandler
- Throws:
SAXException
-
startElement
public void startElement(org.htmlunit.cyberneko.xerces.xni.QName element, org.htmlunit.cyberneko.xerces.xni.XMLAttributes attributes, org.htmlunit.cyberneko.xerces.xni.Augmentations augs) throws org.htmlunit.cyberneko.xerces.xni.XNIException - Specified by:
startElement
in interfaceorg.htmlunit.cyberneko.xerces.xni.XMLDocumentHandler
- Overrides:
startElement
in classorg.htmlunit.cyberneko.xerces.parsers.AbstractSAXParser
- Throws:
org.htmlunit.cyberneko.xerces.xni.XNIException
-
startElement
public void startElement(String namespaceURI, String localName, String qName, Attributes atts) throws SAXException - Specified by:
startElement
in interfaceContentHandler
- Throws:
SAXException
-
addNodeToRightParent
Adds the new node to the right parent that is not necessary the currentNode in case of malformed HTML code. The method tries to emulate the behavior of Firefox. -
findElementOnStack
-
isTableChild
-
isTableCell
-
endElement
public void endElement(org.htmlunit.cyberneko.xerces.xni.QName element, org.htmlunit.cyberneko.xerces.xni.Augmentations augs) throws org.htmlunit.cyberneko.xerces.xni.XNIException - Specified by:
endElement
in interfaceorg.htmlunit.cyberneko.xerces.xni.XMLDocumentHandler
- Overrides:
endElement
in classorg.htmlunit.cyberneko.xerces.parsers.AbstractSAXParser
- Throws:
org.htmlunit.cyberneko.xerces.xni.XNIException
-
endElement
- Specified by:
endElement
in interfaceContentHandler
- Throws:
SAXException
-
characters
- Specified by:
characters
in interfaceContentHandler
- Throws:
SAXException
-
ignorableWhitespace
- Specified by:
ignorableWhitespace
in interfaceContentHandler
- Throws:
SAXException
-
handleCharacters
private void handleCharacters()Picks up the character data accumulated so far and add it to the current element as a text node. -
endDocument
- Specified by:
endDocument
in interfaceContentHandler
- Throws:
SAXException
-
startPrefixMapping
- Specified by:
startPrefixMapping
in interfaceContentHandler
- Throws:
SAXException
-
endPrefixMapping
- Specified by:
endPrefixMapping
in interfaceContentHandler
- Throws:
SAXException
-
processingInstruction
- Specified by:
processingInstruction
in interfaceContentHandler
- Throws:
SAXException
-
skippedEntity
- Specified by:
skippedEntity
in interfaceContentHandler
- Throws:
SAXException
-
comment
public void comment(char[] ch, int start, int length) - Specified by:
comment
in interfaceLexicalHandler
-
endCDATA
public void endCDATA()- Specified by:
endCDATA
in interfaceLexicalHandler
-
endDTD
public void endDTD()- Specified by:
endDTD
in interfaceLexicalHandler
-
endEntity
- Specified by:
endEntity
in interfaceLexicalHandler
-
startCDATA
public void startCDATA()- Specified by:
startCDATA
in interfaceLexicalHandler
-
startDTD
- Specified by:
startDTD
in interfaceLexicalHandler
-
startEntity
- Specified by:
startEntity
in interfaceLexicalHandler
-
ignoredEndElement
public void ignoredEndElement(org.htmlunit.cyberneko.xerces.xni.QName element, org.htmlunit.cyberneko.xerces.xni.Augmentations augs) - Specified by:
ignoredEndElement
in interfaceorg.htmlunit.cyberneko.HTMLTagBalancingListener
-
ignoredStartElement
public void ignoredStartElement(org.htmlunit.cyberneko.xerces.xni.QName elem, org.htmlunit.cyberneko.xerces.xni.XMLAttributes attrs, org.htmlunit.cyberneko.xerces.xni.Augmentations augs) - Specified by:
ignoredStartElement
in interfaceorg.htmlunit.cyberneko.HTMLTagBalancingListener
-
copyAttributes
private static void copyAttributes(DomElement to, org.htmlunit.cyberneko.xerces.xni.XMLAttributes attrs) -
parse
public void parse(org.htmlunit.cyberneko.xerces.xni.parser.XMLInputSource inputSource) throws org.htmlunit.cyberneko.xerces.xni.XNIException, IOException - Overrides:
parse
in classorg.htmlunit.cyberneko.xerces.parsers.XMLParser
- Throws:
org.htmlunit.cyberneko.xerces.xni.XNIException
IOException
-
getBody
HtmlElement getBody() -
isSynthesized
private static boolean isSynthesized(org.htmlunit.cyberneko.xerces.xni.Augmentations augs) -
appendChild
-