Class HtmlUnitNekoDOMBuilder

java.lang.Object
org.htmlunit.cyberneko.xerces.parsers.XMLParser
org.htmlunit.cyberneko.xerces.parsers.AbstractXMLDocumentParser
org.htmlunit.cyberneko.xerces.parsers.AbstractSAXParser
org.htmlunit.html.parser.neko.HtmlUnitNekoDOMBuilder
All Implemented Interfaces:
org.htmlunit.cyberneko.HTMLTagBalancingListener, org.htmlunit.cyberneko.xerces.xni.XMLDocumentHandler, HTMLParserDOMBuilder, ContentHandler, LexicalHandler, XMLReader

final class HtmlUnitNekoDOMBuilder extends org.htmlunit.cyberneko.xerces.parsers.AbstractSAXParser implements ContentHandler, LexicalHandler, org.htmlunit.cyberneko.HTMLTagBalancingListener, HTMLParserDOMBuilder
INTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
The parser and DOM builder. This class subclasses Xerces's AbstractSAXParser and implements the ContentHandler interface. Thus all parser APIs are kept private. The ContentHandler methods consume SAX events to build the page DOM
  • Field Details

    • HTMLELEMENTS

      private static final org.htmlunit.cyberneko.HTMLElements HTMLELEMENTS
    • HTMLELEMENTS_WITH_CMD

      private static final org.htmlunit.cyberneko.HTMLElements HTMLELEMENTS_WITH_CMD
    • htmlParser_

      private final HTMLParser htmlParser_
    • page_

      private final HtmlPage page_
    • locator_

      private Locator locator_
    • stack_

      private final Deque<DomNode> stack_
    • snippetStartNodeOverwritten_

      private boolean snippetStartNodeOverwritten_
      Did the snippet tried to overwrite the start node?
    • initialSize_

      private final int initialSize_
    • currentNode_

      private DomNode currentNode_
    • createdByJavascript_

      private final boolean createdByJavascript_
    • characters_

      private final org.htmlunit.cyberneko.xerces.xni.XMLString characters_
    • headParsed_

      private HtmlUnitNekoDOMBuilder.HeadParsed headParsed_
    • body_

      private HtmlElement body_
    • lastTagWasSynthesized_

      private boolean lastTagWasSynthesized_
    • consumingForm_

      private HtmlForm consumingForm_
    • formEndingIsAdjusting_

      private boolean formEndingIsAdjusting_
    • insideSvg_

      private boolean insideSvg_
    • insideTemplate_

      private boolean insideTemplate_
    • FEATURE_AUGMENTATIONS

      private static final String FEATURE_AUGMENTATIONS
      See Also:
    • FEATURE_PARSE_NOSCRIPT

      private static final String FEATURE_PARSE_NOSCRIPT
      See Also:
  • Constructor Details

    • HtmlUnitNekoDOMBuilder

      HtmlUnitNekoDOMBuilder(HTMLParser htmlParser, DomNode node, URL url, String htmlContent, boolean createdByJavascript)
      Creates a new builder for parsing the specified response contents.
      Parameters:
      node - the location at which to insert the new content
      url - the page's URL
      createdByJavascript - if true the (script) tag was created by javascript
  • Method Details

    • pushInputString

      public void pushInputString(String html)
      Parses and then inserts the specified HTML content into the HTML content currently being parsed.
      Specified by:
      pushInputString in interface HTMLParserDOMBuilder
      Parameters:
      html - the HTML content to push
    • createConfiguration

      private static org.htmlunit.cyberneko.xerces.xni.parser.XMLParserConfiguration createConfiguration(BrowserVersion browserVersion)
      Create the configuration depending on the simulated browser
      Returns:
      the configuration
    • setDocumentLocator

      public void setDocumentLocator(Locator locator)
      Specified by:
      setDocumentLocator in interface ContentHandler
    • startDocument

      public void startDocument() throws SAXException
      Specified by:
      startDocument in interface ContentHandler
      Throws:
      SAXException
    • startElement

      public void startElement(org.htmlunit.cyberneko.xerces.xni.QName element, org.htmlunit.cyberneko.xerces.xni.XMLAttributes attributes, org.htmlunit.cyberneko.xerces.xni.Augmentations augs) throws org.htmlunit.cyberneko.xerces.xni.XNIException
      Specified by:
      startElement in interface org.htmlunit.cyberneko.xerces.xni.XMLDocumentHandler
      Overrides:
      startElement in class org.htmlunit.cyberneko.xerces.parsers.AbstractSAXParser
      Throws:
      org.htmlunit.cyberneko.xerces.xni.XNIException
    • startElement

      public void startElement(String namespaceURI, String localName, String qName, Attributes atts) throws SAXException
      Specified by:
      startElement in interface ContentHandler
      Throws:
      SAXException
    • addNodeToRightParent

      private void addNodeToRightParent(DomNode currentNode, DomElement newElement)
      Adds the new node to the right parent that is not necessary the currentNode in case of malformed HTML code. The method tries to emulate the behavior of Firefox.
    • findElementOnStack

      private DomNode findElementOnStack(String... searchedElementNames)
    • isTableChild

      private static boolean isTableChild(String nodeName)
    • isTableCell

      private static boolean isTableCell(String nodeName)
    • endElement

      public void endElement(org.htmlunit.cyberneko.xerces.xni.QName element, org.htmlunit.cyberneko.xerces.xni.Augmentations augs) throws org.htmlunit.cyberneko.xerces.xni.XNIException
      Specified by:
      endElement in interface org.htmlunit.cyberneko.xerces.xni.XMLDocumentHandler
      Overrides:
      endElement in class org.htmlunit.cyberneko.xerces.parsers.AbstractSAXParser
      Throws:
      org.htmlunit.cyberneko.xerces.xni.XNIException
    • endElement

      public void endElement(String namespaceURI, String localName, String qName) throws SAXException
      Specified by:
      endElement in interface ContentHandler
      Throws:
      SAXException
    • characters

      public void characters(char[] ch, int start, int length) throws SAXException
      Specified by:
      characters in interface ContentHandler
      Throws:
      SAXException
    • ignorableWhitespace

      public void ignorableWhitespace(char[] ch, int start, int length) throws SAXException
      Specified by:
      ignorableWhitespace in interface ContentHandler
      Throws:
      SAXException
    • handleCharacters

      private void handleCharacters()
      Picks up the character data accumulated so far and add it to the current element as a text node.
    • endDocument

      public void endDocument() throws SAXException
      Specified by:
      endDocument in interface ContentHandler
      Throws:
      SAXException
    • startPrefixMapping

      public void startPrefixMapping(String prefix, String uri) throws SAXException
      Specified by:
      startPrefixMapping in interface ContentHandler
      Throws:
      SAXException
    • endPrefixMapping

      public void endPrefixMapping(String prefix) throws SAXException
      Specified by:
      endPrefixMapping in interface ContentHandler
      Throws:
      SAXException
    • processingInstruction

      public void processingInstruction(String target, String data) throws SAXException
      Specified by:
      processingInstruction in interface ContentHandler
      Throws:
      SAXException
    • skippedEntity

      public void skippedEntity(String name) throws SAXException
      Specified by:
      skippedEntity in interface ContentHandler
      Throws:
      SAXException
    • comment

      public void comment(char[] ch, int start, int length)
      Specified by:
      comment in interface LexicalHandler
    • endCDATA

      public void endCDATA()
      Specified by:
      endCDATA in interface LexicalHandler
    • endDTD

      public void endDTD()
      Specified by:
      endDTD in interface LexicalHandler
    • endEntity

      public void endEntity(String name)
      Specified by:
      endEntity in interface LexicalHandler
    • startCDATA

      public void startCDATA()
      Specified by:
      startCDATA in interface LexicalHandler
    • startDTD

      public void startDTD(String name, String publicId, String systemId)
      Specified by:
      startDTD in interface LexicalHandler
    • startEntity

      public void startEntity(String name)
      Specified by:
      startEntity in interface LexicalHandler
    • ignoredEndElement

      public void ignoredEndElement(org.htmlunit.cyberneko.xerces.xni.QName element, org.htmlunit.cyberneko.xerces.xni.Augmentations augs)
      Specified by:
      ignoredEndElement in interface org.htmlunit.cyberneko.HTMLTagBalancingListener
    • ignoredStartElement

      public void ignoredStartElement(org.htmlunit.cyberneko.xerces.xni.QName elem, org.htmlunit.cyberneko.xerces.xni.XMLAttributes attrs, org.htmlunit.cyberneko.xerces.xni.Augmentations augs)
      Specified by:
      ignoredStartElement in interface org.htmlunit.cyberneko.HTMLTagBalancingListener
    • copyAttributes

      private static void copyAttributes(DomElement to, org.htmlunit.cyberneko.xerces.xni.XMLAttributes attrs)
    • parse

      public void parse(org.htmlunit.cyberneko.xerces.xni.parser.XMLInputSource inputSource) throws org.htmlunit.cyberneko.xerces.xni.XNIException, IOException
      Overrides:
      parse in class org.htmlunit.cyberneko.xerces.parsers.XMLParser
      Throws:
      org.htmlunit.cyberneko.xerces.xni.XNIException
      IOException
    • getBody

      HtmlElement getBody()
    • isSynthesized

      private static boolean isSynthesized(org.htmlunit.cyberneko.xerces.xni.Augmentations augs)
    • appendChild

      private static void appendChild(DomNode parent, DomNode child)