Class HtmlUnitNekoHtmlParser

  • All Implemented Interfaces:
    HTMLParser

    public final class HtmlUnitNekoHtmlParser
    extends java.lang.Object
    implements HTMLParser

    SAX parser implementation that uses the NekoHTML HTMLConfiguration to parse HTML into a HtmlUnit-specific DOM (HU-DOM) tree.

    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      (package private) static java.lang.Throwable extractNestedException​(java.lang.Throwable e)
      Extract nested exception within an XNIException (Nekohtml uses reflection and generated exceptions are wrapped many times within XNIException and InvocationTargetException)
      ElementFactory getElementFactory​(SgmlPage page, java.lang.String namespaceURI, java.lang.String qualifiedName, boolean insideSvg, boolean svgSupport)
      INTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
      Returns the pre-registered element factory corresponding to the specified tag, or an UnknownElementFactory.
      ElementFactory getFactory​(java.lang.String tagName)
      INTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
      ElementFactory getSvgFactory()
      INTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
      void parse​(WebResponse webResponse, HtmlPage page, boolean xhtml, boolean createdByJavascript)
      Parses the WebResponse into an object tree representation.
      void parseFragment​(DomNode parent, java.lang.String source)
      Parses the HTML content from the given string into an object tree representation.
      void parseFragment​(DomNode parent, DomNode context, java.lang.String source, boolean createdByJavascript)
      Parses the HTML content from the given string into an object tree representation.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • ELEMENT_FACTORIES

        private static final java.util.Map<java.lang.String,​ElementFactory> ELEMENT_FACTORIES
    • Constructor Detail

      • HtmlUnitNekoHtmlParser

        public HtmlUnitNekoHtmlParser()
        Ctor.
    • Method Detail

      • parseFragment

        public void parseFragment​(DomNode parent,
                                  java.lang.String source)
                           throws org.xml.sax.SAXException,
                                  java.io.IOException
        Parses the HTML content from the given string into an object tree representation.
        Specified by:
        parseFragment in interface HTMLParser
        Parameters:
        parent - the parent for the new nodes
        source - the (X)HTML to be parsed
        Throws:
        org.xml.sax.SAXException - if a SAX error occurs
        java.io.IOException - if an IO error occurs
      • parseFragment

        public void parseFragment​(DomNode parent,
                                  DomNode context,
                                  java.lang.String source,
                                  boolean createdByJavascript)
                           throws org.xml.sax.SAXException,
                                  java.io.IOException
        Parses the HTML content from the given string into an object tree representation.
        Specified by:
        parseFragment in interface HTMLParser
        Parameters:
        parent - where the new parsed nodes will be added to
        context - the context to build the fragment context stack
        source - the (X)HTML to be parsed
        createdByJavascript - if true the (script) tag was created by javascript
        Throws:
        org.xml.sax.SAXException - if a SAX error occurs
        java.io.IOException - if an IO error occurs
      • parse

        public void parse​(WebResponse webResponse,
                          HtmlPage page,
                          boolean xhtml,
                          boolean createdByJavascript)
                   throws java.io.IOException
        Parses the WebResponse into an object tree representation.
        Specified by:
        parse in interface HTMLParser
        Parameters:
        webResponse - the response data
        page - the HtmlPage to add the nodes
        xhtml - if true use the XHtml parser
        createdByJavascript - if true the (script) tag was created by javascript
        Throws:
        java.io.IOException - if there is an IO error
      • extractNestedException

        static java.lang.Throwable extractNestedException​(java.lang.Throwable e)
        Extract nested exception within an XNIException (Nekohtml uses reflection and generated exceptions are wrapped many times within XNIException and InvocationTargetException)
        Parameters:
        e - the original XNIException
        Returns:
        the cause exception
      • getSvgFactory

        public ElementFactory getSvgFactory()
        INTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
        Specified by:
        getSvgFactory in interface HTMLParser
        Returns:
        a factory for creating SvgElements representing the given tag
      • getFactory

        public ElementFactory getFactory​(java.lang.String tagName)
        INTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
        Specified by:
        getFactory in interface HTMLParser
        Parameters:
        tagName - an HTML element tag name
        Returns:
        a factory for creating HtmlElements representing the given tag
      • getElementFactory

        public ElementFactory getElementFactory​(SgmlPage page,
                                                java.lang.String namespaceURI,
                                                java.lang.String qualifiedName,
                                                boolean insideSvg,
                                                boolean svgSupport)
        INTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
        Returns the pre-registered element factory corresponding to the specified tag, or an UnknownElementFactory.
        Specified by:
        getElementFactory in interface HTMLParser
        Parameters:
        page - the page
        namespaceURI - the namespace URI
        qualifiedName - the qualified name
        insideSvg - is the node inside an SVG node or not
        svgSupport - true if called from javascript createElementNS
        Returns:
        the pre-registered element factory corresponding to the specified tag, or an UnknownElementFactory