Interface HTMLParser

  • All Known Implementing Classes:
    HtmlUnitNekoHtmlParser

    public interface HTMLParser

    Interface for the parser used to parse HTML into a HtmlUnit-specific DOM (HU-DOM) tree.

    • Method Summary

      All Methods Instance Methods Abstract Methods 
      Modifier and Type Method Description
      ElementFactory getElementFactory​(SgmlPage page, java.lang.String namespaceURI, java.lang.String qualifiedName, boolean insideSvg, boolean svgSupport)
      INTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
      Returns the pre-registered element factory corresponding to the specified tag, or an UnknownElementFactory.
      ElementFactory getFactory​(java.lang.String tagName)
      INTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
      ElementFactory getSvgFactory()
      INTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
      void parse​(WebResponse webResponse, HtmlPage page, boolean xhtml, boolean createdByJavascript)
      Parses the WebResponse into an object tree representation.
      void parseFragment​(DomNode parent, java.lang.String source)
      Parses the HTML content from the given string into an object tree representation.
      void parseFragment​(DomNode parent, DomNode context, java.lang.String source, boolean createdByJavascript)
      Parses the HTML content from the given string into an object tree representation.
    • Method Detail

      • getFactory

        ElementFactory getFactory​(java.lang.String tagName)
        INTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
        Parameters:
        tagName - an HTML element tag name
        Returns:
        a factory for creating HtmlElements representing the given tag
      • getSvgFactory

        ElementFactory getSvgFactory()
        INTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
        Returns:
        a factory for creating SvgElements representing the given tag
      • getElementFactory

        ElementFactory getElementFactory​(SgmlPage page,
                                         java.lang.String namespaceURI,
                                         java.lang.String qualifiedName,
                                         boolean insideSvg,
                                         boolean svgSupport)
        INTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
        Returns the pre-registered element factory corresponding to the specified tag, or an UnknownElementFactory.
        Parameters:
        page - the page
        namespaceURI - the namespace URI
        qualifiedName - the qualified name
        insideSvg - is the node inside an SVG node or not
        svgSupport - true if called from javascript createElementNS
        Returns:
        the pre-registered element factory corresponding to the specified tag, or an UnknownElementFactory
      • parseFragment

        void parseFragment​(DomNode parent,
                           java.lang.String source)
                    throws org.xml.sax.SAXException,
                           java.io.IOException
        Parses the HTML content from the given string into an object tree representation.
        Parameters:
        parent - the parent for the new nodes
        source - the (X)HTML to be parsed
        Throws:
        org.xml.sax.SAXException - if a SAX error occurs
        java.io.IOException - if an IO error occurs
      • parseFragment

        void parseFragment​(DomNode parent,
                           DomNode context,
                           java.lang.String source,
                           boolean createdByJavascript)
                    throws org.xml.sax.SAXException,
                           java.io.IOException
        Parses the HTML content from the given string into an object tree representation.
        Parameters:
        parent - where the new parsed nodes will be added to
        context - the context to build the fragment context stack
        source - the (X)HTML to be parsed
        createdByJavascript - if true the (script) tag was created by javascript
        Throws:
        org.xml.sax.SAXException - if a SAX error occurs
        java.io.IOException - if an IO error occurs
      • parse

        void parse​(WebResponse webResponse,
                   HtmlPage page,
                   boolean xhtml,
                   boolean createdByJavascript)
            throws java.io.IOException
        Parses the WebResponse into an object tree representation.
        Parameters:
        webResponse - the response data
        page - the HtmlPage to add the nodes
        xhtml - if true use the XHtml parser
        createdByJavascript - if true the (script) tag was created by javascript
        Throws:
        java.io.IOException - if there is an IO error