Interface HTMLParser

All Known Implementing Classes:
HtmlUnitNekoHtmlParser

public interface HTMLParser

Interface for the parser used to parse HTML into a HtmlUnit-specific DOM (HU-DOM) tree.

  • Method Summary

    Modifier and Type
    Method
    Description
    getElementFactory(SgmlPage page, String namespaceURI, String qualifiedName, boolean insideSvg, boolean svgSupport)
    INTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
    Returns the pre-registered element factory corresponding to the specified tag, or an UnknownElementFactory.
    getFactory(String tagName)
    INTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
    INTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
    void
    parse(WebResponse webResponse, HtmlPage page, boolean xhtml, boolean createdByJavascript)
    Parses the WebResponse into an object tree representation.
    void
    parseFragment(DomNode parent, String source)
    Parses the HTML content from the given string into an object tree representation.
    void
    parseFragment(DomNode parent, DomNode context, String source, boolean createdByJavascript)
    Parses the HTML content from the given string into an object tree representation.
  • Method Details

    • getFactory

      ElementFactory getFactory(String tagName)
      INTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
      Parameters:
      tagName - an HTML element tag name
      Returns:
      a factory for creating HtmlElements representing the given tag
    • getSvgFactory

      ElementFactory getSvgFactory()
      INTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
      Returns:
      a factory for creating SvgElements representing the given tag
    • getElementFactory

      ElementFactory getElementFactory(SgmlPage page, String namespaceURI, String qualifiedName, boolean insideSvg, boolean svgSupport)
      INTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
      Returns the pre-registered element factory corresponding to the specified tag, or an UnknownElementFactory.
      Parameters:
      page - the page
      namespaceURI - the namespace URI
      qualifiedName - the qualified name
      insideSvg - is the node inside an SVG node or not
      svgSupport - true if called from javascript createElementNS
      Returns:
      the pre-registered element factory corresponding to the specified tag, or an UnknownElementFactory
    • parseFragment

      void parseFragment(DomNode parent, String source) throws SAXException, IOException
      Parses the HTML content from the given string into an object tree representation.
      Parameters:
      parent - the parent for the new nodes
      source - the (X)HTML to be parsed
      Throws:
      SAXException - if a SAX error occurs
      IOException - if an IO error occurs
    • parseFragment

      void parseFragment(DomNode parent, DomNode context, String source, boolean createdByJavascript) throws SAXException, IOException
      Parses the HTML content from the given string into an object tree representation.
      Parameters:
      parent - where the new parsed nodes will be added to
      context - the context to build the fragment context stack
      source - the (X)HTML to be parsed
      createdByJavascript - if true the (script) tag was created by javascript
      Throws:
      SAXException - if a SAX error occurs
      IOException - if an IO error occurs
    • parse

      void parse(WebResponse webResponse, HtmlPage page, boolean xhtml, boolean createdByJavascript) throws IOException
      Parses the WebResponse into an object tree representation.
      Parameters:
      webResponse - the response data
      page - the HtmlPage to add the nodes
      xhtml - if true use the XHtml parser
      createdByJavascript - if true the (script) tag was created by javascript
      Throws:
      IOException - if there is an IO error