Interface IDOMMarkupParser

All Known Implementing Classes:
DOMMarkupParser

public interface IDOMMarkupParser

Interface to be implemented by all DOM Markup Parsers. Default implementation is DOMMarkupParser.

DOM trees created by this class are made with objects of the classes from the org.attoparser.dom package.

Note that this parser interface is actually a convenience artifact aimed at using the DOMBuilderMarkupHandler DOM-conversion handler more easily.

Sample usage:


   // Obtain a java.io.Reader on the document to be parsed
   final Reader documentReader = ...;

   // Create or obtain the parser instance (note this is not the 'simple' one!)
   final IDOMMarkupParser parser = new DOMMarkupParser(ParseConfiguration.htmlConfiguration());

   // Parse it and return the Document Object Model
   final Document document = parser.parse("Some document", documentReader);
 

Note that implementations of this interface should be thread-safe, and therefore parsers should be reusable through several parsing operations and any number of concurrent threads.

Since:
2.0.0
  • Method Summary

    Modifier and Type
    Method
    Description
    parse(char[] document)
    Parse a document and convert it into a DOM tree, using the classes at the org.attoparser.dom package.
    parse(char[] document, int offset, int len)
    Parse a document and convert it into a DOM tree, using the classes at the org.attoparser.dom package.
    parse(Reader reader)
    Parse a document and convert it into a DOM tree, using the classes at the org.attoparser.dom package.
    parse(String document)
    Parse a document and convert it into a DOM tree, using the classes at the org.attoparser.dom package.
    parse(String documentName, char[] document)
    Parse a document and convert it into a DOM tree, using the classes at the org.attoparser.dom package.
    parse(String documentName, char[] document, int offset, int len)
    Parse a document and convert it into a DOM tree, using the classes at the org.attoparser.dom package.
    parse(String documentName, Reader reader)
    Parse a document and convert it into a DOM tree, using the classes at the org.attoparser.dom package.
    parse(String documentName, String document)
    Parse a document and convert it into a DOM tree, using the classes at the org.attoparser.dom package.
  • Method Details

    • parse

      Document parse(String document) throws ParseException

      Parse a document and convert it into a DOM tree, using the classes at the org.attoparser.dom package.

      Parameters:
      document - the document to be parsed, as a String.
      Returns:
      the Document object resulting from parsing.
      Throws:
      ParseException - if the document cannot be parsed.
    • parse

      Document parse(char[] document) throws ParseException

      Parse a document and convert it into a DOM tree, using the classes at the org.attoparser.dom package.

      Parameters:
      document - the document to be parsed, as a char[].
      Returns:
      the Document object resulting from parsing.
      Throws:
      ParseException - if the document cannot be parsed.
    • parse

      Document parse(char[] document, int offset, int len) throws ParseException

      Parse a document and convert it into a DOM tree, using the classes at the org.attoparser.dom package.

      Parameters:
      document - the document to be parsed, as a char[].
      offset - the offset to be applied on the char[] document to determine the start of the document contents.
      len - the length (in chars) of the document stored in the char[].
      Returns:
      the Document object resulting from parsing.
      Throws:
      ParseException - if the document cannot be parsed.
    • parse

      Document parse(Reader reader) throws ParseException

      Parse a document and convert it into a DOM tree, using the classes at the org.attoparser.dom package.

      Implementations of this interface must close the provided Reader object after parsing.

      Parameters:
      reader - a Reader on the document.
      Returns:
      the Document object resulting from parsing.
      Throws:
      ParseException - if the document cannot be parsed.
    • parse

      Document parse(String documentName, String document) throws ParseException

      Parse a document and convert it into a DOM tree, using the classes at the org.attoparser.dom package.

      Parameters:
      documentName - the name of the document to be parsed.
      document - the document to be parsed, as a String.
      Returns:
      the Document object resulting from parsing.
      Throws:
      ParseException - if the document cannot be parsed.
    • parse

      Document parse(String documentName, char[] document) throws ParseException

      Parse a document and convert it into a DOM tree, using the classes at the org.attoparser.dom package.

      Parameters:
      documentName - the name of the document to be parsed.
      document - the document to be parsed, as a char[].
      Returns:
      the Document object resulting from parsing.
      Throws:
      ParseException - if the document cannot be parsed.
    • parse

      Document parse(String documentName, char[] document, int offset, int len) throws ParseException

      Parse a document and convert it into a DOM tree, using the classes at the org.attoparser.dom package.

      Parameters:
      documentName - the name of the document to be parsed.
      document - the document to be parsed, as a char[].
      offset - the offset to be applied on the char[] document to determine the start of the document contents.
      len - the length (in chars) of the document stored in the char[].
      Returns:
      the Document object resulting from parsing.
      Throws:
      ParseException - if the document cannot be parsed.
    • parse

      Document parse(String documentName, Reader reader) throws ParseException

      Parse a document and convert it into a DOM tree, using the classes at the org.attoparser.dom package.

      Implementations of this interface must close the provided Reader object after parsing.

      Parameters:
      documentName - the name of the document to be parsed.
      reader - a Reader on the document.
      Returns:
      the Document object resulting from parsing.
      Throws:
      ParseException - if the document cannot be parsed.