Class DOMMarkupParser

  • All Implemented Interfaces:
    IDOMMarkupParser

    public final class DOMMarkupParser
    extends java.lang.Object
    implements IDOMMarkupParser

    Default implementation of the IDOMMarkupParser interface.

    DOM trees created by this class are made with objects of the classes from the org.attoparser.dom package.

    Note that this parser interface is actually a convenience artifact aimed at using the DOMBuilderMarkupHandler DOM-conversion handler more easily.

    Sample usage:

    
       // Obtain a java.io.Reader on the document to be parsed
       final Reader documentReader = ...;
    
       // Create or obtain the parser instance (note this is not the 'simple' one!)
       final IDOMMarkupParser parser = new DOMMarkupParser(ParseConfiguration.htmlConfiguration());
    
       // Parse it and return the Document Object Model
       final Document document = parser.parse("Some document", documentReader);
     

    This parser class uses an instance of the MarkupParser class underneath (configured with the default values for its buffer pool), and applies to it an instance of the DOMBuilderMarkupHandler handler class in order to make it produce a DOM (Document Object model) tree as a result of parsing.

    In fact, using the DOMMarkupParser class as shown above is completely equivalent to:

    
       // Obtain a java.io.Reader on the document to be parsed
       final Reader documentReader = ...;
    
       // Instance the DOM-builder handler
       final DOMBuilderMarkupHandler handler = new DOMBuilderMarkupHandler("Some document");
    
       // Create or obtain the parser instance
       final IMarkupParser parser = new MarkupParser(ParseConfiguration.htmlConfiguration());
    
       // Parse the document
       parser.parse(documentReader, handler);
    
       // Obtain the parsed Document Object Model
       final Document document = handler.getDocument();
     

    This parser class is thread-safe.

    Since:
    2.0.0
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      Document parse​(char[] document)
      Parse a document and convert it into a DOM tree, using the classes at the org.attoparser.dom package.
      Document parse​(char[] document, int offset, int len)
      Parse a document and convert it into a DOM tree, using the classes at the org.attoparser.dom package.
      Document parse​(java.io.Reader reader)
      Parse a document and convert it into a DOM tree, using the classes at the org.attoparser.dom package.
      Document parse​(java.lang.String document)
      Parse a document and convert it into a DOM tree, using the classes at the org.attoparser.dom package.
      Document parse​(java.lang.String documentName, char[] document)
      Parse a document and convert it into a DOM tree, using the classes at the org.attoparser.dom package.
      Document parse​(java.lang.String documentName, char[] document, int offset, int len)
      Parse a document and convert it into a DOM tree, using the classes at the org.attoparser.dom package.
      Document parse​(java.lang.String documentName, java.io.Reader reader)
      Parse a document and convert it into a DOM tree, using the classes at the org.attoparser.dom package.
      Document parse​(java.lang.String documentName, java.lang.String document)
      Parse a document and convert it into a DOM tree, using the classes at the org.attoparser.dom package.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Method Detail

      • parse

        public Document parse​(java.lang.String document)
                       throws ParseException
        Description copied from interface: IDOMMarkupParser

        Parse a document and convert it into a DOM tree, using the classes at the org.attoparser.dom package.

        Specified by:
        parse in interface IDOMMarkupParser
        Parameters:
        document - the document to be parsed, as a String.
        Returns:
        the Document object resulting from parsing.
        Throws:
        ParseException - if the document cannot be parsed.
      • parse

        public Document parse​(char[] document)
                       throws ParseException
        Description copied from interface: IDOMMarkupParser

        Parse a document and convert it into a DOM tree, using the classes at the org.attoparser.dom package.

        Specified by:
        parse in interface IDOMMarkupParser
        Parameters:
        document - the document to be parsed, as a char[].
        Returns:
        the Document object resulting from parsing.
        Throws:
        ParseException - if the document cannot be parsed.
      • parse

        public Document parse​(char[] document,
                              int offset,
                              int len)
                       throws ParseException
        Description copied from interface: IDOMMarkupParser

        Parse a document and convert it into a DOM tree, using the classes at the org.attoparser.dom package.

        Specified by:
        parse in interface IDOMMarkupParser
        Parameters:
        document - the document to be parsed, as a char[].
        offset - the offset to be applied on the char[] document to determine the start of the document contents.
        len - the length (in chars) of the document stored in the char[].
        Returns:
        the Document object resulting from parsing.
        Throws:
        ParseException - if the document cannot be parsed.
      • parse

        public Document parse​(java.io.Reader reader)
                       throws ParseException
        Description copied from interface: IDOMMarkupParser

        Parse a document and convert it into a DOM tree, using the classes at the org.attoparser.dom package.

        Implementations of this interface must close the provided Reader object after parsing.

        Specified by:
        parse in interface IDOMMarkupParser
        Parameters:
        reader - a Reader on the document.
        Returns:
        the Document object resulting from parsing.
        Throws:
        ParseException - if the document cannot be parsed.
      • parse

        public Document parse​(java.lang.String documentName,
                              java.lang.String document)
                       throws ParseException
        Description copied from interface: IDOMMarkupParser

        Parse a document and convert it into a DOM tree, using the classes at the org.attoparser.dom package.

        Specified by:
        parse in interface IDOMMarkupParser
        Parameters:
        documentName - the name of the document to be parsed.
        document - the document to be parsed, as a String.
        Returns:
        the Document object resulting from parsing.
        Throws:
        ParseException - if the document cannot be parsed.
      • parse

        public Document parse​(java.lang.String documentName,
                              char[] document)
                       throws ParseException
        Description copied from interface: IDOMMarkupParser

        Parse a document and convert it into a DOM tree, using the classes at the org.attoparser.dom package.

        Specified by:
        parse in interface IDOMMarkupParser
        Parameters:
        documentName - the name of the document to be parsed.
        document - the document to be parsed, as a char[].
        Returns:
        the Document object resulting from parsing.
        Throws:
        ParseException - if the document cannot be parsed.
      • parse

        public Document parse​(java.lang.String documentName,
                              char[] document,
                              int offset,
                              int len)
                       throws ParseException
        Description copied from interface: IDOMMarkupParser

        Parse a document and convert it into a DOM tree, using the classes at the org.attoparser.dom package.

        Specified by:
        parse in interface IDOMMarkupParser
        Parameters:
        documentName - the name of the document to be parsed.
        document - the document to be parsed, as a char[].
        offset - the offset to be applied on the char[] document to determine the start of the document contents.
        len - the length (in chars) of the document stored in the char[].
        Returns:
        the Document object resulting from parsing.
        Throws:
        ParseException - if the document cannot be parsed.
      • parse

        public Document parse​(java.lang.String documentName,
                              java.io.Reader reader)
                       throws ParseException
        Description copied from interface: IDOMMarkupParser

        Parse a document and convert it into a DOM tree, using the classes at the org.attoparser.dom package.

        Implementations of this interface must close the provided Reader object after parsing.

        Specified by:
        parse in interface IDOMMarkupParser
        Parameters:
        documentName - the name of the document to be parsed.
        reader - a Reader on the document.
        Returns:
        the Document object resulting from parsing.
        Throws:
        ParseException - if the document cannot be parsed.