Interface IMarkupParser

All Known Implementing Classes:
MarkupParser

public interface IMarkupParser

Interface to be implemented by all Markup Parsers. Default implementation is MarkupParser.

AttoParser markup parsers work as SAX-style parsers that need a markup handler object for handling parsing events. These handlers implement the IMarkupHandler interface, and are normally developed by users in order to perform the operations they require for their applications.

See the documentation of the IMarkupHandler interface for more information on the event handler methods, and also on the handler implementations AttoParser provides out-of-the-box.

Also, note there are two different specialized parsers that use MarkupParser underneath, but which are oriented towards allowing an easy use of specific parsing features: IDOMMarkupParser for DOM-oriented parsing and ISimpleMarkupParser for using a simplified version of the handler interface (ISimpleMarkupHandler).

Sample usage:


   // Obtain a java.io.Reader on the document to be parsed
   final Reader documentReader = ...;

   // Create the handler instance. Extending the no-op AbstractMarkupHandler is a good start
   final IMarkupHandler handler = new AbstractMarkupHandler() {
       ... // some events implemented
   };

   // Create or obtain the parser instance (can be reused). Example uses the default configuration for HTML
   final IMarkupParser parser = new MarkupParser(ParseConfiguration.htmlConfiguration());

   // Parse it!
   parser.parse(documentReader, handler);
 

Note that implementations of this interface should be thread-safe, and therefore parsers should be reusable through several parsing operations and any number of concurrent threads.

Since:
2.0.0
  • Method Details

    • parse

      void parse(String document, IMarkupHandler handler) throws ParseException

      Parse a document using the specified IMarkupHandler.

      Parameters:
      document - the document to be parsed, as a String.
      handler - the handler to be used, an IMarkupHandler implementation.
      Throws:
      ParseException - if the document cannot be parsed.
    • parse

      void parse(char[] document, IMarkupHandler handler) throws ParseException

      Parse a document using the specified IMarkupHandler.

      Parameters:
      document - the document to be parsed, as a char[].
      handler - the handler to be used, an IMarkupHandler implementation.
      Throws:
      ParseException - if the document cannot be parsed.
    • parse

      void parse(char[] document, int offset, int len, IMarkupHandler handler) throws ParseException

      Parse a document using the specified IMarkupHandler.

      Parameters:
      document - the document to be parsed, as a char[].
      offset - the offset to be applied on the char[] document to determine the start of the document contents.
      len - the length (in chars) of the document stored in the char[].
      handler - the handler to be used, an IMarkupHandler implementation.
      Throws:
      ParseException - if the document cannot be parsed.
    • parse

      void parse(Reader reader, IMarkupHandler handler) throws ParseException

      Parse a document using the specified IMarkupHandler.

      Implementations of this interface must close the provided Reader object after parsing.

      Parameters:
      reader - a Reader on the document.
      handler - the handler to be used, an IMarkupHandler implementation.
      Throws:
      ParseException - if the document cannot be parsed.