Class MarkupParser
- java.lang.Object
-
- org.attoparser.MarkupParser
-
- All Implemented Interfaces:
IMarkupParser
public final class MarkupParser extends java.lang.Object implements IMarkupParser
Default implementation of the
IMarkupParser
interface.AttoParser markup parsers work as SAX-style parsers that need a markup handler object for handling parsing events. These handlers implement the
IMarkupHandler
interface, and are normally developed by users in order to perform the operations they require for their applications.See the documentation of the
IMarkupHandler
interface for more information on the event handler methods, and also on the handler implementations AttoParser provides out-of-the-box.Also, note there are two different specialized parsers that use
MarkupParser
underneath, but which are oriented towards allowing an easy use of specific parsing features:IDOMMarkupParser
for DOM-oriented parsing andISimpleMarkupParser
for using a simplified version of the handler interface (ISimpleMarkupHandler
).Sample usage:
// Obtain a java.io.Reader on the document to be parsed final Reader documentReader = ...; // Create the handler instance. Extending the no-op AbstractMarkupHandler is a good start final IMarkupHandler handler = new AbstractMarkupHandler() { ... // some events implemented }; // Create or obtain the parser instance (can be reused). Example uses the default configuration for HTML final IMarkupParser parser = new MarkupParser(ParseConfiguration.htmlConfiguration()); // Parse it! parser.parse(documentReader, handler);
This parser class is thread-safe. However, take into account that, normally,
IMarkupHandler
implementations are not. So, even if parsers can be reused, handler objects usually cannot.This parser class uses a (configurable) pool of char[] buffers, in order to reduce the amount of memory used for parsing (buffers are large structures). This pool works in a non-blocking mode, so if a new buffer is needed and all are currently allocated, a new (unpooled) char[] object is created and returned without waiting for a pooled buffer to be available.
(Note that these pooled buffers will not be used when parsing documents specified as char[] objects. In such case, the char[] documents themselves will be used as buffers, avoiding the need to allocate pooled buffers or use any additional amount of memory.)
- Since:
- 2.0.0
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description private static class
MarkupParser.BufferPool
-
Field Summary
Fields Modifier and Type Field Description private ParseConfiguration
configuration
static int
DEFAULT_BUFFER_SIZE
Default buffer size to be used (buffer size will grow at runtime if an artifact (structure or text) is bigger than the whole buffer).static int
DEFAULT_POOL_SIZE
Default pool size to be used.private MarkupParser.BufferPool
pool
-
Constructor Summary
Constructors Constructor Description MarkupParser(ParseConfiguration configuration)
Creates a new instance of this parser, using the specified configuration and default sizes for pool (DEFAULT_POOL_SIZE
) and pooled buffers (DEFAULT_BUFFER_SIZE
).MarkupParser(ParseConfiguration configuration, int poolSize, int bufferSize)
Creates a new instance of this parser, specifying the pool and buffer size.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
parse(char[] document, int offset, int len, IMarkupHandler handler)
Parse a document using the specifiedIMarkupHandler
.void
parse(char[] document, IMarkupHandler handler)
Parse a document using the specifiedIMarkupHandler
.void
parse(java.io.Reader reader, IMarkupHandler handler)
Parse a document using the specifiedIMarkupHandler
.void
parse(java.lang.String document, IMarkupHandler handler)
Parse a document using the specifiedIMarkupHandler
.private void
parseBuffer(char[] buffer, int offset, int len, IMarkupHandler handler, ParseStatus status)
(package private) void
parseDocument(char[] buffer, int offset, int len, IMarkupHandler handler, ParseStatus status)
(package private) void
parseDocument(java.io.Reader reader, int suggestedBufferSize, IMarkupHandler handler, ParseStatus status)
-
-
-
Field Detail
-
DEFAULT_BUFFER_SIZE
public static final int DEFAULT_BUFFER_SIZE
Default buffer size to be used (buffer size will grow at runtime if an artifact (structure or text) is bigger than the whole buffer). Value: 4096 chars (= 8192 bytes).
- See Also:
- Constant Field Values
-
DEFAULT_POOL_SIZE
public static final int DEFAULT_POOL_SIZE
Default pool size to be used. Buffers will be kept in a pool and reused in order to increase performance. Pool will be non-exclusive so that if pool size = 2 and a 3rd request arrives, it is assigned a new buffer object (not linked to the pool, and therefore GC-ed at the end). Value: 2.
- See Also:
- Constant Field Values
-
configuration
private final ParseConfiguration configuration
-
pool
private final MarkupParser.BufferPool pool
-
-
Constructor Detail
-
MarkupParser
public MarkupParser(ParseConfiguration configuration)
Creates a new instance of this parser, using the specified configuration and default sizes for pool (
DEFAULT_POOL_SIZE
) and pooled buffers (DEFAULT_BUFFER_SIZE
).- Parameters:
configuration
- the parsing configuration to be used.
-
MarkupParser
public MarkupParser(ParseConfiguration configuration, int poolSize, int bufferSize)
Creates a new instance of this parser, specifying the pool and buffer size.
Buffer size (in chars) will be the size of the char[] structures used as buffers for parsing, which might grow if a certain markup structure does not fit inside (e.g. a text). Default size is
DEFAULT_BUFFER_SIZE
.Pool size is the size of the pool of char[] buffers that will be kept in memory in order to allow their reuse. This pool works in a non-exclusive mode, so that if pool size is 3 and a 4th request arrives, it is served a new non-pooled buffer without the need to block waiting for one of the pooled instances. Default size is
DEFAULT_POOL_SIZE
.Note that these pooled buffers will not be used when parsing documents specified as char[] objects. In such case, the char[] documents themselves will be used as buffers, avoiding the need to allocate buffers or use any additional amount of memory.
- Parameters:
configuration
- the parsing configuration to be used.poolSize
- the size of the pool of buffers to be used.bufferSize
- the default size of the buffers to be instanced for this parser.
-
-
Method Detail
-
parse
public void parse(java.lang.String document, IMarkupHandler handler) throws ParseException
Description copied from interface:IMarkupParser
Parse a document using the specified
IMarkupHandler
.- Specified by:
parse
in interfaceIMarkupParser
- Parameters:
document
- the document to be parsed, as a String.handler
- the handler to be used, anIMarkupHandler
implementation.- Throws:
ParseException
- if the document cannot be parsed.
-
parse
public void parse(char[] document, IMarkupHandler handler) throws ParseException
Description copied from interface:IMarkupParser
Parse a document using the specified
IMarkupHandler
.- Specified by:
parse
in interfaceIMarkupParser
- Parameters:
document
- the document to be parsed, as a char[].handler
- the handler to be used, anIMarkupHandler
implementation.- Throws:
ParseException
- if the document cannot be parsed.
-
parse
public void parse(char[] document, int offset, int len, IMarkupHandler handler) throws ParseException
Description copied from interface:IMarkupParser
Parse a document using the specified
IMarkupHandler
.- Specified by:
parse
in interfaceIMarkupParser
- Parameters:
document
- the document to be parsed, as a char[].offset
- the offset to be applied on the char[] document to determine the start of the document contents.len
- the length (in chars) of the document stored in the char[].handler
- the handler to be used, anIMarkupHandler
implementation.- Throws:
ParseException
- if the document cannot be parsed.
-
parse
public void parse(java.io.Reader reader, IMarkupHandler handler) throws ParseException
Description copied from interface:IMarkupParser
Parse a document using the specified
IMarkupHandler
.Implementations of this interface must close the provided
Reader
object after parsing.- Specified by:
parse
in interfaceIMarkupParser
- Parameters:
reader
- a Reader on the document.handler
- the handler to be used, anIMarkupHandler
implementation.- Throws:
ParseException
- if the document cannot be parsed.
-
parseDocument
void parseDocument(java.io.Reader reader, int suggestedBufferSize, IMarkupHandler handler, ParseStatus status) throws ParseException
- Throws:
ParseException
-
parseDocument
void parseDocument(char[] buffer, int offset, int len, IMarkupHandler handler, ParseStatus status) throws ParseException
- Throws:
ParseException
-
parseBuffer
private void parseBuffer(char[] buffer, int offset, int len, IMarkupHandler handler, ParseStatus status) throws ParseException
- Throws:
ParseException
-
-