Interface IMarkupHandler
- All Superinterfaces:
IAttributeSequenceHandler
,ICDATASectionHandler
,ICommentHandler
,IDocTypeHandler
,IDocumentHandler
,IElementHandler
,IProcessingInstructionHandler
,ITextHandler
,IXMLDeclarationHandler
- All Known Implementing Classes:
AbstractChainedMarkupHandler
,AbstractMarkupHandler
,AttributeSelectionMarkingMarkupHandler
,BlockSelectorMarkupHandler
,DiscardMarkupHandler
,DOMBuilderMarkupHandler
,DuplicateMarkupHandler
,HtmlMarkupHandler
,MarkupEventProcessorHandler
,MinimizeHtmlMarkupHandler
,NodeSelectorMarkupHandler
,OutputMarkupHandler
,PrettyHtmlMarkupHandler
,SimplifierMarkupHandler
,TextOutputMarkupHandler
,TraceBuilderMarkupHandler
Interface to be implemented by all Markup Handlers.
Markup handlers are the objects that receive the events produced during parsing and perform the operations the users need. This interface is the main entry point to use AttoParser.
Markup handlers can be stateful, which means that a new instance of the markup handler class should be created for each parsing operation. In such case, it is not required that these implementations are thread-safe.
There is an abstract, basic, no-op implementation of this interface called
AbstractMarkupHandler
which can be used for easily creating new handlers by overriding
only the relevant event handling methods.
Note also there is a simplified version of this interface that reduces the number of events
and also simplifies the operations on textual buffers, called ISimpleMarkupHandler
,
which can be easily used with the convenience ad-hoc parser class
SimpleMarkupParser
.
AttoParser provides several useful implementations of this interface out-of-the-box:
Markup output
OutputMarkupHandler
-
For writing the received events to a
specified
Writer
object, without any loss of information (case, whitespaces, etc.). This handler is useful for performing filtering/transformation operations on the parsed markup, placing this handler at the end of the handler chain so that it outputs the final results of such operation. TextOutputMarkupHandler
-
For writing the received events to a
specified
Writer
object as mere text, ignoring all non-text events. This will effectively strip all markup elements, comments, DOCTYPEs, etc. from the original markup.
Format conversion and transformation operations
DOMBuilderMarkupHandler
-
For building a DOM tree as a result of parsing
a document. This DOM tree will be created using the classes at the org.attoparser.dom package.
This handler can be more easily applied by using the convenience ad-hoc parser
class
DOMMarkupParser
. SimplifierMarkupHandler
-
For transforming the produced markup
parsing events into a much simpler format, removing much of the complexity of these parsing events
and allowing users to create their handlers by means of the
ISimpleMarkupHandler
interface. Note this handler can be more easily applied by using the convenience ad-hoc parser classSimpleMarkupParser
. MinimizeHtmlMarkupHandler
- For minimizing (compacting) HTML markup: remove excess white space, unquote attributes, etc.
Fragment selection and event management
BlockSelectorMarkupHandler
-
For applying block selection (element + subtree) on the parsed markup, based on a set
of specified markup selectors (see
org.attoparser.select
). NodeSelectorMarkupHandler
-
For applying node selection (element, no subtree) on the parsed markup, based on a set
of specified markup selectors (see
org.attoparser.select
). AttributeSelectionMarkingMarkupHandler
- For synthetically adding an attribute (with the specified name) to markup elements displaying which of the specified selectors (block or node) match those markup elements.
DuplicateMarkupHandler
-
For duplicating parsing events, sending each
of them to two different implementations if
IMarkupHandler
.
Testing and Debugging
PrettyHtmlMarkupHandler
- For creating an HTML document visually explaining all the events happened during the parsing of a document: elements, attributes, auto-closing of elements, unmatched artifacts, etc.
TraceBuilderMarkupHandler
-
For building a trace of parsing events (a
list of
MarkupTraceEvent
objects) detailing all the events launched during the parsing of a document.
- Since:
- 2.0.0
-
Method Summary
Modifier and TypeMethodDescriptionvoid
setParseConfiguration
(ParseConfiguration parseConfiguration) Sets theParseConfiguration
object that will be used during the parsing operation.void
setParseSelection
(ParseSelection selection) Sets theParseSelection
object that represents the different levels of selectors (if any) that are currently active for the fired events.void
setParseStatus
(ParseStatus status) Sets theParseStatus
object that will be used during the parsing operation.Methods inherited from interface org.attoparser.IAttributeSequenceHandler
handleAttribute, handleInnerWhiteSpace
Methods inherited from interface org.attoparser.ICDATASectionHandler
handleCDATASection
Methods inherited from interface org.attoparser.ICommentHandler
handleComment
Methods inherited from interface org.attoparser.IDocTypeHandler
handleDocType
Methods inherited from interface org.attoparser.IDocumentHandler
handleDocumentEnd, handleDocumentStart
Methods inherited from interface org.attoparser.IElementHandler
handleAutoCloseElementEnd, handleAutoCloseElementStart, handleAutoOpenElementEnd, handleAutoOpenElementStart, handleCloseElementEnd, handleCloseElementStart, handleOpenElementEnd, handleOpenElementStart, handleStandaloneElementEnd, handleStandaloneElementStart, handleUnmatchedCloseElementEnd, handleUnmatchedCloseElementStart
Methods inherited from interface org.attoparser.IProcessingInstructionHandler
handleProcessingInstruction
Methods inherited from interface org.attoparser.ITextHandler
handleText
Methods inherited from interface org.attoparser.IXMLDeclarationHandler
handleXmlDeclaration
-
Method Details
-
setParseConfiguration
Sets the
ParseConfiguration
object that will be used during the parsing operation. This object will normally have been specified to the parser object during its instantiation or initialization.This method is always called by the parser before calling any other event handling method.
Note that this method can be safely ignored by most implementations, as there are very few scenarios in which this kind of interaction would be consisdered relevant.
- Parameters:
parseConfiguration
- the configuration object.
-
setParseStatus
Sets the
ParseStatus
object that will be used during the parsing operation. This object can be used for instructing the parser about specific low-level conditions arisen during event handling.This method is always called by the parser before calling any other event handling method.
Note that this method can be safely ignored by most implementations, as there are very few and very specific scenarios in which this kind of interaction with the parser would be needed. It is therefore mainly for internal use.
- Parameters:
status
- the status object.
-
setParseSelection
Sets the
ParseSelection
object that represents the different levels of selectors (if any) that are currently active for the fired events.This method is always called by the parser before calling any other event handling method.
Note that this method can be safely ignored by most implementations, as there are very few scenarios in which this kind of interaction would be consisdered relevant.
- Parameters:
selection
- the selection object.
-