Class DomSerializer


  • public class DomSerializer
    extends java.lang.Object

    DOM serializer - creates xml DOM.

    • Field Detail

      • CSS_COMMENT_START

        private static final java.lang.String CSS_COMMENT_START
        See Also:
        Constant Field Values
      • props

        protected CleanerProperties props
        The HTML Cleaner properties set by the user to control the HTML cleaning.
      • escapeXml

        protected boolean escapeXml
        Whether XML entities should be escaped or not.
      • deserializeCdataEntities

        protected boolean deserializeCdataEntities
      • strictErrorChecking

        protected boolean strictErrorChecking
      • xmlVersion

        protected java.lang.String xmlVersion
    • Constructor Detail

      • DomSerializer

        public DomSerializer​(CleanerProperties props,
                             boolean escapeXml,
                             boolean deserializeCdataEntities,
                             boolean strictErrorChecking)
        Parameters:
        props - the HTML Cleaner properties set by the user to control the HTML cleaning.
        escapeXml - if true then escape XML entities
        deserializeCdataEntities - if true then deserialize entities in CData sections
        strictErrorChecking - if false then Document strict error checking is turned off
      • DomSerializer

        public DomSerializer​(CleanerProperties props,
                             boolean escapeXml,
                             boolean deserializeCdataEntities)
        Parameters:
        props - the HTML Cleaner properties set by the user to control the HTML cleaning.
        escapeXml - if true then escape XML entities
        deserializeCdataEntities - if true then deserialize entities in CData sections
      • DomSerializer

        public DomSerializer​(CleanerProperties props,
                             boolean escapeXml)
        Parameters:
        props - the HTML Cleaner properties set by the user to control the HTML cleaning.
        escapeXml - if true then escape XML entities
      • DomSerializer

        public DomSerializer​(CleanerProperties props)
        Parameters:
        props - the HTML Cleaner properties set by the user to control the HTML cleaning.
    • Method Detail

      • getXmlVersion

        public java.lang.String getXmlVersion()
      • setXmlVersion

        public void setXmlVersion​(java.lang.String xmlVersion)
                           throws java.lang.Exception
        Throws:
        java.lang.Exception
      • createDocument

        protected org.w3c.dom.Document createDocument​(TagNode rootNode)
                                               throws javax.xml.parsers.ParserConfigurationException
        Throws:
        javax.xml.parsers.ParserConfigurationException
      • createDOM

        public org.w3c.dom.Document createDOM​(TagNode rootNode)
                                       throws javax.xml.parsers.ParserConfigurationException
        Parameters:
        rootNode - the HTML Cleaner root node to serialize
        Returns:
        the W3C Document object
        Throws:
        javax.xml.parsers.ParserConfigurationException - if there's an error during serialization
      • isScriptOrStyle

        protected boolean isScriptOrStyle​(org.w3c.dom.Element element)
        Parameters:
        element - the element to check
        Returns:
        true if the passed element is a script or style element
      • dontEscape

        protected boolean dontEscape​(org.w3c.dom.Element element)
        encapsulate content with <[CDATA[ ]]> for things like script and style elements
        Parameters:
        element -
        Returns:
        true if <[CDATA[ ]]> should be used.
      • outputCData

        protected java.lang.String outputCData​(CData cdata)
      • deserializeCdataEntities

        protected java.lang.String deserializeCdataEntities​(java.lang.String input)
      • createSubnodes

        protected void createSubnodes​(org.w3c.dom.Document document,
                                      org.w3c.dom.Element element,
                                      java.util.List<? extends BaseToken> tagChildren)
        Serialize a given HTML Cleaner node.
        Parameters:
        document - the W3C Document to use for creating new DOM elements
        element - the W3C element to which we'll add the subnodes to
        tagChildren - the HTML Cleaner nodes to serialize for that node