Class DomSerializer

java.lang.Object
org.htmlcleaner.DomSerializer

public class DomSerializer extends Object

DOM serializer - creates xml DOM.

  • Field Details

    • CSS_COMMENT_START

      private static final String CSS_COMMENT_START
      See Also:
    • CSS_COMMENT_END

      private static final String CSS_COMMENT_END
      See Also:
    • NEW_LINE

      private static final String NEW_LINE
      See Also:
    • XML_10

      private static final String XML_10
      See Also:
    • XML_11

      private static final String XML_11
      See Also:
    • props

      protected CleanerProperties props
      The HTML Cleaner properties set by the user to control the HTML cleaning.
    • escapeXml

      protected boolean escapeXml
      Whether XML entities should be escaped or not.
    • deserializeCdataEntities

      protected boolean deserializeCdataEntities
    • strictErrorChecking

      protected boolean strictErrorChecking
    • xmlVersion

      protected String xmlVersion
  • Constructor Details

    • DomSerializer

      public DomSerializer(CleanerProperties props, boolean escapeXml, boolean deserializeCdataEntities, boolean strictErrorChecking)
      Parameters:
      props - the HTML Cleaner properties set by the user to control the HTML cleaning.
      escapeXml - if true then escape XML entities
      deserializeCdataEntities - if true then deserialize entities in CData sections
      strictErrorChecking - if false then Document strict error checking is turned off
    • DomSerializer

      public DomSerializer(CleanerProperties props, boolean escapeXml, boolean deserializeCdataEntities)
      Parameters:
      props - the HTML Cleaner properties set by the user to control the HTML cleaning.
      escapeXml - if true then escape XML entities
      deserializeCdataEntities - if true then deserialize entities in CData sections
    • DomSerializer

      public DomSerializer(CleanerProperties props, boolean escapeXml)
      Parameters:
      props - the HTML Cleaner properties set by the user to control the HTML cleaning.
      escapeXml - if true then escape XML entities
    • DomSerializer

      public DomSerializer(CleanerProperties props)
      Parameters:
      props - the HTML Cleaner properties set by the user to control the HTML cleaning.
  • Method Details

    • getXmlVersion

      public String getXmlVersion()
    • setXmlVersion

      public void setXmlVersion(String xmlVersion) throws Exception
      Throws:
      Exception
    • createDocument

      protected Document createDocument(TagNode rootNode) throws ParserConfigurationException
      Throws:
      ParserConfigurationException
    • createDOM

      public Document createDOM(TagNode rootNode) throws ParserConfigurationException
      Parameters:
      rootNode - the HTML Cleaner root node to serialize
      Returns:
      the W3C Document object
      Throws:
      ParserConfigurationException - if there's an error during serialization
    • isScriptOrStyle

      protected boolean isScriptOrStyle(Element element)
      Parameters:
      element - the element to check
      Returns:
      true if the passed element is a script or style element
    • dontEscape

      protected boolean dontEscape(Element element)
      encapsulate content with invalid input: '<'[CDATA[ ]]> for things like script and style elements
      Parameters:
      element -
      Returns:
      true if invalid input: '<'[CDATA[ ]]> should be used.
    • outputCData

      protected String outputCData(CData cdata)
    • deserializeCdataEntities

      protected String deserializeCdataEntities(String input)
    • createSubnodes

      protected void createSubnodes(Document document, Element element, List<? extends BaseToken> tagChildren)
      Serialize a given HTML Cleaner node.
      Parameters:
      document - the W3C Document to use for creating new DOM elements
      element - the W3C element to which we'll add the subnodes to
      tagChildren - the HTML Cleaner nodes to serialize for that node