Package org.w3c.tidy

Class ParserImpl


  • public final class ParserImpl
    extends java.lang.Object
    HTML Parser implementation.
    Version:
    $Revision$ ($Author$)
    Author:
    Dave Raggett dsr@w3.org , Andy Quick ac.quick@sympatico.ca (translation to Java), Fabrizio Giustina
    • Field Detail

      • HTML

        public static final Parser HTML
        parser for html.
      • HEAD

        public static final Parser HEAD
        parser for head.
      • TITLE

        public static final Parser TITLE
        parser for title.
      • SCRIPT

        public static final Parser SCRIPT
        parser for script.
      • BODY

        public static final Parser BODY
        parser for body.
      • FRAMESET

        public static final Parser FRAMESET
        parser for frameset.
      • INLINE

        public static final Parser INLINE
        parser for inline.
      • LIST

        public static final Parser LIST
        parser for list.
      • DEFLIST

        public static final Parser DEFLIST
        parser for definition lists.
      • PRE

        public static final Parser PRE
        parser for pre.
      • BLOCK

        public static final Parser BLOCK
        parser for block elements.
      • TABLETAG

        public static final Parser TABLETAG
        parser for table.
      • COLGROUP

        public static final Parser COLGROUP
        parser for colgroup.
      • ROWGROUP

        public static final Parser ROWGROUP
        parser for rowgroup.
      • ROW

        public static final Parser ROW
        parser for row.
      • NOFRAMES

        public static final Parser NOFRAMES
        parser for noframes.
      • SELECT

        public static final Parser SELECT
        parser for select.
      • TEXT

        public static final Parser TEXT
        parser for text.
      • EMPTY

        public static final Parser EMPTY
        parser for empty elements.
      • OPTGROUP

        public static final Parser OPTGROUP
        parser for optgroup.
    • Method Detail

      • parseTag

        protected static void parseTag​(Lexer lexer,
                                       Node node,
                                       short mode,
                                       int nestingLevel)
                                throws ExcessiveNesting
        Parse tag.
        Parameters:
        lexer - the Lexer to use
        node - the node to use
        mode - the mode to use
        nestingLevel - The current nesting level of the document. Extremely nested documents are considered an error.
        Throws:
        ExcessiveNesting - When excessive nesting is detected.
      • moveToHead

        protected static void moveToHead​(Lexer lexer,
                                         Node element,
                                         Node node,
                                         int nestingLevel)
                                  throws ExcessiveNesting
        Move node to the head, where element is used as starting point in hunt for head. Normally called during parsing.
        Parameters:
        lexer - the Lexer to use
        element - the element to use
        node - the node to use
        nestingLevel - The current nesting level of the document. Extremely nested documents are considered an error.
        Throws:
        ExcessiveNesting - When excessive nesting is detected.
      • parseDocument

        public static Node parseDocument​(Lexer lexer)
        HTML is the top level element.
        Parameters:
        lexer - the Lexer to use
        Returns:
        the document node
      • XMLPreserveWhiteSpace

        public static boolean XMLPreserveWhiteSpace​(Node element,
                                                    TagTable tt)
        Indicates whether or not whitespace should be preserved for this element. If an xml:space attribute is found, then if the attribute value is preserve, returns true. For any other value, returns false. If an xml:space attribute was not found, then the following element names result in a return value of true: pre, script, style, and xsl:text. Finally, if a TagTable was passed in and the element appears as the "pre" element in the TagTable, then true will be returned. Otherwise, false is returned.
        Parameters:
        element - The Node to test to see if whitespace should be preserved.
        tt - The TagTable to test for the getNodePre() function. This may be null, in which case this test is bypassed.
        Returns:
        true or false, as explained above.
      • parseXMLElement

        public static void parseXMLElement​(Lexer lexer,
                                           Node element,
                                           short mode)
        Parse XML element.
        Parameters:
        lexer - the Lexer to use
        element - the element to parse
        mode - the mode to use
      • parseXMLDocument

        public static Node parseXMLDocument​(Lexer lexer)
        Parse xml document.
        Parameters:
        lexer - the Lexer to use
        Returns:
        the document node