Package groovy.util

Class XmlParser

  • All Implemented Interfaces:
    org.xml.sax.ContentHandler

    public class XmlParser
    extends java.lang.Object
    implements org.xml.sax.ContentHandler
    A helper class for parsing XML into a tree of Node instances for a simple way of processing XML. This parser does not preserve the XML InfoSet - if that's what you need try using W3C DOM, dom4j, JDOM, XOM etc. This parser ignores comments and processing instructions and converts the XML into a Node for each element in the XML with attributes and child Nodes and Strings. This simple model is sufficient for most simple use cases of processing XML.

    Example usage:

     def xml = '<root><one a1="uno!"/><two>Some text!</two></root>'
     def rootNode = new XmlParser().parseText(xml)
     assert rootNode.name() == 'root'
     assert rootNode.one[0].@a1 == 'uno!'
     assert rootNode.two.text() == 'Some text!'
     rootNode.children().each { assert it.name() in ['one','two'] }
     
    Version:
    $Revision$
    Author:
    James Strachan, Paul King
    • Constructor Summary

      Constructors 
      Constructor Description
      XmlParser()  
      XmlParser​(boolean validating, boolean namespaceAware)  
      XmlParser​(javax.xml.parsers.SAXParser parser)  
      XmlParser​(org.xml.sax.XMLReader reader)  
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      protected void addTextToNode()  
      void characters​(char[] buffer, int start, int length)  
      protected Node createNode​(Node parent, java.lang.Object name, java.util.Map attributes)
      Creates a new node with the given parent, name, and attributes.
      void endDocument()  
      void endElement​(java.lang.String namespaceURI, java.lang.String localName, java.lang.String qName)  
      void endPrefixMapping​(java.lang.String prefix)  
      org.xml.sax.Locator getDocumentLocator()  
      org.xml.sax.DTDHandler getDTDHandler()  
      protected java.lang.Object getElementName​(java.lang.String namespaceURI, java.lang.String localName, java.lang.String qName)
      Return a name given the namespaceURI, localName and qName.
      org.xml.sax.EntityResolver getEntityResolver()  
      org.xml.sax.ErrorHandler getErrorHandler()  
      boolean getFeature​(java.lang.String uri)  
      java.lang.Object getProperty​(java.lang.String uri)  
      protected org.xml.sax.XMLReader getXMLReader()  
      void ignorableWhitespace​(char[] buffer, int start, int len)  
      boolean isNamespaceAware()
      Determine if namespace handling is enabled.
      boolean isTrimWhitespace()
      Returns the current trim whitespace setting.
      Node parse​(java.io.File file)
      Parses the content of the given file as XML turning it into a tree of Nodes.
      Node parse​(java.io.InputStream input)
      Parse the content of the specified input stream into a tree of Nodes.
      Node parse​(java.io.Reader in)
      Parse the content of the specified reader into a tree of Nodes.
      Node parse​(java.lang.String uri)
      Parse the content of the specified URI into a tree of Nodes.
      Node parse​(org.xml.sax.InputSource input)
      Parse the content of the specified input source into a tree of Nodes.
      Node parseText​(java.lang.String text)
      A helper method to parse the given text as XML.
      void processingInstruction​(java.lang.String target, java.lang.String data)  
      void setDocumentLocator​(org.xml.sax.Locator locator)  
      void setDTDHandler​(org.xml.sax.DTDHandler dtdHandler)  
      void setEntityResolver​(org.xml.sax.EntityResolver entityResolver)  
      void setErrorHandler​(org.xml.sax.ErrorHandler errorHandler)  
      void setFeature​(java.lang.String uri, boolean value)  
      void setNamespaceAware​(boolean namespaceAware)
      Enable and/or disable namespace handling.
      void setProperty​(java.lang.String uri, java.lang.Object value)  
      void setTrimWhitespace​(boolean trimWhitespace)
      Sets the trim whitespace setting value.
      void skippedEntity​(java.lang.String name)  
      void startDocument()  
      void startElement​(java.lang.String namespaceURI, java.lang.String localName, java.lang.String qName, org.xml.sax.Attributes list)  
      void startPrefixMapping​(java.lang.String prefix, java.lang.String namespaceURI)  
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • XmlParser

        public XmlParser()
                  throws javax.xml.parsers.ParserConfigurationException,
                         org.xml.sax.SAXException
        Throws:
        javax.xml.parsers.ParserConfigurationException
        org.xml.sax.SAXException
      • XmlParser

        public XmlParser​(boolean validating,
                         boolean namespaceAware)
                  throws javax.xml.parsers.ParserConfigurationException,
                         org.xml.sax.SAXException
        Throws:
        javax.xml.parsers.ParserConfigurationException
        org.xml.sax.SAXException
      • XmlParser

        public XmlParser​(org.xml.sax.XMLReader reader)
      • XmlParser

        public XmlParser​(javax.xml.parsers.SAXParser parser)
                  throws org.xml.sax.SAXException
        Throws:
        org.xml.sax.SAXException
    • Method Detail

      • isTrimWhitespace

        public boolean isTrimWhitespace()
        Returns the current trim whitespace setting.
        Returns:
        true if whitespace will be trimmed
      • setTrimWhitespace

        public void setTrimWhitespace​(boolean trimWhitespace)
        Sets the trim whitespace setting value.
        Parameters:
        trimWhitespace - the desired setting value
      • parse

        public Node parse​(java.io.File file)
                   throws java.io.IOException,
                          org.xml.sax.SAXException
        Parses the content of the given file as XML turning it into a tree of Nodes.
        Parameters:
        file - the File containing the XML to be parsed
        Returns:
        the root node of the parsed tree of Nodes
        Throws:
        org.xml.sax.SAXException - Any SAX exception, possibly wrapping another exception.
        java.io.IOException - An IO exception from the parser, possibly from a byte stream or character stream supplied by the application.
      • parse

        public Node parse​(org.xml.sax.InputSource input)
                   throws java.io.IOException,
                          org.xml.sax.SAXException
        Parse the content of the specified input source into a tree of Nodes.
        Parameters:
        input - the InputSource for the XML to parse
        Returns:
        the root node of the parsed tree of Nodes
        Throws:
        org.xml.sax.SAXException - Any SAX exception, possibly wrapping another exception.
        java.io.IOException - An IO exception from the parser, possibly from a byte stream or character stream supplied by the application.
      • parse

        public Node parse​(java.io.InputStream input)
                   throws java.io.IOException,
                          org.xml.sax.SAXException
        Parse the content of the specified input stream into a tree of Nodes.

        Note that using this method will not provide the parser with any URI for which to find DTDs etc

        Parameters:
        input - an InputStream containing the XML to be parsed
        Returns:
        the root node of the parsed tree of Nodes
        Throws:
        org.xml.sax.SAXException - Any SAX exception, possibly wrapping another exception.
        java.io.IOException - An IO exception from the parser, possibly from a byte stream or character stream supplied by the application.
      • parse

        public Node parse​(java.io.Reader in)
                   throws java.io.IOException,
                          org.xml.sax.SAXException
        Parse the content of the specified reader into a tree of Nodes.

        Note that using this method will not provide the parser with any URI for which to find DTDs etc

        Parameters:
        in - a Reader to read the XML to be parsed
        Returns:
        the root node of the parsed tree of Nodes
        Throws:
        org.xml.sax.SAXException - Any SAX exception, possibly wrapping another exception.
        java.io.IOException - An IO exception from the parser, possibly from a byte stream or character stream supplied by the application.
      • parse

        public Node parse​(java.lang.String uri)
                   throws java.io.IOException,
                          org.xml.sax.SAXException
        Parse the content of the specified URI into a tree of Nodes.
        Parameters:
        uri - a String containing a uri pointing to the XML to be parsed
        Returns:
        the root node of the parsed tree of Nodes
        Throws:
        org.xml.sax.SAXException - Any SAX exception, possibly wrapping another exception.
        java.io.IOException - An IO exception from the parser, possibly from a byte stream or character stream supplied by the application.
      • parseText

        public Node parseText​(java.lang.String text)
                       throws java.io.IOException,
                              org.xml.sax.SAXException
        A helper method to parse the given text as XML.
        Parameters:
        text - the XML text to parse
        Returns:
        the root node of the parsed tree of Nodes
        Throws:
        org.xml.sax.SAXException - Any SAX exception, possibly wrapping another exception.
        java.io.IOException - An IO exception from the parser, possibly from a byte stream or character stream supplied by the application.
      • isNamespaceAware

        public boolean isNamespaceAware()
        Determine if namespace handling is enabled.
        Returns:
        true if namespace handling is enabled
      • setNamespaceAware

        public void setNamespaceAware​(boolean namespaceAware)
        Enable and/or disable namespace handling.
        Parameters:
        namespaceAware - the new desired value
      • getDTDHandler

        public org.xml.sax.DTDHandler getDTDHandler()
      • getEntityResolver

        public org.xml.sax.EntityResolver getEntityResolver()
      • getErrorHandler

        public org.xml.sax.ErrorHandler getErrorHandler()
      • getFeature

        public boolean getFeature​(java.lang.String uri)
                           throws org.xml.sax.SAXNotRecognizedException,
                                  org.xml.sax.SAXNotSupportedException
        Throws:
        org.xml.sax.SAXNotRecognizedException
        org.xml.sax.SAXNotSupportedException
      • getProperty

        public java.lang.Object getProperty​(java.lang.String uri)
                                     throws org.xml.sax.SAXNotRecognizedException,
                                            org.xml.sax.SAXNotSupportedException
        Throws:
        org.xml.sax.SAXNotRecognizedException
        org.xml.sax.SAXNotSupportedException
      • setDTDHandler

        public void setDTDHandler​(org.xml.sax.DTDHandler dtdHandler)
      • setEntityResolver

        public void setEntityResolver​(org.xml.sax.EntityResolver entityResolver)
      • setErrorHandler

        public void setErrorHandler​(org.xml.sax.ErrorHandler errorHandler)
      • setFeature

        public void setFeature​(java.lang.String uri,
                               boolean value)
                        throws org.xml.sax.SAXNotRecognizedException,
                               org.xml.sax.SAXNotSupportedException
        Throws:
        org.xml.sax.SAXNotRecognizedException
        org.xml.sax.SAXNotSupportedException
      • setProperty

        public void setProperty​(java.lang.String uri,
                                java.lang.Object value)
                         throws org.xml.sax.SAXNotRecognizedException,
                                org.xml.sax.SAXNotSupportedException
        Throws:
        org.xml.sax.SAXNotRecognizedException
        org.xml.sax.SAXNotSupportedException
      • startDocument

        public void startDocument()
                           throws org.xml.sax.SAXException
        Specified by:
        startDocument in interface org.xml.sax.ContentHandler
        Throws:
        org.xml.sax.SAXException
      • endDocument

        public void endDocument()
                         throws org.xml.sax.SAXException
        Specified by:
        endDocument in interface org.xml.sax.ContentHandler
        Throws:
        org.xml.sax.SAXException
      • startElement

        public void startElement​(java.lang.String namespaceURI,
                                 java.lang.String localName,
                                 java.lang.String qName,
                                 org.xml.sax.Attributes list)
                          throws org.xml.sax.SAXException
        Specified by:
        startElement in interface org.xml.sax.ContentHandler
        Throws:
        org.xml.sax.SAXException
      • endElement

        public void endElement​(java.lang.String namespaceURI,
                               java.lang.String localName,
                               java.lang.String qName)
                        throws org.xml.sax.SAXException
        Specified by:
        endElement in interface org.xml.sax.ContentHandler
        Throws:
        org.xml.sax.SAXException
      • characters

        public void characters​(char[] buffer,
                               int start,
                               int length)
                        throws org.xml.sax.SAXException
        Specified by:
        characters in interface org.xml.sax.ContentHandler
        Throws:
        org.xml.sax.SAXException
      • startPrefixMapping

        public void startPrefixMapping​(java.lang.String prefix,
                                       java.lang.String namespaceURI)
                                throws org.xml.sax.SAXException
        Specified by:
        startPrefixMapping in interface org.xml.sax.ContentHandler
        Throws:
        org.xml.sax.SAXException
      • endPrefixMapping

        public void endPrefixMapping​(java.lang.String prefix)
                              throws org.xml.sax.SAXException
        Specified by:
        endPrefixMapping in interface org.xml.sax.ContentHandler
        Throws:
        org.xml.sax.SAXException
      • ignorableWhitespace

        public void ignorableWhitespace​(char[] buffer,
                                        int start,
                                        int len)
                                 throws org.xml.sax.SAXException
        Specified by:
        ignorableWhitespace in interface org.xml.sax.ContentHandler
        Throws:
        org.xml.sax.SAXException
      • processingInstruction

        public void processingInstruction​(java.lang.String target,
                                          java.lang.String data)
                                   throws org.xml.sax.SAXException
        Specified by:
        processingInstruction in interface org.xml.sax.ContentHandler
        Throws:
        org.xml.sax.SAXException
      • getDocumentLocator

        public org.xml.sax.Locator getDocumentLocator()
      • setDocumentLocator

        public void setDocumentLocator​(org.xml.sax.Locator locator)
        Specified by:
        setDocumentLocator in interface org.xml.sax.ContentHandler
      • skippedEntity

        public void skippedEntity​(java.lang.String name)
                           throws org.xml.sax.SAXException
        Specified by:
        skippedEntity in interface org.xml.sax.ContentHandler
        Throws:
        org.xml.sax.SAXException
      • getXMLReader

        protected org.xml.sax.XMLReader getXMLReader()
      • addTextToNode

        protected void addTextToNode()
      • createNode

        protected Node createNode​(Node parent,
                                  java.lang.Object name,
                                  java.util.Map attributes)
        Creates a new node with the given parent, name, and attributes. The default implementation returns an instance of groovy.util.Node.
        Parameters:
        parent - the parent node, or null if the node being created is the root node
        name - an Object representing the name of the node (typically an instance of QName)
        attributes - a Map of attribute names to attribute values
        Returns:
        a new Node instance representing the current node
      • getElementName

        protected java.lang.Object getElementName​(java.lang.String namespaceURI,
                                                  java.lang.String localName,
                                                  java.lang.String qName)
        Return a name given the namespaceURI, localName and qName.
        Parameters:
        namespaceURI - the namespace URI
        localName - the local name
        qName - the qualified name
        Returns:
        the newly created representation of the name