Class ParserRegistry


  • public class ParserRegistry
    extends Object

    Keeps track of response parsers for each content type. Each parser should should be a closure that accepts an HttpResponse instance, and returns whatever handler is appropriate for reading the response data for that content-type. For example, a plain-text response should probably be parsed with a Reader, while an XML response might be parsed by an XmlSlurper, which would then be passed to the response closure.

    Note that all methods in this class assume HttpResponse.getEntity() return a non-null value. It is the job of the HTTPBuilder instance to ensure a NullPointerException is not thrown by passing a response that contains no entity.

    You can see the list of content-type parsers that are built-in to the ParserRegistry class in buildDefaultParserMap().

    Author:
    Tom Nichols
    See Also:
    ContentType
    • Field Summary

      Fields 
      Modifier and Type Field Description
      protected static org.apache.xml.resolver.tools.CatalogResolver catalogResolver
      This CatalogResolver is static to avoid the overhead of re-parsing the catalog definition file every time.
      static String DEFAULT_CHARSET
      The default charset to use when no charset is given in the Content-Type header of a response.
      protected groovy.lang.Closure DEFAULT_PARSER
      The default parser used for unregistered content-types.
      protected static org.apache.commons.logging.Log log  
    • Constructor Summary

      Constructors 
      Constructor Description
      ParserRegistry()  
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      static void addCatalog​(URL catalogLocation)
      Add a new XML catalog definiton to the static XML resolver catalog.
      protected Map<String,​groovy.lang.Closure> buildDefaultParserMap()
      Returns a map of default parsers.
      groovy.lang.Closure getAt​(Object contentType)
      Retrieve a parser for the given response content-type string.
      static org.apache.xml.resolver.tools.CatalogResolver getCatalogResolver()
      Access the default catalog used by all HTTPBuilder instances.
      static String getCharset​(org.apache.http.HttpResponse resp)
      Helper method to get the charset from the response.
      static String getContentType​(org.apache.http.HttpResponse resp)
      Helper method to get the content-type string from the response (no charset).
      groovy.lang.Closure getDefaultParser()
      Get the default parser used for unregistered content-types.
      Iterator<Map.Entry<String,​groovy.lang.Closure>> iterator()
      Iterate over the entire parser map
      Map<String,​String> parseForm​(org.apache.http.HttpResponse resp)
      Default parser used to decode a URL-encoded response.
      groovy.util.slurpersupport.GPathResult parseHTML​(org.apache.http.HttpResponse resp)
      Parse an HTML document by passing it through the NekoHTML parser.
      Object parseJSON​(org.apache.http.HttpResponse resp)
      Default parser used to decode a JSON response.
      InputStream parseStream​(org.apache.http.HttpResponse resp)
      Default parser used for binary data.
      Reader parseText​(org.apache.http.HttpResponse resp)
      Default parser used to handle plain text data.
      groovy.util.slurpersupport.GPathResult parseXML​(org.apache.http.HttpResponse resp)
      Default parser used to decode an XML response.
      groovy.lang.Closure propertyMissing​(Object key)
      Alias for getAt(Object) to allow property-style access.
      void propertyMissing​(Object key, groovy.lang.Closure value)
      Alias for putAt(Object, Closure) to allow property-style access.
      void putAt​(Object contentType, groovy.lang.Closure value)
      Register a new parser for the given content-type.
      static void setDefaultCharset​(String charset)
      Set the charset to use for parsing character streams when no charset is given in the Content-Type header.
      void setDefaultParser​(groovy.lang.Closure defaultParser)
      Set the default parser used for unregistered content-types.
    • Field Detail

      • DEFAULT_PARSER

        protected final groovy.lang.Closure DEFAULT_PARSER
        The default parser used for unregistered content-types. This is a copy of parseStream(HttpResponse), which is like a no-op that just returns the unaltered response stream.
      • log

        protected static final org.apache.commons.logging.Log log
      • catalogResolver

        protected static org.apache.xml.resolver.tools.CatalogResolver catalogResolver
        This CatalogResolver is static to avoid the overhead of re-parsing the catalog definition file every time. Unfortunately, there's no way to share a single Catalog instance between resolvers. The Catalog class is technically not thread-safe, but as long as you do not parse catalog files while using the resolver, it should be fine.
    • Constructor Detail

      • ParserRegistry

        public ParserRegistry()
    • Method Detail

      • setDefaultCharset

        public static void setDefaultCharset​(String charset)
        Set the charset to use for parsing character streams when no charset is given in the Content-Type header.
        Parameters:
        charset - the charset to use, or null to use DEFAULT_CHARSET
      • getCharset

        public static String getCharset​(org.apache.http.HttpResponse resp)
        Helper method to get the charset from the response. This should be done when manually parsing any text response to ensure it is decoded using the correct charset. For instance:
         Reader reader = new InputStreamReader( resp.getEntity().getContent(),
           ParserRegistry.getCharset( resp ) );
        Parameters:
        resp -
      • getContentType

        public static String getContentType​(org.apache.http.HttpResponse resp)
        Helper method to get the content-type string from the response (no charset).
        Parameters:
        resp -
      • parseStream

        public InputStream parseStream​(org.apache.http.HttpResponse resp)
                                throws IOException
        Default parser used for binary data. This simply returns the underlying response InputStream.
        Parameters:
        resp -
        Returns:
        an InputStream the binary response stream
        Throws:
        IllegalStateException
        IOException
        See Also:
        ContentType.BINARY, HttpEntity.getContent()
      • parseHTML

        public groovy.util.slurpersupport.GPathResult parseHTML​(org.apache.http.HttpResponse resp)
                                                         throws IOException,
                                                                SAXException
        Parse an HTML document by passing it through the NekoHTML parser.
        Parameters:
        resp - HTTP response from which to parse content
        Returns:
        the GPathResult from calling XmlSlurper.parse(Reader)
        Throws:
        IOException
        SAXException
        See Also:
        ContentType.HTML, SAXParser, XmlSlurper.parse(Reader)
      • addCatalog

        public static void addCatalog​(URL catalogLocation)
                               throws IOException
        Add a new XML catalog definiton to the static XML resolver catalog. See the HTTPBuilder source catalog for an example.
        Parameters:
        catalogLocation - URL of a catalog definition file
        Throws:
        IOException - if the given URL cannot be parsed or accessed for whatever reason.
      • getCatalogResolver

        public static org.apache.xml.resolver.tools.CatalogResolver getCatalogResolver()
        Access the default catalog used by all HTTPBuilder instances.
        Returns:
        the static CatalogResolver instance
      • getDefaultParser

        public groovy.lang.Closure getDefaultParser()
        Get the default parser used for unregistered content-types.
        Returns:
      • setDefaultParser

        public void setDefaultParser​(groovy.lang.Closure defaultParser)
        Set the default parser used for unregistered content-types.
        Parameters:
        defaultParser - if
      • getAt

        public groovy.lang.Closure getAt​(Object contentType)
        Retrieve a parser for the given response content-type string. This is called by HTTPBuildre to retrieve the correct parser for a given content-type. The parser is then used to decode the response data prior to passing it to a response handler.
        Parameters:
        contentType -
        Returns:
        parser that can interpret the given response content type, or the default parser if no parser is registered for the given content-type.
      • putAt

        public void putAt​(Object contentType,
                          groovy.lang.Closure value)
        Register a new parser for the given content-type. The parser closure should accept an HttpResponse argument and return a type suitable to be passed as the 'parsed data' argument of a response handler closure.
        Parameters:
        contentType - content-type string
        value - code that will parse the HttpResponse and return parsed data to the response handler.
      • propertyMissing

        public groovy.lang.Closure propertyMissing​(Object key)
        Alias for getAt(Object) to allow property-style access.
        Parameters:
        key - content-type string
        Returns:
      • propertyMissing

        public void propertyMissing​(Object key,
                                    groovy.lang.Closure value)
        Alias for putAt(Object, Closure) to allow property-style access.
        Parameters:
        key - content-type string
        value - parser closure
      • iterator

        public Iterator<Map.Entry<String,​groovy.lang.Closure>> iterator()
        Iterate over the entire parser map
        Returns: