Class ParsedURLData


  • public class ParsedURLData
    extends java.lang.Object
    Holds the data for more URLs.
    • Constructor Summary

      Constructors 
      Constructor Description
      ParsedURLData()
      Void constructor
      ParsedURLData​(java.net.URL url)
      Build from an existing URL.
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      protected java.net.URL buildURL()
      Attempts to build a normal java.net.URL instance from this URL.
      static java.io.InputStream checkGZIP​(java.io.InputStream is)
      This is a utility function others can call that checks if is is a GZIP stream if so it returns a GZIPInputStream that will decode the contents, otherwise it returns (or a buffered version of is) untouched.
      boolean complete()
      Returns true if the URL looks well formed and complete.
      boolean equals​(java.lang.Object obj)
      Implement Object.equals for ParsedURLData.
      protected void extractContentTypeParts​(java.lang.String userAgent)
      Extracts the type/subtype and charset parameter from the Content-Type header.
      java.lang.String getContentEncoding​(java.lang.String userAgent)
      Returns the content encoding if available.
      java.lang.String getContentType​(java.lang.String userAgent)
      Returns the content type if available.
      java.lang.String getContentTypeCharset​(java.lang.String userAgent)
      Returns the content type's charset parameter, if available.
      java.lang.String getContentTypeMediaType​(java.lang.String userAgent)
      Returns the content type's type/subtype, if available.
      java.lang.String getPortStr()
      Returns the URL up to and include the port number on the host.
      java.lang.String getPostConnectionURL()
      Returns the URL that was ultimately used to fetch the resource represented by the ParsedURL.
      boolean hasContentTypeParameter​(java.lang.String userAgent, java.lang.String param)
      Returns whether the Content-Type header has the given parameter.
      int hashCode()
      Implement Object.hashCode.
      java.io.InputStream openStream​(java.lang.String userAgent, java.util.Iterator mimeTypes)
      Open the stream and check for common compression types.
      protected java.io.InputStream openStreamInternal​(java.lang.String userAgent, java.util.Iterator mimeTypes, java.util.Iterator encodingTypes)  
      java.io.InputStream openStreamRaw​(java.lang.String userAgent, java.util.Iterator mimeTypes)
      Open the stream and returns it.
      protected boolean sameFile​(ParsedURLData other)  
      java.lang.String toString()
      Return a string representation of the data.
      • Methods inherited from class java.lang.Object

        clone, finalize, getClass, notify, notifyAll, wait, wait, wait
    • Field Detail

      • HTTP_USER_AGENT_HEADER

        protected static final java.lang.String HTTP_USER_AGENT_HEADER
        See Also:
        Constant Field Values
      • HTTP_ACCEPT_HEADER

        protected static final java.lang.String HTTP_ACCEPT_HEADER
        See Also:
        Constant Field Values
      • HTTP_ACCEPT_LANGUAGE_HEADER

        protected static final java.lang.String HTTP_ACCEPT_LANGUAGE_HEADER
        See Also:
        Constant Field Values
      • HTTP_ACCEPT_ENCODING_HEADER

        protected static final java.lang.String HTTP_ACCEPT_ENCODING_HEADER
        See Also:
        Constant Field Values
      • acceptedEncodings

        protected static java.util.List acceptedEncodings
      • GZIP_MAGIC

        public static final byte[] GZIP_MAGIC
        GZIP header magic number bytes, like found in a gzipped files, which are encoded in Intel format (i.e. little indian).
      • protocol

        public java.lang.String protocol
        Since the Data instance is 'hidden' in the ParsedURL instance we make all our methods public. This makes it easy for the various Protocol Handlers to update an instance as parsing proceeds.
      • host

        public java.lang.String host
      • port

        public int port
      • path

        public java.lang.String path
      • ref

        public java.lang.String ref
      • contentType

        public java.lang.String contentType
      • contentEncoding

        public java.lang.String contentEncoding
      • stream

        public java.io.InputStream stream
      • hasBeenOpened

        public boolean hasBeenOpened
      • contentTypeMediaType

        protected java.lang.String contentTypeMediaType
        The extracted type/subtype from the Content-Type header.
      • contentTypeCharset

        protected java.lang.String contentTypeCharset
        The extracted charset parameter from the Content-Type header.
      • postConnectionURL

        protected java.net.URL postConnectionURL
        The URL that was ultimately used to fetch the resource.
    • Constructor Detail

      • ParsedURLData

        public ParsedURLData()
        Void constructor
      • ParsedURLData

        public ParsedURLData​(java.net.URL url)
        Build from an existing URL.
    • Method Detail

      • checkGZIP

        public static java.io.InputStream checkGZIP​(java.io.InputStream is)
                                             throws java.io.IOException
        This is a utility function others can call that checks if is is a GZIP stream if so it returns a GZIPInputStream that will decode the contents, otherwise it returns (or a buffered version of is) untouched.
        Parameters:
        is - Stream that may potentially be a GZIP stream.
        Throws:
        java.io.IOException
      • buildURL

        protected java.net.URL buildURL()
                                 throws java.net.MalformedURLException
        Attempts to build a normal java.net.URL instance from this URL.
        Throws:
        java.net.MalformedURLException
      • hashCode

        public int hashCode()
        Implement Object.hashCode.
        Overrides:
        hashCode in class java.lang.Object
      • equals

        public boolean equals​(java.lang.Object obj)
        Implement Object.equals for ParsedURLData.
        Overrides:
        equals in class java.lang.Object
      • getContentType

        public java.lang.String getContentType​(java.lang.String userAgent)
        Returns the content type if available. This is only available for some protocols.
      • getContentTypeMediaType

        public java.lang.String getContentTypeMediaType​(java.lang.String userAgent)
        Returns the content type's type/subtype, if available. This is only available for some protocols.
      • getContentTypeCharset

        public java.lang.String getContentTypeCharset​(java.lang.String userAgent)
        Returns the content type's charset parameter, if available. This is only available for some protocols.
      • hasContentTypeParameter

        public boolean hasContentTypeParameter​(java.lang.String userAgent,
                                               java.lang.String param)
        Returns whether the Content-Type header has the given parameter.
      • extractContentTypeParts

        protected void extractContentTypeParts​(java.lang.String userAgent)
        Extracts the type/subtype and charset parameter from the Content-Type header.
      • getContentEncoding

        public java.lang.String getContentEncoding​(java.lang.String userAgent)
        Returns the content encoding if available. This is only available for some protocols.
      • complete

        public boolean complete()
        Returns true if the URL looks well formed and complete. This does not garuntee that the stream can be opened but is a good indication that things aren't totally messed up.
      • openStream

        public java.io.InputStream openStream​(java.lang.String userAgent,
                                              java.util.Iterator mimeTypes)
                                       throws java.io.IOException
        Open the stream and check for common compression types. If the stream is found to be compressed with a standard compression type it is automatically decompressed.
        Parameters:
        userAgent - The user agent opening the stream (may be null).
        mimeTypes - The expected mime types of the content in the returned InputStream (mapped to Http accept header among other possability). The elements of the iterator must be strings (may be null)
        Throws:
        java.io.IOException
      • openStreamRaw

        public java.io.InputStream openStreamRaw​(java.lang.String userAgent,
                                                 java.util.Iterator mimeTypes)
                                          throws java.io.IOException
        Open the stream and returns it. No checks are made to see if the stream is compressed or encoded in any way.
        Parameters:
        userAgent - The user agent opening the stream (may be null).
        mimeTypes - The expected mime types of the content in the returned InputStream (mapped to Http accept header among other possability). The elements of the iterator must be strings (may be null)
        Throws:
        java.io.IOException
      • openStreamInternal

        protected java.io.InputStream openStreamInternal​(java.lang.String userAgent,
                                                         java.util.Iterator mimeTypes,
                                                         java.util.Iterator encodingTypes)
                                                  throws java.io.IOException
        Throws:
        java.io.IOException
      • getPortStr

        public java.lang.String getPortStr()
        Returns the URL up to and include the port number on the host. Does not include the path or fragment pieces.
      • toString

        public java.lang.String toString()
        Return a string representation of the data.
        Overrides:
        toString in class java.lang.Object
      • getPostConnectionURL

        public java.lang.String getPostConnectionURL()
        Returns the URL that was ultimately used to fetch the resource represented by the ParsedURL.