Class URI
Parsing of a URI specification is done according to the URI syntax described in RFC 2396, and amended by RFC 2732.
Every absolute URI consists of a scheme, followed by a colon (':'), followed by a scheme-specific part. For URIs that follow the "generic URI" syntax, the scheme-specific part begins with two slashes ("//") and may be followed by an authority segment (comprised of user information, host, and port), path segment, query segment and fragment. Note that RFC 2396 no longer specifies the use of the parameters segment and excludes the "user:password" syntax as part of the authority segment. If "user:password" appears in a URI, the entire user/password string is stored as userinfo.
For URIs that do not follow the "generic URI" syntax (e.g. mailto), the entire scheme-specific part is treated as the "path" portion of the URI.
Note that, unlike the java.net.URL class, this class does not provide any built-in network access functionality nor does it provide any scheme-specific functionality (for example, it does not know a default port for a specific scheme). Rather, it only knows the grammar and basic set of operations that can be applied to a URI.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic class
MalformedURIExceptions are thrown in the process of building a URI or setting fields on a URI when an operation would result in an invalid URI specification. -
Field Summary
FieldsModifier and TypeFieldDescriptionprivate static final int
ASCII letter charactersprivate static final int
ASCII digit charactersprivate static final int
ASCII hex charactersprivate static final byte[]
private String
If specified, stores the fragment for this URI; otherwise nullprivate String
If specified, stores the host for this URI; otherwise nullprivate static final int
URI punctuation mark characters: -_.!~*'() - these, combined with alphanumerics, constitute the "unreserved" charactersprivate static final int
Mask for alpha-numeric charactersprivate static final int
Mask for path charactersprivate static final int
Mask for scheme charactersprivate static final int
Mask for unreserved charactersprivate static final int
Mask for URI allowable characters except for %private static final int
Mask for userinfo charactersprivate String
If specified, stores the path for this URI; otherwise nullprivate static final int
Path charactersprivate int
If specified, stores the port for this URI; otherwise -1private String
If specified, stores the query string for this URI; otherwise null.private String
If specified, stores the registry based authority for this URI; otherwise -1private static final int
reserved characters ;/?:@invalid input: '&'=+$,[]private String
Stores the scheme (usually the protocol) for this URI.private static final int
scheme can be composed of alphanumerics and these characters: +-.private String
If specified, stores the userinfo for this URI; otherwise nullprivate static final int
userinfo can be composed of unreserved, escaped and these characters: ;:invalid input: '&'=+$, -
Constructor Summary
ConstructorsConstructorDescriptionConstruct a new URI from a URI specification string.Construct a new URI from a URI specification string.URI
(String scheme, String userinfo, String host, int port, String path, String queryString, String fragment) Construct a new URI that follows the generic URI syntax from its component parts.Construct a new URI that follows the generic URI syntax from its component parts.Construct a new URI from a base URI and a URI specification string.Construct a new URI from a base URI and a URI specification string. -
Method Summary
Modifier and TypeMethodDescriptionvoid
absolutize
(URI base) Absolutize URI with given base URI.boolean
Determines if the passed-in Object is equivalent to this URI.Get the fragment for this URI.getHost()
Get the host for this URI.getPath()
Get the path for this URI.int
getPort()
Get the port for this URI.Get the query string for this URI.Get the registry based authority for this URI.Get the scheme for this URI.Get the scheme-specific part for this URI (everything following the scheme and the first colon).Get the userinfo for this URI.private void
initialize
(URI other) Initialize all fields of this URI from another URI.private void
initialize
(URI base, String uriSpec) Initializes this URI from a base URI and a URI specification string.private void
initialize
(URI base, String uriSpec, boolean allowNonAbsoluteURI) Initializes this URI from a base URI and a URI specification string.private boolean
initializeAuthority
(String uriSpec) Initialize the authority (either server or registry based) for this URI from a URI string spec.private void
initializePath
(String uriSpec, int nStartIndex) Initialize the path for this URI from a URI string spec.private void
initializeScheme
(String uriSpec) Initialize the scheme for this URI from a URI string spec.boolean
Returns whether this URI represents an absolute URI.private static boolean
isAlpha
(char ch) Determine whether a char is an alphabetic character: a-z or A-Zprivate static boolean
isAlphanum
(char ch) Determine whether a char is an alphanumeric: 0-9, a-z or A-Zprivate static boolean
isConformantSchemeName
(String scheme) Determine whether a scheme conforms to the rules for a scheme name.private static boolean
isDigit
(char chr) Determine whether a char is a digit.boolean
Get the indicator as to whether this URI uses the "generic URI" syntax.private static boolean
isHex
(char ch) Determine whether a character is a hexadecimal character.private static boolean
isPathCharacter
(char ch) Determine whether a char is a path character.private static boolean
isSchemeCharacter
(char ch) Determine whether a char is a scheme character.private static boolean
isURICharacter
(char ch) Determine whether a char is a URI character (reserved or unreserved, not including '%' for escaped octets).private static boolean
isURIString
(String uric) Determine whether a given string contains only URI characters (also called "uric" in RFC 2396).private static boolean
isUserinfoCharacter
(char ch) Determine whether a char is a userinfo character.private boolean
isValidRegistryBasedAuthority
(String authority) Determines whether the given string is a registry based authority.private boolean
isValidServerBasedAuthority
(String host, int port, String userinfo) Determines whether the components host, port, and user info are valid as a server authority.static boolean
isWellFormedAddress
(String address) Determine whether a string is syntactically capable of representing a valid IPv4 address, IPv6 reference or the domain name of a network host.static boolean
isWellFormedIPv4Address
(String address) Determines whether a string is an IPv4 address as defined by RFC 2373, and under the further constraint that it must be a 32-bit address.static boolean
isWellFormedIPv6Reference
(String address) Determines whether a string is an IPv6 reference as defined by RFC 2732, where IPv6address is defined in RFC 2373.private static int
scanHexSequence
(String address, int index, int end, int[] counter) Helper method for isWellFormedIPv6Reference which scans the hex sequences of an IPv6 address.void
setFragment
(String fragment) Set the fragment for this URI.void
Set the host for this URI.void
Set the path for this URI.void
setPort
(int port) Set the port for this URI.void
setQueryString
(String queryString) Set the query string for this URI.void
Set the scheme for this URI.void
setUserinfo
(String userinfo) Set the userinfo for this URI.toString()
Get the URI as a string specification.
-
Field Details
-
fgLookupTable
private static final byte[] fgLookupTable -
RESERVED_CHARACTERS
private static final int RESERVED_CHARACTERSreserved characters ;/?:@invalid input: '&'=+$,[]- See Also:
-
MARK_CHARACTERS
private static final int MARK_CHARACTERSURI punctuation mark characters: -_.!~*'() - these, combined with alphanumerics, constitute the "unreserved" characters- See Also:
-
SCHEME_CHARACTERS
private static final int SCHEME_CHARACTERSscheme can be composed of alphanumerics and these characters: +-.- See Also:
-
USERINFO_CHARACTERS
private static final int USERINFO_CHARACTERSuserinfo can be composed of unreserved, escaped and these characters: ;:invalid input: '&'=+$,- See Also:
-
ASCII_ALPHA_CHARACTERS
private static final int ASCII_ALPHA_CHARACTERSASCII letter characters- See Also:
-
ASCII_DIGIT_CHARACTERS
private static final int ASCII_DIGIT_CHARACTERSASCII digit characters- See Also:
-
ASCII_HEX_CHARACTERS
private static final int ASCII_HEX_CHARACTERSASCII hex characters- See Also:
-
PATH_CHARACTERS
private static final int PATH_CHARACTERSPath characters- See Also:
-
MASK_ALPHA_NUMERIC
private static final int MASK_ALPHA_NUMERICMask for alpha-numeric characters- See Also:
-
MASK_UNRESERVED_MASK
private static final int MASK_UNRESERVED_MASKMask for unreserved characters- See Also:
-
MASK_URI_CHARACTER
private static final int MASK_URI_CHARACTERMask for URI allowable characters except for %- See Also:
-
MASK_SCHEME_CHARACTER
private static final int MASK_SCHEME_CHARACTERMask for scheme characters- See Also:
-
MASK_USERINFO_CHARACTER
private static final int MASK_USERINFO_CHARACTERMask for userinfo characters- See Also:
-
MASK_PATH_CHARACTER
private static final int MASK_PATH_CHARACTERMask for path characters- See Also:
-
scheme_
Stores the scheme (usually the protocol) for this URI. -
userinfo_
If specified, stores the userinfo for this URI; otherwise null -
host_
If specified, stores the host for this URI; otherwise null -
port_
private int port_If specified, stores the port for this URI; otherwise -1 -
regAuthority_
If specified, stores the registry based authority for this URI; otherwise -1 -
path_
If specified, stores the path for this URI; otherwise null -
queryString_
If specified, stores the query string for this URI; otherwise null. -
fragment_
If specified, stores the fragment for this URI; otherwise null
-
-
Constructor Details
-
URI
Construct a new URI from a URI specification string. If the specification follows the "generic URI" syntax, (two slashes following the first colon), the specification will be parsed accordingly - setting the scheme, userinfo, host,port, path, query string and fragment fields as necessary. If the specification does not follow the "generic URI" syntax, the specification is parsed into a scheme and scheme-specific part (stored as the path) only.- Parameters:
uriSpec
- the URI specification string (cannot be null or empty)- Throws:
URI.MalformedURIException
- if p_uriSpec violates any syntax rules
-
URI
Construct a new URI from a URI specification string. If the specification follows the "generic URI" syntax, (two slashes following the first colon), the specification will be parsed accordingly - setting the scheme, userinfo, host,port, path, query string and fragment fields as necessary. If the specification does not follow the "generic URI" syntax, the specification is parsed into a scheme and scheme-specific part (stored as the path) only. Construct a relative URI if boolean is assigned to "true" and p_uriSpec is not valid absolute URI, instead of throwing an exception.- Parameters:
uriSpec
- the URI specification string (cannot be null or empty)allowNonAbsoluteURI
- true to permit non-absolute URIs, false otherwise.- Throws:
URI.MalformedURIException
- if p_uriSpec violates any syntax rules
-
URI
Construct a new URI from a base URI and a URI specification string. The URI specification string may be a relative URI.- Parameters:
base
- the base URI (cannot be null if p_uriSpec is null or empty)uriSpec
- the URI specification string (cannot be null or empty if p_base is null)- Throws:
URI.MalformedURIException
- if p_uriSpec violates any syntax rules
-
URI
Construct a new URI from a base URI and a URI specification string. The URI specification string may be a relative URI. Construct a relative URI if boolean is assigned to "true" and p_uriSpec is not valid absolute URI and p_base is null instead of throwing an exception.- Parameters:
base
- the base URI (cannot be null if p_uriSpec is null or empty)uriSpec
- the URI specification string (cannot be null or empty if p_base is null)allowNonAbsoluteURI
- true to permit non-absolute URIs, false otherwise.- Throws:
URI.MalformedURIException
- if p_uriSpec violates any syntax rules
-
URI
public URI(String scheme, String host, String path, String queryString, String fragment) throws URI.MalformedURIException Construct a new URI that follows the generic URI syntax from its component parts. Each component is validated for syntax and some basic semantic checks are performed as well. See the individual setter methods for specifics.- Parameters:
scheme
- the URI scheme (cannot be null or empty)host
- the hostname, IPv4 address or IPv6 reference for the URIpath
- the URI path - if the path contains '?' or '#', then the query string and/or fragment will be set from the path; however, if the query and fragment are specified both in the path and as separate parameters, an exception is thrownqueryString
- the URI query string (cannot be specified if path is null)fragment
- the URI fragment (cannot be specified if path is null)- Throws:
URI.MalformedURIException
- if any of the parameters violates syntax rules or semantic rules
-
URI
public URI(String scheme, String userinfo, String host, int port, String path, String queryString, String fragment) throws URI.MalformedURIException Construct a new URI that follows the generic URI syntax from its component parts. Each component is validated for syntax and some basic semantic checks are performed as well. See the individual setter methods for specifics.- Parameters:
scheme
- the URI scheme (cannot be null or empty)userinfo
- the URI userinfo (cannot be specified if host is null)host
- the hostname, IPv4 address or IPv6 reference for the URIport
- the URI port (may be -1 for "unspecified"; cannot be specified if host is null)path
- the URI path - if the path contains '?' or '#', then the query string and/or fragment will be set from the path; however, if the query and fragment are specified both in the path and as separate parameters, an exception is thrownqueryString
- the URI query string (cannot be specified if path is null)fragment
- the URI fragment (cannot be specified if path is null)- Throws:
URI.MalformedURIException
- if any of the parameters violates syntax rules or semantic rules
-
-
Method Details
-
initialize
Initialize all fields of this URI from another URI.- Parameters:
other
- the URI to copy (cannot be null)
-
initialize
private void initialize(URI base, String uriSpec, boolean allowNonAbsoluteURI) throws URI.MalformedURIException Initializes this URI from a base URI and a URI specification string. See RFC 2396 Section 4 and Appendix B for specifications on parsing the URI and Section 5 for specifications on resolving relative URIs and relative paths.- Parameters:
base
- the base URI (may be null if p_uriSpec is an absolute URI)uriSpec
- the URI spec string which may be an absolute or relative URI (can only be null/empty if p_base is not null)allowNonAbsoluteURI
- true to permit non-absolute URIs, in case of relative URI, false otherwise.- Throws:
URI.MalformedURIException
- if p_base is null and p_uriSpec is not an absolute URI or if p_uriSpec violates syntax rules
-
initialize
Initializes this URI from a base URI and a URI specification string. See RFC 2396 Section 4 and Appendix B for specifications on parsing the URI and Section 5 for specifications on resolving relative URIs and relative paths.- Parameters:
base
- the base URI (may be null if p_uriSpec is an absolute URI)uriSpec
- the URI spec string which may be an absolute or relative URI (can only be null/empty if p_base is not null)- Throws:
URI.MalformedURIException
- if p_base is null and p_uriSpec is not an absolute URI or if p_uriSpec violates syntax rules
-
absolutize
Absolutize URI with given base URI.- Parameters:
base
- base URI for absolutization
-
initializeScheme
Initialize the scheme for this URI from a URI string spec.- Parameters:
uriSpec
- the URI specification (cannot be null)- Throws:
URI.MalformedURIException
- if URI does not have a conformant scheme
-
initializeAuthority
Initialize the authority (either server or registry based) for this URI from a URI string spec.- Parameters:
uriSpec
- the URI specification (cannot be null)- Returns:
- true if the given string matched server or registry based authority
-
isValidServerBasedAuthority
Determines whether the components host, port, and user info are valid as a server authority.- Parameters:
host
- the host component of authorityport
- the port number component of authorityuserinfo
- the user info component of authority- Returns:
- true if the given host, port, and userinfo compose a valid server authority
-
isValidRegistryBasedAuthority
Determines whether the given string is a registry based authority.- Parameters:
authority
- the authority component of a URI- Returns:
- true if the given string is a registry based authority
-
initializePath
Initialize the path for this URI from a URI string spec.- Parameters:
uriSpec
- the URI specification (cannot be null)nStartIndex
- the index to begin scanning from- Throws:
URI.MalformedURIException
- if p_uriSpec violates syntax rules
-
getScheme
Get the scheme for this URI.- Returns:
- the scheme for this URI
-
getSchemeSpecificPart
Get the scheme-specific part for this URI (everything following the scheme and the first colon). See RFC 2396 Section 5.2 for spec.- Returns:
- the scheme-specific part for this URI
-
getUserinfo
Get the userinfo for this URI.- Returns:
- the userinfo for this URI (null if not specified).
-
getHost
Get the host for this URI.- Returns:
- the host for this URI (null if not specified).
-
getPort
public int getPort()Get the port for this URI.- Returns:
- the port for this URI (-1 if not specified).
-
getRegBasedAuthority
Get the registry based authority for this URI.- Returns:
- the registry based authority (null if not specified).
-
getPath
Get the path for this URI. Note that the value returned is the path only and does not include the query string or fragment.- Returns:
- the path for this URI.
-
getQueryString
Get the query string for this URI.- Returns:
- the query string for this URI. Null is returned if there was no "?" in the URI spec, empty string if there was a "?" but no query string following it.
-
getFragment
Get the fragment for this URI.- Returns:
- the fragment for this URI. Null is returned if there was no "#" in the URI spec, empty string if there was a "#" but no fragment following it.
-
setScheme
Set the scheme for this URI. The scheme is converted to lowercase before it is set.- Parameters:
scheme
- the scheme for this URI (cannot be null)- Throws:
URI.MalformedURIException
- if p_scheme is not a conformant scheme name
-
setUserinfo
Set the userinfo for this URI. If a non-null value is passed in and the host value is null, then an exception is thrown.- Parameters:
userinfo
- the userinfo for this URI- Throws:
URI.MalformedURIException
- if p_userinfo contains invalid characters
-
setHost
Set the host for this URI. If null is passed in, the userinfo field is also set to null and the port is set to -1.
Note: This method overwrites registry based authority if it previously existed in this URI.
- Parameters:
host
- the host for this URI- Throws:
URI.MalformedURIException
- if p_host is not a valid IP address or DNS hostname.
-
setPort
Set the port for this URI. -1 is used to indicate that the port is not specified, otherwise valid port numbers are between 0 and 65535. If a valid port number is passed in and the host field is null, an exception is thrown.- Parameters:
port
- the port number for this URI- Throws:
URI.MalformedURIException
- if p_port is not -1 and not a valid port number
-
setPath
Set the path for this URI. If the supplied path is null, then the query string and fragment are set to null as well. If the supplied path includes a query string and/or fragment, these fields will be parsed and set as well. Note that, for URIs following the "generic URI" syntax, the path specified should start with a slash. For URIs that do not follow the generic URI syntax, this method sets the scheme-specific part.- Parameters:
path
- the path for this URI (may be null)- Throws:
URI.MalformedURIException
- if p_path contains invalid characters
-
setQueryString
Set the query string for this URI. A non-null value is valid only if this is an URI conforming to the generic URI syntax and the path value is not null.- Parameters:
queryString
- the query string for this URI- Throws:
URI.MalformedURIException
- if p_queryString is not null and this URI does not conform to the generic URI syntax or if the path is null
-
setFragment
Set the fragment for this URI. A non-null value is valid only if this is a URI conforming to the generic URI syntax and the path value is not null.- Parameters:
fragment
- the fragment for this URI- Throws:
URI.MalformedURIException
- if p_fragment is not null and this URI does not conform to the generic URI syntax or if the path is null
-
equals
Determines if the passed-in Object is equivalent to this URI. -
toString
Get the URI as a string specification. See RFC 2396 Section 5.2. -
isGenericURI
public boolean isGenericURI()Get the indicator as to whether this URI uses the "generic URI" syntax.- Returns:
- true if this URI uses the "generic URI" syntax, false otherwise
-
isAbsoluteURI
public boolean isAbsoluteURI()Returns whether this URI represents an absolute URI.- Returns:
- true if this URI represents an absolute URI, false otherwise
-
isConformantSchemeName
Determine whether a scheme conforms to the rules for a scheme name. A scheme is conformant if it starts with an alphanumeric, and contains only alphanumerics, '+','-' and '.'.- Parameters:
scheme
- the scheme- Returns:
- true if the scheme is conformant, false otherwise
-
isWellFormedAddress
Determine whether a string is syntactically capable of representing a valid IPv4 address, IPv6 reference or the domain name of a network host. A valid IPv4 address consists of four decimal digit groups separated by a '.'. Each group must consist of one to three digits. See RFC 2732 Section 3, and RFC 2373 Section 2.2, for the definition of IPv6 references. A hostname consists of domain labels (each of which must begin and end with an alphanumeric but may contain '-') separated & by a '.'. See RFC 2396 Section 3.2.2.- Parameters:
address
- the address- Returns:
- true if the string is a syntactically valid IPv4 address, IPv6 reference or hostname
-
isWellFormedIPv4Address
Determines whether a string is an IPv4 address as defined by RFC 2373, and under the further constraint that it must be a 32-bit address. Though not expressed in the grammar, in order to satisfy the 32-bit address constraint, each segment of the address cannot be greater than 255 (8 bits of information).
IPv4address = 1*3DIGIT "." 1*3DIGIT "." 1*3DIGIT "." 1*3DIGIT
- Parameters:
address
- the address- Returns:
- true if the string is a syntactically valid IPv4 address
-
isWellFormedIPv6Reference
Determines whether a string is an IPv6 reference as defined by RFC 2732, where IPv6address is defined in RFC 2373. The IPv6 address is parsed according to Section 2.2 of RFC 2373, with the additional constraint that the address be composed of 128 bits of information.
IPv6reference = "[" IPv6address "]"
Note: The BNF expressed in RFC 2373 Appendix B does not accurately describe section 2.2, and was in fact removed from RFC 3513, the successor of RFC 2373.
- Parameters:
address
- the address- Returns:
- true if the string is a syntactically valid IPv6 reference
-
scanHexSequence
Helper method for isWellFormedIPv6Reference which scans the hex sequences of an IPv6 address. It returns the index of the next character to scan in the address, or -1 if the string cannot match a valid IPv6 address.- Parameters:
address
- the string to be scannedindex
- the beginning index (inclusive)end
- the ending index (exclusive)counter
- a counter for the number of 16-bit sections read in the address- Returns:
- the index of the next character to scan, or -1 if the string cannot match a valid IPv6 address
-
isDigit
private static boolean isDigit(char chr) Determine whether a char is a digit.- Returns:
- true if the char is betweeen '0' and '9', false otherwise
-
isHex
private static boolean isHex(char ch) Determine whether a character is a hexadecimal character.- Returns:
- true if the char is betweeen '0' and '9', 'a' and 'f' or 'A' and 'F', false otherwise
-
isAlpha
private static boolean isAlpha(char ch) Determine whether a char is an alphabetic character: a-z or A-Z- Returns:
- true if the char is alphabetic, false otherwise
-
isAlphanum
private static boolean isAlphanum(char ch) Determine whether a char is an alphanumeric: 0-9, a-z or A-Z- Returns:
- true if the char is alphanumeric, false otherwise
-
isURICharacter
private static boolean isURICharacter(char ch) Determine whether a char is a URI character (reserved or unreserved, not including '%' for escaped octets).- Returns:
- true if the char is a URI character, false otherwise
-
isSchemeCharacter
private static boolean isSchemeCharacter(char ch) Determine whether a char is a scheme character.- Returns:
- true if the char is a scheme character, false otherwise
-
isUserinfoCharacter
private static boolean isUserinfoCharacter(char ch) Determine whether a char is a userinfo character.- Returns:
- true if the char is a userinfo character, false otherwise
-
isPathCharacter
private static boolean isPathCharacter(char ch) Determine whether a char is a path character.- Returns:
- true if the char is a path character, false otherwise
-
isURIString
Determine whether a given string contains only URI characters (also called "uric" in RFC 2396). uric consist of all reserved characters, unreserved characters and escaped characters.- Returns:
- true if the string is comprised of uric, false otherwise
-