Class CharEscapers
CharEscaper
s, and some commonly used
CharEscaper
instances.- Since:
- 1.0
-
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic String
Percent-decodes a US-ASCII string into a Unicode string.static String
Escapes the string value so it can be safely included in URIs.static String
escapeUriPath
(String value) Escapes the string value so it can be safely included in URI path segments.static String
Escapes a URI path but retains all reserved characters, including all general delimiters.static String
escapeUriQuery
(String value) Escapes the string value so it can be safely included in URI query string segments.static String
escapeUriUserInfo
(String value) Escapes the string value so it can be safely included in URI user info part.
-
Field Details
-
URI_ESCAPER
-
URI_PATH_ESCAPER
-
URI_RESERVED_ESCAPER
-
URI_USERINFO_ESCAPER
-
URI_QUERY_STRING_ESCAPER
-
-
Constructor Details
-
CharEscapers
private CharEscapers()
-
-
Method Details
-
escapeUri
Escapes the string value so it can be safely included in URIs. For details on escaping URIs, see RFC 3986 - section 2.4.When encoding a String, the following rules apply:
- The alphanumeric characters "a" through "z", "A" through "Z" and "0" through "9" remain the same.
- The special characters ".", "-", "*", and "_" remain the same.
- The space character " " is converted into a plus sign "+".
- All other characters are converted into one or more bytes using UTF-8 encoding and each byte is then represented by the 3-character string "%XY", where "XY" is the two-digit, uppercase, hexadecimal representation of the byte value.
Note: Unlike other escapers, URI escapers produce uppercase hexadecimal sequences. From RFC 3986:
"URI producers and normalizers should use uppercase hexadecimal digits for all percent-encodings."This escaper has identical behavior to (but is potentially much faster than):
URLEncoder.encode(String, String)
with the encoding name "UTF-8"
-
decodeUri
Percent-decodes a US-ASCII string into a Unicode string. UTF-8 encoding is used to determine what characters are represented by any consecutive sequences of the form "%XX".This replaces each occurrence of '+' with a space, ' '. So this method should not be used for non application/x-www-form-urlencoded strings such as host and path.
- Parameters:
uri
- a percent-encoded US-ASCII string- Returns:
- a Unicode string
-
escapeUriPath
Escapes the string value so it can be safely included in URI path segments. For details on escaping URIs, see RFC 3986 - section 2.4.When encoding a String, the following rules apply:
- The alphanumeric characters "a" through "z", "A" through "Z" and "0" through "9" remain the same.
- The unreserved characters ".", "-", "~", and "_" remain the same.
- The general delimiters "@" and ":" remain the same.
- The subdelimiters "!", "$", "&", "'", "(", ")", "*", ",", ";", and "=" remain the same.
- The space character " " is converted into %20.
- All other characters are converted into one or more bytes using UTF-8 encoding and each byte is then represented by the 3-character string "%XY", where "XY" is the two-digit, uppercase, hexadecimal representation of the byte value.
Note: Unlike other escapers, URI escapers produce uppercase hexadecimal sequences. From RFC 3986:
"URI producers and normalizers should use uppercase hexadecimal digits for all percent-encodings." -
escapeUriPathWithoutReserved
Escapes a URI path but retains all reserved characters, including all general delimiters. That is the same asescapeUriPath(String)
except that it keeps '?', '+', and '/' unescaped. -
escapeUriUserInfo
Escapes the string value so it can be safely included in URI user info part. For details on escaping URIs, see RFC 3986 - section 2.4.When encoding a String, the following rules apply:
- The alphanumeric characters "a" through "z", "A" through "Z" and "0" through "9" remain the same.
- The unreserved characters ".", "-", "~", and "_" remain the same.
- The general delimiter ":" remains the same.
- The subdelimiters "!", "$", "&", "'", "(", ")", "*", ",", ";", and "=" remain the same.
- The space character " " is converted into %20.
- All other characters are converted into one or more bytes using UTF-8 encoding and each byte is then represented by the 3-character string "%XY", where "XY" is the two-digit, uppercase, hexadecimal representation of the byte value.
Note: Unlike other escapers, URI escapers produce uppercase hexadecimal sequences. From RFC 3986:
"URI producers and normalizers should use uppercase hexadecimal digits for all percent-encodings."- Since:
- 1.15
-
escapeUriQuery
Escapes the string value so it can be safely included in URI query string segments. When the query string consists of a sequence of name=value pairs separated by &, the names and values should be individually encoded. If you escape an entire query string in one pass with this escaper, then the "=" and "&" characters used as separators will also be escaped.This escaper is also suitable for escaping fragment identifiers.
For details on escaping URIs, see RFC 3986 - section 2.4.
When encoding a String, the following rules apply:
- The alphanumeric characters "a" through "z", "A" through "Z" and "0" through "9" remain the same.
- The unreserved characters ".", "-", "~", and "_" remain the same.
- The general delimiters "@" and ":" remain the same.
- The path delimiters "/" and "?" remain the same.
- The subdelimiters "!", "$", "'", "(", ")", "*", ",", and ";", remain the same.
- The space character " " is converted into %20.
- The equals sign "=" is converted into %3D.
- The ampersand "&" is converted into %26.
- All other characters are converted into one or more bytes using UTF-8 encoding and each byte is then represented by the 3-character string "%XY", where "XY" is the two-digit, uppercase, hexadecimal representation of the byte value.
Note: Unlike other escapers, URI escapers produce uppercase hexadecimal sequences. From RFC 3986:
"URI producers and normalizers should use uppercase hexadecimal digits for all percent-encodings."
-