Class CharEscapers


  • public final class CharEscapers
    extends java.lang.Object
    Utility functions for encoding and decoding URIs.
    Since:
    1.0
    • Field Detail

      • APPLICATION_X_WWW_FORM_URLENCODED

        private static final Escaper APPLICATION_X_WWW_FORM_URLENCODED
      • URI_ESCAPER

        private static final Escaper URI_ESCAPER
      • URI_PATH_ESCAPER

        private static final Escaper URI_PATH_ESCAPER
      • URI_RESERVED_ESCAPER

        private static final Escaper URI_RESERVED_ESCAPER
      • URI_USERINFO_ESCAPER

        private static final Escaper URI_USERINFO_ESCAPER
      • URI_QUERY_STRING_ESCAPER

        private static final Escaper URI_QUERY_STRING_ESCAPER
    • Constructor Detail

      • CharEscapers

        private CharEscapers()
    • Method Detail

      • escapeUri

        @Deprecated
        public static java.lang.String escapeUri​(java.lang.String value)
        Deprecated.
        Escapes the string value so it can be safely included in application/x-www-form-urlencoded data. This is not appropriate for generic URI escaping. In particular it encodes the space character as a plus sign instead of percent escaping it, in contravention of the URI specification. For details on application/x-www-form-urlencoded encoding see the see HTML 4 specification, section 17.13.4.1.

        When encoding a String, the following rules apply:

        • The alphanumeric characters "a" through "z", "A" through "Z" and "0" through "9" remain the same.
        • The special characters ".", "-", "*", and "_" remain the same.
        • The space character " " is converted into a plus sign "+".
        • All other characters are converted into one or more bytes using UTF-8 encoding and each byte is then represented by the 3-character string "%XY", where "XY" is the two-digit, uppercase, hexadecimal representation of the byte value.

        Note: Unlike other escapers, URI escapers produce uppercase hexadecimal sequences. From RFC 3986:
        "URI producers and normalizers should use uppercase hexadecimal digits for all percent-encodings."

        This escaper has identical behavior to (but is potentially much faster than):

        • URLEncoder.encode(String, String) with the encoding name "UTF-8"
      • escapeUriConformant

        public static java.lang.String escapeUriConformant​(java.lang.String value)
        Escapes the string value so it can be safely included in any part of a URI. For details on escaping URIs, see RFC 3986 - section 2.4.

        When encoding a String, the following rules apply:

        • The alphanumeric characters "a" through "z", "A" through "Z" and "0" through "9" remain the same.
        • The special characters ".", "-", "*", and "_" remain the same.
        • The space character " " is converted into "%20".
        • All other characters are converted into one or more bytes using UTF-8 encoding and each byte is then represented by the 3-character string "%XY", where "XY" is the two-digit, uppercase, hexadecimal representation of the byte value.

        Note: Unlike other escapers, URI escapers produce uppercase hexadecimal sequences. From RFC 3986:
        "URI producers and normalizers should use uppercase hexadecimal digits for all percent-encodings."

      • decodeUri

        public static java.lang.String decodeUri​(java.lang.String uri)
        Decodes application/x-www-form-urlencoded strings. The UTF-8 character set determines what characters are represented by any consecutive sequences of the form "%XX".

        This replaces each occurrence of '+' with a space, ' '. This method should not be used for non-application/x-www-form-urlencoded strings such as host and path.

        Parameters:
        uri - a percent-encoded US-ASCII string
        Returns:
        a string without any percent escapes or plus signs
      • decodeUriPath

        public static java.lang.String decodeUriPath​(java.lang.String path)
        Decodes the path component of a URI. This does not convert + into spaces (the behavior of URLDecoder.decode(String, String)). This method transforms URI encoded values into their decoded symbols.

        e.g. decodePath("%3Co%3E") returns "<o>"

        Parameters:
        path - the value to be decoded
        Returns:
        decoded version of path
      • escapeUriPath

        public static java.lang.String escapeUriPath​(java.lang.String value)
        Escapes the string value so it can be safely included in URI path segments. For details on escaping URIs, see RFC 3986 - section 2.4.

        When encoding a String, the following rules apply:

        • The alphanumeric characters "a" through "z", "A" through "Z" and "0" through "9" remain the same.
        • The unreserved characters ".", "-", "~", and "_" remain the same.
        • The general delimiters "@" and ":" remain the same.
        • The subdelimiters "!", "$", "&", "'", "(", ")", "*", ",", ";", and "=" remain the same.
        • The space character " " is converted into %20.
        • All other characters are converted into one or more bytes using UTF-8 encoding and each byte is then represented by the 3-character string "%XY", where "XY" is the two-digit, uppercase, hexadecimal representation of the byte value.

        Note: Unlike other escapers, URI escapers produce uppercase hexadecimal sequences. From RFC 3986:
        "URI producers and normalizers should use uppercase hexadecimal digits for all percent-encodings."

      • escapeUriPathWithoutReserved

        public static java.lang.String escapeUriPathWithoutReserved​(java.lang.String value)
        Escapes a URI path but retains all reserved characters, including all general delimiters. That is the same as escapeUriPath(String) except that it does not escape '?', '+', and '/'.
      • escapeUriUserInfo

        public static java.lang.String escapeUriUserInfo​(java.lang.String value)
        Escapes the string value so it can be safely included in URI user info part. For details on escaping URIs, see RFC 3986 - section 2.4.

        When encoding a String, the following rules apply:

        • The alphanumeric characters "a" through "z", "A" through "Z" and "0" through "9" remain the same.
        • The unreserved characters ".", "-", "~", and "_" remain the same.
        • The general delimiter ":" remains the same.
        • The subdelimiters "!", "$", "&", "'", "(", ")", "*", ",", ";", and "=" remain the same.
        • The space character " " is converted into %20.
        • All other characters are converted into one or more bytes using UTF-8 encoding and each byte is then represented by the 3-character string "%XY", where "XY" is the two-digit, uppercase, hexadecimal representation of the byte value.

        Note: Unlike other escapers, URI escapers produce uppercase hexadecimal sequences. From RFC 3986:
        "URI producers and normalizers should use uppercase hexadecimal digits for all percent-encodings."

        Since:
        1.15
      • escapeUriQuery

        public static java.lang.String escapeUriQuery​(java.lang.String value)
        Escapes the string value so it can be safely included in URI query string segments. When the query string consists of a sequence of name=value pairs separated by &, the names and values should be individually encoded. If you escape an entire query string in one pass with this escaper, then the "=" and "&" characters used as separators will also be escaped.

        This escaper is also suitable for escaping fragment identifiers.

        For details on escaping URIs, see RFC 3986 - section 2.4.

        When encoding a String, the following rules apply:

        • The alphanumeric characters "a" through "z", "A" through "Z" and "0" through "9" remain the same.
        • The unreserved characters ".", "-", "~", and "_" remain the same.
        • The general delimiters "@" and ":" remain the same.
        • The path delimiters "/" and "?" remain the same.
        • The subdelimiters "!", "$", "'", "(", ")", "*", ",", and ";", remain the same.
        • The space character " " is converted into %20.
        • The equals sign "=" is converted into %3D.
        • The ampersand "&" is converted into %26.
        • All other characters are converted into one or more bytes using UTF-8 encoding and each byte is then represented by the 3-character string "%XY", where "XY" is the two-digit, uppercase, hexadecimal representation of the byte value.

        Note: Unlike other escapers, URI escapers produce uppercase hexadecimal sequences. From RFC 3986:
        "URI producers and normalizers should use uppercase hexadecimal digits for all percent-encodings."