Class PercentEscaper
- com.google.common.escape.Escaper
- com.google.common.escape.UnicodeEscaper
- com.google.common.net.PercentEscaper
Escapes some set of Java characters using a UTF-8 based percent encoding scheme. The set of safe characters (those which remain unescaped) can be specified on construction.
This class is primarily used for creating URI escapers in UrlEscapers
but can be used
directly if required. While URI escapers impose specific semantics on which characters are
considered 'safe', this class has a minimal set of restrictions.
When escaping a String, the following rules apply:
- All specified safe characters remain unchanged.
- If
plusForSpace
was specified, the space character " " is converted into a plus sign"+"
. - All other characters are converted into one or more bytes using UTF-8 encoding and each byte is then represented by the 3-character string "%XX", where "XX" is the two-digit, uppercase, hexadecimal representation of the byte value.
For performance reasons the only currently supported character encoding of this class is UTF-8.
Note: This escaper produces uppercase hexadecimal sequences.
This class is internal and is hence not for public use. Its APIs are unstable and can change at any time.
- Since:
- 15.0
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate static final int
The amount of padding (chars) to use when growing the escape buffer.private static final String
private static final boolean[]
An array of flags where for anychar c
ifsafeOctets[c]
is true thenc
should remain unmodified in the output.private static final char[]
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprivate static int
codePointAt
(CharSequence seq, int index, int end) Returns the Unicode code point of the character at the given index.static PercentEscaper
create()
The defaultPercentEscaper
which will *not* replace spaces with plus signs.private static boolean[]
createSafeOctets
(String safeChars) Creates a boolean array with entries corresponding to the character values specified in safeChars set to true.private static char[]
escape
(int cp) Escapes the given Unicode code point in UTF-8.Escape the provided String, using percent-style URL Encoding.private static String
escapeSlow
(String s, int index) Returns the escaped form of a given literal string, starting at the given index.private static char[]
growBuffer
(char[] dest, int index, int size) Helper method to grow the character buffer as needed, this only happens once in a while so it's ok if it's in a method call.private static int
nextEscapeIndex
(CharSequence csq, int index, int end)
-
Field Details
-
DEST_PAD
private static final int DEST_PADThe amount of padding (chars) to use when growing the escape buffer.- See Also:
-
SAFE_CHARS
- See Also:
-
UPPER_HEX_DIGITS
private static final char[] UPPER_HEX_DIGITS -
safeOctets
private static final boolean[] safeOctetsAn array of flags where for anychar c
ifsafeOctets[c]
is true thenc
should remain unmodified in the output. Ifc >= safeOctets.length
then it should be escaped.
-
-
Constructor Details
-
PercentEscaper
public PercentEscaper()
-
-
Method Details
-
create
The defaultPercentEscaper
which will *not* replace spaces with plus signs. -
createSafeOctets
Creates a boolean array with entries corresponding to the character values specified in safeChars set to true. The array is as small as is required to hold the given character information. -
escape
Escape the provided String, using percent-style URL Encoding. -
escapeSlow
Returns the escaped form of a given literal string, starting at the given index. This method is called by theescape(String)
method when it discovers that escaping is required. It is protected to allow subclasses to override the fastpath escaping function to inline their escaping test.This method is not reentrant and may only be invoked by the top level
escape(String)
method.- Parameters:
s
- the literal string to be escapedindex
- the index to start escaping from- Returns:
- the escaped form of
string
- Throws:
NullPointerException
- ifstring
is nullIllegalArgumentException
- if invalid surrogate characters are encountered
-
nextEscapeIndex
-
escape
Escapes the given Unicode code point in UTF-8. -
codePointAt
Returns the Unicode code point of the character at the given index.Unlike
Character.codePointAt(CharSequence, int)
orString.codePointAt(int)
this method will never fail silently when encountering an invalid surrogate pair.The behaviour of this method is as follows:
- If
index >= end
,IndexOutOfBoundsException
is thrown. - If the character at the specified index is not a surrogate, it is returned.
- If the first character was a high surrogate value, then an attempt is made to read the
next character.
- If the end of the sequence was reached, the negated value of the trailing high surrogate is returned.
- If the next character was a valid low surrogate, the code point value of the high/low surrogate pair is returned.
- If the next character was not a low surrogate value, then
IllegalArgumentException
is thrown.
- If the first character was a low surrogate value,
IllegalArgumentException
is thrown.
- Parameters:
seq
- the sequence of characters from which to decode the code pointindex
- the index of the first character to decodeend
- the index beyond the last valid character to decode- Returns:
- the Unicode code point for the given index or the negated value of the trailing high surrogate character at the end of the sequence
- If
-
growBuffer
private static char[] growBuffer(char[] dest, int index, int size) Helper method to grow the character buffer as needed, this only happens once in a while so it's ok if it's in a method call. If the index passed in is 0 then no copying will be done.
-