Class QueryParser
- java.lang.Object
-
- org.simpleframework.common.parse.Parser
-
- org.simpleframework.common.parse.MapParser<java.lang.String>
-
- org.simpleframework.http.parse.QueryParser
-
- All Implemented Interfaces:
java.util.Map<java.lang.String,java.lang.String>
,Query
- Direct Known Subclasses:
QueryCombiner
public class QueryParser extends MapParser<java.lang.String> implements Query
TheParameterParser
is used to parse data encoded in theapplication/x-www-form-urlencoded
MIME type. It is also used to parse a query string from a HTTP URL, see RFC 2616. The parsed parameters are available through the various methods of theorg.simpleframework.http.net.Query
interface. The syntax of the parsed parameters is described below in BNF.params = *(pair [ "&" params]) pair = name "=" value name = *(text | escaped) value = *(text | escaped) escaped = % HEX HEX
This will consume all data found as a name or value, if the data is a "+" character then it is replaced with a space character. This regards only "=", "&", and "%" as having special values. The "=" character delimits the name from the value and the "&" delimits the name value pair. The "%" character represents the start of an escaped sequence, which consists of two hex digits. All escaped sequences are converted to its character value.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description private class
QueryParser.Token
This is used to mark regions within the buffer that represent a valid token for either the name of a parameter or its value.
-
Field Summary
Fields Modifier and Type Field Description private QueryParser.Token
name
Used to accumulate the characters for the parameter name.private QueryParser.Token
value
Used to accumulate the characters for the parameter value.
-
Constructor Summary
Constructors Constructor Description QueryParser()
Constructor for theParameterParser
.QueryParser(java.lang.String text)
Constructor for theParameterParser
.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description private boolean
binary(int peek)
This method determines, using a peek character, whether the sequence of escaped characters within the URI is binary data.private char
bits(int data)
Defines behaviour for UCS-2 versus UCS-4 conversion from four octets.private int
convert(char high, char low)
This will convert the two hexidecimal characters to a real integer value, which is returned.private java.lang.String
encode(java.lang.String text)
Thisencode
method will escape the text that is provided.private java.lang.String
encode(java.lang.String name, java.lang.String value)
Thisencode
method will escape the name=value pair provided using the UTF-8 character set.private void
escape()
This converts an encountered escaped sequence, that is all embedded hexidecimal characters into a native UCS character value.boolean
getBoolean(java.lang.Object name)
This extracts a boolean parameter for the named value.float
getFloat(java.lang.Object name)
This extracts a float parameter for the named value.int
getInteger(java.lang.Object name)
This extracts an integer parameter for the named value.private boolean
hex(char ch)
This is used to determine whether a char is a hexadecimalchar
or not.protected void
init()
This initializes the parser so that it can be used several times.private void
insert()
This method adds the name and value to a map so that the next name and value can be collected.private void
insert(QueryParser.Token name, QueryParser.Token value)
This will add the given name and value to the parameters map.private void
name()
This extracts the name of the parameter from the character buffer.private void
param()
This is an expression that is defined by RFC 2396 it is used in the definition of a segment expression.protected void
parse()
This performs the actual parsing of the parameter text.private int
peek(int pos)
This will return the escape expression specified from the URI as an integer value of the hexadecimal sequence.java.lang.String
toString()
ThistoString
method is used to compose an string in theapplication/x-www-form-urlencoded
MIME type.java.lang.String
toString(java.util.Set set)
ThistoString
method is used to compose an string in theapplication/x-www-form-urlencoded
MIME type.private boolean
unicode(int peek)
This method determines, using a peek character, whether the sequence of escaped characters within the URI is in UTF-8.private boolean
unicode(int peek, int more)
This method will decode the specified amount of escaped characters from the URI and convert them into a single Java UCS-2 character.private boolean
unicode(int peek, int more, int pos)
This will decode the specified amount of trailing UTF-8 bits from the URI.private void
value()
This extracts a parameter value from a path segment.-
Methods inherited from class org.simpleframework.common.parse.MapParser
clear, containsKey, containsValue, entrySet, get, getAll, isEmpty, keySet, put, putAll, remove, size, values
-
Methods inherited from class org.simpleframework.common.parse.Parser
digit, ensureCapacity, parse, skip, space, toLower
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
-
-
-
-
Field Detail
-
name
private QueryParser.Token name
Used to accumulate the characters for the parameter name.
-
value
private QueryParser.Token value
Used to accumulate the characters for the parameter value.
-
-
Constructor Detail
-
QueryParser
public QueryParser()
Constructor for theParameterParser
. This creates an instance that can be use to parse HTML form data and URL query strings encoded as application/x-www-form-urlencoded. The parsed parameters are made available through the interfaceorg.simpleframework.util.net.Query
.
-
QueryParser
public QueryParser(java.lang.String text)
Constructor for theParameterParser
. This creates an instance that can be use to parse HTML form data and URL query strings encoded as application/x-www-form-urlencoded. The parsed parameters are made available through the interfaceorg.simpleframework.util.net.Query
.- Parameters:
text
- this is the text to parse for the parameters
-
-
Method Detail
-
getInteger
public int getInteger(java.lang.Object name)
This extracts an integer parameter for the named value. If the named parameter does not exist this will return a zero value. If however the parameter exists but is not in the format of a decimal integer value then this will throw an exception.- Specified by:
getInteger
in interfaceQuery
- Parameters:
name
- the name of the parameter value to retrieve- Returns:
- this returns the named parameter value as an integer
-
getFloat
public float getFloat(java.lang.Object name)
This extracts a float parameter for the named value. If the named parameter does not exist this will return a zero value. If however the parameter exists but is not in the format of a floating point number then this will throw an exception.
-
getBoolean
public boolean getBoolean(java.lang.Object name)
This extracts a boolean parameter for the named value. If the named parameter does not exist this will return false otherwise the value is evaluated. If it is eithertrue
orfalse
then those boolean values are returned.- Specified by:
getBoolean
in interfaceQuery
- Parameters:
name
- the name of the parameter value to retrieve- Returns:
- this returns the named parameter value as an float
-
init
protected void init()
This initializes the parser so that it can be used several times. This clears any previous parameters extracted. This ensures that when the nextparse(String)
is invoked the status of theQuery
is empty.
-
parse
protected void parse()
This performs the actual parsing of the parameter text. The parameters parsed from this are taken as "name=value" pairs. Multiple pairs within the text are separated by an "&". This will parse and insert all parameters into a hashtable.
-
insert
private void insert()
This method adds the name and value to a map so that the next name and value can be collected. The name and value are added to the map as string objects. Once added to the map theToken
objects are set to have zero length so they can be reused to collect further values. This will add the values to the map as an array of type string. This is done so that if there are multiple values that they can be stored.
-
insert
private void insert(QueryParser.Token name, QueryParser.Token value)
This will add the given name and value to the parameters map. If any previous value of the given name has been inserted into the map then this will overwrite that value. This is used to ensure that the string value is inserted to the map.- Parameters:
name
- this is the name of the value to be insertedvalue
- this is the value of a that is to be inserted
-
param
private void param()
This is an expression that is defined by RFC 2396 it is used in the definition of a segment expression. This is basically a list of chars with escaped sequences.This method has to ensure that no escaped chars go unchecked. This ensures that the read offset does not go out of bounds and consequently throw an out of bounds exception.
-
name
private void name()
This extracts the name of the parameter from the character buffer. The name of a parameter is defined as a set of chars including escape sequences. This will extract the parameter name and buffer the chars. The name ends when a equals character, "=", is encountered.
-
value
private void value()
This extracts a parameter value from a path segment. The parameter value consists of a sequence of chars and some escape sequences. The parameter value is buffered so that the name and values can be paired. The end of the value is determined as the end of the buffer or an ampersand.
-
escape
private void escape()
This converts an encountered escaped sequence, that is all embedded hexidecimal characters into a native UCS character value. This does not take any characters from the stream it just prepares the buffer with the correct byte. The escaped sequence within the URI will be interpreded as UTF-8.This will leave the next character to read from the buffer as the character encoded from the URI. If there is a fully valid escaped sequence, that is
"%" HEX HEX
. This decodes the escaped sequence using UTF-8 encoding, all encoded sequences should be in UCS-2 to fit in a Java char.
-
binary
private boolean binary(int peek)
This method determines, using a peek character, whether the sequence of escaped characters within the URI is binary data. If the data within the escaped sequence is binary then this will ensure that the next character read from the URI is the binary octet. This is used strictly for backward compatible parsing of URI strings, binary data should never appear.- Parameters:
peek
- this is the first escaped character from the URI- Returns:
- currently this implementation always returns true
-
unicode
private boolean unicode(int peek)
This method determines, using a peek character, whether the sequence of escaped characters within the URI is in UTF-8. If a UTF-8 character can be successfully decoded from the URI it will be the next character read from the buffer. This can check for both UCS-2 and UCS-4 characters. However, because the Javachar
can only hold UCS-2, the UCS-4 characters will have only the low order octets stored.The WWW Consortium provides a reference implementation of a UTF-8 decoding for Java, in this the low order octets in the UCS-4 sequence are used for the character. So, in the absence of a defined behaviour, the W3C behaviour is assumed.
- Parameters:
peek
- this is the first escaped character from the URI- Returns:
- this returns true if a UTF-8 character is decoded
-
unicode
private boolean unicode(int peek, int more)
This method will decode the specified amount of escaped characters from the URI and convert them into a single Java UCS-2 character. If there are not enough characters within the URI then this will return false and leave the URI alone.The number of characters left is determined from the first UTF-8 octet, as specified in RFC 2279, and because this is a URI there must that number of
"%" HEX HEX
sequences left. If successful the next character read is the UTF-8 sequence decoded into a native UCS-2 character.- Parameters:
peek
- contains the bits read from the first UTF octetmore
- this specifies the number of UTF octets left- Returns:
- this returns true if a UTF-8 character is decoded
-
unicode
private boolean unicode(int peek, int more, int pos)
This will decode the specified amount of trailing UTF-8 bits from the URI. The trailing bits are those following the first UTF-8 octet, which specifies the length, in octets, of the sequence. The trailing octets are of the form 10xxxxxx, for each of these octets only the last six bits are valid UCS bits. So a conversion is basically an accumulation of these.If at any point during the accumulation of the UTF-8 bits there is a parsing error, then parsing is aborted an false is returned, as a result the URI is left unchanged.
- Parameters:
peek
- bytes that have been accumulated fron the URImore
- this specifies the number of UTF octets leftpos
- this specifies the position the parsing begins- Returns:
- this returns true if a UTF-8 character is decoded
-
bits
private char bits(int data)
Defines behaviour for UCS-2 versus UCS-4 conversion from four octets. The UTF-8 encoding scheme enables UCS-4 characters to be encoded and decodeded. However, Java supports the 16-bit UCS-2 character set, and so the 32-bit UCS-4 character set is not compatable. This basically decides what to do with UCS-4.- Parameters:
data
- up to four octets to be converted to UCS-2 format- Returns:
- this returns a native UCS-2 character from the int
-
peek
private int peek(int pos)
This will return the escape expression specified from the URI as an integer value of the hexadecimal sequence. This does not make any changes to the buffer it simply checks to see if the characters at the position specified are an escaped set characters of the form"%" HEX HEX
, if so, then it will convert that hexadecimal string in to an integer value, or -1 if the expression is not hexadecimal.- Parameters:
pos
- this is the position the expression starts from- Returns:
- the integer value of the hexadecimal expression
-
convert
private int convert(char high, char low)
This will convert the two hexidecimal characters to a real integer value, which is returned. This requires characters within the range of 'A' to 'F' and 'a' to 'f', and also the digits '0' to '9'. The characters encoded using the ISO-8859-1 encoding scheme, if the characters are not with in the range specified then this returns -1.- Parameters:
high
- this is the high four bits within the integerlow
- this is the low four bits within the integer- Returns:
- this returns the indeger value of the conversion
-
hex
private boolean hex(char ch)
This is used to determine whether a char is a hexadecimalchar
or not. A hexadecimal character is considered to be a character within the range of0 - 9
and betweena - f
andA - F
. This will returntrue
if the character is in this range.- Parameters:
ch
- this is the character which is to be determined here- Returns:
- true if the character given has a hexadecimal value
-
encode
private java.lang.String encode(java.lang.String text)
Thisencode
method will escape the text that is provided. This is used to that the parameter pairs can be encoded in such a way that it can be transferred over HTTP/1.1 using the ISO-8859-1 character set.- Parameters:
text
- this is the text that is to be escaped- Returns:
- the text with % HEX HEX UTF-8 escape sequences
-
encode
private java.lang.String encode(java.lang.String name, java.lang.String value)
Thisencode
method will escape the name=value pair provided using the UTF-8 character set. This method will ensure that the parameters are encoded in such a way that they can be transferred via HTTP in ISO-8859-1.- Parameters:
name
- this is the name of that is to be escapedvalue
- this is the value that is to be escaped- Returns:
- the pair with % HEX HEX UTF-8 escape sequences
-
toString
public java.lang.String toString(java.util.Set set)
ThistoString
method is used to compose an string in theapplication/x-www-form-urlencoded
MIME type. This will encode the tokens specified in theSet
. Each name=value pair acquired is converted into a UTF-8 escape sequence so that the parameters can be sent in the IS0-8859-1 format required via the HTTP/1.1 specification RFC 2616.- Parameters:
set
- this is the set of parameters to be encoded- Returns:
- returns a HTTP parameter encoding for the pairs
-
toString
public java.lang.String toString()
ThistoString
method is used to compose an string in theapplication/x-www-form-urlencoded
MIME type. This will iterate over all tokens that have been added to this object, either during parsing, or during use of the instance. Each name=value pair acquired is converted into a UTF-8 escape sequence so that the parameters can be sent in the IS0-8859-1 format required via the HTTP/1.1 specification RFC 2616.
-
-