Class TextFormat.Tokenizer
java.lang.Object
org.fusesource.hawtbuf.proto.compiler.TextFormat.Tokenizer
- Enclosing class:
TextFormat
Represents a stream of tokens parsed from a
String
.
The Java standard library provides many classes that you might think would be useful for implementing this, but aren't. For example:
java.io.StreamTokenizer
: This almost does what we want -- or, at least, something that would get us close to what we want -- except for one fatal flaw: It automatically un-escapes strings using Java escape sequences, which do not include all the escape sequences we need to support (e.g. '\x').java.util.Scanner
: This seems like a great way at least to parse regular expressions out of a stream (so we wouldn't have to load the entire input into a single string before parsing). Sadly,Scanner
requires that tokens be delimited with some delimiter. Thus, although the text "foo:" should parse to two tokens ("foo" and ":"),Scanner
would recognize it only as a single token. Furthermore,Scanner
provides no way to inspect the contents of delimiters, making it impossible to keep track of line and column numbers.
Luckily, Java's regular expression support does manage to be useful to
us. (Barely: We need Matcher.usePattern()
, which is new in
Java 1.5.) So, we can use that, at least. Unfortunately, this implies
that we need to have the entire input in one contiguous string.
-
Field Summary
Fields -
Constructor Summary
ConstructorsConstructorDescriptionTokenizer
(CharSequence text) Construct a tokenizer that parses tokens from the given text. -
Method Summary
Modifier and TypeMethodDescriptionboolean
atEnd()
Are we at the end of the input?void
If the next token exactly matchestoken
, consume it.boolean
If the next token is a boolean, consume it and return its value.If the next token is a string, consume it, unescape it as aBuffer
, and return it.double
If the next token is a double, consume it and return its value.float
If the next token is a float, consume it and return its value.If the next token is an identifier, consume it and return its value.int
If the next token is a 32-bit signed integer, consume it and return its value.long
If the next token is a 64-bit signed integer, consume it and return its value.If the next token is a string, consume it and return its (unescaped) value.int
If the next token is a 32-bit unsigned integer, consume it and return its value.long
If the next token is a 64-bit unsigned integer, consume it and return its value.private TextFormat.ParseException
Constructs an appropriateTextFormat.ParseException
for the givenNumberFormatException
when trying to parse a float or double.private TextFormat.ParseException
Constructs an appropriateTextFormat.ParseException
for the givenNumberFormatException
when trying to parse an integer.boolean
Returnstrue
if the next token is an integer, but does not consume it.void
Advance to the next token.parseException
(String description) Returns aTextFormat.ParseException
with the current line and column numbers in the description, suitable for throwing.parseExceptionPreviousToken
(String description) Returns aTextFormat.ParseException
with the line and column numbers of the previous token in the description, suitable for throwing.private void
Skip over any whitespace so that the matcher region starts at the next token.boolean
tryConsume
(String token) If the next token exactly matchestoken
, consume it and returntrue
.
-
Field Details
-
text
-
matcher
-
currentToken
-
pos
private int pos -
line
private int line -
column
private int column -
previousLine
private int previousLine -
previousColumn
private int previousColumn -
WHITESPACE
-
TOKEN
-
DOUBLE_INFINITY
-
FLOAT_INFINITY
-
FLOAT_NAN
-
-
Constructor Details
-
Tokenizer
Construct a tokenizer that parses tokens from the given text.
-
-
Method Details
-
atEnd
public boolean atEnd()Are we at the end of the input? -
nextToken
public void nextToken()Advance to the next token. -
skipWhitespace
private void skipWhitespace()Skip over any whitespace so that the matcher region starts at the next token. -
tryConsume
If the next token exactly matchestoken
, consume it and returntrue
. Otherwise, returnfalse
without doing anything. -
consume
If the next token exactly matchestoken
, consume it. Otherwise, throw aTextFormat.ParseException
.- Throws:
TextFormat.ParseException
-
lookingAtInteger
public boolean lookingAtInteger()Returnstrue
if the next token is an integer, but does not consume it. -
consumeIdentifier
If the next token is an identifier, consume it and return its value. Otherwise, throw aTextFormat.ParseException
.- Throws:
TextFormat.ParseException
-
consumeInt32
If the next token is a 32-bit signed integer, consume it and return its value. Otherwise, throw aTextFormat.ParseException
.- Throws:
TextFormat.ParseException
-
consumeUInt32
If the next token is a 32-bit unsigned integer, consume it and return its value. Otherwise, throw aTextFormat.ParseException
.- Throws:
TextFormat.ParseException
-
consumeInt64
If the next token is a 64-bit signed integer, consume it and return its value. Otherwise, throw aTextFormat.ParseException
.- Throws:
TextFormat.ParseException
-
consumeUInt64
If the next token is a 64-bit unsigned integer, consume it and return its value. Otherwise, throw aTextFormat.ParseException
.- Throws:
TextFormat.ParseException
-
consumeDouble
If the next token is a double, consume it and return its value. Otherwise, throw aTextFormat.ParseException
.- Throws:
TextFormat.ParseException
-
consumeFloat
If the next token is a float, consume it and return its value. Otherwise, throw aTextFormat.ParseException
.- Throws:
TextFormat.ParseException
-
consumeBoolean
If the next token is a boolean, consume it and return its value. Otherwise, throw aTextFormat.ParseException
.- Throws:
TextFormat.ParseException
-
consumeString
If the next token is a string, consume it and return its (unescaped) value. Otherwise, throw aTextFormat.ParseException
.- Throws:
TextFormat.ParseException
-
consumeBuffer
If the next token is a string, consume it, unescape it as aBuffer
, and return it. Otherwise, throw aTextFormat.ParseException
.- Throws:
TextFormat.ParseException
-
parseException
Returns aTextFormat.ParseException
with the current line and column numbers in the description, suitable for throwing. -
parseExceptionPreviousToken
Returns aTextFormat.ParseException
with the line and column numbers of the previous token in the description, suitable for throwing. -
integerParseException
Constructs an appropriateTextFormat.ParseException
for the givenNumberFormatException
when trying to parse an integer. -
floatParseException
Constructs an appropriateTextFormat.ParseException
for the givenNumberFormatException
when trying to parse a float or double.
-