Package java_cup
Class lexer
java.lang.Object
java_cup.lexer
This class implements a small scanner (aka lexical analyzer or lexer) for
the JavaCup specification. This scanner reads characters from standard
input (System.in) and returns integers corresponding to the terminal
number of the next token. Once end of input is reached the EOF token is
returned on every subsequent call.
Tokens currently returned include:
Symbol Constant Returned Symbol Constant Returned ------ ----------------- ------ ----------------- "package" PACKAGE "import" IMPORT "code" CODE "action" ACTION "parser" PARSER "terminal" TERMINAL "non" NON "init" INIT "scan" SCAN "with" WITH "start" START ; SEMI , COMMA * STAR . DOT : COLON ::= COLON_COLON_EQUALS | BAR identifier ID {:...:} CODE_STRINGAll symbol constants are defined in sym.java which is generated by JavaCup from parser.cup.
In addition to the scanner proper (called first via init() then with next_token() to get each token) this class provides simple error and warning routines and keeps a count of errors and warnings that is publicly accessible.
This class is "static" (i.e., it has only static members and methods).
-
Field Summary
FieldsModifier and TypeFieldDescriptionprotected static Hashtable
Table of single character symbols.protected static int
Current line number for use in error messages.protected static int
Character position in current line.protected static final int
EOF constant.static int
Count of total errors detected so far.protected static Hashtable
Table of keywords.protected static int
First character of lookahead.protected static int
Second character of lookahead.static int
Count of warnings issued so far -
Method Summary
Modifier and TypeMethodDescriptionprotected static void
advance()
Advance the scanner one character in the input stream.static token
Debugging version of next_token().protected static token
Swallow up a code string.protected static token
do_id()
Process an identifier.static void
emit_error
(String message) Emit an error message.static void
Emit a warning message.protected static int
find_single_char
(int ch) Try to look up a single character symbol, returns -1 for not found.protected static boolean
id_char
(int ch) Determine if a character is ok for the middle of an id.protected static boolean
id_start_char
(int ch) Determine if a character is ok to start an id.static void
init()
Initialize the scanner.static token
Return one token.protected static token
The actual routine to return one token.protected static void
Handle swallowing up a comment.
-
Field Details
-
next_char
protected static int next_charFirst character of lookahead. -
next_char2
protected static int next_char2Second character of lookahead. -
EOF_CHAR
protected static final int EOF_CHAREOF constant.- See Also:
-
keywords
Table of keywords. Keywords are initially treated as identifiers. Just before they are returned we look them up in this table to see if they match one of the keywords. The string of the name is the key here, which indexes Integer objects holding the symbol number. -
char_symbols
Table of single character symbols. For ease of implementation, we store all unambiguous single character tokens in this table of Integer objects keyed by Integer objects with the numerical value of the appropriate char (currently Character objects have a bug which precludes their use in tables). -
current_line
protected static int current_lineCurrent line number for use in error messages. -
current_position
protected static int current_positionCharacter position in current line. -
error_count
public static int error_countCount of total errors detected so far. -
warning_count
public static int warning_countCount of warnings issued so far
-
-
Method Details
-
init
Initialize the scanner. This sets up the keywords and char_symbols tables and reads the first two characters of lookahead.- Throws:
IOException
-
advance
Advance the scanner one character in the input stream. This moves next_char2 to next_char and then reads a new next_char2.- Throws:
IOException
-
emit_error
Emit an error message. The message will be marked with both the current line number and the position in the line. Error messages are printed on standard error (System.err).- Parameters:
message
- the message to print.
-
emit_warn
Emit a warning message. The message will be marked with both the current line number and the position in the line. Messages are printed on standard error (System.err).- Parameters:
message
- the message to print.
-
id_start_char
protected static boolean id_start_char(int ch) Determine if a character is ok to start an id.- Parameters:
ch
- the character in question.
-
id_char
protected static boolean id_char(int ch) Determine if a character is ok for the middle of an id.- Parameters:
ch
- the character in question.
-
find_single_char
protected static int find_single_char(int ch) Try to look up a single character symbol, returns -1 for not found.- Parameters:
ch
- the character in question.
-
swallow_comment
Handle swallowing up a comment. Both old style C and new style C++ comments are handled.- Throws:
IOException
-
do_code_string
Swallow up a code string. Code strings begin with "{:" and include all characters up to the first occurrence of ":}" (there is no way to include ":}" inside a code string). The routine returns an str_token object suitable for return by the scanner.- Throws:
IOException
-
do_id
Process an identifier. Identifiers begin with a letter, underscore, or dollar sign, which is followed by zero or more letters, numbers, underscores or dollar signs. This routine returns an str_token suitable for return by the scanner.- Throws:
IOException
-
next_token
Return one token. This is the main external interface to the scanner. It consumes sufficient characters to determine the next input token and returns it. To help with debugging, this routine actually calls real_next_token() which does the work. If you need to debug the parser, this can be changed to call debug_next_token() which prints a debugging message before returning the token.- Throws:
IOException
-
debug_next_token
Debugging version of next_token(). This routine calls the real scanning routine, prints a message on System.out indicating what the token is, then returns it.- Throws:
IOException
-
real_next_token
The actual routine to return one token. This is normally called from next_token(), but for debugging purposes can be called indirectly from debug_next_token().- Throws:
IOException
-