Package net.sf.saxon.expr
Class Tokenizer
java.lang.Object
net.sf.saxon.expr.Tokenizer
Tokenizer for expressions and inputs.
This code was originally derived from James Clark's xt, though it has been greatly modified since.
See copyright notice at end of file.
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final int
State in which a name is NOT to be merged with what comes next, for example "("int
The number identifying the most recently read tokenint
The position in the input expression where the current token startsThe string value of the most recently read tokenstatic final int
Initial default state of the TokenizerThe string being parsedint
The current position within the input stringstatic final int
State in which the next thing to be read is an operatorstatic final int
State in which the next thing to be read is a SequenceTypeint
The starting line number (for XPath in XSLT, the line number in the stylesheet) -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionint
Get the column number of the current tokenint
getColumnNumber
(int offset) long
getLineAndColumn
(int offset) Get the line and column number corresponding to a given offset in the input expression, as a long value with the line number in the top half and the column number in the lower halfint
Get the line number of the current tokenint
getLineNumber
(int offset) int
getState()
void
Look ahead by one token.void
next()
Get the next token from the input expression.char
nextChar()
Read next character directly.Get the most recently read text (for use in an error message)void
setState
(int state) void
Prepare a string for tokenization.void
Force the current token to be treated as an operator if possiblevoid
Step back one character.
-
Field Details
-
DEFAULT_STATE
public static final int DEFAULT_STATEInitial default state of the Tokenizer- See Also:
-
BARE_NAME_STATE
public static final int BARE_NAME_STATEState in which a name is NOT to be merged with what comes next, for example "("- See Also:
-
SEQUENCE_TYPE_STATE
public static final int SEQUENCE_TYPE_STATEState in which the next thing to be read is a SequenceType- See Also:
-
OPERATOR_STATE
public static final int OPERATOR_STATEState in which the next thing to be read is an operator- See Also:
-
startLineNumber
public int startLineNumberThe starting line number (for XPath in XSLT, the line number in the stylesheet) -
currentToken
public int currentTokenThe number identifying the most recently read token -
currentTokenValue
The string value of the most recently read token -
currentTokenStartOffset
public int currentTokenStartOffsetThe position in the input expression where the current token starts -
input
The string being parsed -
inputOffset
public int inputOffsetThe current position within the input string
-
-
Constructor Details
-
Tokenizer
public Tokenizer()
-
-
Method Details
-
getState
public int getState() -
setState
public void setState(int state) -
tokenize
Prepare a string for tokenization. The actual tokens are obtained by calls on next()- Parameters:
input
- the string to be tokenizedstart
- start point within the stringend
- end point within the string (last character not read): -1 means end of string- Throws:
StaticError
- if a lexical error occurs, e.g. unmatched string quotes
-
next
Get the next token from the input expression. The type of token is returned in the currentToken variable, the string value of the token in currentTokenValue.- Throws:
StaticError
- if a lexical error is detected
-
treatCurrentAsOperator
public void treatCurrentAsOperator()Force the current token to be treated as an operator if possible -
lookAhead
Look ahead by one token. This method does the real tokenization work. The method is normally called internally, but the XQuery parser also calls it to resume normal tokenization after dealing with pseudo-XML syntax.- Throws:
StaticError
- if a lexical error occurs
-
nextChar
Read next character directly. Used by the XQuery parser when parsing pseudo-XML syntax- Returns:
- the next character from the input
- Throws:
StringIndexOutOfBoundsException
- if an attempt is made to read beyond the end of the string. This will only occur in the event of a syntax error in the input.
-
unreadChar
public void unreadChar()Step back one character. If this steps back to a previous line, adjust the line number. -
recentText
Get the most recently read text (for use in an error message) -
getLineNumber
public int getLineNumber()Get the line number of the current token -
getColumnNumber
public int getColumnNumber()Get the column number of the current token -
getLineAndColumn
public long getLineAndColumn(int offset) Get the line and column number corresponding to a given offset in the input expression, as a long value with the line number in the top half and the column number in the lower half- Returns:
- the line and column number, packed together
-
getLineNumber
public int getLineNumber(int offset) -
getColumnNumber
public int getColumnNumber(int offset)
-