Package org.htmlunit.cyberneko
Class HTMLUnicodeEntitiesParser
- java.lang.Object
-
- org.htmlunit.cyberneko.HTMLUnicodeEntitiesParser
-
public class HTMLUnicodeEntitiesParser extends java.lang.Object
Parser for the Pre-defined named HTML entities. 12.2.5.72 Character reference stateFrom the spec:
Consume the maximum number of characters possible, with the consumed characters matching one of the identifiers in the first column of the named character references table (in a case-sensitive manner). Append each character to the temporary buffer when it's consumed.
-
-
Field Summary
Fields Modifier and Type Field Description private int
code_
private int
consumedCount_
private java.lang.String
match_
private int
matchLength_
private int
state_
private static int
STATE_ABSENCE_OF_DIGITS_IN_NUMERIC_CHARACTER_REFERENCE
private static int
STATE_DECIMAL_CHAR
private static int
STATE_HEXADECIMAL_CHAR
private static int
STATE_HEXADECIMAL_START
private static int
STATE_NUMERIC_CHAR_END_SEMICOLON_MISSING
static int
STATE_START
-
Constructor Summary
Constructors Constructor Description HTMLUnicodeEntitiesParser()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description java.lang.String
getMatch()
int
getRewindCount()
boolean
parseNumeric(int current)
Parses a numeric entity such as #x64; or #42; The ampersand must not be presented.void
setMatchFromCode()
-
-
-
Field Detail
-
STATE_START
public static final int STATE_START
- See Also:
- Constant Field Values
-
STATE_HEXADECIMAL_CHAR
private static final int STATE_HEXADECIMAL_CHAR
- See Also:
- Constant Field Values
-
STATE_DECIMAL_CHAR
private static final int STATE_DECIMAL_CHAR
- See Also:
- Constant Field Values
-
STATE_HEXADECIMAL_START
private static final int STATE_HEXADECIMAL_START
- See Also:
- Constant Field Values
-
STATE_NUMERIC_CHAR_END_SEMICOLON_MISSING
private static final int STATE_NUMERIC_CHAR_END_SEMICOLON_MISSING
- See Also:
- Constant Field Values
-
STATE_ABSENCE_OF_DIGITS_IN_NUMERIC_CHARACTER_REFERENCE
private static final int STATE_ABSENCE_OF_DIGITS_IN_NUMERIC_CHARACTER_REFERENCE
- See Also:
- Constant Field Values
-
state_
private int state_
-
consumedCount_
private int consumedCount_
-
match_
private java.lang.String match_
-
code_
private int code_
-
matchLength_
private int matchLength_
-
-
Method Detail
-
getMatch
public java.lang.String getMatch()
-
getRewindCount
public int getRewindCount()
-
setMatchFromCode
public void setMatchFromCode()
-
parseNumeric
public boolean parseNumeric(int current)
Parses a numeric entity such as #x64; or #42; The ampersand must not be presented.- Parameters:
current
- the next character to check- Returns:
- if we have reached the end of the parsing
-
-