Package jflex.core.unicode
Class UnicodeProperties
- java.lang.Object
-
- jflex.core.unicode.UnicodeProperties
-
public class UnicodeProperties extends java.lang.Object
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
UnicodeProperties.UnsupportedUnicodeVersionException
-
Field Summary
Fields Modifier and Type Field Description private IntCharSet[]
caselessMatches
private java.lang.String
caselessMatchPartitions
private int
caselessMatchPartitionSize
private static java.lang.String
DEFAULT_UNICODE_VERSION
private int
maximumCodePoint
private java.util.Map<java.lang.String,IntCharSet>
propertyValueIntervals
static java.lang.String
UNICODE_VERSIONS
private static java.util.regex.Pattern
WORD_SEP_PATTERN
-
Constructor Summary
Constructors Constructor Description UnicodeProperties()
Unpacks the Unicode data corresponding to the default Unicode version: ""12.1"".UnicodeProperties(java.lang.String version)
Unpacks the Unicode data corresponding to the given version.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description private void
bind(java.lang.String[] propertyValues, java.lang.String[] intervals, java.lang.String[] propertyValueAliases, int maximumCodePoint, java.lang.String caselessMatchPartitions, int caselessMatchPartitionSize)
Unpacks data for the selected Unicode version, populatingpropertyValueIntervals
.private void
bindInvariantIntervals()
Adds intervals for \p{ASCII} and \p{Any} topropertyValueIntervals
.IntCharSet
getCaselessMatches(int c)
Returns a set of character intervals representing all characters that are case-insensitively equivalent to the given character, including the given character itself.IntCharSet
getIntCharSet(java.lang.String propertyValue)
Returns the character interval set associated with the given property value for the selected Unicode version.int
getMaximumCodePoint()
Returns the maximum code point for the selected Unicode version.java.util.Set<java.lang.String>
getPropertyValues()
Returns the set of all properties, property values, and their aliases supported by the specified Unicode version.private void
init(java.lang.String version)
Based on the given version, selects and binds the corresponding Unicode data to facilitate mappings from property values to character intervals.private void
initCaselessMatches()
Unpacks the caseless match data.private static java.lang.String
normalize(java.lang.String identifier)
Normalizes the given identifier, by: downcasing; removing whitespace, underscores, hyphens, and parentheses; and substituting '=' for every ':'.
-
-
-
Field Detail
-
UNICODE_VERSIONS
public static final java.lang.String UNICODE_VERSIONS
- See Also:
- Constant Field Values
-
DEFAULT_UNICODE_VERSION
private static final java.lang.String DEFAULT_UNICODE_VERSION
- See Also:
- Constant Field Values
-
WORD_SEP_PATTERN
private static final java.util.regex.Pattern WORD_SEP_PATTERN
-
maximumCodePoint
private int maximumCodePoint
-
propertyValueIntervals
private final java.util.Map<java.lang.String,IntCharSet> propertyValueIntervals
-
caselessMatchPartitions
private java.lang.String caselessMatchPartitions
-
caselessMatchPartitionSize
private int caselessMatchPartitionSize
-
caselessMatches
private IntCharSet[] caselessMatches
-
-
Constructor Detail
-
UnicodeProperties
public UnicodeProperties() throws UnicodeProperties.UnsupportedUnicodeVersionException
Unpacks the Unicode data corresponding to the default Unicode version: ""12.1"".- Throws:
UnicodeProperties.UnsupportedUnicodeVersionException
- if the default version is not supported.
-
UnicodeProperties
public UnicodeProperties(java.lang.String version) throws UnicodeProperties.UnsupportedUnicodeVersionException
Unpacks the Unicode data corresponding to the given version.- Parameters:
version
- The Unicode version for which to unpack data- Throws:
UnicodeProperties.UnsupportedUnicodeVersionException
- if the given version is not supported.
-
-
Method Detail
-
getMaximumCodePoint
public int getMaximumCodePoint()
Returns the maximum code point for the selected Unicode version.- Returns:
- the maximum code point for the selected Unicode version.
-
getIntCharSet
public IntCharSet getIntCharSet(java.lang.String propertyValue)
Returns the character interval set associated with the given property value for the selected Unicode version.- Parameters:
propertyValue
- The Unicode property or property value (or alias for one of these) for which to return the corresponding character intervals.- Returns:
- The character interval set corresponding to the given property value, if a match exists, and null otherwise.
-
getPropertyValues
public java.util.Set<java.lang.String> getPropertyValues()
Returns the set of all properties, property values, and their aliases supported by the specified Unicode version.- Returns:
- The set of all properties supported by the specified Unicode version
-
getCaselessMatches
public IntCharSet getCaselessMatches(int c)
Returns a set of character intervals representing all characters that are case-insensitively equivalent to the given character, including the given character itself.The first call to this method lazily initializes the backing data.
- Parameters:
c
- The character for which to return case-insensitive equivalents.- Returns:
- All case-insensitively equivalent characters, or null if the given character is case-insensitively equivalent only to itself.
-
initCaselessMatches
private void initCaselessMatches()
Unpacks the caseless match data. Called fromgetCaselessMatches(int)
to lazily initialize.
-
init
private void init(java.lang.String version) throws UnicodeProperties.UnsupportedUnicodeVersionException
Based on the given version, selects and binds the corresponding Unicode data to facilitate mappings from property values to character intervals.- Parameters:
version
- The Unicode version for which to bind data- Throws:
UnicodeProperties.UnsupportedUnicodeVersionException
- if the given version is not supported.
-
bind
private void bind(java.lang.String[] propertyValues, java.lang.String[] intervals, java.lang.String[] propertyValueAliases, int maximumCodePoint, java.lang.String caselessMatchPartitions, int caselessMatchPartitionSize)
Unpacks data for the selected Unicode version, populatingpropertyValueIntervals
.- Parameters:
propertyValues
- The list of property values, in same order as the packed data corresponding to them, in the given intervals, for the selected Unicode version.intervals
- The packed character intervals corresponding to and in the same order as the given propertyValues, for the selected Unicode version.propertyValueAliases
- Key/value pairs mapping property value aliases to property values, for the selected Unicode version.maximumCodePoint
- The maximum code point for the selected Unicode version.caselessMatchPartitions
- The packed caseless match partition data for the selected Unicode versioncaselessMatchPartitionSize
- The partition data record length (the maximum number of elements in a caseless match partition) for the selected Unicode version.
-
bindInvariantIntervals
private void bindInvariantIntervals()
Adds intervals for \p{ASCII} and \p{Any} topropertyValueIntervals
.
-
normalize
private static java.lang.String normalize(java.lang.String identifier)
Normalizes the given identifier, by: downcasing; removing whitespace, underscores, hyphens, and parentheses; and substituting '=' for every ':'.- Parameters:
identifier
- The identifier to normalize- Returns:
- The normalized identifier
-
-