Class UnicodeProperties


  • public class UnicodeProperties
    extends java.lang.Object
    • Constructor Summary

      Constructors 
      Constructor Description
      UnicodeProperties()
      Unpacks the Unicode data corresponding to the default Unicode version: ""12.1"".
      UnicodeProperties​(java.lang.String version)
      Unpacks the Unicode data corresponding to the given version.
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      private void bind​(java.lang.String[] propertyValues, java.lang.String[] intervals, java.lang.String[] propertyValueAliases, int maximumCodePoint, java.lang.String caselessMatchPartitions, int caselessMatchPartitionSize)
      Unpacks data for the selected Unicode version, populating propertyValueIntervals.
      private void bindInvariantIntervals()
      Adds intervals for \p{ASCII} and \p{Any} to propertyValueIntervals.
      IntCharSet getCaselessMatches​(int c)
      Returns a set of character intervals representing all characters that are case-insensitively equivalent to the given character, including the given character itself.
      IntCharSet getIntCharSet​(java.lang.String propertyValue)
      Returns the character interval set associated with the given property value for the selected Unicode version.
      int getMaximumCodePoint()
      Returns the maximum code point for the selected Unicode version.
      java.util.Set<java.lang.String> getPropertyValues()
      Returns the set of all properties, property values, and their aliases supported by the specified Unicode version.
      private void init​(java.lang.String version)
      Based on the given version, selects and binds the corresponding Unicode data to facilitate mappings from property values to character intervals.
      private void initCaselessMatches()
      Unpacks the caseless match data.
      private static java.lang.String normalize​(java.lang.String identifier)
      Normalizes the given identifier, by: downcasing; removing whitespace, underscores, hyphens, and parentheses; and substituting '=' for every ':'.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • UNICODE_VERSIONS

        public static final java.lang.String UNICODE_VERSIONS
        See Also:
        Constant Field Values
      • DEFAULT_UNICODE_VERSION

        private static final java.lang.String DEFAULT_UNICODE_VERSION
        See Also:
        Constant Field Values
      • WORD_SEP_PATTERN

        private static final java.util.regex.Pattern WORD_SEP_PATTERN
      • maximumCodePoint

        private int maximumCodePoint
      • propertyValueIntervals

        private final java.util.Map<java.lang.String,​IntCharSet> propertyValueIntervals
      • caselessMatchPartitions

        private java.lang.String caselessMatchPartitions
      • caselessMatchPartitionSize

        private int caselessMatchPartitionSize
      • caselessMatches

        private IntCharSet[] caselessMatches
    • Method Detail

      • getMaximumCodePoint

        public int getMaximumCodePoint()
        Returns the maximum code point for the selected Unicode version.
        Returns:
        the maximum code point for the selected Unicode version.
      • getIntCharSet

        public IntCharSet getIntCharSet​(java.lang.String propertyValue)
        Returns the character interval set associated with the given property value for the selected Unicode version.
        Parameters:
        propertyValue - The Unicode property or property value (or alias for one of these) for which to return the corresponding character intervals.
        Returns:
        The character interval set corresponding to the given property value, if a match exists, and null otherwise.
      • getPropertyValues

        public java.util.Set<java.lang.String> getPropertyValues()
        Returns the set of all properties, property values, and their aliases supported by the specified Unicode version.
        Returns:
        The set of all properties supported by the specified Unicode version
      • getCaselessMatches

        public IntCharSet getCaselessMatches​(int c)
        Returns a set of character intervals representing all characters that are case-insensitively equivalent to the given character, including the given character itself.

        The first call to this method lazily initializes the backing data.

        Parameters:
        c - The character for which to return case-insensitive equivalents.
        Returns:
        All case-insensitively equivalent characters, or null if the given character is case-insensitively equivalent only to itself.
      • initCaselessMatches

        private void initCaselessMatches()
        Unpacks the caseless match data. Called from getCaselessMatches(int) to lazily initialize.
      • bind

        private void bind​(java.lang.String[] propertyValues,
                          java.lang.String[] intervals,
                          java.lang.String[] propertyValueAliases,
                          int maximumCodePoint,
                          java.lang.String caselessMatchPartitions,
                          int caselessMatchPartitionSize)
        Unpacks data for the selected Unicode version, populating propertyValueIntervals.
        Parameters:
        propertyValues - The list of property values, in same order as the packed data corresponding to them, in the given intervals, for the selected Unicode version.
        intervals - The packed character intervals corresponding to and in the same order as the given propertyValues, for the selected Unicode version.
        propertyValueAliases - Key/value pairs mapping property value aliases to property values, for the selected Unicode version.
        maximumCodePoint - The maximum code point for the selected Unicode version.
        caselessMatchPartitions - The packed caseless match partition data for the selected Unicode version
        caselessMatchPartitionSize - The partition data record length (the maximum number of elements in a caseless match partition) for the selected Unicode version.
      • bindInvariantIntervals

        private void bindInvariantIntervals()
        Adds intervals for \p{ASCII} and \p{Any} to propertyValueIntervals.
      • normalize

        private static java.lang.String normalize​(java.lang.String identifier)
        Normalizes the given identifier, by: downcasing; removing whitespace, underscores, hyphens, and parentheses; and substituting '=' for every ':'.
        Parameters:
        identifier - The identifier to normalize
        Returns:
        The normalized identifier