Class SpoofChecker.Builder

java.lang.Object
com.ibm.icu.text.SpoofChecker.Builder
Enclosing class:
SpoofChecker

public static class SpoofChecker.Builder extends Object
SpoofChecker Builder. To create a SpoofChecker, first instantiate a SpoofChecker.Builder, set the desired checking options on the builder, then call the build() function to create a SpoofChecker instance.
  • Constructor Details

    • Builder

      public Builder()
      Constructor: Create a default Unicode Spoof Checker Builder, configured to perform all checks except for LOCALE_LIMIT and CHAR_LIMIT. Note that additional checks may be added in the future, resulting in the changes to the default checking behavior.
    • Builder

      public Builder(SpoofChecker src)
      Constructor: Create a Spoof Checker Builder, and set the configuration from an existing SpoofChecker.
      Parameters:
      src - The existing checker.
  • Method Details

    • build

      public SpoofChecker build()
      Create a SpoofChecker with current configuration.
      Returns:
      SpoofChecker
    • setData

      public SpoofChecker.Builder setData(Reader confusables) throws ParseException, IOException
      Specify the source form of the spoof data Spoof Checker. The inputs correspond to the Unicode data file confusables.txt as described in Unicode UAX 39. The syntax of the source data is as described in UAX 39 for these files, and the content of these files is acceptable input.
      Parameters:
      confusables - the Reader of confusable characters definitions, as found in file confusables.txt from unicode.org.
      Throws:
      ParseException - To report syntax errors in the input.
      IOException
    • setData

      @Deprecated public SpoofChecker.Builder setData(Reader confusables, Reader confusablesWholeScript) throws ParseException, IOException
      Deprecated.
      ICU 58
      Deprecated as of ICU 58; use setData(Reader confusables) instead.
      Parameters:
      confusables - the Reader of confusable characters definitions, as found in file confusables.txt from unicode.org.
      confusablesWholeScript - No longer supported.
      Throws:
      ParseException - To report syntax errors in the input.
      IOException
    • setChecks

      public SpoofChecker.Builder setChecks(int checks)
      Specify the bitmask of checks that will be performed by SpoofChecker.failsChecks(java.lang.String, com.ibm.icu.text.SpoofChecker.CheckResult). Calling this method overwrites any checks that may have already been enabled. By default, all checks are enabled. To enable specific checks and disable all others, OR together only the bit constants for the desired checks. For example, to fail strings containing characters outside of the set specified by setAllowedChars(com.ibm.icu.text.UnicodeSet) and also strings that contain digits from mixed numbering systems:
       
       builder.setChecks(SpoofChecker.CHAR_LIMIT | SpoofChecker.MIXED_NUMBERS);
       
       
      To disable specific checks and enable all others, start with ALL_CHECKS and "AND away" the not-desired checks. For example, if you are not planning to use the SpoofChecker.areConfusable(java.lang.String, java.lang.String) functionality, it is good practice to disable the CONFUSABLE check:
       
       builder.setChecks(SpoofChecker.ALL_CHECKS & ~SpoofChecker.CONFUSABLE);
       
       
      Note that methods such as setAllowedChars(com.ibm.icu.text.UnicodeSet), setAllowedLocales(java.util.Set<com.ibm.icu.util.ULocale>), and setRestrictionLevel(com.ibm.icu.text.SpoofChecker.RestrictionLevel) will enable certain checks when called. Those methods will OR the check they enable onto the existing bitmask specified by this method. For more details, see the documentation of those methods.
      Parameters:
      checks - The set of checks that this spoof checker will perform. The value is an 'or' of the desired checks.
      Returns:
      self
    • setAllowedLocales

      public SpoofChecker.Builder setAllowedLocales(Set<ULocale> locales)
      Limit characters that are acceptable in identifiers being checked to those normally used with the languages associated with the specified locales. Any previously specified list of locales is replaced by the new settings. A set of languages is determined from the locale(s), and from those a set of acceptable Unicode scripts is determined. Characters from this set of scripts, along with characters from the "common" and "inherited" Unicode Script categories will be permitted. Supplying an empty string removes all restrictions; characters from any script will be allowed. The SpoofChecker.CHAR_LIMIT test is automatically enabled for this SpoofChecker when calling this function with a non-empty list of locales. The Unicode Set of characters that will be allowed is accessible via the SpoofChecker.getAllowedChars() function. setAllowedLocales() will replace any previously applied set of allowed characters. Adjustments, such as additions or deletions of certain classes of characters, can be made to the result of setAllowedChars(com.ibm.icu.text.UnicodeSet) by fetching the resulting set with SpoofChecker.getAllowedChars(), manipulating it with the Unicode Set API, then resetting the spoof detectors limits with setAllowedChars(com.ibm.icu.text.UnicodeSet).
      Parameters:
      locales - A Set of ULocales, from which the language and associated script are extracted. If the locales Set is null, no restrictions will be placed on the allowed characters.
      Returns:
      self
    • setAllowedJavaLocales

      public SpoofChecker.Builder setAllowedJavaLocales(Set<Locale> locales)
      Limit characters that are acceptable in identifiers being checked to those normally used with the languages associated with the specified locales. Any previously specified list of locales is replaced by the new settings.
      Parameters:
      locales - A Set of Locales, from which the language and associated script are extracted. If the locales Set is null, no restrictions will be placed on the allowed characters.
      Returns:
      self
    • setAllowedChars

      public SpoofChecker.Builder setAllowedChars(UnicodeSet chars)
      Limit the acceptable characters to those specified by a Unicode Set. Any previously specified character limit is replaced by the new settings. This includes limits on characters that were set with the setAllowedLocales() function. Note that the RESTRICTED set is useful. The SpoofChecker.CHAR_LIMIT test is automatically enabled for this SpoofChecker by this function.
      Parameters:
      chars - A Unicode Set containing the list of characters that are permitted. The incoming set is cloned by this function, so there are no restrictions on modifying or deleting the UnicodeSet after calling this function. Note that this clears the allowedLocales set.
      Returns:
      self
    • setRestrictionLevel

      public SpoofChecker.Builder setRestrictionLevel(SpoofChecker.RestrictionLevel restrictionLevel)
      Set the loosest restriction level allowed for strings. The default if this is not called is SpoofChecker.RestrictionLevel.HIGHLY_RESTRICTIVE. Calling this method enables the SpoofChecker.RESTRICTION_LEVEL and SpoofChecker.MIXED_NUMBERS checks, corresponding to Sections 5.1 and 5.2 of UTS 39. To customize which checks are to be performed by SpoofChecker.failsChecks(java.lang.String, com.ibm.icu.text.SpoofChecker.CheckResult), see setChecks(int).
      Parameters:
      restrictionLevel - The loosest restriction level allowed.
      Returns:
      self