Class RuleManager


  • public class RuleManager
    extends java.lang.Object
    Represents segmentation rules manager. Responsible for constructing and storing break and exception rules.
    • Field Detail

      • maxLookbehindConstructLength

        private int maxLookbehindConstructLength
      • breakRuleList

        private java.util.List<Rule> breakRuleList
      • exceptionPatternMap

        private java.util.Map<Rule,​java.util.regex.Pattern> exceptionPatternMap
    • Constructor Detail

      • RuleManager

        public RuleManager​(SrxDocument document,
                           java.util.List<LanguageRule> languageRuleList,
                           int maxLookbehindConstructLength)
        Constructor. Responsible for retrieving rules from SRX document for given language code, constructing patterns and storing them in quick accessible format. Adds break rules to breakRuleList and constructs corresponding exception patterns in exceptionPatternMap. Uses document cache to store rules and patterns.
        Parameters:
        document - SRX document
        languageRuleList - list of language rules
        maxLookbehindConstructLength - Maximum length of regular expression in lookbehind (see Util.finitize(String, int)).
    • Method Detail

      • getBreakRuleList

        public java.util.List<Rule> getBreakRuleList()
        Returns:
        break rule list
      • getExceptionPattern

        public java.util.regex.Pattern getExceptionPattern​(Rule breakRule)
        Parameters:
        breakRule -
        Returns:
        exception pattern corresponding to give break rule
      • createExceptionPatternString

        private java.lang.String createExceptionPatternString​(Rule rule)
        Creates exception pattern string that can be matched in the place where break rule was matched. Both parts of the rule (beforePattern and afterPattern) are incorporated into one pattern. beforePattern is used in lookbehind, therefore it needs to be modified so it matches finite string (contains no *, + or {n,}).
        Parameters:
        rule - exception rule
        Returns:
        string containing exception pattern