Package net.loomchild.segment.srx.legacy
Class MergedPattern
- java.lang.Object
-
- net.loomchild.segment.srx.legacy.MergedPattern
-
public class MergedPattern extends java.lang.Object
Represents merged splitting pattern. Responsible for merging breaking rules into one large pattern and creating non breaking rules pattern.
-
-
Field Summary
Fields Modifier and Type Field Description private java.util.regex.Pattern
breakingPattern
private java.util.List<java.lang.Integer>
breakingRuleIndexList
private int
maxLookbehindConstructLength
private java.util.List<java.util.regex.Pattern>
nonBreakingPatternList
-
Constructor Summary
Constructors Constructor Description MergedPattern(java.util.List<LanguageRule> languageRuleList, int maxLookbehindConstructLength)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description private java.lang.String
createBreakingPattern(java.util.List<Rule> ruleList)
Merges all breaking rules on list into one pattern.private java.lang.String
createNonBreakingPattern(java.util.List<Rule> ruleList)
Creates non breaking pattern by merging given rules.private java.util.List<Rule>
extractRules(java.util.List<LanguageRule> languageRuleList)
java.util.regex.Pattern
getBreakingPattern()
java.util.List<java.util.regex.Pattern>
getNonBreakingPatternList(int breakingRuleIndex)
Returns all applicable non breaking rules when breaking rule with a given number was matched (non breaking rules that occur before given breaking rule in SRX file).private java.util.List<java.util.List<Rule>>
groupRules(java.util.List<Rule> ruleList)
Divides rules to groups where all rules in the same group are either breaking or non breaking.
-
-
-
Field Detail
-
maxLookbehindConstructLength
private int maxLookbehindConstructLength
-
breakingPattern
private java.util.regex.Pattern breakingPattern
-
nonBreakingPatternList
private java.util.List<java.util.regex.Pattern> nonBreakingPatternList
-
breakingRuleIndexList
private java.util.List<java.lang.Integer> breakingRuleIndexList
-
-
Constructor Detail
-
MergedPattern
public MergedPattern(java.util.List<LanguageRule> languageRuleList, int maxLookbehindConstructLength)
-
-
Method Detail
-
getBreakingPattern
public java.util.regex.Pattern getBreakingPattern()
-
getNonBreakingPatternList
public java.util.List<java.util.regex.Pattern> getNonBreakingPatternList(int breakingRuleIndex)
Returns all applicable non breaking rules when breaking rule with a given number was matched (non breaking rules that occur before given breaking rule in SRX file).- Parameters:
breakingRuleIndex
-- Returns:
- Active non breaking patterns for a given breaking rule
-
extractRules
private java.util.List<Rule> extractRules(java.util.List<LanguageRule> languageRuleList)
- Parameters:
languageRuleList
-- Returns:
- merged list of rules form given language rules
-
groupRules
private java.util.List<java.util.List<Rule>> groupRules(java.util.List<Rule> ruleList)
Divides rules to groups where all rules in the same group are either breaking or non breaking. Does not change rule order.- Parameters:
ruleList
-- Returns:
- list of grouped rules
-
createBreakingPattern
private java.lang.String createBreakingPattern(java.util.List<Rule> ruleList)
Merges all breaking rules on list into one pattern.- Parameters:
ruleList
-- Returns:
- breaking pattern
-
createNonBreakingPattern
private java.lang.String createNonBreakingPattern(java.util.List<Rule> ruleList)
Creates non breaking pattern by merging given rules.- Parameters:
ruleList
-- Returns:
- Non breaking pattern
-
-