Package org.languagetool.rules.patterns
Class Unifier
- java.lang.Object
-
- org.languagetool.rules.patterns.Unifier
-
public class Unifier extends java.lang.Object
Implements unification of features over tokens.
-
-
Field Summary
Fields Modifier and Type Field Description private boolean
allFeatsIn
private java.util.Map<java.lang.String,java.util.List<java.lang.String>>
equivalenceFeatures
A Map that stores all possible equivalence types listed for features.private java.util.List<java.util.Map<java.lang.String,java.util.Set<java.lang.String>>>
equivalencesMatched
Map of sets of matched equivalences in the unified sequence.private java.util.Map<java.lang.String,java.util.Set<java.lang.String>>
equivalencesToBeKept
private java.util.Map<EquivalenceTypeLocator,PatternToken>
equivalenceTypes
A Map for storing the equivalence types for features.private java.util.List<java.lang.Boolean>
featuresFound
private boolean
inUnification
private int
readingsCounter
private java.util.List<java.lang.Boolean>
tmpFeaturesFound
private int
tokCnt
private java.util.List<AnalyzedTokenReadings>
tokSequence
private java.util.List<java.util.List<java.util.Map<java.lang.String,java.util.Set<java.lang.String>>>>
tokSequenceEquivalences
List of all equivalences matched per tokens in the sequence, kept exactly in sync with the list in tokSequence, so that a reading 2 of token 1 has its equivalence map addressable as tokSequenceEquivalences.get(1).get(2).private boolean
uniAllMatched
private java.util.Map<java.lang.String,java.util.List<java.lang.String>>
unificationFeats
private static java.lang.String
UNIFY_IGNORE
private boolean
uniMatched
-
Constructor Summary
Constructors Constructor Description Unifier(java.util.Map<EquivalenceTypeLocator,PatternToken> equivalenceTypes, java.util.Map<java.lang.String,java.util.List<java.lang.String>> equivalenceFeatures)
Instantiates the unifier.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
addNeutralElement(AnalyzedTokenReadings analyzedTokenReadings)
Used to add neutral elements (AnalyzedTokenReadings
to the unified sequence.private void
addTokenToSequence(java.util.List<AnalyzedTokenReadings> tokenSequence, AnalyzedToken token, int pos)
private boolean
checkNext(AnalyzedToken aToken, java.util.Map<java.lang.String,java.util.List<java.lang.String>> uFeatures)
boolean
getFinalUnificationValue(java.util.Map<java.lang.String,java.util.List<java.lang.String>> uFeatures)
Make sure that we really matched all the required features of the unification.@Nullable AnalyzedTokenReadings[]
getFinalUnified()
Used for getting a unified sequence in case when simple test methodisUnified(AnalyzedToken, Map, boolean)
} was used.@Nullable AnalyzedTokenReadings[]
getUnifiedTokens()
Gets a full sequence of filtered tokens.protected boolean
isSatisfied(AnalyzedToken aToken, java.util.Map<java.lang.String,java.util.List<java.lang.String>> uFeatures)
Tests if a token has shared features with other tokens.boolean
isUnified(AnalyzedToken matchToken, java.util.Map<java.lang.String,java.util.List<java.lang.String>> uFeatures, boolean lastReading)
boolean
isUnified(AnalyzedToken matchToken, java.util.Map<java.lang.String,java.util.List<java.lang.String>> uFeatures, boolean lastReading, boolean isMatched)
Tests if the token sequence is unified.void
reset()
Resets after use of unification.void
startNextToken()
Call after every complete token (AnalyzedTokenReadings) checked.void
startUnify()
Starts testing only those equivalences that were previously matched.
-
-
-
Field Detail
-
UNIFY_IGNORE
private static final java.lang.String UNIFY_IGNORE
- See Also:
- Constant Field Values
-
tokSequence
private final java.util.List<AnalyzedTokenReadings> tokSequence
-
tokSequenceEquivalences
private final java.util.List<java.util.List<java.util.Map<java.lang.String,java.util.Set<java.lang.String>>>> tokSequenceEquivalences
List of all equivalences matched per tokens in the sequence, kept exactly in sync with the list in tokSequence, so that a reading 2 of token 1 has its equivalence map addressable as tokSequenceEquivalences.get(1).get(2).
-
equivalenceTypes
private final java.util.Map<EquivalenceTypeLocator,PatternToken> equivalenceTypes
A Map for storing the equivalence types for features. Features are specified as Strings, and map into types defined as maps from Strings to Elements.
-
equivalenceFeatures
private final java.util.Map<java.lang.String,java.util.List<java.lang.String>> equivalenceFeatures
A Map that stores all possible equivalence types listed for features.
-
equivalencesMatched
private final java.util.List<java.util.Map<java.lang.String,java.util.Set<java.lang.String>>> equivalencesMatched
Map of sets of matched equivalences in the unified sequence.
-
allFeatsIn
private boolean allFeatsIn
-
tokCnt
private int tokCnt
-
readingsCounter
private int readingsCounter
-
featuresFound
private java.util.List<java.lang.Boolean> featuresFound
-
tmpFeaturesFound
private java.util.List<java.lang.Boolean> tmpFeaturesFound
-
equivalencesToBeKept
private final java.util.Map<java.lang.String,java.util.Set<java.lang.String>> equivalencesToBeKept
-
unificationFeats
private java.util.Map<java.lang.String,java.util.List<java.lang.String>> unificationFeats
-
inUnification
private boolean inUnification
-
uniMatched
private boolean uniMatched
-
uniAllMatched
private boolean uniAllMatched
-
-
Constructor Detail
-
Unifier
public Unifier(java.util.Map<EquivalenceTypeLocator,PatternToken> equivalenceTypes, java.util.Map<java.lang.String,java.util.List<java.lang.String>> equivalenceFeatures)
Instantiates the unifier.
-
-
Method Detail
-
isSatisfied
protected final boolean isSatisfied(AnalyzedToken aToken, java.util.Map<java.lang.String,java.util.List<java.lang.String>> uFeatures)
Tests if a token has shared features with other tokens.- Parameters:
aToken
- token to be testeduFeatures
- features to be tested- Returns:
- true if the token shares this type of feature with other tokens
-
checkNext
private boolean checkNext(AnalyzedToken aToken, java.util.Map<java.lang.String,java.util.List<java.lang.String>> uFeatures)
-
startNextToken
public final void startNextToken()
Call after every complete token (AnalyzedTokenReadings) checked.
-
startUnify
public final void startUnify()
Starts testing only those equivalences that were previously matched.
-
getFinalUnificationValue
public final boolean getFinalUnificationValue(java.util.Map<java.lang.String,java.util.List<java.lang.String>> uFeatures)
Make sure that we really matched all the required features of the unification.- Parameters:
uFeatures
- Features to be checked- Returns:
- True if the token sequence has been found.
- Since:
- 2.5
-
reset
public final void reset()
Resets after use of unification. Required.
-
getUnifiedTokens
@Nullable public final @Nullable AnalyzedTokenReadings[] getUnifiedTokens()
Gets a full sequence of filtered tokens.- Returns:
- Array of AnalyzedTokenReadings that match equivalence relation
defined for features tested, or
null
-
addTokenToSequence
private void addTokenToSequence(java.util.List<AnalyzedTokenReadings> tokenSequence, AnalyzedToken token, int pos)
-
isUnified
public final boolean isUnified(AnalyzedToken matchToken, java.util.Map<java.lang.String,java.util.List<java.lang.String>> uFeatures, boolean lastReading, boolean isMatched)
Tests if the token sequence is unified.Usage note: to test if the sequence of tokens is unified (i.e., shares a group of features, such as the same gender, number, grammatical case etc.), you need to test all tokens but the last one in the following way: call
To make it work in XML rules, the Elements built based onisUnified()
for every reading of a token, and setlastReading
totrue
. For the last token, check the truth value returned by this method. In previous cases, it may actually be discarded before the final check. SeeAbstractPatternRule
for an example.<token>
s inside the unify block have to be processed in a special way: namely the last Element has to be marked as the last one (by usingPatternToken.setLastInUnification()
).- Parameters:
matchToken
-AnalyzedToken
token to unifylastReading
- true when the matchToken is the last reading in theAnalyzedTokenReadings
isMatched
- true if the reading matches the element in the pattern rule, otherwise the reading is not considered in the unification- Returns:
- true if the tokens in the sequence are unified
-
isUnified
public final boolean isUnified(AnalyzedToken matchToken, java.util.Map<java.lang.String,java.util.List<java.lang.String>> uFeatures, boolean lastReading)
-
addNeutralElement
public final void addNeutralElement(AnalyzedTokenReadings analyzedTokenReadings)
Used to add neutral elements (AnalyzedTokenReadings
to the unified sequence. Useful if the sequence contains punctuation or connectives, for example.- Parameters:
analyzedTokenReadings
- A neutral element to be added.- Since:
- 2.5
-
getFinalUnified
@Nullable public final @Nullable AnalyzedTokenReadings[] getFinalUnified()
Used for getting a unified sequence in case when simple test methodisUnified(AnalyzedToken, Map, boolean)
} was used.- Returns:
- An array of
AnalyzedTokenReadings
ornull
when not in unification
-
-