Class Unifier


  • public class Unifier
    extends java.lang.Object
    Implements unification of features over tokens.
    • Field Detail

      • tokSequenceEquivalences

        private final java.util.List<java.util.List<java.util.Map<java.lang.String,​java.util.Set<java.lang.String>>>> tokSequenceEquivalences
        List of all equivalences matched per tokens in the sequence, kept exactly in sync with the list in tokSequence, so that a reading 2 of token 1 has its equivalence map addressable as tokSequenceEquivalences.get(1).get(2).
      • equivalenceTypes

        private final java.util.Map<EquivalenceTypeLocator,​PatternToken> equivalenceTypes
        A Map for storing the equivalence types for features. Features are specified as Strings, and map into types defined as maps from Strings to Elements.
      • equivalenceFeatures

        private final java.util.Map<java.lang.String,​java.util.List<java.lang.String>> equivalenceFeatures
        A Map that stores all possible equivalence types listed for features.
      • equivalencesMatched

        private final java.util.List<java.util.Map<java.lang.String,​java.util.Set<java.lang.String>>> equivalencesMatched
        Map of sets of matched equivalences in the unified sequence.
      • allFeatsIn

        private boolean allFeatsIn
      • tokCnt

        private int tokCnt
      • readingsCounter

        private int readingsCounter
      • featuresFound

        private java.util.List<java.lang.Boolean> featuresFound
      • tmpFeaturesFound

        private java.util.List<java.lang.Boolean> tmpFeaturesFound
      • equivalencesToBeKept

        private final java.util.Map<java.lang.String,​java.util.Set<java.lang.String>> equivalencesToBeKept
      • unificationFeats

        private java.util.Map<java.lang.String,​java.util.List<java.lang.String>> unificationFeats
      • inUnification

        private boolean inUnification
      • uniMatched

        private boolean uniMatched
      • uniAllMatched

        private boolean uniAllMatched
    • Constructor Detail

      • Unifier

        public Unifier​(java.util.Map<EquivalenceTypeLocator,​PatternToken> equivalenceTypes,
                       java.util.Map<java.lang.String,​java.util.List<java.lang.String>> equivalenceFeatures)
        Instantiates the unifier.
    • Method Detail

      • isSatisfied

        protected final boolean isSatisfied​(AnalyzedToken aToken,
                                            java.util.Map<java.lang.String,​java.util.List<java.lang.String>> uFeatures)
        Tests if a token has shared features with other tokens.
        Parameters:
        aToken - token to be tested
        uFeatures - features to be tested
        Returns:
        true if the token shares this type of feature with other tokens
      • checkNext

        private boolean checkNext​(AnalyzedToken aToken,
                                  java.util.Map<java.lang.String,​java.util.List<java.lang.String>> uFeatures)
      • startNextToken

        public final void startNextToken()
        Call after every complete token (AnalyzedTokenReadings) checked.
      • startUnify

        public final void startUnify()
        Starts testing only those equivalences that were previously matched.
      • getFinalUnificationValue

        public final boolean getFinalUnificationValue​(java.util.Map<java.lang.String,​java.util.List<java.lang.String>> uFeatures)
        Make sure that we really matched all the required features of the unification.
        Parameters:
        uFeatures - Features to be checked
        Returns:
        True if the token sequence has been found.
        Since:
        2.5
      • reset

        public final void reset()
        Resets after use of unification. Required.
      • getUnifiedTokens

        @Nullable
        public final @Nullable AnalyzedTokenReadings[] getUnifiedTokens()
        Gets a full sequence of filtered tokens.
        Returns:
        Array of AnalyzedTokenReadings that match equivalence relation defined for features tested, or null
      • isUnified

        public final boolean isUnified​(AnalyzedToken matchToken,
                                       java.util.Map<java.lang.String,​java.util.List<java.lang.String>> uFeatures,
                                       boolean lastReading,
                                       boolean isMatched)
        Tests if the token sequence is unified.

        Usage note: to test if the sequence of tokens is unified (i.e., shares a group of features, such as the same gender, number, grammatical case etc.), you need to test all tokens but the last one in the following way: call isUnified() for every reading of a token, and set lastReading to true. For the last token, check the truth value returned by this method. In previous cases, it may actually be discarded before the final check. See AbstractPatternRule for an example.

        To make it work in XML rules, the Elements built based on <token>s inside the unify block have to be processed in a special way: namely the last Element has to be marked as the last one (by using PatternToken.setLastInUnification()).
        Parameters:
        matchToken - AnalyzedToken token to unify
        lastReading - true when the matchToken is the last reading in the AnalyzedTokenReadings
        isMatched - true if the reading matches the element in the pattern rule, otherwise the reading is not considered in the unification
        Returns:
        true if the tokens in the sequence are unified
      • isUnified

        public final boolean isUnified​(AnalyzedToken matchToken,
                                       java.util.Map<java.lang.String,​java.util.List<java.lang.String>> uFeatures,
                                       boolean lastReading)
      • addNeutralElement

        public final void addNeutralElement​(AnalyzedTokenReadings analyzedTokenReadings)
        Used to add neutral elements (AnalyzedTokenReadings to the unified sequence. Useful if the sequence contains punctuation or connectives, for example.
        Parameters:
        analyzedTokenReadings - A neutral element to be added.
        Since:
        2.5