Class PartialPosTagFilter


  • public abstract class PartialPosTagFilter
    extends RuleFilter
    Filters rule matches so that only matches are kept where a part of the token has a given POS tag. Expects these arguments:
    • no: an integer of the matching 'token' position to be considered. Starts with 1.
    • regexp: the regular expression to specify the part of the token to be considered. For example, (?:in|un)(.*) will consider the part of the token that comes after 'in' or 'un'. Note that always the first group is considered, so if you need more parenthesis you need to use non-capturing groups (?:...), as in the example.
    • postag_regexp: a regular expression to match the POS tag of the part of the word, e.g. VB.? to match any verb in English.
    • negate_postag: if value is yes, then the regexp is negated (not negated if not specified).
    • two_groups_regexp: if value is yes, then the regexp must contain 2 groups (if not specified - 1 groups).
    Since:
    2.8
    • Constructor Detail

      • PartialPosTagFilter

        public PartialPosTagFilter()
    • Method Detail

      • tag

        @Nullable
        protected abstract @Nullable java.util.List<AnalyzedTokenReadings> tag​(java.lang.String token)
      • acceptRuleMatch

        public RuleMatch acceptRuleMatch​(RuleMatch match,
                                         java.util.Map<java.lang.String,​java.lang.String> args,
                                         int patternTokenPos,
                                         AnalyzedTokenReadings[] patternTokens)
        Description copied from class: RuleFilter
        Returns the original rule match or a modified one, or null if the rule match is filtered out.
        Specified by:
        acceptRuleMatch in class RuleFilter
        args - the resolved argument from the args attribute in the XML. Resolved means that e.g. \1 has been resolved to the actual string at that match position.
        patternTokens - those tokens of the text that correspond the matched pattern
        Returns:
        null if this rule match should be removed, or any other RuleMatch (e.g. the one from the arguments) that properly describes the detected error
      • partialTagHasRequiredTag

        private boolean partialTagHasRequiredTag​(java.util.List<AnalyzedTokenReadings> tags,
                                                 java.lang.String requiredTagRegexp,
                                                 boolean negatePos)