Class MorfologikSpellerRule
- java.lang.Object
-
- org.languagetool.rules.Rule
-
- org.languagetool.rules.spelling.SpellingCheckRule
-
- org.languagetool.rules.spelling.morfologik.MorfologikSpellerRule
-
public abstract class MorfologikSpellerRule extends SpellingCheckRule
-
-
Field Summary
Fields Modifier and Type Field Description private boolean
checkCompound
private java.util.regex.Pattern
compoundRegex
protected java.util.Locale
conversionLocale
private boolean
ignoreTaggedWords
(package private) static int
MAX_FREQUENCY_FOR_SPLITTING
private boolean
runningExperiment
protected MorfologikMultiSpeller
speller1
protected MorfologikMultiSpeller
speller2
protected MorfologikMultiSpeller
speller3
private SuggestionsOrderer
suggestionsOrderer
private UserConfig
userConfig
-
Fields inherited from class org.languagetool.rules.spelling.SpellingCheckRule
ignoreWordsWithLength, language, languageModel, LANGUAGETOOL, LANGUAGETOOLER, wordListLoader
-
-
Constructor Summary
Constructors Constructor Description MorfologikSpellerRule(java.util.ResourceBundle messages, Language language)
MorfologikSpellerRule(java.util.ResourceBundle messages, Language language, UserConfig userConfig)
MorfologikSpellerRule(java.util.ResourceBundle messages, Language language, UserConfig userConfig, java.util.List<Language> altLanguages)
MorfologikSpellerRule(java.util.ResourceBundle messages, Language language, UserConfig userConfig, java.util.List<Language> altLanguages, LanguageModel languageModel)
-
Method Summary
All Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description private boolean
canBeIgnored(AnalyzedTokenReadings[] tokens, int idx, AnalyzedTokenReadings token)
java.lang.String
getDescription()
A short description of the error this rule can detect, usually in the language of the text that is checked.abstract java.lang.String
getFileName()
Get the filename, e.g., /resource/pl/spelling.dict.protected int
getFrequency(MorfologikMultiSpeller speller, java.lang.String word)
abstract java.lang.String
getId()
A string used to identify the rule in e.g.protected java.util.List<RuleMatch>
getRuleMatches(java.lang.String word, int startPos, AnalyzedSentence sentence, java.util.List<RuleMatch> ruleMatchesSoFar, int idx, AnalyzedTokenReadings[] tokens)
protected boolean
ignoreWord(java.lang.String word)
Ignore surrogate pairs (emojis)private void
initSpeller(java.lang.String binaryDict)
private boolean
initSpellers()
boolean
isMisspelled(java.lang.String word)
protected boolean
isMisspelled(MorfologikMultiSpeller speller, java.lang.String word)
protected boolean
isSurrogatePairCombination(java.lang.String word)
Checks whether a given String consists only of surrogate pairs.private java.util.List<java.lang.String>
joinBeforeAfterSuggestions(java.util.List<java.lang.String> suggestionsList, java.lang.String beforeSuggestionStr, java.lang.String afterSuggestionStr)
Join strings before and after a suggestion.RuleMatch[]
match(AnalyzedSentence sentence)
Check whether the given sentence matches this error rule, i.e.protected java.util.List<java.lang.String>
orderSuggestions(java.util.List<java.lang.String> suggestions, java.lang.String word)
private java.util.List<java.lang.String>
orderSuggestions(java.util.List<java.lang.String> suggestions, java.lang.String word, AnalyzedSentence sentence, int startPos)
protected void
setCheckCompound(boolean checkCompound)
protected void
setCompoundRegex(java.lang.String compoundRegex)
void
setIgnoreTaggedWords()
Skip words that are known in the POS tagging dictionary, assuming they cannot be incorrect.void
setLocale(java.util.Locale locale)
@Nullable java.util.regex.Pattern
tokenizingPattern()
Get the regular expression pattern used to tokenize the words as in the source dictionary.-
Methods inherited from class org.languagetool.rules.spelling.SpellingCheckRule
acceptedInAlternativeLanguage, acceptPhrases, addIgnoreTokens, addIgnoreWords, addProhibitedWords, addSuggestionsToRuleMatch, createWrongSplitMatch, expandLine, filterDupes, filterSuggestions, getAdditionalProhibitFileNames, getAdditionalSpellingFileNames, getAdditionalSuggestions, getAdditionalTopSuggestions, getAlternativeLangSpellingRules, getAntiPatterns, getIgnoreFileName, getLanguageVariantSpellingFileName, getProhibitFileName, getSpellingFileName, ignoreToken, ignoreWord, init, isDictionaryBasedSpellingRule, isEMail, isProhibited, isUrl, reorderSuggestions, setConsiderIgnoreWords, setConvertsCase, startsWithIgnoredWord
-
Methods inherited from class org.languagetool.rules.Rule
addExamplePair, estimateContextForSureMatch, getCategory, getConfigureText, getCorrectExamples, getDefaultValue, getErrorTriggeringExamples, getIncorrectExamples, getLocQualityIssueType, getMaxConfigurableValue, getMinConfigurableValue, getSentenceWithImmunization, getUrl, hasConfigurableValue, isDefaultOff, isDefaultTempOff, isOfficeDefaultOff, isOfficeDefaultOn, makeAntiPatterns, setCategory, setCorrectExamples, setDefaultOff, setDefaultOn, setDefaultTempOff, setErrorTriggeringExamples, setIncorrectExamples, setLocQualityIssueType, setOfficeDefaultOff, setOfficeDefaultOn, setUrl, supportsLanguage, toRuleMatchArray, useInOffice
-
-
-
-
Field Detail
-
speller1
protected MorfologikMultiSpeller speller1
-
speller2
protected MorfologikMultiSpeller speller2
-
speller3
protected MorfologikMultiSpeller speller3
-
conversionLocale
protected java.util.Locale conversionLocale
-
suggestionsOrderer
private final SuggestionsOrderer suggestionsOrderer
-
runningExperiment
private final boolean runningExperiment
-
ignoreTaggedWords
private boolean ignoreTaggedWords
-
checkCompound
private boolean checkCompound
-
compoundRegex
private java.util.regex.Pattern compoundRegex
-
userConfig
private final UserConfig userConfig
-
MAX_FREQUENCY_FOR_SPLITTING
static final int MAX_FREQUENCY_FOR_SPLITTING
- See Also:
- Constant Field Values
-
-
Constructor Detail
-
MorfologikSpellerRule
public MorfologikSpellerRule(java.util.ResourceBundle messages, Language language) throws java.io.IOException
- Throws:
java.io.IOException
-
MorfologikSpellerRule
public MorfologikSpellerRule(java.util.ResourceBundle messages, Language language, UserConfig userConfig) throws java.io.IOException
- Throws:
java.io.IOException
-
MorfologikSpellerRule
public MorfologikSpellerRule(java.util.ResourceBundle messages, Language language, UserConfig userConfig, java.util.List<Language> altLanguages) throws java.io.IOException
- Throws:
java.io.IOException
-
MorfologikSpellerRule
public MorfologikSpellerRule(java.util.ResourceBundle messages, Language language, UserConfig userConfig, java.util.List<Language> altLanguages, LanguageModel languageModel) throws java.io.IOException
- Throws:
java.io.IOException
-
-
Method Detail
-
getFileName
public abstract java.lang.String getFileName()
Get the filename, e.g., /resource/pl/spelling.dict.
-
getId
public abstract java.lang.String getId()
Description copied from class:Rule
A string used to identify the rule in e.g. configuration files. This string is supposed to be unique and to stay the same in all upcoming versions of LanguageTool. It's supposed to contain only the charactersA-Z
and the underscore.- Specified by:
getId
in classSpellingCheckRule
-
getDescription
public java.lang.String getDescription()
Description copied from class:Rule
A short description of the error this rule can detect, usually in the language of the text that is checked.- Specified by:
getDescription
in classSpellingCheckRule
-
setLocale
public void setLocale(java.util.Locale locale)
-
setIgnoreTaggedWords
public void setIgnoreTaggedWords()
Skip words that are known in the POS tagging dictionary, assuming they cannot be incorrect.
-
match
public RuleMatch[] match(AnalyzedSentence sentence) throws java.io.IOException
Description copied from class:Rule
Check whether the given sentence matches this error rule, i.e. whether it contains the error detected by this rule. Note that the order in which this method is called is not always guaranteed, i.e. the sentence order in the text may be different than the order in which you get the sentences (this may be the case when LanguageTool is used as a LibreOffice/OpenOffice add-on, for example).- Specified by:
match
in classSpellingCheckRule
- Parameters:
sentence
- a pre-analyzed sentence- Returns:
- an array of
RuleMatch
objects - Throws:
java.io.IOException
-
initSpellers
private boolean initSpellers() throws java.io.IOException
- Throws:
java.io.IOException
-
initSpeller
private void initSpeller(java.lang.String binaryDict) throws java.io.IOException
- Throws:
java.io.IOException
-
canBeIgnored
private boolean canBeIgnored(AnalyzedTokenReadings[] tokens, int idx, AnalyzedTokenReadings token) throws java.io.IOException
- Throws:
java.io.IOException
-
isMisspelled
@Experimental public boolean isMisspelled(java.lang.String word) throws java.io.IOException
- Specified by:
isMisspelled
in classSpellingCheckRule
- Throws:
java.io.IOException
- Since:
- 4.8
-
isMisspelled
protected boolean isMisspelled(MorfologikMultiSpeller speller, java.lang.String word)
- Returns:
- true if the word is misspelled
- Since:
- 2.4
-
getFrequency
protected int getFrequency(MorfologikMultiSpeller speller, java.lang.String word)
-
getRuleMatches
protected java.util.List<RuleMatch> getRuleMatches(java.lang.String word, int startPos, AnalyzedSentence sentence, java.util.List<RuleMatch> ruleMatchesSoFar, int idx, AnalyzedTokenReadings[] tokens) throws java.io.IOException
- Throws:
java.io.IOException
-
tokenizingPattern
@Nullable public @Nullable java.util.regex.Pattern tokenizingPattern()
Get the regular expression pattern used to tokenize the words as in the source dictionary. For example, it may contain a hyphen, if the words with hyphens are not included in the dictionary- Returns:
- A compiled
Pattern
that is used to tokenize words ornull
.
-
orderSuggestions
protected java.util.List<java.lang.String> orderSuggestions(java.util.List<java.lang.String> suggestions, java.lang.String word)
-
orderSuggestions
private java.util.List<java.lang.String> orderSuggestions(java.util.List<java.lang.String> suggestions, java.lang.String word, AnalyzedSentence sentence, int startPos)
-
setCheckCompound
protected void setCheckCompound(boolean checkCompound)
- Parameters:
checkCompound
- If true and the word is not in the dictionary it will be split (seesetCompoundRegex(String)
) and each component will be checked separately- Since:
- 2.4
-
setCompoundRegex
protected void setCompoundRegex(java.lang.String compoundRegex)
- Parameters:
compoundRegex
- seesetCheckCompound(boolean)
- Since:
- 2.4
-
isSurrogatePairCombination
protected boolean isSurrogatePairCombination(java.lang.String word)
Checks whether a given String consists only of surrogate pairs.- Parameters:
word
- to be checked- Since:
- 4.2
-
ignoreWord
protected boolean ignoreWord(java.lang.String word) throws java.io.IOException
Ignore surrogate pairs (emojis)- Overrides:
ignoreWord
in classSpellingCheckRule
- Throws:
java.io.IOException
- Since:
- 4.3
- See Also:
SpellingCheckRule.ignoreWord(java.lang.String)
-
joinBeforeAfterSuggestions
private java.util.List<java.lang.String> joinBeforeAfterSuggestions(java.util.List<java.lang.String> suggestionsList, java.lang.String beforeSuggestionStr, java.lang.String afterSuggestionStr)
Join strings before and after a suggestion. Used when there is also suggestion for split words Ex. to thow > tot how | to throw
-
-