Class CompoundAwareHunspellRule
- java.lang.Object
-
- org.languagetool.rules.Rule
-
- org.languagetool.rules.spelling.SpellingCheckRule
-
- org.languagetool.rules.spelling.hunspell.HunspellRule
-
- org.languagetool.rules.spelling.hunspell.CompoundAwareHunspellRule
-
public abstract class CompoundAwareHunspellRule extends HunspellRule
A spell checker that combines Hunspell und Morfologik spell checking to support compound words and offer fast suggestions for some misspelled compound words.
-
-
Field Summary
Fields Modifier and Type Field Description private CompoundWordTokenizer
compoundSplitter
private static int
MAX_SUGGESTIONS
private MorfologikMultiSpeller
morfoSpeller
-
Fields inherited from class org.languagetool.rules.spelling.hunspell.HunspellRule
FILE_EXTENSION, hunspellDict, needsInit, nonWordPattern, RULE_ID, suggestionsOrderer
-
Fields inherited from class org.languagetool.rules.spelling.SpellingCheckRule
ignoreWordsWithLength, language, languageModel, LANGUAGETOOL, LANGUAGETOOLER, wordListLoader
-
-
Constructor Summary
Constructors Constructor Description CompoundAwareHunspellRule(java.util.ResourceBundle messages, Language language, CompoundWordTokenizer compoundSplitter, MorfologikMultiSpeller morfoSpeller, UserConfig userConfig)
CompoundAwareHunspellRule(java.util.ResourceBundle messages, Language language, CompoundWordTokenizer compoundSplitter, MorfologikMultiSpeller morfoSpeller, UserConfig userConfig, java.util.List<Language> altLanguages)
CompoundAwareHunspellRule(java.util.ResourceBundle messages, Language language, CompoundWordTokenizer compoundSplitter, MorfologikMultiSpeller morfoSpeller, UserConfig userConfig, java.util.List<Language> altLanguages, LanguageModel languageModel)
-
Method Summary
All Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description protected abstract void
filterForLanguage(java.util.List<java.lang.String> suggestions)
protected java.util.List<java.lang.String>
getCandidates(java.lang.String word)
Find potential corrections - it's okay if some of these are not valid words, this list will be filtered against the spellchecker before being returned to the user.protected java.util.List<java.lang.String>
getCandidates(java.util.List<java.lang.String> parts)
private java.util.List<java.lang.String>
getCorrectWords(java.util.List<java.lang.String> wordsOrPhrases)
protected java.util.List<java.lang.String>
getFilteredSuggestions(java.util.List<java.lang.String> wordsOrPhrases)
java.util.List<java.lang.String>
getSuggestions(java.lang.String word)
As a hunspell-based approach is too slow, we use Morfologik to create suggestions.private void
handleWordEndPunctuation(java.lang.String punct, java.lang.String word, java.util.List<java.lang.String> noSplitSuggestions)
protected java.util.List<java.lang.String>
sortSuggestionByQuality(java.lang.String misspelling, java.util.List<java.lang.String> suggestions)
-
Methods inherited from class org.languagetool.rules.spelling.hunspell.HunspellRule
getActiveChecks, getDescription, getDictFilenameInResources, getId, getSentenceTextWithoutUrlsAndImmunizedTokens, init, isAcceptedWordFromLanguage, isMisspelled, isQuotedCompound, match, tokenizeText
-
Methods inherited from class org.languagetool.rules.spelling.SpellingCheckRule
acceptedInAlternativeLanguage, acceptPhrases, addIgnoreTokens, addIgnoreWords, addProhibitedWords, addSuggestionsToRuleMatch, createWrongSplitMatch, expandLine, filterDupes, filterSuggestions, getAdditionalProhibitFileNames, getAdditionalSpellingFileNames, getAdditionalSuggestions, getAdditionalTopSuggestions, getAlternativeLangSpellingRules, getAntiPatterns, getIgnoreFileName, getLanguageVariantSpellingFileName, getProhibitFileName, getSpellingFileName, ignoreToken, ignoreWord, ignoreWord, isDictionaryBasedSpellingRule, isEMail, isProhibited, isUrl, reorderSuggestions, setConsiderIgnoreWords, setConvertsCase, startsWithIgnoredWord
-
Methods inherited from class org.languagetool.rules.Rule
addExamplePair, estimateContextForSureMatch, getCategory, getConfigureText, getCorrectExamples, getDefaultValue, getErrorTriggeringExamples, getIncorrectExamples, getLocQualityIssueType, getMaxConfigurableValue, getMinConfigurableValue, getSentenceWithImmunization, getUrl, hasConfigurableValue, isDefaultOff, isDefaultTempOff, isOfficeDefaultOff, isOfficeDefaultOn, makeAntiPatterns, setCategory, setCorrectExamples, setDefaultOff, setDefaultOn, setDefaultTempOff, setErrorTriggeringExamples, setIncorrectExamples, setLocQualityIssueType, setOfficeDefaultOff, setOfficeDefaultOn, setUrl, supportsLanguage, toRuleMatchArray, useInOffice
-
-
-
-
Field Detail
-
MAX_SUGGESTIONS
private static final int MAX_SUGGESTIONS
- See Also:
- Constant Field Values
-
compoundSplitter
private final CompoundWordTokenizer compoundSplitter
-
morfoSpeller
private final MorfologikMultiSpeller morfoSpeller
-
-
Constructor Detail
-
CompoundAwareHunspellRule
public CompoundAwareHunspellRule(java.util.ResourceBundle messages, Language language, CompoundWordTokenizer compoundSplitter, MorfologikMultiSpeller morfoSpeller, UserConfig userConfig)
-
CompoundAwareHunspellRule
public CompoundAwareHunspellRule(java.util.ResourceBundle messages, Language language, CompoundWordTokenizer compoundSplitter, MorfologikMultiSpeller morfoSpeller, UserConfig userConfig, java.util.List<Language> altLanguages)
- Since:
- 4.3
-
CompoundAwareHunspellRule
public CompoundAwareHunspellRule(java.util.ResourceBundle messages, Language language, CompoundWordTokenizer compoundSplitter, MorfologikMultiSpeller morfoSpeller, UserConfig userConfig, java.util.List<Language> altLanguages, LanguageModel languageModel)
-
-
Method Detail
-
filterForLanguage
protected abstract void filterForLanguage(java.util.List<java.lang.String> suggestions)
-
getSuggestions
public java.util.List<java.lang.String> getSuggestions(java.lang.String word) throws java.io.IOException
As a hunspell-based approach is too slow, we use Morfologik to create suggestions. As this won't work for compounds not in the dictionary, we split the word and also get suggestions on the compound parts. In the end, all candidates are filtered against Hunspell again (which supports compounds).- Overrides:
getSuggestions
in classHunspellRule
- Throws:
java.io.IOException
-
handleWordEndPunctuation
private void handleWordEndPunctuation(java.lang.String punct, java.lang.String word, java.util.List<java.lang.String> noSplitSuggestions)
-
getCandidates
protected java.util.List<java.lang.String> getCandidates(java.lang.String word)
Find potential corrections - it's okay if some of these are not valid words, this list will be filtered against the spellchecker before being returned to the user.
-
getCandidates
protected java.util.List<java.lang.String> getCandidates(java.util.List<java.lang.String> parts)
-
sortSuggestionByQuality
protected java.util.List<java.lang.String> sortSuggestionByQuality(java.lang.String misspelling, java.util.List<java.lang.String> suggestions)
- Overrides:
sortSuggestionByQuality
in classHunspellRule
-
getCorrectWords
private java.util.List<java.lang.String> getCorrectWords(java.util.List<java.lang.String> wordsOrPhrases)
-
getFilteredSuggestions
protected java.util.List<java.lang.String> getFilteredSuggestions(java.util.List<java.lang.String> wordsOrPhrases)
- Since:
- 4.7
-
-