Package org.languagetool.rules.spelling
Class SpellingCheckRule
- java.lang.Object
-
- org.languagetool.rules.Rule
-
- org.languagetool.rules.spelling.SpellingCheckRule
-
- Direct Known Subclasses:
HunspellRule
,MorfologikSpellerRule
,SymSpellRule
public abstract class SpellingCheckRule extends Rule
An abstract rule for spellchecking rules.
-
-
Field Summary
Fields Modifier and Type Field Description private java.util.List<RuleWithLanguage>
altRules
private java.util.List<DisambiguationPatternRule>
antiPatterns
private boolean
considerIgnoreWords
private boolean
convertsCase
private static java.lang.String
CUSTOM_SPELLING_FILE
private static java.lang.String
CUSTOM_SPELLING_PROHIBIT_FILE
private static java.lang.String
GLOBAL_SPELLING_FILE
protected int
ignoreWordsWithLength
protected Language
language
protected @Nullable LanguageModel
languageModel
static java.lang.String
LANGUAGETOOL
The stringLanguageTool
.static java.lang.String
LANGUAGETOOLER
The stringLanguageTooler
.private static java.lang.String
SPELLING_FILE
private static java.lang.String
SPELLING_FILE_VARIANT
private static java.lang.String
SPELLING_IGNORE_FILE
private static java.lang.String
SPELLING_PROHIBIT_FILE
private static java.util.Comparator<java.lang.String>
STRING_LENGTH_COMPARATOR
private UserConfig
userConfig
protected CachingWordListLoader
wordListLoader
private java.util.Set<java.lang.String>
wordsToBeIgnored
private java.util.Map<java.lang.String,java.util.Set<java.lang.String>>
wordsToBeIgnoredDictionary
private java.util.Map<java.lang.String,java.util.Set<java.lang.String>>
wordsToBeIgnoredDictionaryIgnoreCase
private java.util.Set<java.lang.String>
wordsToBeProhibited
-
Constructor Summary
Constructors Constructor Description SpellingCheckRule(java.util.ResourceBundle messages, Language language, UserConfig userConfig)
SpellingCheckRule(java.util.ResourceBundle messages, Language language, UserConfig userConfig, java.util.List<Language> altLanguages)
SpellingCheckRule(java.util.ResourceBundle messages, Language language, UserConfig userConfig, java.util.List<Language> altLanguages, @Nullable LanguageModel languageModel)
-
Method Summary
All Methods Static Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description protected Language
acceptedInAlternativeLanguage(java.lang.String word)
void
acceptPhrases(java.util.List<java.lang.String> phrases)
Accept (case-sensitively, unless at the start of a sentence) the given phrases even though they are not in the built-in dictionary.void
addIgnoreTokens(java.util.List<java.lang.String> tokens)
Add the given words to the list of words to be ignored during spell check.protected void
addIgnoreWords(java.lang.String line)
protected void
addProhibitedWords(java.util.List<java.lang.String> words)
protected static void
addSuggestionsToRuleMatch(java.lang.String word, java.util.List<java.lang.String> userCandidates, java.util.List<java.lang.String> candidates, @Nullable SuggestionsOrderer orderer, RuleMatch match)
protected RuleMatch
createWrongSplitMatch(AnalyzedSentence sentence, java.util.List<RuleMatch> ruleMatchesSoFar, int pos, java.lang.String coveredWord, java.lang.String suggestion1, java.lang.String suggestion2, int prevPos)
protected java.util.List<java.lang.String>
expandLine(java.lang.String line)
Expand suffixes in a line.protected void
filterDupes(java.util.List<java.lang.String> words)
protected java.util.List<java.lang.String>
filterSuggestions(java.util.List<java.lang.String> suggestions, AnalyzedSentence sentence, int i)
Remove prohibited words from suggestions.protected java.util.List<java.lang.String>
getAdditionalProhibitFileNames()
Get the name of the prohibit file, which lists words not to be accepted, even when the spell checker would accept them.java.util.List<java.lang.String>
getAdditionalSpellingFileNames()
Get the name of additional spelling file, which lists words to be accepted and used for suggestions, even when the spell checker would not accept them.protected java.util.List<java.lang.String>
getAdditionalSuggestions(java.util.List<java.lang.String> suggestions, java.lang.String word)
Get additional suggestions added after other suggestions (note the rule may choose to re-order the suggestions anyway).protected java.util.List<java.lang.String>
getAdditionalTopSuggestions(java.util.List<java.lang.String> suggestions, java.lang.String word)
Get additional suggestions added before other suggestions (note the rule may choose to re-order the suggestions anyway).protected java.util.List<RuleWithLanguage>
getAlternativeLangSpellingRules(java.util.List<Language> alternativeLanguages)
java.util.List<DisambiguationPatternRule>
getAntiPatterns()
Overwrite this to avoid false alarms by ignoring these patterns - note that yourRule.match(AnalyzedSentence)
method needs to callRule.getSentenceWithImmunization(org.languagetool.AnalyzedSentence)
for this to be used and you need to checkAnalyzedTokenReadings.isImmunized()
abstract java.lang.String
getDescription()
A short description of the error this rule can detect, usually in the language of the text that is checked.abstract java.lang.String
getId()
A string used to identify the rule in e.g.protected java.lang.String
getIgnoreFileName()
Get the name of the ignore file, which lists words to be accepted, even when the spell checker would not accept them.java.lang.String
getLanguageVariantSpellingFileName()
Get the name of the spelling file for a language variant (e.g., en-US or de-AT), which lists words to be accepted and used for suggestions, even when the spell checker would not accept them.protected java.lang.String
getProhibitFileName()
Get the name of the prohibit file, which lists words not to be accepted, even when the spell checker would accept them.java.lang.String
getSpellingFileName()
Get the name of the spelling file, which lists words to be accepted and used for suggestions, even when the spell checker would not accept them.private java.util.List<PatternToken>
getTokensForSentenceStart(java.lang.String[] parts)
protected boolean
ignoreToken(AnalyzedTokenReadings[] tokens, int idx)
Returns true iff the token at the given position should be ignored by the spell checker.protected boolean
ignoreWord(java.lang.String word)
Returns true iff the word should be ignored by the spell checker.protected boolean
ignoreWord(java.util.List<java.lang.String> words, int idx)
Returns true iff the word at the given position should be ignored by the spell checker.protected void
init()
boolean
isDictionaryBasedSpellingRule()
Whether this is a spelling rule that uses a dictionary.protected boolean
isEMail(java.lang.String token)
private boolean
isIgnoredNoCase(java.lang.String word)
abstract boolean
isMisspelled(java.lang.String word)
protected boolean
isProhibited(java.lang.String word)
Whether the word is prohibited, i.e.private boolean
isProperNoun(java.lang.String wordWithoutS)
protected boolean
isUrl(java.lang.String token)
abstract RuleMatch[]
match(AnalyzedSentence sentence)
Check whether the given sentence matches this error rule, i.e.protected java.util.List<java.lang.String>
reorderSuggestions(java.util.List<java.lang.String> suggestions, java.lang.String word)
void
setConsiderIgnoreWords(boolean considerIgnoreWords)
Set whether the list of words to be explicitly ignored (set withaddIgnoreTokens(List)
) is considered at all.void
setConvertsCase(boolean convertsCase)
Used to determine whether the dictionary will use case conversions for spell checking.protected int
startsWithIgnoredWord(java.lang.String word, boolean caseSensitive)
Checks whether aword
starts with an ignored word.private void
updateIgnoredWordDictionary()
-
Methods inherited from class org.languagetool.rules.Rule
addExamplePair, estimateContextForSureMatch, getCategory, getConfigureText, getCorrectExamples, getDefaultValue, getErrorTriggeringExamples, getIncorrectExamples, getLocQualityIssueType, getMaxConfigurableValue, getMinConfigurableValue, getSentenceWithImmunization, getUrl, hasConfigurableValue, isDefaultOff, isDefaultTempOff, isOfficeDefaultOff, isOfficeDefaultOn, makeAntiPatterns, setCategory, setCorrectExamples, setDefaultOff, setDefaultOn, setDefaultTempOff, setErrorTriggeringExamples, setIncorrectExamples, setLocQualityIssueType, setOfficeDefaultOff, setOfficeDefaultOn, setUrl, supportsLanguage, toRuleMatchArray, useInOffice
-
-
-
-
Field Detail
-
LANGUAGETOOL
public static final java.lang.String LANGUAGETOOL
The stringLanguageTool
.- Since:
- 2.3
- See Also:
- Constant Field Values
-
LANGUAGETOOLER
public static final java.lang.String LANGUAGETOOLER
The stringLanguageTooler
.- Since:
- 4.4
- See Also:
- Constant Field Values
-
language
protected final Language language
-
languageModel
@Nullable @Experimental protected @Nullable LanguageModel languageModel
- Since:
- 4.5 For rules from @see Language.getRelevantLanguageModelCapableRules Optional, allows e.g. better suggestions when set
-
wordListLoader
protected final CachingWordListLoader wordListLoader
-
SPELLING_IGNORE_FILE
private static final java.lang.String SPELLING_IGNORE_FILE
- See Also:
- Constant Field Values
-
SPELLING_FILE
private static final java.lang.String SPELLING_FILE
- See Also:
- Constant Field Values
-
CUSTOM_SPELLING_FILE
private static final java.lang.String CUSTOM_SPELLING_FILE
- See Also:
- Constant Field Values
-
GLOBAL_SPELLING_FILE
private static final java.lang.String GLOBAL_SPELLING_FILE
- See Also:
- Constant Field Values
-
SPELLING_PROHIBIT_FILE
private static final java.lang.String SPELLING_PROHIBIT_FILE
- See Also:
- Constant Field Values
-
CUSTOM_SPELLING_PROHIBIT_FILE
private static final java.lang.String CUSTOM_SPELLING_PROHIBIT_FILE
- See Also:
- Constant Field Values
-
SPELLING_FILE_VARIANT
private static final java.lang.String SPELLING_FILE_VARIANT
-
STRING_LENGTH_COMPARATOR
private static final java.util.Comparator<java.lang.String> STRING_LENGTH_COMPARATOR
-
userConfig
private final UserConfig userConfig
-
wordsToBeIgnored
private final java.util.Set<java.lang.String> wordsToBeIgnored
-
wordsToBeProhibited
private final java.util.Set<java.lang.String> wordsToBeProhibited
-
altRules
private final java.util.List<RuleWithLanguage> altRules
-
wordsToBeIgnoredDictionary
private java.util.Map<java.lang.String,java.util.Set<java.lang.String>> wordsToBeIgnoredDictionary
-
wordsToBeIgnoredDictionaryIgnoreCase
private java.util.Map<java.lang.String,java.util.Set<java.lang.String>> wordsToBeIgnoredDictionaryIgnoreCase
-
antiPatterns
private java.util.List<DisambiguationPatternRule> antiPatterns
-
considerIgnoreWords
private boolean considerIgnoreWords
-
convertsCase
private boolean convertsCase
-
ignoreWordsWithLength
protected int ignoreWordsWithLength
-
-
Constructor Detail
-
SpellingCheckRule
public SpellingCheckRule(java.util.ResourceBundle messages, Language language, UserConfig userConfig)
-
SpellingCheckRule
public SpellingCheckRule(java.util.ResourceBundle messages, Language language, UserConfig userConfig, java.util.List<Language> altLanguages)
- Since:
- 4.4
-
SpellingCheckRule
@Experimental public SpellingCheckRule(java.util.ResourceBundle messages, Language language, UserConfig userConfig, java.util.List<Language> altLanguages, @Nullable @Nullable LanguageModel languageModel)
- Since:
- 4.5
-
-
Method Detail
-
addSuggestionsToRuleMatch
protected static void addSuggestionsToRuleMatch(java.lang.String word, java.util.List<java.lang.String> userCandidates, java.util.List<java.lang.String> candidates, @Nullable @Nullable SuggestionsOrderer orderer, RuleMatch match)
- Parameters:
word
- misspelled word that suggestions should be generated foruserCandidates
- candidates from personal dictionarycandidates
- candidates from default dictionaryorderer
- model to rank suggestions / extract features, or nullmatch
- rule match to add suggestions to
-
createWrongSplitMatch
protected RuleMatch createWrongSplitMatch(AnalyzedSentence sentence, java.util.List<RuleMatch> ruleMatchesSoFar, int pos, java.lang.String coveredWord, java.lang.String suggestion1, java.lang.String suggestion2, int prevPos)
-
getId
public abstract java.lang.String getId()
Description copied from class:Rule
A string used to identify the rule in e.g. configuration files. This string is supposed to be unique and to stay the same in all upcoming versions of LanguageTool. It's supposed to contain only the charactersA-Z
and the underscore.
-
getDescription
public abstract java.lang.String getDescription()
Description copied from class:Rule
A short description of the error this rule can detect, usually in the language of the text that is checked.- Specified by:
getDescription
in classRule
-
match
public abstract RuleMatch[] match(AnalyzedSentence sentence) throws java.io.IOException
Description copied from class:Rule
Check whether the given sentence matches this error rule, i.e. whether it contains the error detected by this rule. Note that the order in which this method is called is not always guaranteed, i.e. the sentence order in the text may be different than the order in which you get the sentences (this may be the case when LanguageTool is used as a LibreOffice/OpenOffice add-on, for example).
-
isMisspelled
@Experimental public abstract boolean isMisspelled(java.lang.String word) throws java.io.IOException
- Throws:
java.io.IOException
- Since:
- 4.8
-
isDictionaryBasedSpellingRule
public boolean isDictionaryBasedSpellingRule()
Description copied from class:Rule
Whether this is a spelling rule that uses a dictionary. Rules that returntrue
here are basically rules that work like a simple hunspell-like spellchecker: they check words without considering the words' context.- Overrides:
isDictionaryBasedSpellingRule
in classRule
-
addIgnoreTokens
public void addIgnoreTokens(java.util.List<java.lang.String> tokens)
Add the given words to the list of words to be ignored during spell check. You might want to useacceptPhrases(List)
instead, as only that can also deal with phrases.
-
updateIgnoredWordDictionary
private void updateIgnoredWordDictionary()
-
setConsiderIgnoreWords
public void setConsiderIgnoreWords(boolean considerIgnoreWords)
Set whether the list of words to be explicitly ignored (set withaddIgnoreTokens(List)
) is considered at all.
-
getAdditionalTopSuggestions
protected java.util.List<java.lang.String> getAdditionalTopSuggestions(java.util.List<java.lang.String> suggestions, java.lang.String word) throws java.io.IOException
Get additional suggestions added before other suggestions (note the rule may choose to re-order the suggestions anyway). Only add suggestions here that you know are spelled correctly, they will not be checked again before being shown to the user.- Throws:
java.io.IOException
-
getAdditionalSuggestions
protected java.util.List<java.lang.String> getAdditionalSuggestions(java.util.List<java.lang.String> suggestions, java.lang.String word)
Get additional suggestions added after other suggestions (note the rule may choose to re-order the suggestions anyway).
-
ignoreToken
protected boolean ignoreToken(AnalyzedTokenReadings[] tokens, int idx) throws java.io.IOException
Returns true iff the token at the given position should be ignored by the spell checker.- Throws:
java.io.IOException
-
ignoreWord
protected boolean ignoreWord(java.lang.String word) throws java.io.IOException
Returns true iff the word should be ignored by the spell checker. If possible, useignoreToken(AnalyzedTokenReadings[], int)
instead.- Throws:
java.io.IOException
-
isIgnoredNoCase
private boolean isIgnoredNoCase(java.lang.String word)
-
ignoreWord
protected boolean ignoreWord(java.util.List<java.lang.String> words, int idx) throws java.io.IOException
Returns true iff the word at the given position should be ignored by the spell checker. If possible, useignoreToken(AnalyzedTokenReadings[], int)
instead.- Throws:
java.io.IOException
- Since:
- 2.6
-
setConvertsCase
public void setConvertsCase(boolean convertsCase)
Used to determine whether the dictionary will use case conversions for spell checking.- Parameters:
convertsCase
- if true, then conversions are used.- Since:
- 2.5
-
isUrl
protected boolean isUrl(java.lang.String token)
-
isEMail
protected boolean isEMail(java.lang.String token)
-
filterDupes
protected void filterDupes(java.util.List<java.lang.String> words)
-
init
protected void init() throws java.io.IOException
- Throws:
java.io.IOException
-
getIgnoreFileName
protected java.lang.String getIgnoreFileName()
Get the name of the ignore file, which lists words to be accepted, even when the spell checker would not accept them. Unlike withgetSpellingFileName()
the words in this file will not be used for creating suggestions for misspelled words.- Since:
- 2.7
-
getSpellingFileName
public java.lang.String getSpellingFileName()
Get the name of the spelling file, which lists words to be accepted and used for suggestions, even when the spell checker would not accept them.- Since:
- 2.9, public since 3.5
-
getAdditionalSpellingFileNames
public java.util.List<java.lang.String> getAdditionalSpellingFileNames()
Get the name of additional spelling file, which lists words to be accepted and used for suggestions, even when the spell checker would not accept them.- Since:
- 4.8
-
getLanguageVariantSpellingFileName
public java.lang.String getLanguageVariantSpellingFileName()
Get the name of the spelling file for a language variant (e.g., en-US or de-AT), which lists words to be accepted and used for suggestions, even when the spell checker would not accept them.- Since:
- 4.3
-
getProhibitFileName
protected java.lang.String getProhibitFileName()
Get the name of the prohibit file, which lists words not to be accepted, even when the spell checker would accept them.- Since:
- 2.8
-
getAdditionalProhibitFileNames
protected java.util.List<java.lang.String> getAdditionalProhibitFileNames()
Get the name of the prohibit file, which lists words not to be accepted, even when the spell checker would accept them.- Since:
- 2.8
-
isProhibited
protected boolean isProhibited(java.lang.String word)
Whether the word is prohibited, i.e. whether it should be marked as a spelling error even if the spell checker would accept it. (This is useful to improve our spell checker without waiting for the upstream checker to be updated.)- Since:
- 2.8
-
filterSuggestions
protected java.util.List<java.lang.String> filterSuggestions(java.util.List<java.lang.String> suggestions, AnalyzedSentence sentence, int i)
Remove prohibited words from suggestions.- Since:
- 2.8
-
isProperNoun
private boolean isProperNoun(java.lang.String wordWithoutS)
-
addIgnoreWords
protected void addIgnoreWords(java.lang.String line)
- Parameters:
line
- the line as read fromspelling.txt
.- Since:
- 2.9, signature modified in 3.9
-
addProhibitedWords
protected void addProhibitedWords(java.util.List<java.lang.String> words)
- Parameters:
words
- list of words to be prohibited.- Since:
- 4.2
-
expandLine
protected java.util.List<java.lang.String> expandLine(java.lang.String line)
Expand suffixes in a line. By default, the line is not expanded. Implementations might e.g. turnbicycle/S
into[bicycle, bicycles]
.- Since:
- 3.0
-
getAlternativeLangSpellingRules
protected java.util.List<RuleWithLanguage> getAlternativeLangSpellingRules(java.util.List<Language> alternativeLanguages)
-
acceptedInAlternativeLanguage
protected Language acceptedInAlternativeLanguage(java.lang.String word) throws java.io.IOException
- Throws:
java.io.IOException
-
acceptPhrases
public void acceptPhrases(java.util.List<java.lang.String> phrases)
Accept (case-sensitively, unless at the start of a sentence) the given phrases even though they are not in the built-in dictionary. Use this to avoid false alarms on e.g. names and technical terms. UnlikeaddIgnoreTokens(List)
this can deal with phrases. A way to call this is like this:rule.acceptPhrases(Arrays.asList("duodenal atresia"))
This way, checking would not create an error for "duodenal atresia", but it would still create and error for "duodenal" or "atresia" if they appear on their own.- Since:
- 3.3
-
getTokensForSentenceStart
private java.util.List<PatternToken> getTokensForSentenceStart(java.lang.String[] parts)
-
getAntiPatterns
public java.util.List<DisambiguationPatternRule> getAntiPatterns()
Description copied from class:Rule
Overwrite this to avoid false alarms by ignoring these patterns - note that yourRule.match(AnalyzedSentence)
method needs to callRule.getSentenceWithImmunization(org.languagetool.AnalyzedSentence)
for this to be used and you need to checkAnalyzedTokenReadings.isImmunized()
- Overrides:
getAntiPatterns
in classRule
-
startsWithIgnoredWord
protected int startsWithIgnoredWord(java.lang.String word, boolean caseSensitive)
Checks whether aword
starts with an ignored word. Note that a minimumword
-length of 4 characters is expected. (This is for better performance. Moreover, such short words are most likely contained in the dictionary.)- Parameters:
word
- - entire wordcaseSensitive
- - determines whether the check is case-sensitive- Returns:
- length of the ignored word (i.e., return value is 0, if the word does not start with an ignored word). If there are several matches from the set of ignored words, the length of the longest matching word is returned.
- Since:
- 3.5
-
reorderSuggestions
@Experimental protected java.util.List<java.lang.String> reorderSuggestions(java.util.List<java.lang.String> suggestions, java.lang.String word)
-
-