Package org.languagetool
Class JLanguageTool
- java.lang.Object
-
- org.languagetool.JLanguageTool
-
- Direct Known Subclasses:
MultiThreadedJLanguageTool
public class JLanguageTool extends java.lang.Object
The main class used for checking text against different rules:- built-in Java rules (for English: a vs. an, whitespace after commas, ...)
- built-in pattern rules loaded from external XML files (usually called
grammar.xml
) - your own implementation of the abstract
Rule
classes added withaddRule(Rule)
You will probably want to use the sub class
MultiThreadedJLanguageTool
for best performance.Thread-safety: this class is not thread safe. Create one instance per thread, but create the language only once (e.g.
new AmericanEnglish()
) and use it for all instances of JLanguageTool.- See Also:
MultiThreadedJLanguageTool
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
JLanguageTool.Mode
static class
JLanguageTool.ParagraphHandling
Constants for correct paragraph-rule handling.(package private) class
JLanguageTool.TextCheckCallable
-
Field Summary
Fields Modifier and Type Field Description private java.util.List<Language>
altLanguages
static @Nullable java.lang.String
BUILD_DATE
LanguageTool build date and time like2013-10-17 16:10
ornull
if not run from JAR.private java.util.List<Rule>
builtinRules
private ResultCache
cache
private boolean
cleanOverlappingMatches
private static ResourceDataBroker
dataBroker
private ShortDescriptionProvider
descProvider
static java.lang.String
DICTIONARY_FILENAME_EXTENSION
Extension of dictionary files read by Spellersprivate java.util.Set<CategoryId>
disabledRuleCategories
private java.util.Set<java.lang.String>
disabledRules
private java.util.Set<CategoryId>
enabledRuleCategories
private java.util.Set<java.lang.String>
enabledRules
static java.lang.String
FALSE_FRIEND_FILE
The name of the file with false friend information.static @Nullable java.lang.String
GIT_SHORT_ID
Abbreviated git id ornull
if not available.private Language
language
private boolean
listUnknownWords
private java.util.List<RuleMatchFilter>
matchFilters
private float
maxErrorsPerWordRate
static java.lang.String
MESSAGE_BUNDLE
Name of the message bundle for translations.private Language
motherTongue
private java.util.Set<java.lang.String>
optionalLanguageModelRules
static java.lang.String
PARAGRAPH_END_TAGNAME
The internal tag used to mark the end of a paragraph.static java.lang.String
PATTERN_FILE
The name of the file with error patterns.private java.io.PrintStream
printStream
static java.lang.String
SENTENCE_END_TAGNAME
The internal tag used to mark the end of a sentence.static java.lang.String
SENTENCE_START_TAGNAME
The internal tag used to mark the beginning of a sentence.private static java.util.List<java.io.File>
temporaryFiles
private java.util.Set<java.lang.String>
unknownWords
private UserConfig
userConfig
private java.util.List<Rule>
userRules
static java.lang.String
VERSION
LanguageTool version as a string like2.3
or2.4-SNAPSHOT
.
-
Constructor Summary
Constructors Constructor Description JLanguageTool(Language language)
Create a JLanguageTool and setup the built-in Java rules for the given language.JLanguageTool(Language language, java.util.List<Language> altLanguages, Language motherTongue, ResultCache cache, GlobalConfig globalConfig, UserConfig userConfig)
Create a JLanguageTool and setup the built-in rules for the given language and false friend rules for the text language / mother tongue pair.JLanguageTool(Language lang, Language motherTongue)
Create a JLanguageTool and setup the built-in rules for the given language and false friend rules for the text language / mother tongue pair.JLanguageTool(Language language, Language motherTongue, ResultCache cache)
Create a JLanguageTool and setup the built-in rules for the given language and false friend rules for the text language / mother tongue pair.JLanguageTool(Language language, Language motherTongue, ResultCache cache, UserConfig userConfig)
Create a JLanguageTool and setup the built-in rules for the given language and false friend rules for the text language / mother tongue pair.JLanguageTool(Language language, ResultCache cache, UserConfig userConfig)
Create a JLanguageTool and setup the built-in rules for the given language and false friend rules for the text language / mother tongue pair.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description private void
activateDefaultFalseFriendRules()
Loads and activates the false friend rules fromrules/false-friends.xml
.private void
activateDefaultPatternRules()
Loads and activates the pattern rules fromorg/languagetool/rules/<languageCode>/grammar.xml
.void
activateLanguageModelRules(java.io.File indexDir)
Activate rules that depend on a language model.void
activateNeuralNetworkRules(java.io.File modelDir)
Activate rules that depend on pretrained neural network models.void
activateWord2VecModelRules(java.io.File indexDir)
Activate rules that depend on a word2vec language model.void
addMatchFilter(@NotNull RuleMatchFilter filter)
Add aRuleMatchFilter
for post-processing of rule matches Filters are called sequentially in the same order as addedvoid
addRule(Rule rule)
Add a rule to be used by the next call to the check methods likecheck(String)
.static void
addTemporaryFile(java.io.File file)
Adds a temporary file to the internal list (internal method, you should never need to call this as a user of LanguageTool)RuleMatch
adjustRuleMatchPos(RuleMatch match, int charCount, int columnCount, int lineCount, java.lang.String sentence, AnnotatedText annotatedText)
Change RuleMatch positions so they are relative to the complete text, not just to the sentence.protected java.util.List<AnalyzedSentence>
analyzeSentences(java.util.List<java.lang.String> sentences)
java.util.List<AnalyzedSentence>
analyzeText(java.lang.String text)
Use this method if you want to access LanguageTool's otherwise internal analysis of the text.protected java.util.List<RuleMatch>
applyCustomFilters(java.util.List<RuleMatch> matches, AnnotatedText text)
should be called just once with complete list of matches, before returning them to callerjava.util.List<RuleMatch>
check(java.lang.String text)
The main check method.java.util.List<RuleMatch>
check(java.lang.String text, boolean tokenizeText, JLanguageTool.ParagraphHandling paraMode)
java.util.List<RuleMatch>
check(java.lang.String text, boolean tokenizeText, JLanguageTool.ParagraphHandling paraMode, RuleMatchListener listener)
java.util.List<RuleMatch>
check(java.lang.String text, RuleMatchListener listener)
The main check method.java.util.List<RuleMatch>
check(AnnotatedText text)
The main check method.java.util.List<RuleMatch>
check(AnnotatedText annotatedText, boolean tokenizeText, JLanguageTool.ParagraphHandling paraMode)
The main check method.java.util.List<RuleMatch>
check(AnnotatedText annotatedText, boolean tokenizeText, JLanguageTool.ParagraphHandling paraMode, RuleMatchListener listener)
The main check method.java.util.List<RuleMatch>
check(AnnotatedText annotatedText, boolean tokenizeText, JLanguageTool.ParagraphHandling paraMode, RuleMatchListener listener, JLanguageTool.Mode mode)
The main check method.java.util.List<RuleMatch>
check(AnnotatedText text, RuleMatchListener listener)
java.util.List<RuleMatch>
checkAnalyzedSentence(JLanguageTool.ParagraphHandling paraMode, java.util.List<Rule> rules, AnalyzedSentence analyzedSentence)
This is an internal method that's public only for technical reasons, please use one of thecheck(String)
methods instead.(package private) static int
countLineBreaks(java.lang.String s)
void
disableCategory(CategoryId id)
Disable the given rule category so the check methods likecheck(String)
won't use it.void
disableRule(java.lang.String ruleId)
Disable a given rule so the check methods likecheck(String)
won't use it.void
disableRules(java.util.List<java.lang.String> ruleIds)
Disable the given rules so the check methods likecheck(String)
won't use them.void
enableRule(java.lang.String ruleId)
Enable a given rule so the check methods likecheck(String)
will use it.void
enableRuleCategory(CategoryId id)
Enable all rules of the given category so the check methods likecheck(String)
will use it.private java.util.List<SuggestedReplacement>
extendSuggestions(java.util.List<SuggestedReplacement> replacements)
java.util.List<Rule>
getAllActiveOfficeRules()
Works like getAllActiveRules but overrides defaults by office defaultsjava.util.List<Rule>
getAllActiveRules()
Get all active (not disabled) rules for the current language that are built-in or that have been added using e.g.private java.util.List<Rule>
getAllBuiltinRules(Language language, java.util.ResourceBundle messages, UserConfig userConfig, GlobalConfig globalConfig)
java.util.List<Rule>
getAllRules()
Get all rules for the current language that are built-in or that have been added usingaddRule(Rule)
.AnalyzedSentence
getAnalyzedSentence(java.lang.String sentence)
Tokenizes the givensentence
into words and analyzes it, and then disambiguates POS tags.private static @Nullable java.lang.String
getBuildDate()
Returns the build date ornull
if not run from JAR.java.util.Map<CategoryId,Category>
getCategories()
Get all rule categories for the current language.static ResourceDataBroker
getDataBroker()
The grammar checker needs resources from following directories:/resource
/rules
java.util.Set<java.lang.String>
getDisabledRules()
Get rule ids of the rules that have been explicitly disabled.Language
getLanguage()
Get the language that was used to configure this instance.static java.util.ResourceBundle
getMessageBundle()
Gets the ResourceBundle (i18n strings) for the default language of the user's system.static java.util.ResourceBundle
getMessageBundle(Language lang)
Gets the ResourceBundle (i18n strings) for the given user interface language.java.util.List<AbstractPatternRule>
getPatternRulesByIdAndSubId(java.lang.String id, java.lang.String subId)
Get pattern rules by Id and SubId.AnalyzedSentence
getRawAnalyzedSentence(java.lang.String sentence)
Tokenizes the givensentence
into words and analyzes it.private static @Nullable java.lang.String
getShortGitId()
Returns the abbreviated git id ornull
.java.util.List<java.lang.String>
getUnknownWords()
Get the alphabetically sorted list of unknown words in the latest run of one of thecheck(String)
methods.private boolean
ignoreRule(Rule rule)
boolean
isCategoryDisabled(CategoryId id)
Returns true if a category is explicitly disabled.static boolean
isPremiumVersion()
java.util.List<AbstractPatternRule>
loadFalseFriendRules(java.lang.String filename)
Load false friend rules from an XML file.java.util.List<AbstractPatternRule>
loadPatternRules(java.lang.String filename)
Load pattern rules from an XML file.protected java.util.List<RuleMatch>
performCheck(java.util.List<AnalyzedSentence> analyzedSentences, java.util.List<java.lang.String> sentences, java.util.List<Rule> allRules, JLanguageTool.ParagraphHandling paraMode, AnnotatedText annotatedText, JLanguageTool.Mode mode)
protected java.util.List<RuleMatch>
performCheck(java.util.List<AnalyzedSentence> analyzedSentences, java.util.List<java.lang.String> sentences, java.util.List<Rule> allRules, JLanguageTool.ParagraphHandling paraMode, AnnotatedText annotatedText, RuleMatchListener listener, JLanguageTool.Mode mode)
protected void
printIfVerbose(java.lang.String s)
protected void
printSentenceInfo(AnalyzedSentence analyzedSentence)
protected void
rememberUnknownWords(AnalyzedSentence analyzedText)
static void
removeTemporaryFiles()
Clean up all temporary files, if there are any.private java.util.Map<java.lang.Integer,java.lang.String>
replaceSoftHyphens(java.util.List<java.lang.String> tokens)
java.util.List<java.lang.String>
sentenceTokenize(java.lang.String text)
Tokenizes the given text into sentences.void
setCleanOverlappingMatches(boolean cleanOverlappingMatches)
Whether thecheck(String)
methods return overlapping errors.void
setConfigValues(java.util.Map<java.lang.String,java.lang.Integer> v)
static void
setDataBroker(ResourceDataBroker broker)
The grammar checker needs resources from following directories:/resource
/rules
void
setListUnknownWords(boolean listUnknownWords)
Whether thecheck(String)
methods store unknown words.void
setMaxErrorsPerWordRate(float maxErrorsPerWordRate)
Maximum errors per word rate, checking will stop with an exception if the rate is higher.void
setOutput(java.io.PrintStream printStream)
Set a PrintStream that will receive verbose output.private void
updateOptionalLanguageModelRules(@Nullable LanguageModel lm)
Remove rules that can profit from a language model, recreate them with the given model and add them again
-
-
-
Field Detail
-
VERSION
public static final java.lang.String VERSION
LanguageTool version as a string like2.3
or2.4-SNAPSHOT
.- See Also:
- Constant Field Values
-
BUILD_DATE
@Nullable public static final @Nullable java.lang.String BUILD_DATE
LanguageTool build date and time like2013-10-17 16:10
ornull
if not run from JAR.
-
GIT_SHORT_ID
@Nullable public static final @Nullable java.lang.String GIT_SHORT_ID
Abbreviated git id ornull
if not available.- Since:
- 4.5
-
PATTERN_FILE
public static final java.lang.String PATTERN_FILE
The name of the file with error patterns.- See Also:
- Constant Field Values
-
FALSE_FRIEND_FILE
public static final java.lang.String FALSE_FRIEND_FILE
The name of the file with false friend information.- See Also:
- Constant Field Values
-
SENTENCE_START_TAGNAME
public static final java.lang.String SENTENCE_START_TAGNAME
The internal tag used to mark the beginning of a sentence.- See Also:
- Constant Field Values
-
SENTENCE_END_TAGNAME
public static final java.lang.String SENTENCE_END_TAGNAME
The internal tag used to mark the end of a sentence.- See Also:
- Constant Field Values
-
PARAGRAPH_END_TAGNAME
public static final java.lang.String PARAGRAPH_END_TAGNAME
The internal tag used to mark the end of a paragraph.- See Also:
- Constant Field Values
-
MESSAGE_BUNDLE
public static final java.lang.String MESSAGE_BUNDLE
Name of the message bundle for translations.- See Also:
- Constant Field Values
-
DICTIONARY_FILENAME_EXTENSION
public static final java.lang.String DICTIONARY_FILENAME_EXTENSION
Extension of dictionary files read by Spellers- See Also:
- Constant Field Values
-
cache
private final ResultCache cache
-
userConfig
private final UserConfig userConfig
-
descProvider
private final ShortDescriptionProvider descProvider
-
maxErrorsPerWordRate
private float maxErrorsPerWordRate
-
dataBroker
private static ResourceDataBroker dataBroker
-
builtinRules
private final java.util.List<Rule> builtinRules
-
userRules
private final java.util.List<Rule> userRules
-
optionalLanguageModelRules
private final java.util.Set<java.lang.String> optionalLanguageModelRules
-
disabledRules
private final java.util.Set<java.lang.String> disabledRules
-
disabledRuleCategories
private final java.util.Set<CategoryId> disabledRuleCategories
-
enabledRules
private final java.util.Set<java.lang.String> enabledRules
-
enabledRuleCategories
private final java.util.Set<CategoryId> enabledRuleCategories
-
language
private final Language language
-
altLanguages
private final java.util.List<Language> altLanguages
-
motherTongue
private final Language motherTongue
-
matchFilters
private final java.util.List<RuleMatchFilter> matchFilters
-
printStream
private java.io.PrintStream printStream
-
listUnknownWords
private boolean listUnknownWords
-
unknownWords
private java.util.Set<java.lang.String> unknownWords
-
cleanOverlappingMatches
private boolean cleanOverlappingMatches
-
temporaryFiles
private static final java.util.List<java.io.File> temporaryFiles
-
-
Constructor Detail
-
JLanguageTool
public JLanguageTool(Language lang, Language motherTongue)
Create a JLanguageTool and setup the built-in rules for the given language and false friend rules for the text language / mother tongue pair.- Parameters:
lang
- the language of the text to be checkedmotherTongue
- the user's mother tongue, used for false friend rules, ornull
. The mother tongue may also be used as a source language for checking bilingual texts.
-
JLanguageTool
public JLanguageTool(Language language)
Create a JLanguageTool and setup the built-in Java rules for the given language.- Parameters:
language
- the language of the text to be checked
-
JLanguageTool
public JLanguageTool(Language language, Language motherTongue, ResultCache cache)
Create a JLanguageTool and setup the built-in rules for the given language and false friend rules for the text language / mother tongue pair.- Parameters:
language
- the language of the text to be checkedmotherTongue
- the user's mother tongue, used for false friend rules, ornull
. The mother tongue may also be used as a source language for checking bilingual texts.cache
- a cache to speed up checking if the same sentences get checked more than once, e.g. when LT is running as a server and texts are re-checked due to changes- Since:
- 3.7
-
JLanguageTool
@Experimental public JLanguageTool(Language language, ResultCache cache, UserConfig userConfig)
Create a JLanguageTool and setup the built-in rules for the given language and false friend rules for the text language / mother tongue pair.- Parameters:
language
- the language of the text to be checkedcache
- a cache to speed up checking if the same sentences get checked more than once, e.g. when LT is running as a server and texts are re-checked due to changes. Usenull
to deactivate the cache.- Since:
- 4.2
-
JLanguageTool
@Experimental public JLanguageTool(Language language, java.util.List<Language> altLanguages, Language motherTongue, ResultCache cache, GlobalConfig globalConfig, UserConfig userConfig)
Create a JLanguageTool and setup the built-in rules for the given language and false friend rules for the text language / mother tongue pair.- Parameters:
language
- the language of the text to be checkedaltLanguages
- The languages that are accepted as alternative languages - currently this means words are accepted if they are in an alternative language and not similar to a word fromlanguage
. If there's a similar word inlanguage
, there will be an error of typeRuleMatch.Type.Hint
(EXPERIMENTAL)motherTongue
- the user's mother tongue, used for false friend rules, ornull
. The mother tongue may also be used as a source language for checking bilingual texts.cache
- a cache to speed up checking if the same sentences get checked more than once, e.g. when LT is running as a server and texts are re-checked due to changes- Since:
- 4.3
-
JLanguageTool
@Experimental public JLanguageTool(Language language, Language motherTongue, ResultCache cache, UserConfig userConfig)
Create a JLanguageTool and setup the built-in rules for the given language and false friend rules for the text language / mother tongue pair.- Parameters:
language
- the language of the text to be checkedmotherTongue
- the user's mother tongue, used for false friend rules, ornull
. The mother tongue may also be used as a source language for checking bilingual texts.cache
- a cache to speed up checking if the same sentences get checked more than once, e.g. when LT is running as a server and texts are re-checked due to changes- Since:
- 4.2
-
-
Method Detail
-
getBuildDate
@Nullable private static @Nullable java.lang.String getBuildDate()
Returns the build date ornull
if not run from JAR.
-
getShortGitId
@Nullable private static @Nullable java.lang.String getShortGitId()
Returns the abbreviated git id ornull
.
-
isPremiumVersion
public static boolean isPremiumVersion()
- Since:
- 4.2
-
getDataBroker
public static ResourceDataBroker getDataBroker()
The grammar checker needs resources from following directories:/resource
/rules
- Returns:
- The currently set data broker which allows to obtain
resources from the mentioned directories above. If no
data broker was set, a new
DefaultResourceDataBroker
will be instantiated and returned. - Since:
- 1.0.1
-
setDataBroker
public static void setDataBroker(ResourceDataBroker broker)
The grammar checker needs resources from following directories:/resource
/rules
- Parameters:
broker
- The new resource broker to be used.- Since:
- 1.0.1
-
setListUnknownWords
public void setListUnknownWords(boolean listUnknownWords)
Whether thecheck(String)
methods store unknown words. If set totrue
(default: false), you can get the list of unknown words usinggetUnknownWords()
.
-
setCleanOverlappingMatches
public void setCleanOverlappingMatches(boolean cleanOverlappingMatches)
Whether thecheck(String)
methods return overlapping errors. If set totrue
(default: true), it removes overlapping errors according to the priorities established for the language.- Since:
- 3.6
-
setMaxErrorsPerWordRate
@Experimental public void setMaxErrorsPerWordRate(float maxErrorsPerWordRate)
Maximum errors per word rate, checking will stop with an exception if the rate is higher. For example, with a rate of 0.33, the checking would stop if the user's text has so many errors that more than every 3rd word causes a rule match. Note that this may not apply for very short texts.- Since:
- 4.0
-
getMessageBundle
public static java.util.ResourceBundle getMessageBundle()
Gets the ResourceBundle (i18n strings) for the default language of the user's system.
-
getMessageBundle
public static java.util.ResourceBundle getMessageBundle(Language lang)
Gets the ResourceBundle (i18n strings) for the given user interface language.- Since:
- 2.4 (public since 2.4)
-
getAllBuiltinRules
private java.util.List<Rule> getAllBuiltinRules(Language language, java.util.ResourceBundle messages, UserConfig userConfig, GlobalConfig globalConfig)
-
setOutput
public void setOutput(java.io.PrintStream printStream)
Set a PrintStream that will receive verbose output. Set tonull
(which is the default) to disable verbose output.
-
loadPatternRules
public java.util.List<AbstractPatternRule> loadPatternRules(java.lang.String filename) throws java.io.IOException
Load pattern rules from an XML file. UseaddRule(Rule)
to add these rules to the checking process.- Parameters:
filename
- path to an XML file in the classpath or in the filesystem - the classpath is checked first- Returns:
- a List of
PatternRule
objects - Throws:
java.io.IOException
-
loadFalseFriendRules
public java.util.List<AbstractPatternRule> loadFalseFriendRules(java.lang.String filename) throws javax.xml.parsers.ParserConfigurationException, org.xml.sax.SAXException, java.io.IOException
Load false friend rules from an XML file. Only those pairs will be loaded that match the current text language and the mother tongue specified in the JLanguageTool constructor. UseaddRule(Rule)
to add these rules to the checking process.- Parameters:
filename
- path to an XML file in the classpath or in the filesystem - the classpath is checked first- Returns:
- a List of
PatternRule
objects, or an empty list if mother tongue is not set - Throws:
javax.xml.parsers.ParserConfigurationException
org.xml.sax.SAXException
java.io.IOException
-
updateOptionalLanguageModelRules
private void updateOptionalLanguageModelRules(@Nullable @Nullable LanguageModel lm)
Remove rules that can profit from a language model, recreate them with the given model and add them again- Parameters:
lm
- the language model or null if none is available
-
activateNeuralNetworkRules
public void activateNeuralNetworkRules(java.io.File modelDir) throws java.io.IOException
Activate rules that depend on pretrained neural network models.- Parameters:
modelDir
- root dir of exported models- Throws:
java.io.IOException
- Since:
- 4.4
-
activateLanguageModelRules
public void activateLanguageModelRules(java.io.File indexDir) throws java.io.IOException
Activate rules that depend on a language model. The language model currently consists of Lucene indexes with ngram occurrence counts.- Parameters:
indexDir
- directory with a '3grams' sub directory which contains a Lucene index with 3gram occurrence counts- Throws:
java.io.IOException
- Since:
- 2.7
-
activateWord2VecModelRules
public void activateWord2VecModelRules(java.io.File indexDir) throws java.io.IOException
Activate rules that depend on a word2vec language model.- Parameters:
indexDir
- directory with a subdirectories like 'en', each containing dictionary.txt and final_embeddings.txt- Throws:
java.io.IOException
- Since:
- 4.0
-
activateDefaultPatternRules
private void activateDefaultPatternRules() throws java.io.IOException
Loads and activates the pattern rules fromorg/languagetool/rules/<languageCode>/grammar.xml
.- Throws:
java.io.IOException
-
activateDefaultFalseFriendRules
private void activateDefaultFalseFriendRules() throws javax.xml.parsers.ParserConfigurationException, org.xml.sax.SAXException, java.io.IOException
Loads and activates the false friend rules fromrules/false-friends.xml
.- Throws:
javax.xml.parsers.ParserConfigurationException
org.xml.sax.SAXException
java.io.IOException
-
addMatchFilter
public void addMatchFilter(@NotNull @NotNull RuleMatchFilter filter)
Add aRuleMatchFilter
for post-processing of rule matches Filters are called sequentially in the same order as added- Parameters:
filter
- filter to add- Since:
- 4.7
-
addRule
public void addRule(Rule rule)
Add a rule to be used by the next call to the check methods likecheck(String)
.
-
disableRule
public void disableRule(java.lang.String ruleId)
Disable a given rule so the check methods likecheck(String)
won't use it.- Parameters:
ruleId
- the id of the rule to disable - no error will be thrown if the id does not exist- See Also:
enableRule(String)
-
disableRules
public void disableRules(java.util.List<java.lang.String> ruleIds)
Disable the given rules so the check methods likecheck(String)
won't use them.- Parameters:
ruleIds
- the ids of the rules to disable - no error will be thrown if the id does not exist- Since:
- 2.4
-
disableCategory
public void disableCategory(CategoryId id)
Disable the given rule category so the check methods likecheck(String)
won't use it.- Parameters:
id
- the id of the category to disable - no error will be thrown if the id does not exist- Since:
- 3.3
- See Also:
enableRuleCategory(CategoryId)
-
isCategoryDisabled
public boolean isCategoryDisabled(CategoryId id)
Returns true if a category is explicitly disabled.- Parameters:
id
- the id of the category to check - no error will be thrown if the id does not exist- Returns:
- true if this category is explicitly disabled.
- Since:
- 3.5
- See Also:
disableCategory(org.languagetool.rules.CategoryId)
-
getLanguage
public Language getLanguage()
Get the language that was used to configure this instance.
-
getDisabledRules
public java.util.Set<java.lang.String> getDisabledRules()
Get rule ids of the rules that have been explicitly disabled.
-
enableRule
public void enableRule(java.lang.String ruleId)
Enable a given rule so the check methods likecheck(String)
will use it. This will not throw an exception if the given rule id doesn't exist.- Parameters:
ruleId
- the id of the rule to enable- See Also:
disableRule(String)
-
enableRuleCategory
public void enableRuleCategory(CategoryId id)
Enable all rules of the given category so the check methods likecheck(String)
will use it. This will not throw an exception if the given rule id doesn't exist.- Since:
- 3.3
- See Also:
disableCategory(org.languagetool.rules.CategoryId)
-
sentenceTokenize
public java.util.List<java.lang.String> sentenceTokenize(java.lang.String text)
Tokenizes the given text into sentences.
-
check
public java.util.List<RuleMatch> check(java.lang.String text) throws java.io.IOException
The main check method. Tokenizes the text into sentences and matches these sentences against all currently active rules.- Parameters:
text
- the text to be checked- Returns:
- a List of
RuleMatch
objects - Throws:
java.io.IOException
-
check
public java.util.List<RuleMatch> check(java.lang.String text, RuleMatchListener listener) throws java.io.IOException
The main check method. Tokenizes the text into sentences and matches these sentences against all currently active rules.- Parameters:
text
- the text to be checked- Returns:
- a List of
RuleMatch
objects - Throws:
java.io.IOException
- Since:
- 3.7
-
check
public java.util.List<RuleMatch> check(java.lang.String text, boolean tokenizeText, JLanguageTool.ParagraphHandling paraMode) throws java.io.IOException
- Throws:
java.io.IOException
-
check
public java.util.List<RuleMatch> check(java.lang.String text, boolean tokenizeText, JLanguageTool.ParagraphHandling paraMode, RuleMatchListener listener) throws java.io.IOException
- Throws:
java.io.IOException
- Since:
- 3.7
-
check
public java.util.List<RuleMatch> check(AnnotatedText text) throws java.io.IOException
The main check method. Tokenizes the text into sentences and matches these sentences against all currently active rules, adjusting error positions so they refer to the original text including markup.- Throws:
java.io.IOException
- Since:
- 2.3
-
check
public java.util.List<RuleMatch> check(AnnotatedText text, RuleMatchListener listener) throws java.io.IOException
- Throws:
java.io.IOException
- Since:
- 3.9
-
check
public java.util.List<RuleMatch> check(AnnotatedText annotatedText, boolean tokenizeText, JLanguageTool.ParagraphHandling paraMode) throws java.io.IOException
The main check method. Tokenizes the text into sentences and matches these sentences against all currently active rules.- Parameters:
annotatedText
- The text to be checked, created withAnnotatedTextBuilder
. Call this method with the complete text to be checked. If you call it repeatedly with smaller chunks like paragraphs or sentence, those rules that work across paragraphs/sentences won't work (their status gets reset whenever this method is called).tokenizeText
- If true, then the text is tokenized into sentences. Otherwise, it is assumed it's already tokenized, i.e. it is only one sentenceparaMode
- Uses paragraph-level rules only if true.- Returns:
- a List of
RuleMatch
objects, describing potential errors in the text - Throws:
java.io.IOException
- Since:
- 2.3
-
check
public java.util.List<RuleMatch> check(AnnotatedText annotatedText, boolean tokenizeText, JLanguageTool.ParagraphHandling paraMode, RuleMatchListener listener) throws java.io.IOException
The main check method. Tokenizes the text into sentences and matches these sentences against all currently active rules.- Throws:
java.io.IOException
- Since:
- 3.7
-
check
public java.util.List<RuleMatch> check(AnnotatedText annotatedText, boolean tokenizeText, JLanguageTool.ParagraphHandling paraMode, RuleMatchListener listener, JLanguageTool.Mode mode) throws java.io.IOException
The main check method. Tokenizes the text into sentences and matches these sentences against all currently active rules depending onmode
.- Throws:
java.io.IOException
- Since:
- 4.3
-
analyzeText
public java.util.List<AnalyzedSentence> analyzeText(java.lang.String text) throws java.io.IOException
Use this method if you want to access LanguageTool's otherwise internal analysis of the text. For actual text checking, use thecheck...
methods instead.- Parameters:
text
- The text to be analyzed- Throws:
java.io.IOException
- Since:
- 2.5
-
analyzeSentences
protected java.util.List<AnalyzedSentence> analyzeSentences(java.util.List<java.lang.String> sentences) throws java.io.IOException
- Throws:
java.io.IOException
-
printSentenceInfo
protected void printSentenceInfo(AnalyzedSentence analyzedSentence)
-
performCheck
protected java.util.List<RuleMatch> performCheck(java.util.List<AnalyzedSentence> analyzedSentences, java.util.List<java.lang.String> sentences, java.util.List<Rule> allRules, JLanguageTool.ParagraphHandling paraMode, AnnotatedText annotatedText, JLanguageTool.Mode mode) throws java.io.IOException
- Throws:
java.io.IOException
-
performCheck
protected java.util.List<RuleMatch> performCheck(java.util.List<AnalyzedSentence> analyzedSentences, java.util.List<java.lang.String> sentences, java.util.List<Rule> allRules, JLanguageTool.ParagraphHandling paraMode, AnnotatedText annotatedText, RuleMatchListener listener, JLanguageTool.Mode mode) throws java.io.IOException
- Throws:
java.io.IOException
- Since:
- 3.7
-
checkAnalyzedSentence
public java.util.List<RuleMatch> checkAnalyzedSentence(JLanguageTool.ParagraphHandling paraMode, java.util.List<Rule> rules, AnalyzedSentence analyzedSentence) throws java.io.IOException
This is an internal method that's public only for technical reasons, please use one of thecheck(String)
methods instead.- Throws:
java.io.IOException
- Since:
- 2.3
-
ignoreRule
private boolean ignoreRule(Rule rule)
-
adjustRuleMatchPos
public RuleMatch adjustRuleMatchPos(RuleMatch match, int charCount, int columnCount, int lineCount, java.lang.String sentence, AnnotatedText annotatedText)
Change RuleMatch positions so they are relative to the complete text, not just to the sentence.- Parameters:
charCount
- Count of characters in the sentences beforecolumnCount
- Current column numberlineCount
- Current line numbersentence
- The text being checked- Returns:
- The RuleMatch object with adjustments
-
extendSuggestions
private java.util.List<SuggestedReplacement> extendSuggestions(java.util.List<SuggestedReplacement> replacements)
-
rememberUnknownWords
protected void rememberUnknownWords(AnalyzedSentence analyzedText)
-
getUnknownWords
public java.util.List<java.lang.String> getUnknownWords()
Get the alphabetically sorted list of unknown words in the latest run of one of thecheck(String)
methods.- Throws:
java.lang.IllegalStateException
- ifsetListUnknownWords(boolean)
has been set tofalse
-
countLineBreaks
static int countLineBreaks(java.lang.String s)
-
getAnalyzedSentence
public AnalyzedSentence getAnalyzedSentence(java.lang.String sentence) throws java.io.IOException
Tokenizes the givensentence
into words and analyzes it, and then disambiguates POS tags.- Parameters:
sentence
- sentence to be analyzed- Throws:
java.io.IOException
-
getRawAnalyzedSentence
public AnalyzedSentence getRawAnalyzedSentence(java.lang.String sentence) throws java.io.IOException
Tokenizes the givensentence
into words and analyzes it. This is the same asgetAnalyzedSentence(String)
but it does not run the disambiguator.- Parameters:
sentence
- sentence to be analyzed- Throws:
java.io.IOException
- Since:
- 0.9.8
-
replaceSoftHyphens
private java.util.Map<java.lang.Integer,java.lang.String> replaceSoftHyphens(java.util.List<java.lang.String> tokens)
-
getCategories
public java.util.Map<CategoryId,Category> getCategories()
Get all rule categories for the current language.- Returns:
- a map of
Categories
, keyed by theirid
. - Since:
- 3.5
-
getAllRules
public java.util.List<Rule> getAllRules()
Get all rules for the current language that are built-in or that have been added usingaddRule(Rule)
. Please note that XML rules that are grouped will appear as multiple rules with the same id. To tell them apart, check if they are of typeAbstractPatternRule
, cast them to that type and call theirAbstractPatternRule.getSubId()
method.- Returns:
- a List of
Rule
objects
-
getAllActiveRules
public java.util.List<Rule> getAllActiveRules()
Get all active (not disabled) rules for the current language that are built-in or that have been added using e.g.addRule(Rule)
. SeegetAllRules()
for hints about rule ids.- Returns:
- a List of
Rule
objects
-
getAllActiveOfficeRules
public java.util.List<Rule> getAllActiveOfficeRules()
Works like getAllActiveRules but overrides defaults by office defaults- Returns:
- a List of
Rule
objects - Since:
- 4.0
-
getPatternRulesByIdAndSubId
public java.util.List<AbstractPatternRule> getPatternRulesByIdAndSubId(java.lang.String id, java.lang.String subId)
Get pattern rules by Id and SubId. This returns a list because rules that use<or>...</or>
are internally expanded into several rules.- Returns:
- a List of
Rule
objects - Since:
- 2.3
-
printIfVerbose
protected void printIfVerbose(java.lang.String s)
-
addTemporaryFile
public static void addTemporaryFile(java.io.File file)
Adds a temporary file to the internal list (internal method, you should never need to call this as a user of LanguageTool)- Parameters:
file
- the file to be added.
-
removeTemporaryFiles
public static void removeTemporaryFiles()
Clean up all temporary files, if there are any.
-
applyCustomFilters
protected java.util.List<RuleMatch> applyCustomFilters(java.util.List<RuleMatch> matches, AnnotatedText text)
should be called just once with complete list of matches, before returning them to caller- Parameters:
matches
- matches after applying rules and default filterstext
- text that matches refer to- Returns:
- transformed matches (after applying filters in
matchFilters
) - Since:
- 4.7
-
setConfigValues
public void setConfigValues(java.util.Map<java.lang.String,java.lang.Integer> v)
-
-