Package org.languagetool.tools
Class SynthDictionaryBuilder
- java.lang.Object
-
- org.languagetool.tools.DictionaryBuilder
-
- org.languagetool.tools.SynthDictionaryBuilder
-
final class SynthDictionaryBuilder extends DictionaryBuilder
Create a Morfologik binary synthesizer dictionary from plain text data.
-
-
Field Summary
Fields Modifier and Type Field Description private static java.lang.String
POLISH_IGNORE_REGEX
It makes sense to remove all forms from the synthesizer dict where POS tags indicate "unknown form", "foreign word" etc., as they only take space.
-
Constructor Summary
Constructors Constructor Description SynthDictionaryBuilder(java.io.File infoFile)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description (package private) java.io.File
build(java.io.File plainTextDictFile, java.io.File infoFile)
private java.util.Set<java.lang.String>
collectTags(java.io.File plainTextDictFile)
private java.util.Set<java.lang.String>
getIgnoreItems(java.io.File file)
private @Nullable java.util.regex.Pattern
getPosTagIgnoreRegex(java.io.File infoFile)
private java.io.File
getTagFile(java.io.File tempFile)
static void
main(java.lang.String[] args)
private java.io.File
reverseLineContent(java.io.File plainTextDictFile, java.util.Set<java.lang.String> itemsToBeIgnored, java.util.regex.Pattern ignorePosRegex)
private void
writePosTagsToFile(java.io.File plainTextDictFile, java.io.File tagFile)
-
Methods inherited from class org.languagetool.tools.DictionaryBuilder
addFreqData, buildDict, buildFSA, convertTabToSeparator, getOption, getOutputFilename, hasOption, isOptionTrue, readFreqList, setOutputFilename
-
-
-
-
Field Detail
-
POLISH_IGNORE_REGEX
private static final java.lang.String POLISH_IGNORE_REGEX
It makes sense to remove all forms from the synthesizer dict where POS tags indicate "unknown form", "foreign word" etc., as they only take space. Probably nobody will ever use them:- See Also:
- Constant Field Values
-
-
Method Detail
-
main
public static void main(java.lang.String[] args) throws java.lang.Exception
- Throws:
java.lang.Exception
-
build
java.io.File build(java.io.File plainTextDictFile, java.io.File infoFile) throws java.lang.Exception
- Throws:
java.lang.Exception
-
getIgnoreItems
private java.util.Set<java.lang.String> getIgnoreItems(java.io.File file) throws java.io.FileNotFoundException
- Throws:
java.io.FileNotFoundException
-
getPosTagIgnoreRegex
@Nullable private @Nullable java.util.regex.Pattern getPosTagIgnoreRegex(java.io.File infoFile)
-
reverseLineContent
private java.io.File reverseLineContent(java.io.File plainTextDictFile, java.util.Set<java.lang.String> itemsToBeIgnored, java.util.regex.Pattern ignorePosRegex) throws java.io.IOException
- Throws:
java.io.IOException
-
getTagFile
private java.io.File getTagFile(java.io.File tempFile)
-
writePosTagsToFile
private void writePosTagsToFile(java.io.File plainTextDictFile, java.io.File tagFile) throws java.io.IOException
- Throws:
java.io.IOException
-
collectTags
private java.util.Set<java.lang.String> collectTags(java.io.File plainTextDictFile) throws java.io.IOException
- Throws:
java.io.IOException
-
-