Package edu.berkeley.nlp.lm.io
Class TextReader<W>
- java.lang.Object
-
- edu.berkeley.nlp.lm.io.TextReader<W>
-
- Type Parameters:
W
-
- All Implemented Interfaces:
LmReader<LongRef,LmReaderCallback<LongRef>>
public class TextReader<W> extends java.lang.Object implements LmReader<LongRef,LmReaderCallback<LongRef>>
Class for reading raw text files.- Author:
- adampauls
-
-
Constructor Summary
Constructors Constructor Description TextReader(java.lang.Iterable<java.lang.String> lineIterator, WordIndexer<W> wordIndexer)
TextReader(java.util.List<java.lang.String> inputFiles, WordIndexer<W> wordIndexer)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
parse(LmReaderCallback<LongRef> callback)
Reads newline-separated plain text from inputFiles, and writes an ARPA lm file to outputFile.
-
-
-
Constructor Detail
-
TextReader
public TextReader(java.util.List<java.lang.String> inputFiles, WordIndexer<W> wordIndexer)
-
TextReader
public TextReader(java.lang.Iterable<java.lang.String> lineIterator, WordIndexer<W> wordIndexer)
-
-
Method Detail
-
parse
public void parse(LmReaderCallback<LongRef> callback)
Reads newline-separated plain text from inputFiles, and writes an ARPA lm file to outputFile. If files have a .gz suffix, then they will be (un)zipped as necessary.- Specified by:
parse
in interfaceLmReader<LongRef,LmReaderCallback<LongRef>>
- Parameters:
inputFiles
-outputFile
-
-
-