Package it.unimi.dsi.io
Class LineWordReader
- java.lang.Object
-
- it.unimi.dsi.io.LineWordReader
-
- All Implemented Interfaces:
WordReader
,java.io.Serializable
public class LineWordReader extends java.lang.Object implements WordReader, java.io.Serializable
A trivialWordReader
that considers each line of a document a single word.The intended usage of this class is that of indexing stuff like lists of document identifiers: if the identifiers contain nonalphabetical characters, the default
FastBufferedReader
might do a poor job.Note that the non-word returned by
next(MutableString, MutableString)
is always empty.- See Also:
- Serialized Form
-
-
Constructor Summary
Constructors Constructor Description LineWordReader()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description LineWordReader
copy()
Returns a copy of this word reader.boolean
next(MutableString word, MutableString nonWord)
Extracts the next word and non-word.LineWordReader
setReader(java.io.Reader reader)
Resets the internal state of this word reader, which will start again reading from the given reader.
-
-
-
Method Detail
-
next
public boolean next(MutableString word, MutableString nonWord) throws java.io.IOException
Description copied from interface:WordReader
Extracts the next word and non-word.If this method returns true, a new non-empty word, and possibly a new non-word, have been extracted. It is acceptable that the first call to this method after creation or after a call to
WordReader.setReader(Reader)
returns an empty word. In other words bothword
andnonWord
are maximal.- Specified by:
next
in interfaceWordReader
- Parameters:
word
- the next word returned by the underlying reader.nonWord
- the nonword following the next word returned by the underlying reader.- Returns:
- true if a new word was processed, false otherwise (in which
case both
word
andnonWord
are unchanged). - Throws:
java.io.IOException
-
setReader
public LineWordReader setReader(java.io.Reader reader)
Description copied from interface:WordReader
Resets the internal state of this word reader, which will start again reading from the given reader.- Specified by:
setReader
in interfaceWordReader
- Parameters:
reader
- the new reader providing characters.- Returns:
- this word reader.
-
copy
public LineWordReader copy()
Description copied from interface:WordReader
Returns a copy of this word reader.This method must return a word reader with a behaviour that matches exactly that of this word reader.
- Specified by:
copy
in interfaceWordReader
- Returns:
- a copy of this word reader.
-
-