Class LineWordReader

  • All Implemented Interfaces:
    WordReader, java.io.Serializable

    public class LineWordReader
    extends java.lang.Object
    implements WordReader, java.io.Serializable
    A trivial WordReader that considers each line of a document a single word.

    The intended usage of this class is that of indexing stuff like lists of document identifiers: if the identifiers contain nonalphabetical characters, the default FastBufferedReader might do a poor job.

    Note that the non-word returned by next(MutableString, MutableString) is always empty.

    See Also:
    Serialized Form
    • Constructor Summary

      Constructors 
      Constructor Description
      LineWordReader()  
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      LineWordReader copy()
      Returns a copy of this word reader.
      boolean next​(MutableString word, MutableString nonWord)
      Extracts the next word and non-word.
      LineWordReader setReader​(java.io.Reader reader)
      Resets the internal state of this word reader, which will start again reading from the given reader.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • LineWordReader

        public LineWordReader()
    • Method Detail

      • next

        public boolean next​(MutableString word,
                            MutableString nonWord)
                     throws java.io.IOException
        Description copied from interface: WordReader
        Extracts the next word and non-word.

        If this method returns true, a new non-empty word, and possibly a new non-word, have been extracted. It is acceptable that the first call to this method after creation or after a call to WordReader.setReader(Reader) returns an empty word. In other words both word and nonWord are maximal.

        Specified by:
        next in interface WordReader
        Parameters:
        word - the next word returned by the underlying reader.
        nonWord - the nonword following the next word returned by the underlying reader.
        Returns:
        true if a new word was processed, false otherwise (in which case both word and nonWord are unchanged).
        Throws:
        java.io.IOException
      • setReader

        public LineWordReader setReader​(java.io.Reader reader)
        Description copied from interface: WordReader
        Resets the internal state of this word reader, which will start again reading from the given reader.
        Specified by:
        setReader in interface WordReader
        Parameters:
        reader - the new reader providing characters.
        Returns:
        this word reader.
      • copy

        public LineWordReader copy()
        Description copied from interface: WordReader
        Returns a copy of this word reader.

        This method must return a word reader with a behaviour that matches exactly that of this word reader.

        Specified by:
        copy in interface WordReader
        Returns:
        a copy of this word reader.