Class FileLinesBigList

java.lang.Object
java.util.AbstractCollection<it.unimi.dsi.lang.MutableString>
it.unimi.dsi.fastutil.objects.AbstractObjectCollection<it.unimi.dsi.lang.MutableString>
it.unimi.dsi.fastutil.objects.AbstractObjectBigList<it.unimi.dsi.lang.MutableString>
it.unimi.dsi.sux4j.io.FileLinesBigList
All Implemented Interfaces:
it.unimi.dsi.fastutil.BigList<it.unimi.dsi.lang.MutableString>, it.unimi.dsi.fastutil.objects.ObjectBigList<it.unimi.dsi.lang.MutableString>, it.unimi.dsi.fastutil.objects.ObjectCollection<it.unimi.dsi.lang.MutableString>, it.unimi.dsi.fastutil.objects.ObjectIterable<it.unimi.dsi.lang.MutableString>, it.unimi.dsi.fastutil.Size64, it.unimi.dsi.fastutil.Stack<it.unimi.dsi.lang.MutableString>, Serializable, Comparable<it.unimi.dsi.fastutil.BigList<? extends it.unimi.dsi.lang.MutableString>>, Iterable<it.unimi.dsi.lang.MutableString>, Collection<it.unimi.dsi.lang.MutableString>, RandomAccess

public class FileLinesBigList extends it.unimi.dsi.fastutil.objects.AbstractObjectBigList<it.unimi.dsi.lang.MutableString> implements RandomAccess, Serializable
A wrapper exhibiting the lines of a file as a big list.

An instance of this class allows to access the lines of a file as a BigList. Contrarily to a big FileLinesMutableStringIterable, direct access is possible and reasonably efficient, in particular when accessing nearby lines, and all returned mutable strings are separate, independent instances.

Similarly to FileLinesMutableStringIterable, instead, AbstractObjectBigList.iterator() can be called any number of times, as it opens an independent input stream at each call. For the same reason, the returned iterator type (FileLinesBigList.FileLinesIterator) is Closeable, and should be closed after usage.

Note that toString() will return a single string containing all file lines separated by the string associated to the system property line.separator.

Warning: this class is not synchronised. Separate iterators use separate input streams, and can be accessed concurrently, but all calls to get(long) refer to the same input stream.

Implementation details

Instances of this class perform a full scan of the specified file at construction time, representing the list of pointers to the start of each line using the Elias–Fano representation. The memory occupation per line is thus bounded by 2 + log ℓ bits, where ℓ is the average line length.

Since:
2.1
Author:
Sebastiano Vigna
See Also:
  • Nested Class Summary

    Nested Classes
    Modifier and Type
    Class
    Description
    static final class 
    An iterator over the lines of a FileLinesBigList.

    Nested classes/interfaces inherited from class it.unimi.dsi.fastutil.objects.AbstractObjectBigList

    it.unimi.dsi.fastutil.objects.AbstractObjectBigList.ObjectRandomAccessSubList<K>, it.unimi.dsi.fastutil.objects.AbstractObjectBigList.ObjectSubList<K>
  • Constructor Summary

    Constructors
    Constructor
    Description
    FileLinesBigList(CharSequence filename, String encoding)
    Creates a file-lines collection for the specified filename with the specified encoding, default buffer size and with all terminators.
    FileLinesBigList(CharSequence filename, String encoding, int bufferSize)
    Creates a file-lines collection for the specified filename with the specified encoding, buffer size and with all terminators.
    FileLinesBigList(CharSequence filename, String encoding, int bufferSize, EnumSet<it.unimi.dsi.fastutil.io.FastBufferedInputStream.LineTerminator> terminators)
    Creates a file-lines collection for the specified filename with the specified encoding, buffer size and terminator set.
  • Method Summary

    Modifier and Type
    Method
    Description
    it.unimi.dsi.lang.MutableString
    get(long index)
     
    it.unimi.dsi.lang.MutableString
    get(long index, it.unimi.dsi.fastutil.io.FastBufferedInputStream fastBufferedInputStream, ByteBuffer byteBuffer, CharBuffer charBuffer, CharsetDecoder decoder)
     
    listIterator(long index)
     
    int
    Deprecated.
    long
     
     

    Methods inherited from class it.unimi.dsi.fastutil.objects.AbstractObjectBigList

    add, add, addAll, addAll, addElements, addElements, clear, compareTo, contains, ensureIndex, ensureRestrictedIndex, equals, forEach, getElements, hashCode, indexOf, iterator, lastIndexOf, listIterator, peek, pop, push, remove, removeElements, set, setElements, size, subList, top

    Methods inherited from class java.util.AbstractCollection

    containsAll, isEmpty, remove, removeAll, retainAll, toArray, toArray

    Methods inherited from class java.lang.Object

    clone, finalize, getClass, notify, notifyAll, wait, wait, wait

    Methods inherited from interface it.unimi.dsi.fastutil.objects.ObjectBigList

    addAll, addAll, addAll, addAll, getElements, setElements, setElements, spliterator

    Methods inherited from interface it.unimi.dsi.fastutil.Stack

    isEmpty
  • Constructor Details

    • FileLinesBigList

      public FileLinesBigList(CharSequence filename, String encoding, int bufferSize, EnumSet<it.unimi.dsi.fastutil.io.FastBufferedInputStream.LineTerminator> terminators) throws IOException
      Creates a file-lines collection for the specified filename with the specified encoding, buffer size and terminator set.
      Parameters:
      filename - a filename.
      encoding - an encoding.
      bufferSize - the buffer size for FastBufferedInputStream.
      terminators - a set of line terminators.
      Throws:
      IOException
    • FileLinesBigList

      public FileLinesBigList(CharSequence filename, String encoding, int bufferSize) throws IOException
      Creates a file-lines collection for the specified filename with the specified encoding, buffer size and with all terminators.
      Parameters:
      filename - a filename.
      encoding - an encoding.
      bufferSize - the buffer size for FastBufferedInputStream.
      Throws:
      IOException
    • FileLinesBigList

      public FileLinesBigList(CharSequence filename, String encoding) throws IOException
      Creates a file-lines collection for the specified filename with the specified encoding, default buffer size and with all terminators.
      Parameters:
      filename - a filename.
      encoding - an encoding.
      Throws:
      IOException
  • Method Details

    • size64

      public long size64()
      Specified by:
      size64 in interface it.unimi.dsi.fastutil.Size64
    • size

      @Deprecated public int size()
      Deprecated.
      Specified by:
      size in interface it.unimi.dsi.fastutil.BigList<it.unimi.dsi.lang.MutableString>
      Specified by:
      size in interface Collection<it.unimi.dsi.lang.MutableString>
      Specified by:
      size in interface it.unimi.dsi.fastutil.Size64
      Overrides:
      size in class it.unimi.dsi.fastutil.objects.AbstractObjectBigList<it.unimi.dsi.lang.MutableString>
    • get

      public it.unimi.dsi.lang.MutableString get(long index)
      Specified by:
      get in interface it.unimi.dsi.fastutil.BigList<it.unimi.dsi.lang.MutableString>
    • get

      public it.unimi.dsi.lang.MutableString get(long index, it.unimi.dsi.fastutil.io.FastBufferedInputStream fastBufferedInputStream, ByteBuffer byteBuffer, CharBuffer charBuffer, CharsetDecoder decoder)
    • listIterator

      public FileLinesBigList.FileLinesIterator listIterator(long index)
      Specified by:
      listIterator in interface it.unimi.dsi.fastutil.BigList<it.unimi.dsi.lang.MutableString>
      Specified by:
      listIterator in interface it.unimi.dsi.fastutil.objects.ObjectBigList<it.unimi.dsi.lang.MutableString>
      Overrides:
      listIterator in class it.unimi.dsi.fastutil.objects.AbstractObjectBigList<it.unimi.dsi.lang.MutableString>
    • toString

      public String toString()
      Overrides:
      toString in class it.unimi.dsi.fastutil.objects.AbstractObjectBigList<it.unimi.dsi.lang.MutableString>