Class FileLinesBigList

  • All Implemented Interfaces:
    it.unimi.dsi.fastutil.BigList<it.unimi.dsi.lang.MutableString>, it.unimi.dsi.fastutil.objects.ObjectBigList<it.unimi.dsi.lang.MutableString>, it.unimi.dsi.fastutil.objects.ObjectCollection<it.unimi.dsi.lang.MutableString>, it.unimi.dsi.fastutil.objects.ObjectIterable<it.unimi.dsi.lang.MutableString>, it.unimi.dsi.fastutil.Size64, it.unimi.dsi.fastutil.Stack<it.unimi.dsi.lang.MutableString>, java.io.Serializable, java.lang.Comparable<it.unimi.dsi.fastutil.BigList<? extends it.unimi.dsi.lang.MutableString>>, java.lang.Iterable<it.unimi.dsi.lang.MutableString>, java.util.Collection<it.unimi.dsi.lang.MutableString>, java.util.RandomAccess

    public class FileLinesBigList
    extends it.unimi.dsi.fastutil.objects.AbstractObjectBigList<it.unimi.dsi.lang.MutableString>
    implements java.util.RandomAccess, java.io.Serializable
    A wrapper exhibiting the lines of a file as a big list.

    An instance of this class allows to access the lines of a file as a BigList. Contrarily to a big FileLinesMutableStringIterable, direct access is possible and reasonably efficient, in particular when accessing nearby lines, and all returned mutable strings are separate, independent instances.

    Similarly to FileLinesMutableStringIterable, instead, AbstractObjectBigList.iterator() can be called any number of times, as it opens an independent input stream at each call. For the same reason, the returned iterator type (FileLinesBigList.FileLinesIterator) is Closeable, and should be closed after usage.

    Note that toString() will return a single string containing all file lines separated by the string associated to the system property line.separator.

    Warning: this class is not synchronised. Separate iterators use separate input streams, and can be accessed concurrently, but all calls to get(long) refer to the same input stream.

    Implementation details

    Instances of this class perform a full scan of the specified file at construction time, representing the list of pointers to the start of each line using the Elias–Fano representation. The memory occupation per line is thus bounded by 2 + log ℓ bits, where ℓ is the average line length.

    Since:
    2.1
    Author:
    Sebastiano Vigna
    See Also:
    Serialized Form
    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
      static class  FileLinesBigList.FileLinesIterator
      An iterator over the lines of a FileLinesBigList.
      • Nested classes/interfaces inherited from class it.unimi.dsi.fastutil.objects.AbstractObjectBigList

        it.unimi.dsi.fastutil.objects.AbstractObjectBigList.ObjectRandomAccessSubList<K extends java.lang.Object>, it.unimi.dsi.fastutil.objects.AbstractObjectBigList.ObjectSubList<K extends java.lang.Object>
    • Constructor Summary

      Constructors 
      Constructor Description
      FileLinesBigList​(java.lang.CharSequence filename, java.lang.String encoding)
      Creates a file-lines collection for the specified filename with the specified encoding, default buffer size and with all terminators.
      FileLinesBigList​(java.lang.CharSequence filename, java.lang.String encoding, int bufferSize)
      Creates a file-lines collection for the specified filename with the specified encoding, buffer size and with all terminators.
      FileLinesBigList​(java.lang.CharSequence filename, java.lang.String encoding, int bufferSize, java.util.EnumSet<it.unimi.dsi.fastutil.io.FastBufferedInputStream.LineTerminator> terminators)
      Creates a file-lines collection for the specified filename with the specified encoding, buffer size and terminator set.
    • Method Summary

      All Methods Instance Methods Concrete Methods Deprecated Methods 
      Modifier and Type Method Description
      it.unimi.dsi.lang.MutableString get​(long index)  
      it.unimi.dsi.lang.MutableString get​(long index, it.unimi.dsi.fastutil.io.FastBufferedInputStream fastBufferedInputStream, java.nio.ByteBuffer byteBuffer, java.nio.CharBuffer charBuffer, java.nio.charset.CharsetDecoder decoder)  
      FileLinesBigList.FileLinesIterator listIterator​(long index)  
      int size()
      Deprecated.
      long size64()  
      java.lang.String toString()  
      • Methods inherited from class it.unimi.dsi.fastutil.objects.AbstractObjectBigList

        add, add, addAll, addAll, addElements, addElements, clear, compareTo, contains, ensureIndex, ensureRestrictedIndex, equals, forEach, getElements, hashCode, indexOf, iterator, lastIndexOf, listIterator, peek, pop, push, remove, removeElements, set, setElements, size, subList, top
      • Methods inherited from class java.util.AbstractCollection

        containsAll, isEmpty, remove, removeAll, retainAll, toArray, toArray
      • Methods inherited from class java.lang.Object

        clone, finalize, getClass, notify, notifyAll, wait, wait, wait
      • Methods inherited from interface java.util.Collection

        containsAll, isEmpty, parallelStream, remove, removeAll, removeIf, retainAll, stream, toArray, toArray, toArray
      • Methods inherited from interface it.unimi.dsi.fastutil.objects.ObjectBigList

        addAll, addAll, addAll, addAll, getElements, setElements, setElements, spliterator
      • Methods inherited from interface it.unimi.dsi.fastutil.Stack

        isEmpty
    • Constructor Detail

      • FileLinesBigList

        public FileLinesBigList​(java.lang.CharSequence filename,
                                java.lang.String encoding,
                                int bufferSize,
                                java.util.EnumSet<it.unimi.dsi.fastutil.io.FastBufferedInputStream.LineTerminator> terminators)
                         throws java.io.IOException
        Creates a file-lines collection for the specified filename with the specified encoding, buffer size and terminator set.
        Parameters:
        filename - a filename.
        encoding - an encoding.
        bufferSize - the buffer size for FastBufferedInputStream.
        terminators - a set of line terminators.
        Throws:
        java.io.IOException
      • FileLinesBigList

        public FileLinesBigList​(java.lang.CharSequence filename,
                                java.lang.String encoding,
                                int bufferSize)
                         throws java.io.IOException
        Creates a file-lines collection for the specified filename with the specified encoding, buffer size and with all terminators.
        Parameters:
        filename - a filename.
        encoding - an encoding.
        bufferSize - the buffer size for FastBufferedInputStream.
        Throws:
        java.io.IOException
      • FileLinesBigList

        public FileLinesBigList​(java.lang.CharSequence filename,
                                java.lang.String encoding)
                         throws java.io.IOException
        Creates a file-lines collection for the specified filename with the specified encoding, default buffer size and with all terminators.
        Parameters:
        filename - a filename.
        encoding - an encoding.
        Throws:
        java.io.IOException
    • Method Detail

      • size64

        public long size64()
        Specified by:
        size64 in interface it.unimi.dsi.fastutil.Size64
      • size

        @Deprecated
        public int size()
        Deprecated.
        Specified by:
        size in interface it.unimi.dsi.fastutil.BigList<it.unimi.dsi.lang.MutableString>
        Specified by:
        size in interface java.util.Collection<it.unimi.dsi.lang.MutableString>
        Specified by:
        size in interface it.unimi.dsi.fastutil.Size64
        Overrides:
        size in class it.unimi.dsi.fastutil.objects.AbstractObjectBigList<it.unimi.dsi.lang.MutableString>
      • get

        public it.unimi.dsi.lang.MutableString get​(long index)
        Specified by:
        get in interface it.unimi.dsi.fastutil.BigList<it.unimi.dsi.lang.MutableString>
      • get

        public it.unimi.dsi.lang.MutableString get​(long index,
                                                   it.unimi.dsi.fastutil.io.FastBufferedInputStream fastBufferedInputStream,
                                                   java.nio.ByteBuffer byteBuffer,
                                                   java.nio.CharBuffer charBuffer,
                                                   java.nio.charset.CharsetDecoder decoder)
      • listIterator

        public FileLinesBigList.FileLinesIterator listIterator​(long index)
        Specified by:
        listIterator in interface it.unimi.dsi.fastutil.BigList<it.unimi.dsi.lang.MutableString>
        Specified by:
        listIterator in interface it.unimi.dsi.fastutil.objects.ObjectBigList<it.unimi.dsi.lang.MutableString>
        Overrides:
        listIterator in class it.unimi.dsi.fastutil.objects.AbstractObjectBigList<it.unimi.dsi.lang.MutableString>
      • toString

        public java.lang.String toString()
        Overrides:
        toString in class it.unimi.dsi.fastutil.objects.AbstractObjectBigList<it.unimi.dsi.lang.MutableString>