Class FileLinesByteArrayIterable

java.lang.Object
it.unimi.dsi.io.FileLinesByteArrayIterable
All Implemented Interfaces:
it.unimi.dsi.fastutil.Size64, Iterable<byte[]>

public class FileLinesByteArrayIterable extends Object implements Iterable<byte[]>, it.unimi.dsi.fastutil.Size64
A wrapper exhibiting the lines of a file as an Iterable of byte arrays.

An instance of this class makes it possible to access the lines of a file as an Iterable of byte arrays. Reading is performed using FastBufferedInputStream.readLine(byte[], EnumSet), and follows the rules defined therein. No decoding is performed.

The result of a call to iterator() can be used to scan the file; each call will open an independent input stream. The returned iterator type (FileLinesIterator) is Closeable, and should be closed after usage. Exhausted iterators, however, will be closed automagically.

Using a suitable constructor it is possible to specify a decompression class, which must extend InputStream and provide a constructor accepting an InputStream (e.g., GZIPInputStream if the file is compressed in gzip format).

Convenience static methods makes it possible to build on the fly an iterator over an input stream using the same conventions.

This class implements size64(), which will return the number of lines of the file, computed with a full scan at the first invocation. However, it is also possible to specify at construction time the number of lines in the file to skip the first scan. It is responsibility of the caller that the specified size and the actual number of lines in the file do match.

Since:
2.6.17
Author:
Sebastiano Vigna
  • Constructor Details

    • FileLinesByteArrayIterable

      public FileLinesByteArrayIterable(String filename)
      Creates a file-lines byte-array iterable for the specified filename.
      Parameters:
      filename - a filename.
    • FileLinesByteArrayIterable

      public FileLinesByteArrayIterable(String filename, long size)
      Creates a file-lines byte-array iterable for the specified filename and size.
      Parameters:
      filename - a filename.
      size - the number of lines in the file.
    • FileLinesByteArrayIterable

      public FileLinesByteArrayIterable(String filename, long size, EnumSet<it.unimi.dsi.fastutil.io.FastBufferedInputStream.LineTerminator> terminators)
      Creates a file-lines byte-array iterable for the specified filename and size using the given line terminators.
      Parameters:
      filename - a filename.
      size - the number of lines in the file.
      terminators - line terminators for the underlying FastBufferedInputStream.
    • FileLinesByteArrayIterable

      public FileLinesByteArrayIterable(String filename, Class<? extends InputStream> decompressor) throws NoSuchMethodException, SecurityException
      Creates a file-lines byte-array iterable for the specified filename, optionally assuming that the file is compressed.
      Parameters:
      filename - a filename.
      decompressor - a class extending InputStream that will be used as a decompressor, or null for no decompression.
      Throws:
      NoSuchMethodException
      SecurityException
    • FileLinesByteArrayIterable

      public FileLinesByteArrayIterable(String filename, long size, Class<? extends InputStream> decompressor) throws NoSuchMethodException, SecurityException
      Creates a file-lines byte-array iterable for the specified filename and size, optionally assuming that the file is compressed.
      Parameters:
      filename - a filename.
      size - the number of lines in the file.
      decompressor - a class extending InputStream that will be used as a decompressor, or null for no decompression.
      Throws:
      NoSuchMethodException
      SecurityException
    • FileLinesByteArrayIterable

      public FileLinesByteArrayIterable(String filename, long size, EnumSet<it.unimi.dsi.fastutil.io.FastBufferedInputStream.LineTerminator> terminators, Class<? extends InputStream> decompressor) throws NoSuchMethodException, SecurityException
      Creates a file-lines byte-array iterable for the specified filename and size using the given line terminators and optionally assuming that the file is compressed.
      Parameters:
      filename - a filename.
      size - the number of lines in the file.
      terminators - line terminators for the underlying FastBufferedInputStream.
      decompressor - a class extending InputStream that will be used as a decompressor, or null for no decompression.
      Throws:
      NoSuchMethodException
      SecurityException
  • Method Details

    • iterator

      Specified by:
      iterator in interface Iterable<byte[]>
    • iterator

      public static FileLinesByteArrayIterable.FileLinesIterator iterator(InputStream inputStream)
      A convenience method returning a one-off FileLinesByteArrayIterable.FileLinesIterator reading from an input stream.
      Parameters:
      inputStream - an input stream.
      Returns:
      an iterator returning the lines contained in the provided input stream.
    • iterator

      public static FileLinesByteArrayIterable.FileLinesIterator iterator(InputStream inputStream, Class<? extends InputStream> decompressor)
      A convenience method returning a one-off FileLinesByteArrayIterable.FileLinesIterator reading from an input stream.
      Parameters:
      inputStream - an input stream.
      decompressor - a class extending InputStream that will be used as a decompressor, or null for no decompression.
      Returns:
      an iterator returning the lines contained in the provided input stream.
    • iterator

      public static FileLinesByteArrayIterable.FileLinesIterator iterator(InputStream inputStream, Class<? extends InputStream> decompressor, EnumSet<it.unimi.dsi.fastutil.io.FastBufferedInputStream.LineTerminator> terminators)
      A convenience method returning a one-off FileLinesByteArrayIterable.FileLinesIterator reading from an input stream.
      Parameters:
      inputStream - an input stream.
      decompressor - a class extending InputStream that will be used as a decompressor, or null for no decompression.
      terminators - line terminators for the underlying FastBufferedInputStream.
      Returns:
      an iterator returning the lines contained in the provided input stream.
    • size64

      public long size64()
      Specified by:
      size64 in interface it.unimi.dsi.fastutil.Size64
    • allLines

      public it.unimi.dsi.fastutil.objects.ObjectList<byte[]> allLines()
      Returns all lines as a list.
      Returns:
      all lines of the file wrapped by this file-lines byte-array iterable.
      See Also:
      Implementation Specification:
      This method iterates over the lines of the file and accumulates the resulting byte arrays in a standard list. Thus, it will throw an exception on files with more than Integer.MAX_VALUE lines.
    • allLinesBig

      public it.unimi.dsi.fastutil.objects.ObjectBigArrayBigList<byte[]> allLinesBig()
      Returns all lines as a big list.
      Returns:
      all lines of the file wrapped by this file-lines byte-array iterable.
      See Also:
      Implementation Specification:
      This method iterates over the lines of the file and accumulates the resulting byte arrays. in a big list. Thus, it supports files with more than Integer.MAX_VALUE lines.