Class EliasFanoMonotoneLongBigList

  • All Implemented Interfaces:
    it.unimi.dsi.fastutil.BigList<java.lang.Long>, it.unimi.dsi.fastutil.longs.LongBigList, it.unimi.dsi.fastutil.longs.LongCollection, it.unimi.dsi.fastutil.longs.LongIterable, it.unimi.dsi.fastutil.longs.LongStack, it.unimi.dsi.fastutil.Size64, it.unimi.dsi.fastutil.Stack<java.lang.Long>, java.io.Serializable, java.lang.Comparable<it.unimi.dsi.fastutil.BigList<? extends java.lang.Long>>, java.lang.Iterable<java.lang.Long>, java.util.Collection<java.lang.Long>
    Direct Known Subclasses:
    EliasFanoIndexedMonotoneLongBigList, EliasFanoPrefixSumLongBigList, SparseSelect

    public class EliasFanoMonotoneLongBigList
    extends it.unimi.dsi.fastutil.longs.AbstractLongBigList
    implements java.io.Serializable
    An implementation of Elias–Fano's representation of monotone sequences; an element occupies a number of bits bounded by two plus the logarithm of the average gap.

    Instances of this class represent in a highly compacted form a nondecreasing sequence of natural numbers. Instances are built by providing either an iterator returning the (nondecreasing) sequence, or an iterable object that provides such an iterator. In the first case, you must also provide in advance the number of elements that will be returned and an upper bound to their values (see below), and at the end of the construction the iterator will be exhausted.

    An additional bulk method makes it possible to extract several consecutive entries at high speed, and getDelta(long) computes directly the difference between two consecutive elements. Moreover, the nextLong() method of an iterator will read read consecutive data much faster than repeated calls to getLong(long).

    Methods to not usually perform bound checks on the arguments. Bounds checks can be enabled, however, by enabling assertions.

    Because Java array are limited in size, it might not be possible to build certain instances: you can use the fits(long, long) methods to check is this might happen. In this case, please use EliasFanoMonotoneBigLongBigList, which is slightly slower but has no such limitations.

    This class is thread safe.

    Memory mapping

    Instances of this class can be dumped and then loaded uses MappedEliasFanoMonotoneLongBigList.

    Implementation details

    Given a monotone sequence 0 ≤ x0x1 ≤ … ≤ xn − 1 < u, where u is a given upper bound (the size of the universe), the Elias–Fano representation makes it possible to store it using at most 2 + log(u/n) bits per element, which is very close to the information-theoretical lower bound ≈ log e + log(u/n). A typical example is a list of pointer into records of a large file: instead of using, for each pointer, a number of bit sufficient to express the length of the file, the Elias–Fano representation makes it possible to use, for each pointer, a number of bits roughly equal to the logarithm of the average length of a record. The representation was introduced in Peter Elias, “Efficient storage and retrieval by content and address of static files”, J. Assoc. Comput. Mach., 21(2):246−260, 1974, and also independently by Robert Fano, “On the number of bits required to implement an associative memory”, Memorandum 61, Computer Structures Group, Project MAC, MIT, Cambridge, Mass., n.d., 1971.

    The elements of the sequence are recorded by storing separately the lower s = ⌊log(u/n)⌋ bits and the remaining upper bits. The lower bits are stored contiguously, whereas the upper bits are stored in an array of n + u / 2s bits by setting, for each 0 ≤ i < n, the bit of index xi / 2s + i; the value can then be recovered by selecting the i-th bit of the resulting bit array and subtracting i (note that this will work because the upper bits are nondecreasing).

    This implementation uses SimpleSelect to support selection inside the upper-bits array, and exploits SimpleSelect.select(long, long[], int, int) to implement get(long, long[], int, int).

    See Also:
    EliasFanoMonotoneBigLongBigList, EliasFanoIndexedMonotoneLongBigList, Serialized Form
    • Field Summary

      Fields 
      Modifier and Type Field Description
      protected int l
      The number of lower bits.
      protected long length
      The length of the sequence.
      protected long[] lowerBits
      The list of lower bits of each element, stored explicitly.
      protected long lowerBitsMask
      The mask for the lower bits.
      protected SimpleSelect selectUpper
      The select structure used to extract the upper bits.
      protected long[] upperBits
      The upper bits, stored as unary gaps.
    • Constructor Summary

      Constructors 
      Modifier Constructor Description
      protected EliasFanoMonotoneLongBigList​(long[] a, it.unimi.dsi.fastutil.longs.LongIterator iterator)
      Creates an Elias–Fano representation of the values returned by an iterator, given that the overall number of elements and an upper bound are provided, too.
      protected EliasFanoMonotoneLongBigList​(long length, int l, long[] upperBits, long[] lowerBits, SimpleSelect selectUpper)  
        EliasFanoMonotoneLongBigList​(long n, long upperBound, it.unimi.dsi.fastutil.bytes.ByteIterator iterator)
      Creates an Elias–Fano representation of the values returned by an iterator, given that the overall number of elements and an upper bound are provided, too.
        EliasFanoMonotoneLongBigList​(long n, long upperBound, it.unimi.dsi.fastutil.ints.IntIterator iterator)
      Creates an Elias–Fano representation of the values returned by an iterator, given that the overall number of elements and an upper bound are provided, too.
        EliasFanoMonotoneLongBigList​(long n, long upperBound, it.unimi.dsi.fastutil.longs.LongIterator iterator)
      Creates an Elias–Fano representation of the values returned by an iterator, given that the overall number of elements and an upper bound are provided, too.
        EliasFanoMonotoneLongBigList​(long n, long upperBound, it.unimi.dsi.fastutil.shorts.ShortIterator iterator)
      Creates an Elias–Fano representation of the values returned by an iterator, given that the overall number of elements and an upper bound are provided, too.
        EliasFanoMonotoneLongBigList​(it.unimi.dsi.fastutil.bytes.ByteIterable list)
      Creates an Elias–Fano representation of the values returned by the given iterable object.
        EliasFanoMonotoneLongBigList​(it.unimi.dsi.fastutil.ints.IntIterable list)
      Creates an Elias–Fano representation of the values returned by the given iterable object.
        EliasFanoMonotoneLongBigList​(it.unimi.dsi.fastutil.longs.LongIterable list)
      Creates an Elias–Fano representation of the values returned by the given iterable object.
        EliasFanoMonotoneLongBigList​(it.unimi.dsi.fastutil.shorts.ShortIterable list)
      Creates an Elias–Fano representation of the values returned by the given iterable object.
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      void dump​(java.lang.String basename)
      Dumps this list's lower bits in native order so that it can be used with MappedEliasFanoMonotoneLongBigList.
      void dump​(java.lang.String basename, java.nio.ByteOrder byteOrder)
      Dumps this list's lower bits so that it can be used with MappedEliasFanoMonotoneLongBigList.
      static boolean fits​(long length, long upperBound)
      Returns true if this class can accommodate a list with the given number of elements and upper bound.
      long[] get​(long index, long[] dest)
      Extracts a number of consecutive entries into a given array.
      long[] get​(long index, long[] dest, int offset, int length)
      Extracts a number of consecutive entries into a given array fragment.
      long getDelta​(long index)
      Returns the difference between two consecutive elements of the sequence.
      long getLong​(long index)
      Returns the element at the specified position.
      EliasFanoMonotoneLongBigList.EliasFanoMonotoneLongBigListIterator iterator()
      Returns a list iterator over the values of this EliasFanoMonotoneLongBigList.
      EliasFanoMonotoneLongBigList.EliasFanoMonotoneLongBigListIterator listIterator()
      Returns a list iterator over the values of this EliasFanoMonotoneLongBigList.
      EliasFanoMonotoneLongBigList.EliasFanoMonotoneLongBigListIterator listIterator​(long from)
      Returns a list iterator over the values of this EliasFanoMonotoneLongBigList.
      long numBits()  
      long size64()  
      • Methods inherited from class it.unimi.dsi.fastutil.longs.AbstractLongBigList

        add, add, add, addAll, addAll, addAll, addAll, addElements, addElements, clear, compareTo, contains, ensureIndex, ensureRestrictedIndex, equals, forEach, get, getElements, hashCode, indexOf, indexOf, lastIndexOf, lastIndexOf, peek, peekLong, pop, popLong, push, push, rem, remove, removeElements, removeLong, set, set, setElements, size, size, subList, top, topLong, toString
      • Methods inherited from class it.unimi.dsi.fastutil.longs.AbstractLongCollection

        add, contains, containsAll, containsAll, forEach, remove, removeAll, removeAll, removeIf, retainAll, retainAll, toArray, toLongArray, toLongArray
      • Methods inherited from class java.util.AbstractCollection

        isEmpty, toArray, toArray
      • Methods inherited from class java.lang.Object

        clone, finalize, getClass, notify, notifyAll, wait, wait, wait
      • Methods inherited from interface java.util.Collection

        containsAll, isEmpty, removeAll, retainAll, toArray, toArray, toArray
      • Methods inherited from interface it.unimi.dsi.fastutil.longs.LongBigList

        addAll, addAll, addAll, addAll, getElements, setElements, setElements, spliterator
      • Methods inherited from interface it.unimi.dsi.fastutil.longs.LongCollection

        add, contains, containsAll, longIterator, longParallelStream, longSpliterator, longStream, parallelStream, remove, removeAll, removeIf, removeIf, removeIf, retainAll, stream, toArray, toLongArray, toLongArray
      • Methods inherited from interface it.unimi.dsi.fastutil.longs.LongIterable

        forEach, forEach
      • Methods inherited from interface it.unimi.dsi.fastutil.Stack

        isEmpty
    • Field Detail

      • length

        protected final long length
        The length of the sequence.
      • l

        protected final int l
        The number of lower bits.
      • upperBits

        protected transient long[] upperBits
        The upper bits, stored as unary gaps.
      • lowerBits

        protected long[] lowerBits
        The list of lower bits of each element, stored explicitly.
      • selectUpper

        protected final SimpleSelect selectUpper
        The select structure used to extract the upper bits.
      • lowerBitsMask

        protected final long lowerBitsMask
        The mask for the lower bits.
    • Constructor Detail

      • EliasFanoMonotoneLongBigList

        protected EliasFanoMonotoneLongBigList​(long length,
                                               int l,
                                               long[] upperBits,
                                               long[] lowerBits,
                                               SimpleSelect selectUpper)
      • EliasFanoMonotoneLongBigList

        public EliasFanoMonotoneLongBigList​(it.unimi.dsi.fastutil.ints.IntIterable list)
        Creates an Elias–Fano representation of the values returned by the given iterable object.
        Parameters:
        list - an iterable object returning nondecreasing natural numbers.
      • EliasFanoMonotoneLongBigList

        public EliasFanoMonotoneLongBigList​(it.unimi.dsi.fastutil.shorts.ShortIterable list)
        Creates an Elias–Fano representation of the values returned by the given iterable object.
        Parameters:
        list - an iterable object returning nondecreasing natural numbers.
      • EliasFanoMonotoneLongBigList

        public EliasFanoMonotoneLongBigList​(it.unimi.dsi.fastutil.bytes.ByteIterable list)
        Creates an Elias–Fano representation of the values returned by the given iterable object.
        Parameters:
        list - an iterable object returning nondecreasing natural numbers.
      • EliasFanoMonotoneLongBigList

        public EliasFanoMonotoneLongBigList​(it.unimi.dsi.fastutil.longs.LongIterable list)
        Creates an Elias–Fano representation of the values returned by the given iterable object.
        Parameters:
        list - an iterable object returning nondecreasing natural numbers.
      • EliasFanoMonotoneLongBigList

        public EliasFanoMonotoneLongBigList​(long n,
                                            long upperBound,
                                            it.unimi.dsi.fastutil.bytes.ByteIterator iterator)
        Creates an Elias–Fano representation of the values returned by an iterator, given that the overall number of elements and an upper bound are provided, too.

        This constructor is particularly useful if the elements of the iterator are provided by some sequential source.

        Parameters:
        n - the number of elements returned by iterator.
        upperBound - a strict upper bound to the values returned by iterator (note that it used to be non-strict).
        iterator - an iterator returning nondecreasing natural numbers.
      • EliasFanoMonotoneLongBigList

        public EliasFanoMonotoneLongBigList​(long n,
                                            long upperBound,
                                            it.unimi.dsi.fastutil.shorts.ShortIterator iterator)
        Creates an Elias–Fano representation of the values returned by an iterator, given that the overall number of elements and an upper bound are provided, too.

        This constructor is particularly useful if the elements of the iterator are provided by some sequential source.

        Parameters:
        n - the number of elements returned by iterator.
        upperBound - a strict upper bound to the values returned by iterator (note that it used to be non-strict).
        iterator - an iterator returning nondecreasing natural numbers.
      • EliasFanoMonotoneLongBigList

        public EliasFanoMonotoneLongBigList​(long n,
                                            long upperBound,
                                            it.unimi.dsi.fastutil.ints.IntIterator iterator)
        Creates an Elias–Fano representation of the values returned by an iterator, given that the overall number of elements and an upper bound are provided, too.

        This constructor is particularly useful if the elements of the iterator are provided by some sequential source.

        Parameters:
        n - the number of elements returned by iterator.
        upperBound - a strict upper bound to the values returned by iterator (note that it used to be non-strict).
        iterator - an iterator returning nondecreasing natural numbers.
      • EliasFanoMonotoneLongBigList

        public EliasFanoMonotoneLongBigList​(long n,
                                            long upperBound,
                                            it.unimi.dsi.fastutil.longs.LongIterator iterator)
        Creates an Elias–Fano representation of the values returned by an iterator, given that the overall number of elements and an upper bound are provided, too.

        This constructor is particularly useful if the elements of the iterator are provided by some sequential source.

        Parameters:
        n - the number of elements returned by iterator.
        upperBound - a strict upper bound to the values returned by iterator (note that it used to be non-strict).
        iterator - an iterator returning nondecreasing natural numbers.
      • EliasFanoMonotoneLongBigList

        protected EliasFanoMonotoneLongBigList​(long[] a,
                                               it.unimi.dsi.fastutil.longs.LongIterator iterator)
        Creates an Elias–Fano representation of the values returned by an iterator, given that the overall number of elements and an upper bound are provided, too.

        This constructor is used only internally, to work around the usual problems caused by the obligation to call this() before anything else.

        Parameters:
        a - an array containing the number of elements returned by iterator and a strict upper bound to the values returned by iterator (note that it used to be non-strict).
        iterator - an iterator returning nondecreasing natural numbers.
    • Method Detail

      • fits

        public static boolean fits​(long length,
                                   long upperBound)
        Returns true if this class can accommodate a list with the given number of elements and upper bound.
        Parameters:
        length - the length of the list.
        upperBound - a strict upper bound to the values of the list.
        Returns:
        true if this class can accommodate a list with the given number of elements and upper bound.
      • numBits

        public long numBits()
      • getLong

        public long getLong​(long index)
        Returns the element at the specified position.
        Specified by:
        getLong in interface it.unimi.dsi.fastutil.longs.LongBigList
        Parameters:
        index - a position in the list.
        Returns:
        the element at the specified position; if index is out of bounds, behavior is undefined.
      • getDelta

        public long getDelta​(long index)
        Returns the difference between two consecutive elements of the sequence.
        Parameters:
        index - the index of an element (smaller then size64() - 1).
        Returns:
        the difference between the element of position index + 1 and that of position index; if index is out of bounds, behavior is undefined.
        See Also:
        get(long, long[])
      • get

        public long[] get​(long index,
                          long[] dest,
                          int offset,
                          int length)
        Extracts a number of consecutive entries into a given array fragment.
        Parameters:
        index - the index of the first entry returned.
        dest - the destination array; it will be filled with length consecutive entries starting at position offset; must be of length greater than offset.
        offset - the first position written in dest.
        length - the number of elements written in dest starting at offset.
        Returns:
        dest; if the arguments are out of bounds, behavior is undefined.
        See Also:
        get(long, long[])
      • get

        public long[] get​(long index,
                          long[] dest)
        Extracts a number of consecutive entries into a given array.
        Parameters:
        index - the index of the first entry returned.
        dest - the destination array, of nonzero length; it will be filled with consecutive entries.
        Returns:
        dest; if index is out of bounds or dest has length zero, behavior is undefined.
        See Also:
        get(long, long[], int, int)
      • size64

        public long size64()
        Specified by:
        size64 in interface it.unimi.dsi.fastutil.Size64
      • dump

        public void dump​(java.lang.String basename)
                  throws java.io.IOException
        Dumps this list's lower bits in native order so that it can be used with MappedEliasFanoMonotoneLongBigList.
        Parameters:
        basename - the basename of the generated files.
        Throws:
        java.io.IOException