Class MappedFrontCodedStringBigList

java.lang.Object
java.util.AbstractCollection<MutableString>
it.unimi.dsi.fastutil.objects.AbstractObjectCollection<MutableString>
it.unimi.dsi.fastutil.objects.AbstractObjectBigList<MutableString>
it.unimi.dsi.big.util.MappedFrontCodedStringBigList
All Implemented Interfaces:
it.unimi.dsi.fastutil.BigList<MutableString>, it.unimi.dsi.fastutil.objects.ObjectBigList<MutableString>, it.unimi.dsi.fastutil.objects.ObjectCollection<MutableString>, it.unimi.dsi.fastutil.objects.ObjectIterable<MutableString>, it.unimi.dsi.fastutil.Size64, it.unimi.dsi.fastutil.Stack<MutableString>, FlyweightPrototype<MappedFrontCodedStringBigList>, Closeable, AutoCloseable, Comparable<it.unimi.dsi.fastutil.BigList<? extends MutableString>>, Iterable<MutableString>, Collection<MutableString>, RandomAccess

public class MappedFrontCodedStringBigList extends it.unimi.dsi.fastutil.objects.AbstractObjectBigList<MutableString> implements RandomAccess, Closeable, FlyweightPrototype<MappedFrontCodedStringBigList>
A memory-mapped version of FrontCodedStringBigList.

This class is functionally identical to FrontCodedStringBigList, but its data is memory-mapped from disk. Only UTF-8 encoding is supported.

To use this class, one first invokes the build(String, int, Iterator) method to generate a property file containing metadata, and two files containing strings and string pointers, respectively. Then, the load(String) method (invoked with the same basename) will return an instance of this class accessing strings and pointers by memory mapping.

Note that for consistency with other classes in this package this class implements a big list of mutable strings; however, for greater flexibility it also implements a getString(long) method and a getArray(long) method.

If you need to build an instance from a (possibly compressed) stream, we suggest to adapt it using FileLinesByteArrayIterable.iterator().

See Also:
  • Nested Class Summary

    Nested Classes
    Modifier and Type
    Class
    Description
    static enum 
     

    Nested classes/interfaces inherited from class it.unimi.dsi.fastutil.objects.AbstractObjectBigList

    it.unimi.dsi.fastutil.objects.AbstractObjectBigList.ObjectRandomAccessSubList<K>, it.unimi.dsi.fastutil.objects.AbstractObjectBigList.ObjectSubList<K>
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    static final String
     
    protected it.unimi.dsi.fastutil.bytes.ByteBigList
    The underlying byte array.
    protected final long
    The number of strings in the list.
    protected it.unimi.dsi.fastutil.longs.LongBigList
    The pointers to entire arrays in the list.
    static final String
     
    static final String
     
    protected final int
    The ratio of this front-coded list.
    static final long
     
  • Constructor Summary

    Constructors
    Modifier
    Constructor
    Description
    protected
    MappedFrontCodedStringBigList(long n, int ratio, String byteBigList, String pointers)
     
  • Method Summary

    Modifier and Type
    Method
    Description
    static void
    build(String basename, int ratio, Iterator<byte[]> arrays)
    Builds and stores a new memory-mapped front-coded big string list.
    void
     
    Returns a copy of this object, sharing state with this object as much as possible.
    protected static int
    countUTF8Chars(byte[] a)
     
    get(long index)
    Returns the element at the specified position in this front-coded big list as a mutable string.
    void
    get(long index, MutableString s)
    Returns the element at the specified position in this front-coded big list by storing it in a mutable string.
    byte[]
    getArray(long index)
    Returns the element at the specified position in this front-coded big list as an UTF-8-coded byte array.
    getString(long index)
    Returns the element at the specified position in this front-coded big list as a string.
    load(String basename)
    Maps in memory a front-coded string big list starting from a basename.
    static void
    main(String[] arg)
     
    long
     

    Methods inherited from class it.unimi.dsi.fastutil.objects.AbstractObjectBigList

    add, add, addAll, addAll, addElements, addElements, clear, compareTo, contains, ensureIndex, ensureRestrictedIndex, equals, forEach, getElements, hashCode, indexOf, iterator, lastIndexOf, listIterator, listIterator, peek, pop, push, remove, removeElements, set, setElements, size, size, subList, top, toString

    Methods inherited from class java.util.AbstractCollection

    containsAll, isEmpty, remove, removeAll, retainAll, toArray, toArray

    Methods inherited from class java.lang.Object

    clone, finalize, getClass, notify, notifyAll, wait, wait, wait

    Methods inherited from interface it.unimi.dsi.fastutil.objects.ObjectBigList

    addAll, addAll, addAll, addAll, getElements, setElements, setElements, spliterator

    Methods inherited from interface it.unimi.dsi.fastutil.Stack

    isEmpty
  • Field Details

    • serialVersionUID

      public static final long serialVersionUID
      See Also:
    • PROPERTIES_EXTENSION

      public static final String PROPERTIES_EXTENSION
      See Also:
    • BYTE_ARRAY_EXTENSION

      public static final String BYTE_ARRAY_EXTENSION
      See Also:
    • POINTERS_EXTENSION

      public static final String POINTERS_EXTENSION
      See Also:
    • n

      protected final long n
      The number of strings in the list.
    • ratio

      protected final int ratio
      The ratio of this front-coded list.
    • byteList

      protected it.unimi.dsi.fastutil.bytes.ByteBigList byteList
      The underlying byte array.
    • pointers

      protected it.unimi.dsi.fastutil.longs.LongBigList pointers
      The pointers to entire arrays in the list.
  • Constructor Details

    • MappedFrontCodedStringBigList

      protected MappedFrontCodedStringBigList(long n, int ratio, String byteBigList, String pointers) throws IOException
      Throws:
      IOException
  • Method Details

    • copy

      Description copied from interface: FlyweightPrototype
      Returns a copy of this object, sharing state with this object as much as possible.
      Specified by:
      copy in interface FlyweightPrototype<MappedFrontCodedStringBigList>
      Returns:
      a copy of this object, sharing state with this object as much as possible.
    • build

      public static void build(String basename, int ratio, Iterator<byte[]> arrays) throws IOException, org.apache.commons.configuration2.ex.ConfigurationException
      Builds and stores a new memory-mapped front-coded big string list.

      Given a basename, three file with extensions PROPERTIES_EXTENSION, BYTE_ARRAY_EXTENSION and POINTERS_EXTENSION will be generated.

      After building a list, you can load it using the same basename.

      Parameters:
      basename - the basename of the list.
      ratio - the ratio.
      arrays - an iterator over byte arrays containing UTF-8 encoded-strings.
      Throws:
      IOException
      org.apache.commons.configuration2.ex.ConfigurationException
    • load

      public static MappedFrontCodedStringBigList load(String basename) throws org.apache.commons.configuration2.ex.ConfigurationException, IOException
      Maps in memory a front-coded string big list starting from a basename.
      Parameters:
      basename - the basename of a memory-mapped front-coded string big list.
      Returns:
      a memory-mapped front-coded string big list.
      Throws:
      org.apache.commons.configuration2.ex.ConfigurationException
      IOException
    • get

      public MutableString get(long index)
      Returns the element at the specified position in this front-coded big list as a mutable string.
      Specified by:
      get in interface it.unimi.dsi.fastutil.BigList<MutableString>
      Parameters:
      index - an index in the list.
      Returns:
      a MutableString that will contain the string at the specified position. The string may be freely modified.
    • get

      public void get(long index, MutableString s)
      Returns the element at the specified position in this front-coded big list by storing it in a mutable string.
      Parameters:
      index - an index in the list.
      s - a mutable string that will contain the string at the specified position.
    • getString

      public String getString(long index)
      Returns the element at the specified position in this front-coded big list as a string.
      Parameters:
      index - an index in the list.
      Returns:
      a String that will contain the string at the specified position.
    • getArray

      public byte[] getArray(long index)
      Returns the element at the specified position in this front-coded big list as an UTF-8-coded byte array.
      Parameters:
      index - an index in the list.
      Returns:
      a byte array representing in UTF-8 encoding the string at the specified position.
    • countUTF8Chars

      protected static int countUTF8Chars(byte[] a)
    • size64

      public long size64()
      Specified by:
      size64 in interface it.unimi.dsi.fastutil.Size64
    • close

      public void close() throws IOException
      Specified by:
      close in interface AutoCloseable
      Specified by:
      close in interface Closeable
      Throws:
      IOException
    • main

      public static void main(String[] arg) throws IOException, com.martiansoftware.jsap.JSAPException, org.apache.commons.configuration2.ex.ConfigurationException, ClassNotFoundException, IllegalArgumentException, SecurityException
      Throws:
      IOException
      com.martiansoftware.jsap.JSAPException
      org.apache.commons.configuration2.ex.ConfigurationException
      ClassNotFoundException
      IllegalArgumentException
      SecurityException