Class FrontCodedStringBigList

java.lang.Object
java.util.AbstractCollection<MutableString>
it.unimi.dsi.fastutil.objects.AbstractObjectCollection<MutableString>
it.unimi.dsi.fastutil.objects.AbstractObjectBigList<MutableString>
it.unimi.dsi.big.util.FrontCodedStringBigList
All Implemented Interfaces:
it.unimi.dsi.fastutil.BigList<MutableString>, it.unimi.dsi.fastutil.objects.ObjectBigList<MutableString>, it.unimi.dsi.fastutil.objects.ObjectCollection<MutableString>, it.unimi.dsi.fastutil.objects.ObjectIterable<MutableString>, it.unimi.dsi.fastutil.Size64, it.unimi.dsi.fastutil.Stack<MutableString>, Serializable, Comparable<it.unimi.dsi.fastutil.BigList<? extends MutableString>>, Iterable<MutableString>, Collection<MutableString>, RandomAccess

public class FrontCodedStringBigList extends it.unimi.dsi.fastutil.objects.AbstractObjectBigList<MutableString> implements RandomAccess, Serializable
Compact storage of strings using front-coding compression (also known as compression by prefix omission).

This class is functionally identical to FrontCodedStringList, except for the larger size allowed.

See Also:
  • Nested Class Summary

    Nested classes/interfaces inherited from class it.unimi.dsi.fastutil.objects.AbstractObjectBigList

    it.unimi.dsi.fastutil.objects.AbstractObjectBigList.ObjectRandomAccessSubList<K>, it.unimi.dsi.fastutil.objects.AbstractObjectBigList.ObjectSubList<K>
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    protected final it.unimi.dsi.fastutil.bytes.ByteArrayFrontCodedBigList
    The underlying ByteArrayFrontCodedBigList, or null.
    protected final it.unimi.dsi.fastutil.chars.CharArrayFrontCodedBigList
    The underlying CharArrayFrontCodedBigList, or null.
    static final long
     
    protected final boolean
    Whether this front-coded list is UTF-8 encoded.
  • Constructor Summary

    Constructors
    Constructor
    Description
    FrontCodedStringBigList(Collection<? extends CharSequence> c, int ratio, boolean utf8)
    Creates a new front-coded string list containing the character sequences contained in the given collection.
    FrontCodedStringBigList(Iterator<? extends CharSequence> words, int ratio, boolean utf8)
    Creates a new front-coded string list containing the character sequences returned by the given iterator.
  • Method Summary

    Modifier and Type
    Method
    Description
    protected static char[]
    byte2Char(byte[] a, char[] s)
     
    protected static int
    countUTF8Chars(byte[] a)
     
    void
    dump(String basename)
     
    get(long index)
    Returns the element at the specified position in this front-coded string big list as a mutable string.
    void
    get(long index, MutableString s)
    Returns the element at the specified position in this front-coded string big list by storing it in a mutable string.
    it.unimi.dsi.fastutil.objects.ObjectBigListIterator<MutableString>
    listIterator(long k)
     
    static void
    main(String[] arg)
     
    int
    Returns the ratio of the underlying front-coded list.
    long
     
    boolean
    Returns whether this front-coded string list is storing its strings as UTF-8 encoded bytes.

    Methods inherited from class it.unimi.dsi.fastutil.objects.AbstractObjectBigList

    add, add, addAll, addAll, addElements, addElements, clear, compareTo, contains, ensureIndex, ensureRestrictedIndex, equals, forEach, getElements, hashCode, indexOf, iterator, lastIndexOf, listIterator, peek, pop, push, remove, removeElements, set, setElements, size, size, subList, top, toString

    Methods inherited from class java.util.AbstractCollection

    containsAll, isEmpty, remove, removeAll, retainAll, toArray, toArray

    Methods inherited from class java.lang.Object

    clone, finalize, getClass, notify, notifyAll, wait, wait, wait

    Methods inherited from interface it.unimi.dsi.fastutil.objects.ObjectBigList

    addAll, addAll, addAll, addAll, getElements, setElements, setElements, spliterator

    Methods inherited from interface it.unimi.dsi.fastutil.Stack

    isEmpty
  • Field Details

    • serialVersionUID

      public static final long serialVersionUID
      See Also:
    • byteFrontCodedBigList

      protected final it.unimi.dsi.fastutil.bytes.ByteArrayFrontCodedBigList byteFrontCodedBigList
      The underlying ByteArrayFrontCodedBigList, or null.
    • charFrontCodedBigList

      protected final it.unimi.dsi.fastutil.chars.CharArrayFrontCodedBigList charFrontCodedBigList
      The underlying CharArrayFrontCodedBigList, or null.
    • utf8

      protected final boolean utf8
      Whether this front-coded list is UTF-8 encoded.
  • Constructor Details

    • FrontCodedStringBigList

      public FrontCodedStringBigList(Iterator<? extends CharSequence> words, int ratio, boolean utf8)
      Creates a new front-coded string list containing the character sequences returned by the given iterator.
      Parameters:
      words - an iterator returning character sequences.
      ratio - the desired ratio.
      utf8 - if true, the strings will be stored as UTF-8 byte arrays.
    • FrontCodedStringBigList

      public FrontCodedStringBigList(Collection<? extends CharSequence> c, int ratio, boolean utf8)
      Creates a new front-coded string list containing the character sequences contained in the given collection.
      Parameters:
      c - a collection containing character sequences.
      ratio - the desired ratio.
      utf8 - if true, the strings will be stored as UTF-8 byte arrays.
  • Method Details

    • utf8

      public boolean utf8()
      Returns whether this front-coded string list is storing its strings as UTF-8 encoded bytes.
      Returns:
      true if this front-coded string list is keeping its data as an array of UTF-8 encoded bytes.
    • ratio

      public int ratio()
      Returns the ratio of the underlying front-coded list.
      Returns:
      the ratio of the underlying front-coded list.
    • get

      public MutableString get(long index)
      Returns the element at the specified position in this front-coded string big list as a mutable string.
      Specified by:
      get in interface it.unimi.dsi.fastutil.BigList<MutableString>
      Parameters:
      index - an index in the list.
      Returns:
      a MutableString that will contain the string at the specified position. The string may be freely modified.
    • get

      public void get(long index, MutableString s)
      Returns the element at the specified position in this front-coded string big list by storing it in a mutable string.
      Parameters:
      index - an index in the list.
      s - a mutable string that will contain the string at the specified position.
    • countUTF8Chars

      protected static int countUTF8Chars(byte[] a)
    • byte2Char

      protected static char[] byte2Char(byte[] a, char[] s)
    • listIterator

      public it.unimi.dsi.fastutil.objects.ObjectBigListIterator<MutableString> listIterator(long k)
      Specified by:
      listIterator in interface it.unimi.dsi.fastutil.BigList<MutableString>
      Specified by:
      listIterator in interface it.unimi.dsi.fastutil.objects.ObjectBigList<MutableString>
      Overrides:
      listIterator in class it.unimi.dsi.fastutil.objects.AbstractObjectBigList<MutableString>
    • size64

      public long size64()
      Specified by:
      size64 in interface it.unimi.dsi.fastutil.Size64
    • dump

      public void dump(String basename) throws org.apache.commons.configuration2.ex.ConfigurationException, IOException
      Throws:
      org.apache.commons.configuration2.ex.ConfigurationException
      IOException
    • main

      public static void main(String[] arg) throws IOException, com.martiansoftware.jsap.JSAPException, NoSuchMethodException
      Throws:
      IOException
      com.martiansoftware.jsap.JSAPException
      NoSuchMethodException