Class EquiDepthHistogram

java.lang.Object
cern.colt.PersistentObject
cern.jet.stat.quantile.EquiDepthHistogram
All Implemented Interfaces:
Serializable, Cloneable

public class EquiDepthHistogram extends PersistentObject
Read-only equi-depth histogram for selectivity estimation. Assume you have collected statistics over a data set, among them a one-dimensional equi-depth histogram (quantiles). Then an applications or DBMS might want to estimate the selectivity of some range query [from,to], i.e. the percentage of data set elements contained in the query range. This class does not collect equi-depth histograms but only space efficiently stores already produced histograms and provides operations for selectivity estimation. Uses linear interpolation.

This class stores a list l of float values for which holds:

  • Let v be a list of values (sorted ascending) an equi-depth histogram has been computed over.
  • Let s=l.length.
  • Let p=(0, 1/s-1), 2/s-1,..., s-1/s-1=1.0) be a list of the s percentages.
  • Then for each i=0..s-1: l[i] = e : v.contains(e) invalid input: '&'invalid input: '&' v[0],..., v[p[i]*v.length] <= e.
  • (In particular: l[0]=min(v)=v[0] and l[s-1]=max(v)=v[s-1].)
  • Version:
    1.0, 09/24/99
    See Also:
    • Field Summary

      Fields
      Modifier and Type
      Field
      Description
      protected float[]
       

      Fields inherited from class cern.colt.PersistentObject

      serialVersionUID
    • Constructor Summary

      Constructors
      Constructor
      Description
      EquiDepthHistogram(float[] quantileElements)
      Constructs an equi-depth histogram with the given quantile elements.
    • Method Summary

      Modifier and Type
      Method
      Description
      int
      binOfElement(float element)
      Returns the bin index of the given element.
      int
      Returns the number of bins.
      float
      endOfBin(int binIndex)
      Returns the end of the range associated with the given bin.
      double
      percentFromTo(float from, float to)
      Returns the percentage of elements in the range (from,to].
      double
      phi(float element)
      Returns how many percent of the elements contained in the receiver are <= element.
      int
      Deprecated.
      Deprecated.
      float
      startOfBin(int binIndex)
      Returns the start of the range associated with the given bin.
      static void
      test(float element)
      Not yet commented.

      Methods inherited from class cern.colt.PersistentObject

      clone

      Methods inherited from class java.lang.Object

      equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Details

      • binBoundaries

        protected float[] binBoundaries
    • Constructor Details

      • EquiDepthHistogram

        public EquiDepthHistogram(float[] quantileElements)
        Constructs an equi-depth histogram with the given quantile elements. Quantile elements must be sorted ascending and have the form specified in the class documentation.
    • Method Details

      • binOfElement

        public int binOfElement(float element)
        Returns the bin index of the given element. In other words, returns a handle to the range the element falls into.
        Parameters:
        element - the element to search for.
        Throws:
        IllegalArgumentException - if the element is not contained in any bin.
      • bins

        public int bins()
        Returns the number of bins. In other words, returns the number of subdomains partitioning the entire value domain.
      • endOfBin

        public float endOfBin(int binIndex)
        Returns the end of the range associated with the given bin.
        Throws:
        ArrayIndexOutOfBoundsException - if binIndex < 0 || binIndex >= bins().
      • percentFromTo

        public double percentFromTo(float from, float to)
        Returns the percentage of elements in the range (from,to]. Does linear interpolation.
        Parameters:
        from - the start point (exclusive).
        to - the end point (inclusive).
      • phi

        public double phi(float element)
        Returns how many percent of the elements contained in the receiver are <= element. Does linear interpolation.
        Parameters:
        the - element to search for.
      • size

        public int size()
        Deprecated.
        Deprecated. Returns the number of bin boundaries.
      • startOfBin

        public float startOfBin(int binIndex)
        Returns the start of the range associated with the given bin.
        Throws:
        ArrayIndexOutOfBoundsException - if binIndex < 0 || binIndex >= bins().
      • test

        public static void test(float element)
        Not yet commented.