Class CKMSQuantiles


  • class CKMSQuantiles
    extends java.lang.Object
    Implementation of the Cormode, Korn, Muthukrishnan, and Srivastava algorithm for streaming calculation of targeted high-percentile epsilon-approximate quantiles. This is a generalization of the earlier work by Greenwald and Khanna (GK), which essentially allows different error bounds on the targeted quantiles, which allows for far more efficient calculation of high-percentiles. See: Cormode, Korn, Muthukrishnan, and Srivastava "Effective Computation of Biased Quantiles over Data Streams" in ICDE 2005 Greenwald and Khanna, "Space-efficient online computation of quantile summaries" in SIGMOD 2001
    • Field Summary

      Fields 
      Modifier and Type Field Description
      private double[] buffer
      Buffers incoming items to be inserted in batch.
      private int bufferCount  
      private int compressIdx
      Used for tracking incremental compression.
      private int count
      Total number of items in stream.
      private CKMSQuantiles.Quantile[] quantiles
      Array of Quantiles that we care about, along with desired error.
      protected java.util.LinkedList<CKMSQuantiles.Item> sample
      Current list of sampled items, maintained in sorted order with error bounds.
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      private double allowableError​(int rank)
      Specifies the allowable error for this rank, depending on which quantiles are being targeted.
      private void compress()
      Try to remove extraneous items from the set of sampled items.
      double get​(double q)
      Get the estimated value at the specified quantile.
      void insert​(double value)
      Add a new value from the stream.
      private boolean insertBatch()  
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • count

        private int count
        Total number of items in stream.
      • compressIdx

        private int compressIdx
        Used for tracking incremental compression.
      • sample

        protected java.util.LinkedList<CKMSQuantiles.Item> sample
        Current list of sampled items, maintained in sorted order with error bounds.
      • buffer

        private double[] buffer
        Buffers incoming items to be inserted in batch.
      • bufferCount

        private int bufferCount
      • quantiles

        private final CKMSQuantiles.Quantile[] quantiles
        Array of Quantiles that we care about, along with desired error.
    • Method Detail

      • insert

        public void insert​(double value)
        Add a new value from the stream.
        Parameters:
        value -
      • get

        public double get​(double q)
        Get the estimated value at the specified quantile.
        Parameters:
        q - Queried quantile, e.g. 0.50 or 0.99.
        Returns:
        Estimated value at that quantile.
      • allowableError

        private double allowableError​(int rank)
        Specifies the allowable error for this rank, depending on which quantiles are being targeted. This is the f(r_i, n) function from the CKMS paper. It's basically how wide the range of this rank can be.
        Parameters:
        rank - the index in the list of samples
      • insertBatch

        private boolean insertBatch()
      • compress

        private void compress()
        Try to remove extraneous items from the set of sampled items. This checks if an item is unnecessary based on the desired error bounds, and merges it with the adjacent item if it is.