Class QuantileCalc

java.lang.Object
cern.jet.stat.quantile.QuantileCalc

class QuantileCalc extends Object
Computes b and k vor various parameters.
  • Constructor Summary

    Constructors
    Constructor
    Description
     
  • Method Summary

    Modifier and Type
    Method
    Description
    static double
    binomial(long n, long k)
    Efficiently computes the binomial coefficient, often also referred to as "n over k" or "n choose k".
    static long
    ceiling(double value)
    Returns the smallest long >= value.
    static long[]
    known_N_compute_B_and_K(long N, double epsilon, double delta, int quantiles, double[] returnSamplingRate)
    Computes the number of buffers and number of values per buffer such that quantiles can be determined with an approximation error no more than epsilon with a certain probability.
    protected static long[]
    known_N_compute_B_and_K_quick(long N, double epsilon)
    Computes the number of buffers and number of values per buffer such that quantiles can be determined with a guaranteed approximation error no more than epsilon.
    protected static long[]
    known_N_compute_B_and_K_slow(long N, double epsilon, double delta, int quantiles, double[] returnSamplingRate)
    Computes the number of buffers and number of values per buffer such that quantiles can be determined with an approximation error no more than epsilon with a certain probability.
    static void
    main(String[] args)
     
    static void
    Computes b and k for different parameters.
    static long[]
    unknown_N_compute_B_and_K(double epsilon, double delta, int quantiles)
    Computes the number of buffers and number of values per buffer such that quantiles can be determined with an approximation error no more than epsilon with a certain probability.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Constructor Details

    • QuantileCalc

      QuantileCalc()
  • Method Details

    • binomial

      public static double binomial(long n, long k)
      Efficiently computes the binomial coefficient, often also referred to as "n over k" or "n choose k". The binomial coefficient is defined as n!/((n-k)!*k!). Tries to avoid numeric overflows.
      Returns:
      the binomial coefficient.
    • ceiling

      public static long ceiling(double value)
      Returns the smallest long >= value.
      Examples: 1.0 -> 1, 1.2 -> 2, 1.9 -> 2. This method is safer than using (long) Math.ceil(value), because of possible rounding error.
    • known_N_compute_B_and_K

      public static long[] known_N_compute_B_and_K(long N, double epsilon, double delta, int quantiles, double[] returnSamplingRate)
      Computes the number of buffers and number of values per buffer such that quantiles can be determined with an approximation error no more than epsilon with a certain probability. Assumes that quantiles are to be computed over N values. The required sampling rate is computed and stored in the first element of the provided returnSamplingRate array, which, therefore must be at least of length 1.
      Parameters:
      N - the number of values over which quantiles shall be computed (e.g 10^6).
      epsilon - the approximation error which is guaranteed not to be exceeded (e.g. 0.001) (0 <= epsilon <= 1). To get exact result, set epsilon=0.0;
      delta - the probability that the approximation error is more than than epsilon (e.g. 0.0001) (0 <= delta <= 1). To avoid probabilistic answers, set delta=0.0.
      quantiles - the number of quantiles to be computed (e.g. 100) (quantiles >= 1). If unknown in advance, set this number large, e.g. quantiles >= 10000.
      samplingRate - a double[1] where the sampling rate is to be filled in.
      Returns:
      long[2] - long[0]=the number of buffers, long[1]=the number of elements per buffer, returnSamplingRate[0]=the required sampling rate.
    • known_N_compute_B_and_K_quick

      protected static long[] known_N_compute_B_and_K_quick(long N, double epsilon)
      Computes the number of buffers and number of values per buffer such that quantiles can be determined with a guaranteed approximation error no more than epsilon. Assumes that quantiles are to be computed over N values.
      Parameters:
      N - the anticipated number of values over which quantiles shall be determined.
      epsilon - the approximation error which is guaranteed not to be exceeded (e.g. 0.001) (0 <= epsilon <= 1). To get exact result, set epsilon=0.0;
      Returns:
      long[2] - long[0]=the number of buffers, long[1]=the number of elements per buffer.
    • known_N_compute_B_and_K_slow

      protected static long[] known_N_compute_B_and_K_slow(long N, double epsilon, double delta, int quantiles, double[] returnSamplingRate)
      Computes the number of buffers and number of values per buffer such that quantiles can be determined with an approximation error no more than epsilon with a certain probability. Assumes that quantiles are to be computed over N values. The required sampling rate is computed and stored in the first element of the provided returnSamplingRate array, which, therefore must be at least of length 1.
      Parameters:
      N - the anticipated number of values over which quantiles shall be computed (e.g 10^6).
      epsilon - the approximation error which is guaranteed not to be exceeded (e.g. 0.001) (0 <= epsilon <= 1). To get exact result, set epsilon=0.0;
      delta - the probability that the approximation error is more than than epsilon (e.g. 0.0001) (0 <= delta <= 1). To avoid probabilistic answers, set delta=0.0.
      quantiles - the number of quantiles to be computed (e.g. 100) (quantiles >= 1). If unknown in advance, set this number large, e.g. quantiles >= 10000.
      samplingRate - a double[1] where the sampling rate is to be filled in.
      Returns:
      long[2] - long[0]=the number of buffers, long[1]=the number of elements per buffer, returnSamplingRate[0]=the required sampling rate.
    • main

      public static void main(String[] args)
    • test_B_and_K_Calculation

      public static void test_B_and_K_Calculation(String[] args)
      Computes b and k for different parameters.
    • unknown_N_compute_B_and_K

      public static long[] unknown_N_compute_B_and_K(double epsilon, double delta, int quantiles)
      Computes the number of buffers and number of values per buffer such that quantiles can be determined with an approximation error no more than epsilon with a certain probability.
      Parameters:
      epsilon - the approximation error which is guaranteed not to be exceeded (e.g. 0.001) (0 <= epsilon <= 1). To get exact results, set epsilon=0.0;
      delta - the probability that the approximation error is more than than epsilon (e.g. 0.0001) (0 <= delta <= 1). To get exact results, set delta=0.0.
      quantiles - the number of quantiles to be computed (e.g. 100) (quantiles >= 1). If unknown in advance, set this number large, e.g. quantiles >= 10000.
      Returns:
      long[3] - long[0]=the number of buffers, long[1]=the number of elements per buffer, long[2]=the tree height where sampling shall start.