Package cern.jet.stat

Class Descriptive

java.lang.Object
cern.jet.stat.Descriptive

public class Descriptive extends Object
Basic descriptive statistics.
Version:
0.91, 08-Dec-99
  • Constructor Summary

    Constructors
    Modifier
    Constructor
    Description
    protected
    Makes this class non instantiable, but still let's others inherit from it.
  • Method Summary

    Modifier and Type
    Method
    Description
    static double
    autoCorrelation(DoubleArrayList data, int lag, double mean, double variance)
    Returns the auto-correlation of a data sequence.
    protected static void
    checkRangeFromTo(int from, int to, int theSize)
    Checks if the given range is within the contained array's bounds.
    static double
    correlation(DoubleArrayList data1, double standardDev1, DoubleArrayList data2, double standardDev2)
    Returns the correlation of two data sequences.
    static double
    Returns the covariance of two data sequences, which is cov(x,y) = (1/(size()-1)) * Sum((x[i]-mean(x)) * (y[i]-mean(y))).
    private static double
     
    static double
    Durbin-Watson computation.
    static void
    frequencies(DoubleArrayList sortedData, DoubleArrayList distinctValues, IntArrayList frequencies)
    Computes the frequency (number of occurances, count) of each distinct value in the given sorted data.
    static double
    geometricMean(int size, double sumOfLogarithms)
    Returns the geometric mean of a data sequence.
    static double
    Returns the geometric mean of a data sequence.
    static double
    harmonicMean(int size, double sumOfInversions)
    Returns the harmonic mean of a data sequence.
    static void
    incrementalUpdate(DoubleArrayList data, int from, int to, double[] inOut)
    Incrementally maintains and updates minimum, maximum, sum and sum of squares of a data sequence.
    static void
    incrementalUpdateSumsOfPowers(DoubleArrayList data, int from, int to, int fromSumIndex, int toSumIndex, double[] sumOfPowers)
    Incrementally maintains and updates various sums of powers of the form Sum(data[i]k).
    static void
    incrementalWeightedUpdate(DoubleArrayList data, DoubleArrayList weights, int from, int to, double[] inOut)
    Incrementally maintains and updates sum and sum of squares of a weighted data sequence.
    static double
    kurtosis(double moment4, double standardDeviation)
    Returns the kurtosis (aka excess) of a data sequence.
    static double
    kurtosis(DoubleArrayList data, double mean, double standardDeviation)
    Returns the kurtosis (aka excess) of a data sequence, which is -3 + moment(data,4,mean) / standardDeviation4.
    static double
    lag1(DoubleArrayList data, double mean)
    Returns the lag-1 autocorrelation of a dataset; Note that this method has semantics different from autoCorrelation(..., 1);
    static double
    Returns the largest member of a data sequence.
    static double
    Returns the arithmetic mean of a data sequence; That is Sum( data[i] ) / data.size().
    static double
    meanDeviation(DoubleArrayList data, double mean)
    Returns the mean deviation of a dataset.
    static double
    median(DoubleArrayList sortedData)
    Returns the median of a sorted data sequence.
    static double
    Returns the smallest member of a data sequence.
    static double
    moment(int k, double c, int size, double[] sumOfPowers)
    Returns the moment of k-th order with constant c of a data sequence, which is Sum( (data[i]-c)k ) / data.size().
    static double
    moment(DoubleArrayList data, int k, double c)
    Returns the moment of k-th order with constant c of a data sequence, which is Sum( (data[i]-c)k ) / data.size().
    static double
    pooledMean(int size1, double mean1, int size2, double mean2)
    Returns the pooled mean of two data sequences.
    static double
    pooledVariance(int size1, double variance1, int size2, double variance2)
    Returns the pooled variance of two data sequences.
    static double
    product(int size, double sumOfLogarithms)
    Returns the product, which is Prod( data[i] ).
    static double
    Returns the product of a data sequence, which is Prod( data[i] ).
    static double
    quantile(DoubleArrayList sortedData, double phi)
    Returns the phi-quantile; that is, an element elem for which holds that phi percent of data elements are less than elem.
    static double
    quantileInverse(DoubleArrayList sortedList, double element)
    Returns how many percent of the elements contained in the receiver are <= element.
    quantiles(DoubleArrayList sortedData, DoubleArrayList percentages)
    Returns the quantiles of the specified percentages.
    static double
    rankInterpolated(DoubleArrayList sortedList, double element)
    Returns the linearly interpolated number of elements in a list less or equal to a given element.
    static double
    rms(int size, double sumOfSquares)
    Returns the RMS (Root-Mean-Square) of a data sequence.
    static double
    sampleKurtosis(int size, double moment4, double sampleVariance)
    Returns the sample kurtosis (aka excess) of a data sequence.
    static double
    sampleKurtosis(DoubleArrayList data, double mean, double sampleVariance)
    Returns the sample kurtosis (aka excess) of a data sequence.
    static double
    Return the standard error of the sample kurtosis.
    static double
    sampleSkew(int size, double moment3, double sampleVariance)
    Returns the sample skew of a data sequence.
    static double
    sampleSkew(DoubleArrayList data, double mean, double sampleVariance)
    Returns the sample skew of a data sequence.
    static double
    Return the standard error of the sample skew.
    static double
    sampleStandardDeviation(int size, double sampleVariance)
    Returns the sample standard deviation.
    static double
    sampleVariance(int size, double sum, double sumOfSquares)
    Returns the sample variance of a data sequence.
    static double
    sampleVariance(DoubleArrayList data, double mean)
    Returns the sample variance of a data sequence.
    static double
    sampleWeightedVariance(double sumOfWeights, double sumOfProducts, double sumOfSquaredProducts)
    Returns the sample weighted variance of a data sequence.
    static double
    skew(double moment3, double standardDeviation)
    Returns the skew of a data sequence.
    static double
    skew(DoubleArrayList data, double mean, double standardDeviation)
    Returns the skew of a data sequence, which is moment(data,3,mean) / standardDeviation3.
    split(DoubleArrayList sortedList, DoubleArrayList splitters)
    Splits (partitions) a list into sublists such that each sublist contains the elements with a given range.
    static double
    standardDeviation(double variance)
    Returns the standard deviation from a variance.
    static double
    standardError(int size, double variance)
    Returns the standard error of a data sequence.
    static void
    standardize(DoubleArrayList data, double mean, double standardDeviation)
    Modifies a data sequence to be standardized.
    static double
    Returns the sum of a data sequence.
    static double
    sumOfInversions(DoubleArrayList data, int from, int to)
    Returns the sum of inversions of a data sequence, which is Sum( 1.0 / data[i]).
    static double
    sumOfLogarithms(DoubleArrayList data, int from, int to)
    Returns the sum of logarithms of a data sequence, which is Sum( Log(data[i]).
    static double
    sumOfPowerDeviations(DoubleArrayList data, int k, double c)
    Returns Sum( (data[i]-c)k ); optimized for common parameters like c == 0.0 and/or k == -2 ..
    static double
    sumOfPowerDeviations(DoubleArrayList data, int k, double c, int from, int to)
    Returns Sum( (data[i]-c)k ) for all i = from ..
    static double
    Returns the sum of powers of a data sequence, which is Sum ( data[i]k ).
    static double
    sumOfSquaredDeviations(int size, double variance)
    Returns the sum of squared mean deviation of of a data sequence.
    static double
    Returns the sum of squares of a data sequence.
    static double
    trimmedMean(DoubleArrayList sortedData, double mean, int left, int right)
    Returns the trimmed mean of a sorted data sequence.
    static double
    variance(double standardDeviation)
    Returns the variance from a standard deviation.
    static double
    variance(int size, double sum, double sumOfSquares)
    Returns the variance of a data sequence.
    static double
    Returns the weighted mean of a data sequence.
    static double
    weightedRMS(double sumOfProducts, double sumOfSquaredProducts)
    Returns the weighted RMS (Root-Mean-Square) of a data sequence.
    static double
    winsorizedMean(DoubleArrayList sortedData, double mean, int left, int right)
    Returns the winsorized mean of a sorted data sequence.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Constructor Details

    • Descriptive

      protected Descriptive()
      Makes this class non instantiable, but still let's others inherit from it.
  • Method Details

    • autoCorrelation

      public static double autoCorrelation(DoubleArrayList data, int lag, double mean, double variance)
      Returns the auto-correlation of a data sequence.
    • checkRangeFromTo

      protected static void checkRangeFromTo(int from, int to, int theSize)
      Checks if the given range is within the contained array's bounds.
      Throws:
      IndexOutOfBoundsException - if to!=from-1 || from<0 || from>to || to>=size().
    • correlation

      public static double correlation(DoubleArrayList data1, double standardDev1, DoubleArrayList data2, double standardDev2)
      Returns the correlation of two data sequences. That is covariance(data1,data2)/(standardDev1*standardDev2).
    • covariance

      public static double covariance(DoubleArrayList data1, DoubleArrayList data2)
      Returns the covariance of two data sequences, which is cov(x,y) = (1/(size()-1)) * Sum((x[i]-mean(x)) * (y[i]-mean(y))). See the math definition.
    • covariance2

      private static double covariance2(DoubleArrayList data1, DoubleArrayList data2)
    • durbinWatson

      public static double durbinWatson(DoubleArrayList data)
      Durbin-Watson computation.
    • frequencies

      public static void frequencies(DoubleArrayList sortedData, DoubleArrayList distinctValues, IntArrayList frequencies)
      Computes the frequency (number of occurances, count) of each distinct value in the given sorted data. After this call returns both distinctValues and frequencies have a new size (which is equal for both), which is the number of distinct values in the sorted data.

      Distinct values are filled into distinctValues, starting at index 0. The frequency of each distinct value is filled into frequencies, starting at index 0. As a result, the smallest distinct value (and its frequency) can be found at index 0, the second smallest distinct value (and its frequency) at index 1, ..., the largest distinct value (and its frequency) at index distinctValues.size()-1. Example:
      elements = (5,6,6,7,8,8) --> distinctValues = (5,6,7,8), frequencies = (1,2,1,2)

      Parameters:
      sortedData - the data; must be sorted ascending.
      distinctValues - a list to be filled with the distinct values; can have any size.
      frequencies - a list to be filled with the frequencies; can have any size; set this parameter to null to ignore it.
    • geometricMean

      public static double geometricMean(int size, double sumOfLogarithms)
      Returns the geometric mean of a data sequence. Note that for a geometric mean to be meaningful, the minimum of the data sequence must not be less or equal to zero.
      The geometric mean is given by pow( Product( data[i] ), 1/size) which is equivalent to Math.exp( Sum( Log(data[i]) ) / size).
    • geometricMean

      public static double geometricMean(DoubleArrayList data)
      Returns the geometric mean of a data sequence. Note that for a geometric mean to be meaningful, the minimum of the data sequence must not be less or equal to zero.
      The geometric mean is given by pow( Product( data[i] ), 1/data.size()). This method tries to avoid overflows at the expense of an equivalent but somewhat slow definition: geo = Math.exp( Sum( Log(data[i]) ) / data.size()).
    • harmonicMean

      public static double harmonicMean(int size, double sumOfInversions)
      Returns the harmonic mean of a data sequence.
      Parameters:
      size - the number of elements in the data sequence.
      sumOfInversions - Sum( 1.0 / data[i]).
    • incrementalUpdate

      public static void incrementalUpdate(DoubleArrayList data, int from, int to, double[] inOut)
      Incrementally maintains and updates minimum, maximum, sum and sum of squares of a data sequence. Assume we have already recorded some data sequence elements and know their minimum, maximum, sum and sum of squares. Assume further, we are to record some more elements and to derive updated values of minimum, maximum, sum and sum of squares.

      This method computes those updated values without needing to know the already recorded elements. This is interesting for interactive online monitoring and/or applications that cannot keep the entire huge data sequence in memory.


      Definition of sumOfSquares: sumOfSquares(n) = Sum ( data[i] * data[i] ).

      Parameters:
      data - the additional elements to be incorporated into min, max, etc.
      from - the index of the first element within data to consider.
      to - the index of the last element within data to consider. The method incorporates elements data[from], ..., data[to].
      inOut - the old values in the following format:
      • inOut[0] is the old minimum.
      • inOut[1] is the old maximum.
      • inOut[2] is the old sum.
      • inOut[3] is the old sum of squares.
      If no data sequence elements have so far been recorded set the values as follows
      • inOut[0] = Double.POSITIVE_INFINITY as the old minimum.
      • inOut[1] = Double.NEGATIVE_INFINITY as the old maximum.
      • inOut[2] = 0.0 as the old sum.
      • inOut[3] = 0.0 as the old sum of squares.
    • incrementalUpdateSumsOfPowers

      public static void incrementalUpdateSumsOfPowers(DoubleArrayList data, int from, int to, int fromSumIndex, int toSumIndex, double[] sumOfPowers)
      Incrementally maintains and updates various sums of powers of the form Sum(data[i]k). Assume we have already recorded some data sequence elements data[i] and know the values of Sum(data[i]from), Sum(data[i]from+1), ..., Sum(data[i]to). Assume further, we are to record some more elements and to derive updated values of these sums.

      This method computes those updated values without needing to know the already recorded elements. This is interesting for interactive online monitoring and/or applications that cannot keep the entire huge data sequence in memory. For example, the incremental computation of moments is based upon such sums of powers:

      The moment of k-th order with constant c of a data sequence, is given by Sum( (data[i]-c)k ) / data.size(). It can incrementally be computed by using the equivalent formula

      moment(k,c) = m(k,c) / data.size() where
      m(k,c) = Sum( -1i * b(k,i) * ci * sumOfPowers(k-i)) for i = 0 .. k and
      b(k,i) = binomial(k,i) and
      sumOfPowers(k) = Sum( data[i]k ).

      Parameters:
      data - the additional elements to be incorporated into min, max, etc.
      from - the index of the first element within data to consider.
      to - the index of the last element within data to consider. The method incorporates elements data[from], ..., data[to].
      inOut - the old values of the sums in the following format:
      • sumOfPowers[0] is the old Sum(data[i]fromSumIndex).
      • sumOfPowers[1] is the old Sum(data[i]fromSumIndex+1).
      • ...
      • sumOfPowers[toSumIndex-fromSumIndex] is the old Sum(data[i]toSumIndex).
      If no data sequence elements have so far been recorded set all old values of the sums to 0.0.
    • incrementalWeightedUpdate

      public static void incrementalWeightedUpdate(DoubleArrayList data, DoubleArrayList weights, int from, int to, double[] inOut)
      Incrementally maintains and updates sum and sum of squares of a weighted data sequence. Assume we have already recorded some data sequence elements and know their sum and sum of squares. Assume further, we are to record some more elements and to derive updated values of sum and sum of squares.

      This method computes those updated values without needing to know the already recorded elements. This is interesting for interactive online monitoring and/or applications that cannot keep the entire huge data sequence in memory.


      Definition of sum: sum = Sum ( data[i] * weights[i] ).
      Definition of sumOfSquares: sumOfSquares = Sum ( data[i] * data[i] * weights[i]).

      Parameters:
      data - the additional elements to be incorporated into min, max, etc.
      weights - the weight of each element within data.
      from - the index of the first element within data (and weights) to consider.
      to - the index of the last element within data (and weights) to consider. The method incorporates elements data[from], ..., data[to].
      inOut - the old values in the following format:
      • inOut[0] is the old sum.
      • inOut[1] is the old sum of squares.
      If no data sequence elements have so far been recorded set the values as follows
      • inOut[0] = 0.0 as the old sum.
      • inOut[1] = 0.0 as the old sum of squares.
    • kurtosis

      public static double kurtosis(double moment4, double standardDeviation)
      Returns the kurtosis (aka excess) of a data sequence.
      Parameters:
      moment4 - the fourth central moment, which is moment(data,4,mean).
      standardDeviation - the standardDeviation.
    • kurtosis

      public static double kurtosis(DoubleArrayList data, double mean, double standardDeviation)
      Returns the kurtosis (aka excess) of a data sequence, which is -3 + moment(data,4,mean) / standardDeviation4.
    • lag1

      public static double lag1(DoubleArrayList data, double mean)
      Returns the lag-1 autocorrelation of a dataset; Note that this method has semantics different from autoCorrelation(..., 1);
    • max

      public static double max(DoubleArrayList data)
      Returns the largest member of a data sequence.
    • mean

      public static double mean(DoubleArrayList data)
      Returns the arithmetic mean of a data sequence; That is Sum( data[i] ) / data.size().
    • meanDeviation

      public static double meanDeviation(DoubleArrayList data, double mean)
      Returns the mean deviation of a dataset. That is Sum (Math.abs(data[i]-mean)) / data.size()).
    • median

      public static double median(DoubleArrayList sortedData)
      Returns the median of a sorted data sequence.
      Parameters:
      sortedData - the data sequence; must be sorted ascending.
    • min

      public static double min(DoubleArrayList data)
      Returns the smallest member of a data sequence.
    • moment

      public static double moment(int k, double c, int size, double[] sumOfPowers)
      Returns the moment of k-th order with constant c of a data sequence, which is Sum( (data[i]-c)k ) / data.size().
      Parameters:
      size - the number of elements of the data sequence.
      sumOfPowers - sumOfPowers[m] == Sum( data[i]m) ) for m = 0,1,..,k as returned by method incrementalUpdateSumsOfPowers(DoubleArrayList,int,int,int,int,double[]). In particular there must hold sumOfPowers.length == k+1.
    • moment

      public static double moment(DoubleArrayList data, int k, double c)
      Returns the moment of k-th order with constant c of a data sequence, which is Sum( (data[i]-c)k ) / data.size().
    • pooledMean

      public static double pooledMean(int size1, double mean1, int size2, double mean2)
      Returns the pooled mean of two data sequences. That is (size1 * mean1 + size2 * mean2) / (size1 + size2).
      Parameters:
      size1 - the number of elements in data sequence 1.
      mean1 - the mean of data sequence 1.
      size2 - the number of elements in data sequence 2.
      mean2 - the mean of data sequence 2.
    • pooledVariance

      public static double pooledVariance(int size1, double variance1, int size2, double variance2)
      Returns the pooled variance of two data sequences. That is (size1 * variance1 + size2 * variance2) / (size1 + size2);
      Parameters:
      size1 - the number of elements in data sequence 1.
      variance1 - the variance of data sequence 1.
      size2 - the number of elements in data sequence 2.
      variance2 - the variance of data sequence 2.
    • product

      public static double product(int size, double sumOfLogarithms)
      Returns the product, which is Prod( data[i] ). In other words: data[0]*data[1]*...*data[data.size()-1]. This method uses the equivalent definition: prod = pow( exp( Sum( Log(x[i]) ) / size(), size()).
    • product

      public static double product(DoubleArrayList data)
      Returns the product of a data sequence, which is Prod( data[i] ). In other words: data[0]*data[1]*...*data[data.size()-1]. Note that you may easily get numeric overflows.
    • quantile

      public static double quantile(DoubleArrayList sortedData, double phi)
      Returns the phi-quantile; that is, an element elem for which holds that phi percent of data elements are less than elem. The quantile need not necessarily be contained in the data sequence, it can be a linear interpolation.
      Parameters:
      sortedData - the data sequence; must be sorted ascending.
      phi - the percentage; must satisfy 0 <= phi <= 1.
    • quantileInverse

      public static double quantileInverse(DoubleArrayList sortedList, double element)
      Returns how many percent of the elements contained in the receiver are <= element. Does linear interpolation if the element is not contained but lies in between two contained elements.
      Parameters:
      sortedList - the list to be searched (must be sorted ascending).
      element - the element to search for.
      Returns:
      the percentage phi of elements <= element (0.0 <= phi <= 1.0).
    • quantiles

      public static DoubleArrayList quantiles(DoubleArrayList sortedData, DoubleArrayList percentages)
      Returns the quantiles of the specified percentages. The quantiles need not necessarily be contained in the data sequence, it can be a linear interpolation.
      Parameters:
      sortedData - the data sequence; must be sorted ascending.
      percentages - the percentages for which quantiles are to be computed. Each percentage must be in the interval [0.0,1.0].
      Returns:
      the quantiles.
    • rankInterpolated

      public static double rankInterpolated(DoubleArrayList sortedList, double element)
      Returns the linearly interpolated number of elements in a list less or equal to a given element. The rank is the number of elements invalid input: '<'= element. Ranks are of the form {0, 1, 2,..., sortedList.size()}. If no element is invalid input: '<'= element, then the rank is zero. If the element lies in between two contained elements, then linear interpolation is used and a non integer value is returned.
      Parameters:
      sortedList - the list to be searched (must be sorted ascending).
      element - the element to search for.
      Returns:
      the rank of the element.
    • rms

      public static double rms(int size, double sumOfSquares)
      Returns the RMS (Root-Mean-Square) of a data sequence. That is Math.sqrt(Sum( data[i]*data[i] ) / data.size()). The RMS of data sequence is the square-root of the mean of the squares of the elements in the data sequence. It is a measure of the average "size" of the elements of a data sequence.
      Parameters:
      size - the number of elements in the data sequence.
      sumOfSquares - sumOfSquares(data) == Sum( data[i]*data[i] ) of the data sequence.
    • sampleKurtosis

      public static double sampleKurtosis(int size, double moment4, double sampleVariance)
      Returns the sample kurtosis (aka excess) of a data sequence. Ref: R.R. Sokal, F.J. Rohlf, Biometry: the principles and practice of statistics in biological research (W.H. Freeman and Company, New York, 1998, 3rd edition) p. 114-115.
      Parameters:
      size - the number of elements of the data sequence.
      moment4 - the fourth central moment, which is moment(data,4,mean).
      sampleVariance - the sample variance.
    • sampleKurtosis

      public static double sampleKurtosis(DoubleArrayList data, double mean, double sampleVariance)
      Returns the sample kurtosis (aka excess) of a data sequence.
    • sampleKurtosisStandardError

      public static double sampleKurtosisStandardError(int size)
      Return the standard error of the sample kurtosis. Ref: R.R. Sokal, F.J. Rohlf, Biometry: the principles and practice of statistics in biological research (W.H. Freeman and Company, New York, 1998, 3rd edition) p. 138.
      Parameters:
      size - the number of elements of the data sequence.
    • sampleSkew

      public static double sampleSkew(int size, double moment3, double sampleVariance)
      Returns the sample skew of a data sequence. Ref: R.R. Sokal, F.J. Rohlf, Biometry: the principles and practice of statistics in biological research (W.H. Freeman and Company, New York, 1998, 3rd edition) p. 114-115.
      Parameters:
      size - the number of elements of the data sequence.
      moment3 - the third central moment, which is moment(data,3,mean).
      sampleVariance - the sample variance.
    • sampleSkew

      public static double sampleSkew(DoubleArrayList data, double mean, double sampleVariance)
      Returns the sample skew of a data sequence.
    • sampleSkewStandardError

      public static double sampleSkewStandardError(int size)
      Return the standard error of the sample skew. Ref: R.R. Sokal, F.J. Rohlf, Biometry: the principles and practice of statistics in biological research (W.H. Freeman and Company, New York, 1998, 3rd edition) p. 138.
      Parameters:
      size - the number of elements of the data sequence.
    • sampleStandardDeviation

      public static double sampleStandardDeviation(int size, double sampleVariance)
      Returns the sample standard deviation. Ref: R.R. Sokal, F.J. Rohlf, Biometry: the principles and practice of statistics in biological research (W.H. Freeman and Company, New York, 1998, 3rd edition) p. 53.
      Parameters:
      size - the number of elements of the data sequence.
      sampleVariance - the sample variance.
    • sampleVariance

      public static double sampleVariance(int size, double sum, double sumOfSquares)
      Returns the sample variance of a data sequence. That is (sumOfSquares - mean*sum) / (size - 1) with mean = sum/size.
      Parameters:
      size - the number of elements of the data sequence.
      sum - == Sum( data[i] ).
      sumOfSquares - == Sum( data[i]*data[i] ).
    • sampleVariance

      public static double sampleVariance(DoubleArrayList data, double mean)
      Returns the sample variance of a data sequence. That is Sum ( (data[i]-mean)^2 ) / (data.size()-1).
    • sampleWeightedVariance

      public static double sampleWeightedVariance(double sumOfWeights, double sumOfProducts, double sumOfSquaredProducts)
      Returns the sample weighted variance of a data sequence. That is (sumOfSquaredProducts - sumOfProducts * sumOfProducts / sumOfWeights) / (sumOfWeights - 1).
      Parameters:
      sumOfWeights - == Sum( weights[i] ).
      sumOfProducts - == Sum( data[i] * weights[i] ).
      sumOfSquaredProducts - == Sum( data[i] * data[i] * weights[i] ).
    • skew

      public static double skew(double moment3, double standardDeviation)
      Returns the skew of a data sequence.
      Parameters:
      moment3 - the third central moment, which is moment(data,3,mean).
      standardDeviation - the standardDeviation.
    • skew

      public static double skew(DoubleArrayList data, double mean, double standardDeviation)
      Returns the skew of a data sequence, which is moment(data,3,mean) / standardDeviation3.
    • split

      public static DoubleArrayList[] split(DoubleArrayList sortedList, DoubleArrayList splitters)
      Splits (partitions) a list into sublists such that each sublist contains the elements with a given range. splitters=(a,b,c,...,y,z) defines the ranges [-inf,a), [a,b), [b,c), ..., [y,z), [z,inf].

      Examples:

        data = (1,2,3,4,5,8,8,8,10,11).
        splitters=(2,8) yields 3 bins: (1), (2,3,4,5) (8,8,8,10,11).
        splitters=() yields 1 bin: (1,2,3,4,5,8,8,8,10,11).
        splitters=(-5) yields 2 bins: (), (1,2,3,4,5,8,8,8,10,11).
        splitters=(100) yields 2 bins: (1,2,3,4,5,8,8,8,10,11), ().
      Parameters:
      sortedList - the list to be partitioned (must be sorted ascending).
      splitters - the points at which the list shall be partitioned (must be sorted ascending).
      Returns:
      the sublists (an array with length == splitters.size() + 1. Each sublist is returned sorted ascending.
    • standardDeviation

      public static double standardDeviation(double variance)
      Returns the standard deviation from a variance.
    • standardError

      public static double standardError(int size, double variance)
      Returns the standard error of a data sequence. That is Math.sqrt(variance/size).
      Parameters:
      size - the number of elements in the data sequence.
      variance - the variance of the data sequence.
    • standardize

      public static void standardize(DoubleArrayList data, double mean, double standardDeviation)
      Modifies a data sequence to be standardized. Changes each element data[i] as follows: data[i] = (data[i]-mean)/standardDeviation.
    • sum

      public static double sum(DoubleArrayList data)
      Returns the sum of a data sequence. That is Sum( data[i] ).
    • sumOfInversions

      public static double sumOfInversions(DoubleArrayList data, int from, int to)
      Returns the sum of inversions of a data sequence, which is Sum( 1.0 / data[i]).
      Parameters:
      data - the data sequence.
      from - the index of the first data element (inclusive).
      to - the index of the last data element (inclusive).
    • sumOfLogarithms

      public static double sumOfLogarithms(DoubleArrayList data, int from, int to)
      Returns the sum of logarithms of a data sequence, which is Sum( Log(data[i]).
      Parameters:
      data - the data sequence.
      from - the index of the first data element (inclusive).
      to - the index of the last data element (inclusive).
    • sumOfPowerDeviations

      public static double sumOfPowerDeviations(DoubleArrayList data, int k, double c)
      Returns Sum( (data[i]-c)k ); optimized for common parameters like c == 0.0 and/or k == -2 .. 4.
    • sumOfPowerDeviations

      public static double sumOfPowerDeviations(DoubleArrayList data, int k, double c, int from, int to)
      Returns Sum( (data[i]-c)k ) for all i = from .. to; optimized for common parameters like c == 0.0 and/or k == -2 .. 5.
    • sumOfPowers

      public static double sumOfPowers(DoubleArrayList data, int k)
      Returns the sum of powers of a data sequence, which is Sum ( data[i]k ).
    • sumOfSquaredDeviations

      public static double sumOfSquaredDeviations(int size, double variance)
      Returns the sum of squared mean deviation of of a data sequence. That is variance * (size-1) == Sum( (data[i] - mean)^2 ).
      Parameters:
      size - the number of elements of the data sequence.
      variance - the variance of the data sequence.
    • sumOfSquares

      public static double sumOfSquares(DoubleArrayList data)
      Returns the sum of squares of a data sequence. That is Sum ( data[i]*data[i] ).
    • trimmedMean

      public static double trimmedMean(DoubleArrayList sortedData, double mean, int left, int right)
      Returns the trimmed mean of a sorted data sequence.
      Parameters:
      sortedData - the data sequence; must be sorted ascending.
      mean - the mean of the (full) sorted data sequence.
    • variance

      public static double variance(double standardDeviation)
      Returns the variance from a standard deviation.
    • variance

      public static double variance(int size, double sum, double sumOfSquares)
      Returns the variance of a data sequence. That is (sumOfSquares - mean*sum) / size with mean = sum/size.
      Parameters:
      size - the number of elements of the data sequence.
      sum - == Sum( data[i] ).
      sumOfSquares - == Sum( data[i]*data[i] ).
    • weightedMean

      public static double weightedMean(DoubleArrayList data, DoubleArrayList weights)
      Returns the weighted mean of a data sequence. That is Sum (data[i] * weights[i]) / Sum ( weights[i] ).
    • weightedRMS

      public static double weightedRMS(double sumOfProducts, double sumOfSquaredProducts)
      Returns the weighted RMS (Root-Mean-Square) of a data sequence. That is Sum( data[i] * data[i] * weights[i]) / Sum( data[i] * weights[i] ), or in other words sumOfProducts / sumOfSquaredProducts.
      Parameters:
      sumOfProducts - == Sum( data[i] * weights[i] ).
      sumOfSquaredProducts - == Sum( data[i] * data[i] * weights[i] ).
    • winsorizedMean

      public static double winsorizedMean(DoubleArrayList sortedData, double mean, int left, int right)
      Returns the winsorized mean of a sorted data sequence.
      Parameters:
      sortedData - the data sequence; must be sorted ascending.
      mean - the mean of the (full) sorted data sequence.