java.lang.Object
org.apache.commons.statistics.inference.TTest

public final class TTest extends Object
Implements Student's t-test statistics.

Tests can be:

  • One-sample or two-sample
  • One-sided or two-sided
  • Paired or unpaired (for two-sample tests)
  • Homoscedastic (equal variance assumption) or heteroscedastic (for two sample tests)

Input to tests can be either double[] arrays or the mean, variance, and size of the sample.

Since:
1.1
See Also:
  • Nested Class Summary

    Nested Classes
    Modifier and Type
    Class
    Description
    static final class 
    Result for the t-test.
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    private final AlternativeHypothesis
    Alternative hypothesis.
    private static final TTest
    Default instance.
    private final boolean
    Assume the two samples have the same population variance.
    private final double
    The true value of the mean (or difference in means for a two sample test).
  • Constructor Summary

    Constructors
    Modifier
    Constructor
    Description
    private
    TTest(AlternativeHypothesis alternative, boolean equalVariances, double mu)
     
  • Method Summary

    Modifier and Type
    Method
    Description
    private static long
    Check sample data size.
    private static double
    computeDf(double v1, long n1, double v2, long n2)
    Computes approximate degrees of freedom for two-sample t-test without the assumption of equal samples sizes or sub-population variances.
    private static double
    computeHomoscedasticT(double mu, double m1, double v1, long n1, double m2, double v2, long n2)
    Computes t statistic for two-sample t-test under the hypothesis of equal sub-population variances.
    private double
    computeP(double t, double degreesOfFreedom)
    Computes p-value for the specified t statistic.
    private static double
    computeT(double mu, double m1, double v1, long n1, double m2, double v2, long n2)
    Computes t statistic for two-sample t-test without the assumption of equal samples sizes or sub-population variances.
    private static double
    computeT(double m, double v, long n)
    Computes t statistic for one-sample t-test.
    double
    pairedStatistic(double[] x, double[] y)
    Computes a paired two-sample t-statistic on related samples comparing the mean difference between the samples to mu.
    pairedTest(double[] x, double[] y)
    Performs a paired two-sample t-test on related samples comparing the mean difference between the samples to mu.
    double
    statistic(double[] x)
    Computes a one-sample t statistic comparing the mean of the sample to mu.
    double
    statistic(double[] x, double[] y)
    Computes a two-sample t statistic on independent samples comparing the difference in means of the samples to mu.
    double
    statistic(double m, double v, long n)
    Computes a one-sample t statistic comparing the mean of the dataset to mu.
    double
    statistic(double m1, double v1, long n1, double m2, double v2, long n2)
    Computes a two-sample t statistic on independent samples comparing the difference in means of the datasets to mu.
    test(double[] sample)
    Performs a one-sample t-test comparing the mean of the sample to mu.
    test(double[] x, double[] y)
    Performs a two-sample t-test on independent samples comparing the difference in means of the samples to mu.
    test(double m, double v, long n)
    Perform a one-sample t-test comparing the mean of the dataset to mu.
    test(double m1, double v1, long n1, double m2, double v2, long n2)
    Performs a two-sample t-test on independent samples comparing the difference in means of the datasets to mu.
    Return an instance with the configured alternative hypothesis.
    Return an instance with the configured assumption on the data dispersion.
    static TTest
    Return an instance using the default options.
    withMu(double v)
    Return an instance with the configured mu.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

    • DEFAULT

      private static final TTest DEFAULT
      Default instance.
    • alternative

      private final AlternativeHypothesis alternative
      Alternative hypothesis.
    • equalVariances

      private final boolean equalVariances
      Assume the two samples have the same population variance.
    • mu

      private final double mu
      The true value of the mean (or difference in means for a two sample test).
  • Constructor Details

    • TTest

      private TTest(AlternativeHypothesis alternative, boolean equalVariances, double mu)
      Parameters:
      alternative - Alternative hypothesis.
      equalVariances - Assume the two samples have the same population variance.
      mu - true value of the mean (or difference in means for a two sample test).
  • Method Details

    • withDefaults

      public static TTest withDefaults()
      Return an instance using the default options.
      Returns:
      default instance
    • with

      public TTest with(AlternativeHypothesis v)
      Return an instance with the configured alternative hypothesis.
      Parameters:
      v - Value.
      Returns:
      an instance
    • with

      public TTest with(DataDispersion v)
      Return an instance with the configured assumption on the data dispersion.

      Applies to the two-sample independent t-test. The statistic can compare the means without the assumption of equal sub-population variances (heteroscedastic); otherwise the means are compared under the assumption of equal sub-population variances (homoscedastic).

      Parameters:
      v - Value.
      Returns:
      an instance
      See Also:
    • withMu

      public TTest withMu(double v)
      Return an instance with the configured mu.

      For the one-sample test this is the expected mean.

      For the two-sample test this is the expected difference between the means.

      Parameters:
      v - Value.
      Returns:
      an instance
      Throws:
      IllegalArgumentException - if the value is not finite
    • statistic

      public double statistic(double m, double v, long n)
      Computes a one-sample t statistic comparing the mean of the dataset to mu.

      The returned t-statistic is:

      \[ t = \frac{m - \mu}{ \sqrt{ \frac{v}{n} } } \]

      Parameters:
      m - Sample mean.
      v - Sample variance.
      n - Sample size.
      Returns:
      t statistic
      Throws:
      IllegalArgumentException - if the number of samples is < 2; or the variance is negative
      See Also:
    • statistic

      public double statistic(double[] x)
      Computes a one-sample t statistic comparing the mean of the sample to mu.
      Parameters:
      x - Sample values.
      Returns:
      t statistic
      Throws:
      IllegalArgumentException - if the number of samples is < 2
      See Also:
    • pairedStatistic

      public double pairedStatistic(double[] x, double[] y)
      Computes a paired two-sample t-statistic on related samples comparing the mean difference between the samples to mu.

      The t-statistic returned is functionally equivalent to what would be returned by computing the one-sample t-statistic statistic(double[]), with the sample array consisting of the (signed) differences between corresponding entries in x and y.

      Parameters:
      x - First sample values.
      y - Second sample values.
      Returns:
      t statistic
      Throws:
      IllegalArgumentException - if the number of samples is < 2; or the the size of the samples is not equal
      See Also:
    • statistic

      public double statistic(double m1, double v1, long n1, double m2, double v2, long n2)
      Computes a two-sample t statistic on independent samples comparing the difference in means of the datasets to mu.

      Use the DataDispersion to control the computation of the variance.

      The heteroscedastic t-statistic is:

      \[ t = \frac{m1 - m2 - \mu}{ \sqrt{ \frac{v_1}{n_1} + \frac{v_2}{n_2} } } \]

      The homoscedastic t-statistic is:

      \[ t = \frac{m1 - m2 - \mu}{ \sqrt{ v (\frac{1}{n_1} + \frac{1}{n_2}) } } \]

      where \( v \) is the pooled variance estimate:

      \[ v = \frac{(n_1-1)v_1 + (n_2-1)v_2}{n_1 + n_2 - 2} \]

      Parameters:
      m1 - First sample mean.
      v1 - First sample variance.
      n1 - First sample size.
      m2 - Second sample mean.
      v2 - Second sample variance.
      n2 - Second sample size.
      Returns:
      t statistic
      Throws:
      IllegalArgumentException - if the number of samples in either dataset is < 2; or the variances are negative.
      See Also:
    • statistic

      public double statistic(double[] x, double[] y)
      Computes a two-sample t statistic on independent samples comparing the difference in means of the samples to mu.

      Use the DataDispersion to control the computation of the variance.

      Parameters:
      x - First sample values.
      y - Second sample values.
      Returns:
      t statistic
      Throws:
      IllegalArgumentException - if the number of samples in either dataset is < 2
      See Also:
    • test

      public TTest.Result test(double m, double v, long n)
      Perform a one-sample t-test comparing the mean of the dataset to mu.

      Degrees of freedom are \( v = n - 1 \).

      Parameters:
      m - Sample mean.
      v - Sample variance.
      n - Sample size.
      Returns:
      test result
      Throws:
      IllegalArgumentException - if the number of samples is < 2; or the variance is negative
      See Also:
    • test

      public TTest.Result test(double[] sample)
      Performs a one-sample t-test comparing the mean of the sample to mu.

      Degrees of freedom are \( v = n - 1 \).

      Parameters:
      sample - Sample values.
      Returns:
      the test result
      Throws:
      IllegalArgumentException - if the number of samples is < 2; or the the size of the samples is not equal
      See Also:
    • pairedTest

      public TTest.Result pairedTest(double[] x, double[] y)
      Performs a paired two-sample t-test on related samples comparing the mean difference between the samples to mu.

      The test is functionally equivalent to what would be returned by computing the one-sample t-test test(double[]), with the sample array consisting of the (signed) differences between corresponding entries in x and y.

      Parameters:
      x - First sample values.
      y - Second sample values.
      Returns:
      the test result
      Throws:
      IllegalArgumentException - if the number of samples is < 2; or the the size of the samples is not equal
      See Also:
    • test

      public TTest.Result test(double m1, double v1, long n1, double m2, double v2, long n2)
      Performs a two-sample t-test on independent samples comparing the difference in means of the datasets to mu.

      Use the DataDispersion to control the computation of the variance.

      The heteroscedastic degrees of freedom are estimated using the Welch-Satterthwaite approximation:

      \[ v = \frac{ (\frac{v_1}{n_1} + \frac{v_2}{n_2})^2 } { \frac{(v_1/n_1)^2}{n_1-1} + \frac{(v_2/n_2)^2}{n_2-1} } \]

      The homoscedastic degrees of freedom are \( v = n_1 + n_2 - 2 \).

      Parameters:
      m1 - First sample mean.
      v1 - First sample variance.
      n1 - First sample size.
      m2 - Second sample mean.
      v2 - Second sample variance.
      n2 - Second sample size.
      Returns:
      test result
      Throws:
      IllegalArgumentException - if the number of samples in either dataset is < 2; or the variances are negative.
      See Also:
    • test

      public TTest.Result test(double[] x, double[] y)
      Performs a two-sample t-test on independent samples comparing the difference in means of the samples to mu.

      Use the DataDispersion to control the computation of the variance.

      Parameters:
      x - First sample values.
      y - Second sample values.
      Returns:
      the test result
      Throws:
      IllegalArgumentException - if the number of samples in either dataset is < 2
      See Also:
    • computeT

      private static double computeT(double m, double v, long n)
      Computes t statistic for one-sample t-test.
      Parameters:
      m - Sample mean.
      v - Sample variance.
      n - Sample size.
      Returns:
      t test statistic
    • computeT

      private static double computeT(double mu, double m1, double v1, long n1, double m2, double v2, long n2)
      Computes t statistic for two-sample t-test without the assumption of equal samples sizes or sub-population variances.
      Parameters:
      mu - Expected difference between means.
      m1 - First sample mean.
      v1 - First sample variance.
      n1 - First sample size.
      m2 - Second sample mean.
      v2 - Second sample variance.
      n2 - Second sample size.
      Returns:
      t test statistic
    • computeDf

      private static double computeDf(double v1, long n1, double v2, long n2)
      Computes approximate degrees of freedom for two-sample t-test without the assumption of equal samples sizes or sub-population variances.
      Parameters:
      v1 - First sample variance.
      n1 - First sample size.
      v2 - Second sample variance.
      n2 - Second sample size.
      Returns:
      approximate degrees of freedom
    • computeHomoscedasticT

      private static double computeHomoscedasticT(double mu, double m1, double v1, long n1, double m2, double v2, long n2)
      Computes t statistic for two-sample t-test under the hypothesis of equal sub-population variances.
      Parameters:
      mu - Expected difference between means.
      m1 - First sample mean.
      v1 - First sample variance.
      n1 - First sample size.
      m2 - Second sample mean.
      v2 - Second sample variance.
      n2 - Second sample size.
      Returns:
      t test statistic
    • computeP

      private double computeP(double t, double degreesOfFreedom)
      Computes p-value for the specified t statistic.
      Parameters:
      t - T statistic.
      degreesOfFreedom - Degrees of freedom.
      Returns:
      p-value for t-test
    • checkSampleSize

      private static long checkSampleSize(long n)
      Check sample data size.
      Parameters:
      n - Data size.
      Returns:
      the sample size
      Throws:
      IllegalArgumentException - if the number of samples < 2