Class WilcoxonSignedRankTest

java.lang.Object
org.apache.commons.statistics.inference.WilcoxonSignedRankTest

public final class WilcoxonSignedRankTest extends Object
Implements the Wilcoxon signed-rank test.
Since:
1.1
See Also:
  • Field Details

    • EXACT_LIMIT

      private static final int EXACT_LIMIT
      Limit on sample size for the exact p-value computation.
      See Also:
    • AUTO_LIMIT

      private static final int AUTO_LIMIT
      Limit on sample size for the exact p-value computation for the auto mode.
      See Also:
    • RANKING

      private static final RankingAlgorithm RANKING
      Ranking instance.
    • DEFAULT

      private static final WilcoxonSignedRankTest DEFAULT
      Default instance.
    • alternative

      private final AlternativeHypothesis alternative
      Alternative hypothesis.
    • pValueMethod

      private final PValueMethod pValueMethod
      Method to compute the p-value.
    • continuityCorrection

      private final boolean continuityCorrection
      Perform continuity correction.
    • mu

      private final double mu
      Expected location shift.
  • Constructor Details

    • WilcoxonSignedRankTest

      private WilcoxonSignedRankTest(AlternativeHypothesis alternative, PValueMethod method, boolean continuityCorrection, double mu)
      Parameters:
      alternative - Alternative hypothesis.
      method - P-value method.
      continuityCorrection - true to perform continuity correction.
      mu - Expected location shift.
  • Method Details

    • withDefaults

      public static WilcoxonSignedRankTest withDefaults()
      Returns:
      default instance
    • with

      Return an instance with the configured alternative hypothesis.
      Parameters:
      v - Value.
      Returns:
      an instance
    • with

      Return an instance with the configured p-value method.
      Parameters:
      v - Value.
      Returns:
      an instance
      Throws:
      IllegalArgumentException - if the value is not in the allowed options or is null
    • with

      Return an instance with the configured continuity correction.

      If enabled, adjust the Wilcoxon rank statistic by 0.5 towards the mean value when computing the z-statistic if a normal approximation is used to compute the p-value.

      Parameters:
      v - Value.
      Returns:
      an instance
    • withMu

      public WilcoxonSignedRankTest withMu(double v)
      Return an instance with the configured expected difference mu.
      Parameters:
      v - Value.
      Returns:
      an instance
      Throws:
      IllegalArgumentException - if the value is not finite
    • statistic

      public double statistic(double[] z)
      Computes the Wilcoxon signed ranked statistic comparing the differences between sample values z = x - y to mu.

      This method handles matching samples z[i] == mu (no difference) by including them in the ranking of samples but excludes them from the test statistic (signed-rank zero procedure).

      Parameters:
      z - Signed differences between sample values.
      Returns:
      Wilcoxon positive-rank sum statistic (W+)
      Throws:
      IllegalArgumentException - if z is zero-length; contains NaN values; or all differences are equal to the expected difference
      See Also:
    • statistic

      public double statistic(double[] x, double[] y)
      Computes the Wilcoxon signed ranked statistic comparing the differences between two related samples or repeated measurements on a single sample.

      This method handles matching samples x[i] - mu == y[i] (no difference) by including them in the ranking of samples but excludes them from the test statistic (signed-rank zero procedure).

      This method is functionally equivalent to creating an array of differences z = x - y and calling statistic(z); the implementation may use an optimised method to compute the differences and rank statistic if mu != 0.

      Parameters:
      x - First sample values.
      y - Second sample values.
      Returns:
      Wilcoxon positive-rank sum statistic (W+)
      Throws:
      IllegalArgumentException - if x or y are zero-length; are not the same length; contain NaN values; or x[i] == y[i] for all data
      See Also:
    • test

      public WilcoxonSignedRankTest.Result test(double[] z)
      Performs a Wilcoxon signed ranked statistic comparing the differences between sample values z = x - y to mu.

      This method handles matching samples z[i] == mu (no difference) by including them in the ranking of samples but excludes them from the test statistic (signed-rank zero procedure).

      The test is defined by the AlternativeHypothesis.

      • 'two-sided': the distribution of the difference is not symmetric about mu.
      • 'greater': the distribution of the difference is stochastically greater than a distribution symmetric about mu.
      • 'less': the distribution of the difference is stochastically less than a distribution symmetric about mu.

      If the p-value method is auto an exact p-value is computed if the samples contain less than 50 values; otherwise a normal approximation is used.

      Computation of the exact p-value is only valid if there are no matching samples z[i] == mu and no tied ranks in the data; otherwise the p-value resorts to the asymptotic Cureton approximation using a tie correction and an optional continuity correction.

      Note: Computation of the exact p-value requires the sample size <= 1023. Exact computation requires tabulation of values not exceeding size n(n+1)/2 and computes in Order(n*n/2). Maximum memory usage is approximately 4 MiB.

      Parameters:
      z - Differences between sample values.
      Returns:
      test result
      Throws:
      IllegalArgumentException - if z is zero-length; contains NaN values; or all differences are zero
      See Also:
    • test

      public WilcoxonSignedRankTest.Result test(double[] x, double[] y)
      Performs a Wilcoxon signed ranked statistic comparing mean for two related samples or repeated measurements on a single sample.

      This method handles matching samples x[i] - mu == y[i] (no difference) by including them in the ranking of samples but excludes them from the test statistic (signed-rank zero procedure).

      This method is functionally equivalent to creating an array of differences z = x - y and calling test(z); the implementation may use an optimised method to compute the differences and rank statistic if mu != 0.

      Parameters:
      x - First sample values.
      y - Second sample values.
      Returns:
      test result
      Throws:
      IllegalArgumentException - if x or y are zero-length; are not the same length; contain NaN values; or x[i] - mu == y[i] for all data
      See Also:
    • computeStatistic

      private static double computeStatistic(double[] z, double mu)
      Computes the Wilcoxon signed ranked statistic comparing the differences between sample values z = x - y to mu.
      Parameters:
      z - Signed differences between sample values.
      mu - Expected difference.
      Returns:
      Wilcoxon positive-rank sum statistic (W+)
      Throws:
      IllegalArgumentException - if z is zero-length; contains NaN values; or all differences are equal to the expected difference
      See Also:
    • computeTest

      private WilcoxonSignedRankTest.Result computeTest(double[] z, double expectedMu)
      Performs a Wilcoxon signed ranked statistic comparing the differences between sample values z = x - y to mu.
      Parameters:
      z - Differences between sample values.
      expectedMu - Expected difference.
      Returns:
      test result
      Throws:
      IllegalArgumentException - if z is zero-length; contains NaN values; or all differences are zero
    • checkSamples

      private static void checkSamples(double[] x, double[] y)
      Ensures that the provided arrays fulfil the assumptions.
      Parameters:
      x - First sample.
      y - Second sample.
      Throws:
      IllegalArgumentException - if x or y are zero-length; or do not have the same length
    • calculateDifferences

      private static double[] calculateDifferences(double mu, double[] x, double[] y)
      Calculates x[i] - mu - y[i] for all i.
      Parameters:
      mu - Expected difference.
      x - First sample.
      y - Second sample.
      Returns:
      z = x - y
    • calculateAbsoluteDifferences

      private static double[] calculateAbsoluteDifferences(double[] z)
      Calculates |z[i]| for all i.
      Parameters:
      z - Sample.
      Returns:
      |z|
    • calculateW

      private static double calculateW(double[] obs, double[] ranks)
      Calculate the Wilcoxon positive-rank sum statistic.
      Parameters:
      obs - Observed signed value.
      ranks - Ranks (including averages for ties).
      Returns:
      Wilcoxon positive-rank sum statistic (W+)
    • countZeros

      private static int countZeros(double[] z)
      Count the number of zeros in the data.
      Parameters:
      z - Input data.
      Returns:
      the zero count
      Throws:
      IllegalArgumentException - if the data is all zeros
    • calculateTieCorrection

      static double calculateTieCorrection(double[] ranks)
      Calculate the tie correction. Destructively modifies ranks (by sorting).
       c = sum(t^3 - t)
       

      where t is the size of each group of tied observations.

      Parameters:
      ranks - Ranks
      Returns:
      the tie correction
    • selectMethod

      private static PValueMethod selectMethod(PValueMethod method, int n)
      Select the method to compute the p-value.
      Parameters:
      method - P-value method.
      n - Size of the data.
      Returns:
      p-value method.
    • calculateAsymptoticPValue

      private static double calculateAsymptoticPValue(double wPlus, int n, double z, double c, AlternativeHypothesis alternative, boolean continuityCorrection)
      Compute the asymptotic p-value using the Cureton normal approximation. This corrects for zeros in the signed-rank zero procedure and/or ties corrected using the average method.
      Parameters:
      wPlus - Wilcoxon signed rank value (W+).
      n - Number of subjects.
      z - Count of number of zeros
      c - Tie-correction
      alternative - Alternative hypothesis.
      continuityCorrection - true to use a continuity correction.
      Returns:
      two-sided asymptotic p-value
    • calculateExactPValue

      private static double calculateExactPValue(int w1, int n, AlternativeHypothesis alternative)
      Compute the exact p-value.

      This computation requires that no zeros or ties are found in the data. The input value n is limited to 1023.

      Parameters:
      w1 - Wilcoxon signed rank value (W+, or W-).
      n - Number of subjects.
      alternative - Alternative hypothesis.
      Returns:
      exact p-value (two-sided, greater, or less using the options)
    • cdf

      private static double cdf(int w1, int w2, int n)
      Compute the cumulative density function of the Wilcoxon signed rank W+ statistic. The W- statistic is passed for convenience to exploit symmetry in the distribution.
      Parameters:
      w1 - Wilcoxon W+ statistic
      w2 - Wilcoxon W- statistic
      n - Number of subjects.
      Returns:
      Pr(X <= k)
    • sf

      private static double sf(int w1, int w2, int n)
      Compute the survival function of the Wilcoxon signed rank W+ statistic. The W- statistic is passed for convenience to exploit symmetry in the distribution.
      Parameters:
      w1 - Wilcoxon W+ statistic
      w2 - Wilcoxon W- statistic
      n - Number of subjects.
      Returns:
      Pr(X <= k)
    • computeCdf

      private static double computeCdf(int t, int n)
      Compute the cumulative density function for the distribution of the Wilcoxon signed rank statistic. This is a discrete distribution and is only valid when no zeros or ties are found in the data.

      This should be called with the lower of W+ or W- for computational efficiency. The input value n is limited to 1023.

      Uses recursion to compute the density for X <= t and sums the values. See: https://en.wikipedia.org/wiki/Wilcoxon_signed-rank_test#Computing_the_null_distribution

      Parameters:
      t - Smallest Wilcoxon signed rank value (W+, or W-).
      n - Number of subjects.
      Returns:
      Pr(T <= t)