1 /* 2 * Copyright 2004-2005 The Apache Software Foundation. 3 * 4 * Licensed under the Apache License, Version 2.0 (the "License"); 5 * you may not use this file except in compliance with the License. 6 * You may obtain a copy of the License at 7 * 8 * http://www.apache.org/licenses/LICENSE-2.0 9 * 10 * Unless required by applicable law or agreed to in writing, software 11 * distributed under the License is distributed on an "AS IS" BASIS, 12 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 * See the License for the specific language governing permissions and 14 * limitations under the License. 15 */ 16 package org.apache.commons.math.stat.inference; 17 18 import org.apache.commons.math.MathException; 19 import org.apache.commons.math.stat.descriptive.StatisticalSummary; 20 21 /** 22 * An interface for Student's t-tests. 23 * <p> 24 * Tests can be:<ul> 25 * <li>One-sample or two-sample</li> 26 * <li>One-sided or two-sided</li> 27 * <li>Paired or unpaired (for two-sample tests)</li> 28 * <li>Homoscedastic (equal variance assumption) or heteroscedastic 29 * (for two sample tests)</li> 30 * <li>Fixed significance level (boolean-valued) or returning p-values. 31 * </li></ul> 32 * <p> 33 * Test statistics are available for all tests. Methods including "Test" in 34 * in their names perform tests, all other methods return t-statistics. Among 35 * the "Test" methods, <code>double-</code>valued methods return p-values; 36 * <code>boolean-</code>valued methods perform fixed significance level tests. 37 * Significance levels are always specified as numbers between 0 and 0.5 38 * (e.g. tests at the 95% level use <code>alpha=0.05</code>). 39 * <p> 40 * Input to tests can be either <code>double[]</code> arrays or 41 * {@link StatisticalSummary} instances. 42 * 43 * 44 * @version $Revision: 161625 $ $Date: 2005-04-16 22:12:15 -0700 (Sat, 16 Apr 2005) $ 45 */ 46 public interface TTest { 47 /** 48 * Computes a paired, 2-sample t-statistic based on the data in the input 49 * arrays. The t-statistic returned is equivalent to what would be returned by 50 * computing the one-sample t-statistic {@link #t(double, double[])}, with 51 * <code>mu = 0</code> and the sample array consisting of the (signed) 52 * differences between corresponding entries in <code>sample1</code> and 53 * <code>sample2.</code> 54 * <p> 55 * <strong>Preconditions</strong>: <ul> 56 * <li>The input arrays must have the same length and their common length 57 * must be at least 2. 58 * </li></ul> 59 * 60 * @param sample1 array of sample data values 61 * @param sample2 array of sample data values 62 * @return t statistic 63 * @throws IllegalArgumentException if the precondition is not met 64 * @throws MathException if the statistic can not be computed do to a 65 * convergence or other numerical error. 66 */ 67 public abstract double pairedT(double[] sample1, double[] sample2) 68 throws IllegalArgumentException, MathException; 69 /** 70 * Returns the <i>observed significance level</i>, or 71 * <i> p-value</i>, associated with a paired, two-sample, two-tailed t-test 72 * based on the data in the input arrays. 73 * <p> 74 * The number returned is the smallest significance level 75 * at which one can reject the null hypothesis that the mean of the paired 76 * differences is 0 in favor of the two-sided alternative that the mean paired 77 * difference is not equal to 0. For a one-sided test, divide the returned 78 * value by 2. 79 * <p> 80 * This test is equivalent to a one-sample t-test computed using 81 * {@link #tTest(double, double[])} with <code>mu = 0</code> and the sample 82 * array consisting of the signed differences between corresponding elements of 83 * <code>sample1</code> and <code>sample2.</code> 84 * <p> 85 * <strong>Usage Note:</strong><br> 86 * The validity of the p-value depends on the assumptions of the parametric 87 * t-test procedure, as discussed 88 * <a href="http://www.basic.nwu.edu/statguidefiles/ttest_unpaired_ass_viol.html"> 89 * here</a> 90 * <p> 91 * <strong>Preconditions</strong>: <ul> 92 * <li>The input array lengths must be the same and their common length must 93 * be at least 2. 94 * </li></ul> 95 * 96 * @param sample1 array of sample data values 97 * @param sample2 array of sample data values 98 * @return p-value for t-test 99 * @throws IllegalArgumentException if the precondition is not met 100 * @throws MathException if an error occurs computing the p-value 101 */ 102 public abstract double pairedTTest(double[] sample1, double[] sample2) 103 throws IllegalArgumentException, MathException; 104 /** 105 * Performs a paired t-test evaluating the null hypothesis that the 106 * mean of the paired differences between <code>sample1</code> and 107 * <code>sample2</code> is 0 in favor of the two-sided alternative that the 108 * mean paired difference is not equal to 0, with significance level 109 * <code>alpha</code>. 110 * <p> 111 * Returns <code>true</code> iff the null hypothesis can be rejected with 112 * confidence <code>1 - alpha</code>. To perform a 1-sided test, use 113 * <code>alpha * 2</code> 114 * <p> 115 * <strong>Usage Note:</strong><br> 116 * The validity of the test depends on the assumptions of the parametric 117 * t-test procedure, as discussed 118 * <a href="http://www.basic.nwu.edu/statguidefiles/ttest_unpaired_ass_viol.html"> 119 * here</a> 120 * <p> 121 * <strong>Preconditions</strong>: <ul> 122 * <li>The input array lengths must be the same and their common length 123 * must be at least 2. 124 * </li> 125 * <li> <code> 0 < alpha < 0.5 </code> 126 * </li></ul> 127 * 128 * @param sample1 array of sample data values 129 * @param sample2 array of sample data values 130 * @param alpha significance level of the test 131 * @return true if the null hypothesis can be rejected with 132 * confidence 1 - alpha 133 * @throws IllegalArgumentException if the preconditions are not met 134 * @throws MathException if an error occurs performing the test 135 */ 136 public abstract boolean pairedTTest( 137 double[] sample1, 138 double[] sample2, 139 double alpha) 140 throws IllegalArgumentException, MathException; 141 /** 142 * Computes a <a href="http://www.itl.nist.gov/div898/handbook/prc/section2/prc22.htm#formula"> 143 * t statistic </a> given observed values and a comparison constant. 144 * <p> 145 * This statistic can be used to perform a one sample t-test for the mean. 146 * <p> 147 * <strong>Preconditions</strong>: <ul> 148 * <li>The observed array length must be at least 2. 149 * </li></ul> 150 * 151 * @param mu comparison constant 152 * @param observed array of values 153 * @return t statistic 154 * @throws IllegalArgumentException if input array length is less than 2 155 */ 156 public abstract double t(double mu, double[] observed) 157 throws IllegalArgumentException; 158 /** 159 * Computes a <a href="http://www.itl.nist.gov/div898/handbook/prc/section2/prc22.htm#formula"> 160 * t statistic </a> to use in comparing the mean of the dataset described by 161 * <code>sampleStats</code> to <code>mu</code>. 162 * <p> 163 * This statistic can be used to perform a one sample t-test for the mean. 164 * <p> 165 * <strong>Preconditions</strong>: <ul> 166 * <li><code>observed.getN() > = 2</code>. 167 * </li></ul> 168 * 169 * @param mu comparison constant 170 * @param sampleStats DescriptiveStatistics holding sample summary statitstics 171 * @return t statistic 172 * @throws IllegalArgumentException if the precondition is not met 173 */ 174 public abstract double t(double mu, StatisticalSummary sampleStats) 175 throws IllegalArgumentException; 176 /** 177 * Computes a 2-sample t statistic, under the hypothesis of equal 178 * subpopulation variances. To compute a t-statistic without the 179 * equal variances hypothesis, use {@link #t(double[], double[])}. 180 * <p> 181 * This statistic can be used to perform a (homoscedastic) two-sample 182 * t-test to compare sample means. 183 * <p> 184 * The t-statisitc is 185 * <p> 186 * <code> t = (m1 - m2) / (sqrt(1/n1 +1/n2) sqrt(var))</code> 187 * <p> 188 * where <strong><code>n1</code></strong> is the size of first sample; 189 * <strong><code> n2</code></strong> is the size of second sample; 190 * <strong><code> m1</code></strong> is the mean of first sample; 191 * <strong><code> m2</code></strong> is the mean of second sample</li> 192 * </ul> 193 * and <strong><code>var</code></strong> is the pooled variance estimate: 194 * <p> 195 * <code>var = sqrt(((n1 - 1)var1 + (n2 - 1)var2) / ((n1-1) + (n2-1)))</code> 196 * <p> 197 * with <strong><code>var1<code></strong> the variance of the first sample and 198 * <strong><code>var2</code></strong> the variance of the second sample. 199 * <p> 200 * <strong>Preconditions</strong>: <ul> 201 * <li>The observed array lengths must both be at least 2. 202 * </li></ul> 203 * 204 * @param sample1 array of sample data values 205 * @param sample2 array of sample data values 206 * @return t statistic 207 * @throws IllegalArgumentException if the precondition is not met 208 */ 209 public abstract double homoscedasticT(double[] sample1, double[] sample2) 210 throws IllegalArgumentException; 211 /** 212 * Computes a 2-sample t statistic, without the hypothesis of equal 213 * subpopulation variances. To compute a t-statistic assuming equal 214 * variances, use {@link #homoscedasticT(double[], double[])}. 215 * <p> 216 * This statistic can be used to perform a two-sample t-test to compare 217 * sample means. 218 * <p> 219 * The t-statisitc is 220 * <p> 221 * <code> t = (m1 - m2) / sqrt(var1/n1 + var2/n2)</code> 222 * <p> 223 * where <strong><code>n1</code></strong> is the size of the first sample 224 * <strong><code> n2</code></strong> is the size of the second sample; 225 * <strong><code> m1</code></strong> is the mean of the first sample; 226 * <strong><code> m2</code></strong> is the mean of the second sample; 227 * <strong><code> var1</code></strong> is the variance of the first sample; 228 * <strong><code> var2</code></strong> is the variance of the second sample; 229 * <p> 230 * <strong>Preconditions</strong>: <ul> 231 * <li>The observed array lengths must both be at least 2. 232 * </li></ul> 233 * 234 * @param sample1 array of sample data values 235 * @param sample2 array of sample data values 236 * @return t statistic 237 * @throws IllegalArgumentException if the precondition is not met 238 */ 239 public abstract double t(double[] sample1, double[] sample2) 240 throws IllegalArgumentException; 241 /** 242 * Computes a 2-sample t statistic </a>, comparing the means of the datasets 243 * described by two {@link StatisticalSummary} instances, without the 244 * assumption of equal subpopulation variances. Use 245 * {@link #homoscedasticT(StatisticalSummary, StatisticalSummary)} to 246 * compute a t-statistic under the equal variances assumption. 247 * <p> 248 * This statistic can be used to perform a two-sample t-test to compare 249 * sample means. 250 * <p> 251 * The returned t-statisitc is 252 * <p> 253 * <code> t = (m1 - m2) / sqrt(var1/n1 + var2/n2)</code> 254 * <p> 255 * where <strong><code>n1</code></strong> is the size of the first sample; 256 * <strong><code> n2</code></strong> is the size of the second sample; 257 * <strong><code> m1</code></strong> is the mean of the first sample; 258 * <strong><code> m2</code></strong> is the mean of the second sample 259 * <strong><code> var1</code></strong> is the variance of the first sample; 260 * <strong><code> var2</code></strong> is the variance of the second sample 261 * <p> 262 * <strong>Preconditions</strong>: <ul> 263 * <li>The datasets described by the two Univariates must each contain 264 * at least 2 observations. 265 * </li></ul> 266 * 267 * @param sampleStats1 StatisticalSummary describing data from the first sample 268 * @param sampleStats2 StatisticalSummary describing data from the second sample 269 * @return t statistic 270 * @throws IllegalArgumentException if the precondition is not met 271 */ 272 public abstract double t( 273 StatisticalSummary sampleStats1, 274 StatisticalSummary sampleStats2) 275 throws IllegalArgumentException; 276 /** 277 * Computes a 2-sample t statistic, comparing the means of the datasets 278 * described by two {@link StatisticalSummary} instances, under the 279 * assumption of equal subpopulation variances. To compute a t-statistic 280 * without the equal variances assumption, use 281 * {@link #t(StatisticalSummary, StatisticalSummary)}. 282 * <p> 283 * This statistic can be used to perform a (homoscedastic) two-sample 284 * t-test to compare sample means. 285 * <p> 286 * The t-statisitc returned is 287 * <p> 288 * <code> t = (m1 - m2) / (sqrt(1/n1 +1/n2) sqrt(var))</code> 289 * <p> 290 * where <strong><code>n1</code></strong> is the size of first sample; 291 * <strong><code> n2</code></strong> is the size of second sample; 292 * <strong><code> m1</code></strong> is the mean of first sample; 293 * <strong><code> m2</code></strong> is the mean of second sample 294 * and <strong><code>var</code></strong> is the pooled variance estimate: 295 * <p> 296 * <code>var = sqrt(((n1 - 1)var1 + (n2 - 1)var2) / ((n1-1) + (n2-1)))</code> 297 * <p> 298 * with <strong><code>var1<code></strong> the variance of the first sample and 299 * <strong><code>var2</code></strong> the variance of the second sample. 300 * <p> 301 * <strong>Preconditions</strong>: <ul> 302 * <li>The datasets described by the two Univariates must each contain 303 * at least 2 observations. 304 * </li></ul> 305 * 306 * @param sampleStats1 StatisticalSummary describing data from the first sample 307 * @param sampleStats2 StatisticalSummary describing data from the second sample 308 * @return t statistic 309 * @throws IllegalArgumentException if the precondition is not met 310 */ 311 public abstract double homoscedasticT( 312 StatisticalSummary sampleStats1, 313 StatisticalSummary sampleStats2) 314 throws IllegalArgumentException; 315 /** 316 * Returns the <i>observed significance level</i>, or 317 * <i>p-value</i>, associated with a one-sample, two-tailed t-test 318 * comparing the mean of the input array with the constant <code>mu</code>. 319 * <p> 320 * The number returned is the smallest significance level 321 * at which one can reject the null hypothesis that the mean equals 322 * <code>mu</code> in favor of the two-sided alternative that the mean 323 * is different from <code>mu</code>. For a one-sided test, divide the 324 * returned value by 2. 325 * <p> 326 * <strong>Usage Note:</strong><br> 327 * The validity of the test depends on the assumptions of the parametric 328 * t-test procedure, as discussed 329 * <a href="http://www.basic.nwu.edu/statguidefiles/ttest_unpaired_ass_viol.html">here</a> 330 * <p> 331 * <strong>Preconditions</strong>: <ul> 332 * <li>The observed array length must be at least 2. 333 * </li></ul> 334 * 335 * @param mu constant value to compare sample mean against 336 * @param sample array of sample data values 337 * @return p-value 338 * @throws IllegalArgumentException if the precondition is not met 339 * @throws MathException if an error occurs computing the p-value 340 */ 341 public abstract double tTest(double mu, double[] sample) 342 throws IllegalArgumentException, MathException; 343 /** 344 * Performs a <a href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda353.htm"> 345 * two-sided t-test</a> evaluating the null hypothesis that the mean of the population from 346 * which <code>sample</code> is drawn equals <code>mu</code>. 347 * <p> 348 * Returns <code>true</code> iff the null hypothesis can be 349 * rejected with confidence <code>1 - alpha</code>. To 350 * perform a 1-sided test, use <code>alpha * 2</code> 351 * <p> 352 * <strong>Examples:</strong><br><ol> 353 * <li>To test the (2-sided) hypothesis <code>sample mean = mu </code> at 354 * the 95% level, use <br><code>tTest(mu, sample, 0.05) </code> 355 * </li> 356 * <li>To test the (one-sided) hypothesis <code> sample mean < mu </code> 357 * at the 99% level, first verify that the measured sample mean is less 358 * than <code>mu</code> and then use 359 * <br><code>tTest(mu, sample, 0.02) </code> 360 * </li></ol> 361 * <p> 362 * <strong>Usage Note:</strong><br> 363 * The validity of the test depends on the assumptions of the one-sample 364 * parametric t-test procedure, as discussed 365 * <a href="http://www.basic.nwu.edu/statguidefiles/sg_glos.html#one-sample">here</a> 366 * <p> 367 * <strong>Preconditions</strong>: <ul> 368 * <li>The observed array length must be at least 2. 369 * </li></ul> 370 * 371 * @param mu constant value to compare sample mean against 372 * @param sample array of sample data values 373 * @param alpha significance level of the test 374 * @return p-value 375 * @throws IllegalArgumentException if the precondition is not met 376 * @throws MathException if an error computing the p-value 377 */ 378 public abstract boolean tTest(double mu, double[] sample, double alpha) 379 throws IllegalArgumentException, MathException; 380 /** 381 * Returns the <i>observed significance level</i>, or 382 * <i>p-value</i>, associated with a one-sample, two-tailed t-test 383 * comparing the mean of the dataset described by <code>sampleStats</code> 384 * with the constant <code>mu</code>. 385 * <p> 386 * The number returned is the smallest significance level 387 * at which one can reject the null hypothesis that the mean equals 388 * <code>mu</code> in favor of the two-sided alternative that the mean 389 * is different from <code>mu</code>. For a one-sided test, divide the 390 * returned value by 2. 391 * <p> 392 * <strong>Usage Note:</strong><br> 393 * The validity of the test depends on the assumptions of the parametric 394 * t-test procedure, as discussed 395 * <a href="http://www.basic.nwu.edu/statguidefiles/ttest_unpaired_ass_viol.html"> 396 * here</a> 397 * <p> 398 * <strong>Preconditions</strong>: <ul> 399 * <li>The sample must contain at least 2 observations. 400 * </li></ul> 401 * 402 * @param mu constant value to compare sample mean against 403 * @param sampleStats StatisticalSummary describing sample data 404 * @return p-value 405 * @throws IllegalArgumentException if the precondition is not met 406 * @throws MathException if an error occurs computing the p-value 407 */ 408 public abstract double tTest(double mu, StatisticalSummary sampleStats) 409 throws IllegalArgumentException, MathException; 410 /** 411 * Performs a <a href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda353.htm"> 412 * two-sided t-test</a> evaluating the null hypothesis that the mean of the 413 * population from which the dataset described by <code>stats</code> is 414 * drawn equals <code>mu</code>. 415 * <p> 416 * Returns <code>true</code> iff the null hypothesis can be rejected with 417 * confidence <code>1 - alpha</code>. To perform a 1-sided test, use 418 * <code>alpha * 2.</code> 419 * <p> 420 * <strong>Examples:</strong><br><ol> 421 * <li>To test the (2-sided) hypothesis <code>sample mean = mu </code> at 422 * the 95% level, use <br><code>tTest(mu, sampleStats, 0.05) </code> 423 * </li> 424 * <li>To test the (one-sided) hypothesis <code> sample mean < mu </code> 425 * at the 99% level, first verify that the measured sample mean is less 426 * than <code>mu</code> and then use 427 * <br><code>tTest(mu, sampleStats, 0.02) </code> 428 * </li></ol> 429 * <p> 430 * <strong>Usage Note:</strong><br> 431 * The validity of the test depends on the assumptions of the one-sample 432 * parametric t-test procedure, as discussed 433 * <a href="http://www.basic.nwu.edu/statguidefiles/sg_glos.html#one-sample">here</a> 434 * <p> 435 * <strong>Preconditions</strong>: <ul> 436 * <li>The sample must include at least 2 observations. 437 * </li></ul> 438 * 439 * @param mu constant value to compare sample mean against 440 * @param sampleStats StatisticalSummary describing sample data values 441 * @param alpha significance level of the test 442 * @return p-value 443 * @throws IllegalArgumentException if the precondition is not met 444 * @throws MathException if an error occurs computing the p-value 445 */ 446 public abstract boolean tTest( 447 double mu, 448 StatisticalSummary sampleStats, 449 double alpha) 450 throws IllegalArgumentException, MathException; 451 /** 452 * Returns the <i>observed significance level</i>, or 453 * <i>p-value</i>, associated with a two-sample, two-tailed t-test 454 * comparing the means of the input arrays. 455 * <p> 456 * The number returned is the smallest significance level 457 * at which one can reject the null hypothesis that the two means are 458 * equal in favor of the two-sided alternative that they are different. 459 * For a one-sided test, divide the returned value by 2. 460 * <p> 461 * The test does not assume that the underlying popuation variances are 462 * equal and it uses approximated degrees of freedom computed from the 463 * sample data to compute the p-value. The t-statistic used is as defined in 464 * {@link #t(double[], double[])} and the Welch-Satterthwaite approximation 465 * to the degrees of freedom is used, 466 * as described 467 * <a href="http://www.itl.nist.gov/div898/handbook/prc/section3/prc31.htm"> 468 * here.</a> To perform the test under the assumption of equal subpopulation 469 * variances, use {@link #homoscedasticTTest(double[], double[])}. 470 * <p> 471 * <strong>Usage Note:</strong><br> 472 * The validity of the p-value depends on the assumptions of the parametric 473 * t-test procedure, as discussed 474 * <a href="http://www.basic.nwu.edu/statguidefiles/ttest_unpaired_ass_viol.html"> 475 * here</a> 476 * <p> 477 * <strong>Preconditions</strong>: <ul> 478 * <li>The observed array lengths must both be at least 2. 479 * </li></ul> 480 * 481 * @param sample1 array of sample data values 482 * @param sample2 array of sample data values 483 * @return p-value for t-test 484 * @throws IllegalArgumentException if the precondition is not met 485 * @throws MathException if an error occurs computing the p-value 486 */ 487 public abstract double tTest(double[] sample1, double[] sample2) 488 throws IllegalArgumentException, MathException; 489 /** 490 * Returns the <i>observed significance level</i>, or 491 * <i>p-value</i>, associated with a two-sample, two-tailed t-test 492 * comparing the means of the input arrays, under the assumption that 493 * the two samples are drawn from subpopulations with equal variances. 494 * To perform the test without the equal variances assumption, use 495 * {@link #tTest(double[], double[])}. 496 * <p> 497 * The number returned is the smallest significance level 498 * at which one can reject the null hypothesis that the two means are 499 * equal in favor of the two-sided alternative that they are different. 500 * For a one-sided test, divide the returned value by 2. 501 * <p> 502 * A pooled variance estimate is used to compute the t-statistic. See 503 * {@link #homoscedasticT(double[], double[])}. The sum of the sample sizes 504 * minus 2 is used as the degrees of freedom. 505 * <p> 506 * <strong>Usage Note:</strong><br> 507 * The validity of the p-value depends on the assumptions of the parametric 508 * t-test procedure, as discussed 509 * <a href="http://www.basic.nwu.edu/statguidefiles/ttest_unpaired_ass_viol.html"> 510 * here</a> 511 * <p> 512 * <strong>Preconditions</strong>: <ul> 513 * <li>The observed array lengths must both be at least 2. 514 * </li></ul> 515 * 516 * @param sample1 array of sample data values 517 * @param sample2 array of sample data values 518 * @return p-value for t-test 519 * @throws IllegalArgumentException if the precondition is not met 520 * @throws MathException if an error occurs computing the p-value 521 */ 522 public abstract double homoscedasticTTest( 523 double[] sample1, 524 double[] sample2) 525 throws IllegalArgumentException, MathException; 526 /** 527 * Performs a 528 * <a href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda353.htm"> 529 * two-sided t-test</a> evaluating the null hypothesis that <code>sample1</code> 530 * and <code>sample2</code> are drawn from populations with the same mean, 531 * with significance level <code>alpha</code>. This test does not assume 532 * that the subpopulation variances are equal. To perform the test assuming 533 * equal variances, use 534 * {@link #homoscedasticTTest(double[], double[], double)}. 535 * <p> 536 * Returns <code>true</code> iff the null hypothesis that the means are 537 * equal can be rejected with confidence <code>1 - alpha</code>. To 538 * perform a 1-sided test, use <code>alpha * 2</code> 539 * <p> 540 * See {@link #t(double[], double[])} for the formula used to compute the 541 * t-statistic. Degrees of freedom are approximated using the 542 * <a href="http://www.itl.nist.gov/div898/handbook/prc/section3/prc31.htm"> 543 * Welch-Satterthwaite approximation.</a> 544 545 * <p> 546 * <strong>Examples:</strong><br><ol> 547 * <li>To test the (2-sided) hypothesis <code>mean 1 = mean 2 </code> at 548 * the 95% level, use 549 * <br><code>tTest(sample1, sample2, 0.05). </code> 550 * </li> 551 * <li>To test the (one-sided) hypothesis <code> mean 1 < mean 2 </code>, 552 * at the 99% level, first verify that the measured mean of <code>sample 1</code> 553 * is less than the mean of <code>sample 2</code> and then use 554 * <br><code>tTest(sample1, sample2, 0.02) </code> 555 * </li></ol> 556 * <p> 557 * <strong>Usage Note:</strong><br> 558 * The validity of the test depends on the assumptions of the parametric 559 * t-test procedure, as discussed 560 * <a href="http://www.basic.nwu.edu/statguidefiles/ttest_unpaired_ass_viol.html"> 561 * here</a> 562 * <p> 563 * <strong>Preconditions</strong>: <ul> 564 * <li>The observed array lengths must both be at least 2. 565 * </li> 566 * <li> <code> 0 < alpha < 0.5 </code> 567 * </li></ul> 568 * 569 * @param sample1 array of sample data values 570 * @param sample2 array of sample data values 571 * @param alpha significance level of the test 572 * @return true if the null hypothesis can be rejected with 573 * confidence 1 - alpha 574 * @throws IllegalArgumentException if the preconditions are not met 575 * @throws MathException if an error occurs performing the test 576 */ 577 public abstract boolean tTest( 578 double[] sample1, 579 double[] sample2, 580 double alpha) 581 throws IllegalArgumentException, MathException; 582 /** 583 * Performs a 584 * <a href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda353.htm"> 585 * two-sided t-test</a> evaluating the null hypothesis that <code>sample1</code> 586 * and <code>sample2</code> are drawn from populations with the same mean, 587 * with significance level <code>alpha</code>, assuming that the 588 * subpopulation variances are equal. Use 589 * {@link #tTest(double[], double[], double)} to perform the test without 590 * the assumption of equal variances. 591 * <p> 592 * Returns <code>true</code> iff the null hypothesis that the means are 593 * equal can be rejected with confidence <code>1 - alpha</code>. To 594 * perform a 1-sided test, use <code>alpha * 2.</code> To perform the test 595 * without the assumption of equal subpopulation variances, use 596 * {@link #tTest(double[], double[], double)}. 597 * <p> 598 * A pooled variance estimate is used to compute the t-statistic. See 599 * {@link #t(double[], double[])} for the formula. The sum of the sample 600 * sizes minus 2 is used as the degrees of freedom. 601 * <p> 602 * <strong>Examples:</strong><br><ol> 603 * <li>To test the (2-sided) hypothesis <code>mean 1 = mean 2 </code> at 604 * the 95% level, use <br><code>tTest(sample1, sample2, 0.05). </code> 605 * </li> 606 * <li>To test the (one-sided) hypothesis <code> mean 1 < mean 2, </code> 607 * at the 99% level, first verify that the measured mean of 608 * <code>sample 1</code> is less than the mean of <code>sample 2</code> 609 * and then use 610 * <br><code>tTest(sample1, sample2, 0.02) </code> 611 * </li></ol> 612 * <p> 613 * <strong>Usage Note:</strong><br> 614 * The validity of the test depends on the assumptions of the parametric 615 * t-test procedure, as discussed 616 * <a href="http://www.basic.nwu.edu/statguidefiles/ttest_unpaired_ass_viol.html"> 617 * here</a> 618 * <p> 619 * <strong>Preconditions</strong>: <ul> 620 * <li>The observed array lengths must both be at least 2. 621 * </li> 622 * <li> <code> 0 < alpha < 0.5 </code> 623 * </li></ul> 624 * 625 * @param sample1 array of sample data values 626 * @param sample2 array of sample data values 627 * @param alpha significance level of the test 628 * @return true if the null hypothesis can be rejected with 629 * confidence 1 - alpha 630 * @throws IllegalArgumentException if the preconditions are not met 631 * @throws MathException if an error occurs performing the test 632 */ 633 public abstract boolean homoscedasticTTest( 634 double[] sample1, 635 double[] sample2, 636 double alpha) 637 throws IllegalArgumentException, MathException; 638 /** 639 * Returns the <i>observed significance level</i>, or 640 * <i>p-value</i>, associated with a two-sample, two-tailed t-test 641 * comparing the means of the datasets described by two StatisticalSummary 642 * instances. 643 * <p> 644 * The number returned is the smallest significance level 645 * at which one can reject the null hypothesis that the two means are 646 * equal in favor of the two-sided alternative that they are different. 647 * For a one-sided test, divide the returned value by 2. 648 * <p> 649 * The test does not assume that the underlying popuation variances are 650 * equal and it uses approximated degrees of freedom computed from the 651 * sample data to compute the p-value. To perform the test assuming 652 * equal variances, use 653 * {@link #homoscedasticTTest(StatisticalSummary, StatisticalSummary)}. 654 * <p> 655 * <strong>Usage Note:</strong><br> 656 * The validity of the p-value depends on the assumptions of the parametric 657 * t-test procedure, as discussed 658 * <a href="http://www.basic.nwu.edu/statguidefiles/ttest_unpaired_ass_viol.html"> 659 * here</a> 660 * <p> 661 * <strong>Preconditions</strong>: <ul> 662 * <li>The datasets described by the two Univariates must each contain 663 * at least 2 observations. 664 * </li></ul> 665 * 666 * @param sampleStats1 StatisticalSummary describing data from the first sample 667 * @param sampleStats2 StatisticalSummary describing data from the second sample 668 * @return p-value for t-test 669 * @throws IllegalArgumentException if the precondition is not met 670 * @throws MathException if an error occurs computing the p-value 671 */ 672 public abstract double tTest( 673 StatisticalSummary sampleStats1, 674 StatisticalSummary sampleStats2) 675 throws IllegalArgumentException, MathException; 676 /** 677 * Returns the <i>observed significance level</i>, or 678 * <i>p-value</i>, associated with a two-sample, two-tailed t-test 679 * comparing the means of the datasets described by two StatisticalSummary 680 * instances, under the hypothesis of equal subpopulation variances. To 681 * perform a test without the equal variances assumption, use 682 * {@link #tTest(StatisticalSummary, StatisticalSummary)}. 683 * <p> 684 * The number returned is the smallest significance level 685 * at which one can reject the null hypothesis that the two means are 686 * equal in favor of the two-sided alternative that they are different. 687 * For a one-sided test, divide the returned value by 2. 688 * <p> 689 * See {@link #homoscedasticT(double[], double[])} for the formula used to 690 * compute the t-statistic. The sum of the sample sizes minus 2 is used as 691 * the degrees of freedom. 692 * <p> 693 * <strong>Usage Note:</strong><br> 694 * The validity of the p-value depends on the assumptions of the parametric 695 * t-test procedure, as discussed 696 * <a href="http://www.basic.nwu.edu/statguidefiles/ttest_unpaired_ass_viol.html">here</a> 697 * <p> 698 * <strong>Preconditions</strong>: <ul> 699 * <li>The datasets described by the two Univariates must each contain 700 * at least 2 observations. 701 * </li></ul> 702 * 703 * @param sampleStats1 StatisticalSummary describing data from the first sample 704 * @param sampleStats2 StatisticalSummary describing data from the second sample 705 * @return p-value for t-test 706 * @throws IllegalArgumentException if the precondition is not met 707 * @throws MathException if an error occurs computing the p-value 708 */ 709 public abstract double homoscedasticTTest( 710 StatisticalSummary sampleStats1, 711 StatisticalSummary sampleStats2) 712 throws IllegalArgumentException, MathException; 713 /** 714 * Performs a 715 * <a href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda353.htm"> 716 * two-sided t-test</a> evaluating the null hypothesis that 717 * <code>sampleStats1</code> and <code>sampleStats2</code> describe 718 * datasets drawn from populations with the same mean, with significance 719 * level <code>alpha</code>. This test does not assume that the 720 * subpopulation variances are equal. To perform the test under the equal 721 * variances assumption, use 722 * {@link #homoscedasticTTest(StatisticalSummary, StatisticalSummary)}. 723 * <p> 724 * Returns <code>true</code> iff the null hypothesis that the means are 725 * equal can be rejected with confidence <code>1 - alpha</code>. To 726 * perform a 1-sided test, use <code>alpha * 2</code> 727 * <p> 728 * See {@link #t(double[], double[])} for the formula used to compute the 729 * t-statistic. Degrees of freedom are approximated using the 730 * <a href="http://www.itl.nist.gov/div898/handbook/prc/section3/prc31.htm"> 731 * Welch-Satterthwaite approximation.</a> 732 * <p> 733 * <strong>Examples:</strong><br><ol> 734 * <li>To test the (2-sided) hypothesis <code>mean 1 = mean 2 </code> at 735 * the 95%, use 736 * <br><code>tTest(sampleStats1, sampleStats2, 0.05) </code> 737 * </li> 738 * <li>To test the (one-sided) hypothesis <code> mean 1 < mean 2 </code> 739 * at the 99% level, first verify that the measured mean of 740 * <code>sample 1</code> is less than the mean of <code>sample 2</code> 741 * and then use 742 * <br><code>tTest(sampleStats1, sampleStats2, 0.02) </code> 743 * </li></ol> 744 * <p> 745 * <strong>Usage Note:</strong><br> 746 * The validity of the test depends on the assumptions of the parametric 747 * t-test procedure, as discussed 748 * <a href="http://www.basic.nwu.edu/statguidefiles/ttest_unpaired_ass_viol.html"> 749 * here</a> 750 * <p> 751 * <strong>Preconditions</strong>: <ul> 752 * <li>The datasets described by the two Univariates must each contain 753 * at least 2 observations. 754 * </li> 755 * <li> <code> 0 < alpha < 0.5 </code> 756 * </li></ul> 757 * 758 * @param sampleStats1 StatisticalSummary describing sample data values 759 * @param sampleStats2 StatisticalSummary describing sample data values 760 * @param alpha significance level of the test 761 * @return true if the null hypothesis can be rejected with 762 * confidence 1 - alpha 763 * @throws IllegalArgumentException if the preconditions are not met 764 * @throws MathException if an error occurs performing the test 765 */ 766 public abstract boolean tTest( 767 StatisticalSummary sampleStats1, 768 StatisticalSummary sampleStats2, 769 double alpha) 770 throws IllegalArgumentException, MathException; 771 }