gov.sandia.cognition.statistics.method
Class MannWhitneyUConfidence

java.lang.Object
  extended by gov.sandia.cognition.util.AbstractCloneableSerializable
      extended by gov.sandia.cognition.statistics.method.MannWhitneyUConfidence
All Implemented Interfaces:
NullHypothesisEvaluator<Collection<? extends Number>>, CloneableSerializable, Serializable, Cloneable

@ConfidenceTestAssumptions(name="Mann-Whitney U-test",
                           alsoKnownAs={"Mann-Whitney-Wolcoxon","Wilcoxon rank-sum test","Wilcoxon-Mann-Whitney test","U-test"},
                           description="A nonparameteric test to determine is two groups of data were drawn from the same underlying distribution.",
                           assumptions={"The groups were sampled independently.","The data are orginal and we can determine which of two samples is greater than the other.","Although the two populations don\'t have to follow any particular distribution, the two distributions must have a similar shape."},
                           nullHypothesis="The data were drawn from the same distribution.",
                           dataPaired=false,
                           dataSameSize=false,
                           distribution=UnivariateGaussian.CDF.class,
                           reference=@PublicationReference(author="Wikipedia",title="Mann-Whitney U",type=WebPage,year=2009,url="http://en.wikipedia.org/wiki/Mann-Whitney_U"))
public class MannWhitneyUConfidence
extends AbstractCloneableSerializable
implements NullHypothesisEvaluator<Collection<? extends Number>>

Performs a Mann-Whitney U-test on the given data (usually simply called a "U-test", sometimes called a Wilcoxon-Mann-Whitney U-test, or Wilcoxon rank-sum test).

Since:
2.0
Author:
Kevin R. Dixon
See Also:
Serialized Form

Nested Class Summary
static class MannWhitneyUConfidence.Statistic
          Statistics from the Mann-Whitney U-test
 
Constructor Summary
MannWhitneyUConfidence()
          Creates a new instance of MannWhitneyUConfidence
 
Method Summary
 MannWhitneyUConfidence.Statistic evaluateNullHypothesis(Collection<? extends InputOutputPair<? extends Number,Boolean>> scoreClassPairs)
          Performs a U-test on the score-class pairs.
 MannWhitneyUConfidence.Statistic evaluateNullHypothesis(Collection<? extends Number> data1, Collection<? extends Number> data2)
          Computes the probability that two data were generated by the same distribution.
 
Methods inherited from class gov.sandia.cognition.util.AbstractCloneableSerializable
clone
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface gov.sandia.cognition.util.CloneableSerializable
clone
 

Constructor Detail

MannWhitneyUConfidence

public MannWhitneyUConfidence()
Creates a new instance of MannWhitneyUConfidence

Method Detail

evaluateNullHypothesis

public MannWhitneyUConfidence.Statistic evaluateNullHypothesis(Collection<? extends InputOutputPair<? extends Number,Boolean>> scoreClassPairs)
Performs a U-test on the score-class pairs. The first element in the pair is a score, the second is a flag to determine which group the score belongs to. For example {, means that data1=1.0 and data2=0.9 and so forth. This is useful for computing that classified data partitions data better than chance.

Parameters:
scoreClassPairs - Pairs of scores with the corresponding class "label" for the score
Returns:
Statistics from the Mann-Whitney U-test

evaluateNullHypothesis

public MannWhitneyUConfidence.Statistic evaluateNullHypothesis(Collection<? extends Number> data1,
                                                               Collection<? extends Number> data2)
Description copied from interface: NullHypothesisEvaluator
Computes the probability that two data were generated by the same distribution. NullHypothesisProbability=1 means that the distributions are likely the same, NullHypothesisProbability=0 means they are likely NOT the same, and NullHypothesisProbability less than 0.05 is the standard statistical significance test. This is the "p-value" that social scientists like to use.

Specified by:
evaluateNullHypothesis in interface NullHypothesisEvaluator<Collection<? extends Number>>
Parameters:
data1 - First dataset to consider
data2 - Second dataset to consider
Returns:
Probability that the two data were generated by the same source. A value of NullHypothesisProbability less than 0.05 is the standard point at which social scientists say two distributions were generated by different sources.