gov.sandia.cognition.statistics.method
Class GaussianConfidence

java.lang.Object
  extended by gov.sandia.cognition.util.AbstractCloneableSerializable
      extended by gov.sandia.cognition.statistics.method.GaussianConfidence
All Implemented Interfaces:
ConfidenceIntervalEvaluator<Collection<? extends Number>>, NullHypothesisEvaluator<Collection<? extends Number>>, CloneableSerializable, Serializable, Cloneable

@ConfidenceTestAssumptions(name="Gaussian Z-test",
                           alsoKnownAs="Z-test",
                           description="Determines if two populations have the same mean, if the populations are Gaussian and relatively large, at least 30 or so.",
                           assumptions={"The two groups are sampled independently of each other.","The two groups are sampled from a Gaussian distribution, or the underlying distributions are non-Gaussian but obey the weak law of large numbers.","The variances of the two groups are equal."},
                           nullHypothesis="The means of the groups are equal.",
                           dataPaired=false,
                           dataSameSize=false,
                           distribution=UnivariateGaussian.CDF.class,
                           reference=@PublicationReference(author="Wikipedia",title="Z-test",type=WebPage,year=2009,url="http://en.wikipedia.org/wiki/Z-test"))
public class GaussianConfidence
extends AbstractCloneableSerializable
implements NullHypothesisEvaluator<Collection<? extends Number>>, ConfidenceIntervalEvaluator<Collection<? extends Number>>

This test is sometimes called the "Z test" Defines a range of values that the statistic can take, as well as the confidence that the statistic is between the lower and upper bounds. This test is useful in those situations where the tested data were generated by a (univariate) Gaussian distribution.

Since:
2.0
Author:
Kevin R. Dixon
See Also:
Serialized Form

Nested Class Summary
static class GaussianConfidence.Statistic
          Confidence statistics for a Gaussian distribution
 
Field Summary
static GaussianConfidence INSTANCE
          This class has no members, so here's a static instance.
 
Constructor Summary
GaussianConfidence()
          Creates a new instance of GaussianConfidence
 
Method Summary
 ConfidenceInterval computeConfidenceInterval(Collection<? extends Number> data, double confidence)
          Computes a confidence interval for a given dataset and confidence (power) level
 ConfidenceInterval computeConfidenceInterval(double mean, double variance, int numSamples, double confidence)
          Computes the confidence interval given the mean and variance of the samples, number of samples, and corresponding confidence interval
static ConfidenceInterval computeConfidenceInterval(UnivariateDistribution<?> dataDistribution, int numSamples, double confidence)
          Computes the Gaussian confidence interval given a distribution of data, number of samples, and corresponding confidence interval
static GaussianConfidence.Statistic evaluateNullHypothesis(Collection<? extends Double> data1, double data2)
          Computes the probability that the input was drawn from the estimated UnivariateGaussian distribution.
 GaussianConfidence.Statistic evaluateNullHypothesis(Collection<? extends Number> data1, Collection<? extends Number> data2)
          Computes the probability that two data were generated by the same distribution.
 
Methods inherited from class gov.sandia.cognition.util.AbstractCloneableSerializable
clone
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface gov.sandia.cognition.util.CloneableSerializable
clone
 

Field Detail

INSTANCE

public static final GaussianConfidence INSTANCE
This class has no members, so here's a static instance.

Constructor Detail

GaussianConfidence

public GaussianConfidence()
Creates a new instance of GaussianConfidence

Method Detail

evaluateNullHypothesis

public GaussianConfidence.Statistic evaluateNullHypothesis(Collection<? extends Number> data1,
                                                           Collection<? extends Number> data2)
Description copied from interface: NullHypothesisEvaluator
Computes the probability that two data were generated by the same distribution. NullHypothesisProbability=1 means that the distributions are likely the same, NullHypothesisProbability=0 means they are likely NOT the same, and NullHypothesisProbability less than 0.05 is the standard statistical significance test. This is the "p-value" that social scientists like to use.

Specified by:
evaluateNullHypothesis in interface NullHypothesisEvaluator<Collection<? extends Number>>
Parameters:
data1 - First dataset to consider
data2 - Second dataset to consider
Returns:
Probability that the two data were generated by the same source. A value of NullHypothesisProbability less than 0.05 is the standard point at which social scientists say two distributions were generated by different sources.

evaluateNullHypothesis

public static GaussianConfidence.Statistic evaluateNullHypothesis(Collection<? extends Double> data1,
                                                                  double data2)
Computes the probability that the input was drawn from the estimated UnivariateGaussian distribution. That is, what is the probability that the UnivariateGaussian could produce a MORE UNLIKELY sample than the given "input". For example, the probability of drawing a more unlikely sample that the mean is 1.0 and infinity is 0.0

Parameters:
data1 - Dataset to consider
data2 - Sample to compute the probability that a UnivariateGaussian would produce a more unlikely sample than "data2"
Returns:
probability that the input was drawn from this estimated UnivariateGaussian distribution. That is, what is the probability that the UnivariateGaussian could produce a MORE UNLIKELY sample than the given input

computeConfidenceInterval

public ConfidenceInterval computeConfidenceInterval(Collection<? extends Number> data,
                                                    double confidence)
Description copied from interface: ConfidenceIntervalEvaluator
Computes a confidence interval for a given dataset and confidence (power) level

Specified by:
computeConfidenceInterval in interface ConfidenceIntervalEvaluator<Collection<? extends Number>>
Parameters:
data - Dataset to use to compute the ConfidenceInterval
confidence - Confidence level (power, 1-pvalue) for the ConfidenceInterval, must be on the interval (0,1]
Returns:
ConfidenceInterval describing the range of values that contain the estimate for the given confidence level

computeConfidenceInterval

public static ConfidenceInterval computeConfidenceInterval(UnivariateDistribution<?> dataDistribution,
                                                           int numSamples,
                                                           double confidence)
Computes the Gaussian confidence interval given a distribution of data, number of samples, and corresponding confidence interval

Parameters:
dataDistribution - UnivariateGaussian describing the distribution of the underlying data
numSamples - Number of samples in the underlying data
confidence - Confidence value to assume for the ConfidenceInterval
Returns:
ConfidenceInterval capturing the range of the mean of the data at the desired level of confidence

computeConfidenceInterval

@PublicationReference(author="Wikipedia",
                      title="Standard error (statistics)",
                      type=WebPage,
                      year=2009,
                      url="http://en.wikipedia.org/wiki/Standard_error_(statistics)")
public ConfidenceInterval computeConfidenceInterval(double mean,
                                                                                                      double variance,
                                                                                                      int numSamples,
                                                                                                      double confidence)
Description copied from interface: ConfidenceIntervalEvaluator
Computes the confidence interval given the mean and variance of the samples, number of samples, and corresponding confidence interval

Specified by:
computeConfidenceInterval in interface ConfidenceIntervalEvaluator<Collection<? extends Number>>
Parameters:
mean - Mean of the distribution.
variance - Variance of the distribution.
numSamples - Number of samples in the underlying data
confidence - Confidence value to assume for the ConfidenceInterval
Returns:
ConfidenceInterval capturing the range of the mean of the data at the desired level of confidence