gov.sandia.cognition.statistics.method Class BernoulliConfidence

```java.lang.Object
gov.sandia.cognition.util.AbstractCloneableSerializable
gov.sandia.cognition.statistics.method.BernoulliConfidence
```
All Implemented Interfaces:
ConfidenceIntervalEvaluator<Collection<Boolean>>, CloneableSerializable, Serializable, Cloneable

`public class BernoulliConfidenceextends AbstractCloneableSerializableimplements ConfidenceIntervalEvaluator<Collection<Boolean>>`

Computes the Bernoulli confidence interval. In other words, computes the Bernoulli parameter based on the given data and the desired level of confidence. This answers the question, "What is true range of classification rates given a collection of correct/incorrect guesses at a given level of confidence?" For example, if my classifier gets { Correct, Wrong, Correct, Correct, Correct, Wrong, Correct, Correct }, the true classification rate of my classifier at 50% confidence is Pr{ 0.5335 <= p <= 0.9665 } >= 0.5

Since:
2.0
Author:
Kevin R. Dixon
Serialized Form

Field Summary
`static BernoulliConfidence` `INSTANCE`
This class has no members, so here's a static instance.

Constructor Summary
`BernoulliConfidence()`
Creates a new instance of BernoulliConfidence

Method Summary
` ConfidenceInterval` ```computeConfidenceInterval(Collection<Boolean> data, double confidence)```
Computes the ConfidenceInterval for the Bernoulli parameter based on the given data and the desired level of confidence.
` ConfidenceInterval` ```computeConfidenceInterval(double mean, double variance, int numSamples, double confidence)```
Computes the confidence interval given the mean and variance of the samples, number of samples, and corresponding confidence interval
`static ConfidenceInterval` ```computeConfidenceInterval(double bernoulliParameter, int numSamples, double confidence)```
Computes the ConfidenceInterval for the Bernoulli parameter based on the given data and the desired level of confidence.
`static int` ```computeSampleSize(double accuracy, double confidence)```
Computes the number of samples needed to estimate the Bernoulli parameter "p" (mean) within "accuracy" with probability at least "confidence".

Methods inherited from class gov.sandia.cognition.util.AbstractCloneableSerializable
`clone`

Methods inherited from class java.lang.Object
`equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Field Detail

INSTANCE

`public static final BernoulliConfidence INSTANCE`
This class has no members, so here's a static instance.

Constructor Detail

BernoulliConfidence

`public BernoulliConfidence()`
Creates a new instance of BernoulliConfidence

Method Detail

computeConfidenceInterval

```public ConfidenceInterval computeConfidenceInterval(Collection<Boolean> data,
double confidence)```
Computes the ConfidenceInterval for the Bernoulli parameter based on the given data and the desired level of confidence. This answers the question, "What is true range of classification rates given a collection of correct/incorrect guesses at a given level of confidence?" For example, if my classifier gets { Correct, Wrong, Correct, Correct, Correct, Wrong, Correct, Correct }, the true classification rate of my classifier at 50% confidence is Pr{ 0.5335 <= p <= 0.9665 } >= 0.5

Specified by:
`computeConfidenceInterval` in interface `ConfidenceIntervalEvaluator<Collection<Boolean>>`
Parameters:
`data` - Correct/Wrong data
`confidence` - Confidence level to place on the confidence interval, must be (0,1]
Returns:
Range of values for the accuracy of the classifier at the desired confidence

computeConfidenceInterval

```@PublicationReference(author="Wikipedia",
title="",
type=WebPage,
year=2009,
url="http://en.wikipedia.org/wiki/Margin_of_error")
public static ConfidenceInterval computeConfidenceInterval(double bernoulliParameter,
int numSamples,
double confidence)```
Computes the ConfidenceInterval for the Bernoulli parameter based on the given data and the desired level of confidence. This answers the question, "What is true range of classification rates given a collection of correct/incorrect guesses at a given level of confidence?" For example, if my classifier gets { Correct, Wrong, Correct, Correct, Correct, Wrong, Correct, Correct }, the true classification rate of my classifier at 50% confidence is Pr{ 0.5335 <= p <= 0.9665 } >= 0.5

Parameters:
`bernoulliParameter` - Estimated Bernoulli parameter, classifier success rate, must be [0,1]
`numSamples` - Number of samples used in the determination
`confidence` - Confidence level to place on the confidence interval, must be (0,1]
Returns:
Range of values for the accuracy of the classifier at the desired confidence

computeConfidenceInterval

```public ConfidenceInterval computeConfidenceInterval(double mean,
double variance,
int numSamples,
double confidence)```
Description copied from interface: `ConfidenceIntervalEvaluator`
Computes the confidence interval given the mean and variance of the samples, number of samples, and corresponding confidence interval

Specified by:
`computeConfidenceInterval` in interface `ConfidenceIntervalEvaluator<Collection<Boolean>>`
Parameters:
`mean` - Mean of the distribution.
`variance` - Variance of the distribution.
`numSamples` - Number of samples in the underlying data
`confidence` - Confidence value to assume for the ConfidenceInterval
Returns:
ConfidenceInterval capturing the range of the mean of the data at the desired level of confidence

computeSampleSize

```@PublicationReference(author="Wikipedia",
title="",
type=WebPage,
year=2009,
url="http://en.wikipedia.org/wiki/Margin_of_error")
public static int computeSampleSize(double accuracy,
double confidence)```
Computes the number of samples needed to estimate the Bernoulli parameter "p" (mean) within "accuracy" with probability at least "confidence". Answers the question, "How many people do I need to survey to estimate how many people would vote for Budweiser as the King of Beers within a desired accuracy and a set confidence?" For example, to correctly determine the accuracy within 0.01 with confidence=0.95, we need up to 50000 samples.

Parameters:
`accuracy` - Desired accuracy to estimate, on the interval (0,1]
`confidence` - Desired confidence, on the interval (0,1]
Returns:
Maximum number of samples needed to achieve the accuracy with the level of confidence