gov.sandia.cognition.statistics.method
Class ReceiverOperatingCharacteristic.Statistic

java.lang.Object
  extended by gov.sandia.cognition.util.AbstractCloneableSerializable
      extended by gov.sandia.cognition.statistics.method.AbstractConfidenceStatistic
          extended by gov.sandia.cognition.statistics.method.MannWhitneyUConfidence.Statistic
              extended by gov.sandia.cognition.statistics.method.ReceiverOperatingCharacteristic.Statistic
All Implemented Interfaces:
ConfidenceStatistic, CloneableSerializable, Serializable, Cloneable
Enclosing class:
ReceiverOperatingCharacteristic

public static class ReceiverOperatingCharacteristic.Statistic
extends MannWhitneyUConfidence.Statistic

Contains useful statistics derived from a ROC curve

See Also:
Serialized Form

Field Summary
 
Fields inherited from class gov.sandia.cognition.statistics.method.AbstractConfidenceStatistic
nullHypothesisProbability
 
Constructor Summary
protected ReceiverOperatingCharacteristic.Statistic(ReceiverOperatingCharacteristic roc)
          Creates a new instance of Statistic
 
Method Summary
static double computeAreaUnderCurve(ReceiverOperatingCharacteristic roc)
          Computes the "pessimistic" area under the ROC curve using the top-left rectangle method for numerical integration.
static double computeAreaUnderCurveTopLeft(Collection<ReceiverOperatingCharacteristic.DataPoint> points)
          Computes the Area Under Curve for an x-axis sorted Collection of ROC points using the top-left rectangle method for numerical integration.
static double computeAreaUnderCurveTrapezoid(Collection<ReceiverOperatingCharacteristic.DataPoint> points)
          Computes the Area Under Curve for an x-axis sorted Collection of ROC points using the top-left rectangle method for numerical integration.
static double computeDPrime(ReceiverOperatingCharacteristic.DataPoint data)
          Computes the value of d-prime given a datapoint
static ReceiverOperatingCharacteristic.DataPoint computeOptimalThreshold(ReceiverOperatingCharacteristic roc)
          Determines the DataPoint, and associated threshold, that simultaneously maximizes the value of Area=TruePositiveRate+TrueNegativeRate, usually the upper-left "knee" on the ROC curve.
static ReceiverOperatingCharacteristic.DataPoint computeOptimalThreshold(ReceiverOperatingCharacteristic roc, double truePositiveWeight, double trueNegativeWeight)
          Determines the DataPoint, and associated threshold, that simultaneously maximizes the value of Area=TruePositiveRate+TrueNegativeRate, usually the upper-left "knee" on the ROC curve.
 double getAreaUnderCurve()
          Getter for areaUnderCurve
 double getDPrime()
          Getter for dPrime
 ReceiverOperatingCharacteristic.DataPoint getOptimalThreshold()
          Getter for optimalThreshold
protected  void setAreaUnderCurve(double areaUnderCurve)
          Setter for areaUnderCurve
protected  void setDPrime(double dPrime)
          Setter for dPrime
protected  void setOptimalThreshold(ReceiverOperatingCharacteristic.DataPoint optimalThreshold)
          Setter for optimalThreshold
 
Methods inherited from class gov.sandia.cognition.statistics.method.MannWhitneyUConfidence.Statistic
computeNullHypothesisProbability, computeU, computeZ, getN1, getN2, getTestStatistic, getU, getZ, setN1, setN2, setU, setZ
 
Methods inherited from class gov.sandia.cognition.statistics.method.AbstractConfidenceStatistic
getNullHypothesisProbability, setNullHypothesisProbability, toString
 
Methods inherited from class gov.sandia.cognition.util.AbstractCloneableSerializable
clone
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 
Methods inherited from interface gov.sandia.cognition.util.CloneableSerializable
clone
 

Constructor Detail

ReceiverOperatingCharacteristic.Statistic

protected ReceiverOperatingCharacteristic.Statistic(ReceiverOperatingCharacteristic roc)
Creates a new instance of Statistic

Parameters:
roc - ROC Curve from which to pull statistics
Method Detail

computeAreaUnderCurve

public static double computeAreaUnderCurve(ReceiverOperatingCharacteristic roc)
Computes the "pessimistic" area under the ROC curve using the top-left rectangle method for numerical integration.

Parameters:
roc - ROC Curve to compute the area under
Returns:
Area underneath the ROC curve, on the interval [0,1]. A value of 0.5 means that the classifier is doing no better than chance and bigger is better

computeAreaUnderCurveTopLeft

@PublicationReference(author="Wikipedia",
                      title="Rectangle method",
                      type=WebPage,
                      year=2011,
                      url="http://en.wikipedia.org/wiki/Rectangle_method")
public static double computeAreaUnderCurveTopLeft(Collection<ReceiverOperatingCharacteristic.DataPoint> points)
Computes the Area Under Curve for an x-axis sorted Collection of ROC points using the top-left rectangle method for numerical integration.

Parameters:
points - x-axis sorted collection of x-axis points
Returns:
Area underneath the ROC curve, on the interval [0,1]. A value of 0.5 means that the classifier is doing no better than chance and bigger is better

computeAreaUnderCurveTrapezoid

@PublicationReference(author="Wikipedia",
                      title="Trapezoidal rule",
                      type=WebPage,
                      year=2011,
                      url="http://en.wikipedia.org/wiki/Trapezoidal_rule")
public static double computeAreaUnderCurveTrapezoid(Collection<ReceiverOperatingCharacteristic.DataPoint> points)
Computes the Area Under Curve for an x-axis sorted Collection of ROC points using the top-left rectangle method for numerical integration.

Parameters:
points - x-axis sorted collection of x-axis points
Returns:
Area underneath the ROC curve, on the interval [0,1]. A value of 0.5 means that the classifier is doing no better than chance and bigger is better

computeOptimalThreshold

public static ReceiverOperatingCharacteristic.DataPoint computeOptimalThreshold(ReceiverOperatingCharacteristic roc)
Determines the DataPoint, and associated threshold, that simultaneously maximizes the value of Area=TruePositiveRate+TrueNegativeRate, usually the upper-left "knee" on the ROC curve.

Parameters:
roc - ROC Curve to consider
Returns:
DataPoint, with corresponding threshold, that maximizes the value of Area=TruePositiveRate*(1-FalsePositiveRate), usually the upper-left "knee" on the ROC curve.

computeOptimalThreshold

public static ReceiverOperatingCharacteristic.DataPoint computeOptimalThreshold(ReceiverOperatingCharacteristic roc,
                                                                                double truePositiveWeight,
                                                                                double trueNegativeWeight)
Determines the DataPoint, and associated threshold, that simultaneously maximizes the value of Area=TruePositiveRate+TrueNegativeRate, usually the upper-left "knee" on the ROC curve.

Parameters:
truePositiveWeight - Amount to weight the TruePositiveRate
trueNegativeWeight - Amount to weight the TrueNegativeRate
roc - ROC Curve to consider
Returns:
DataPoint, with corresponding threshold, that maximizes the value of Area=TruePositiveRate*(1-FalsePositiveRate), usually the upper-left "knee" on the ROC curve.

computeDPrime

public static double computeDPrime(ReceiverOperatingCharacteristic.DataPoint data)
Computes the value of d-prime given a datapoint

Parameters:
data - Datapoint from which to estimate d'
Returns:
Estimated distance between the two classes to be split. Larger values of d' indicate that the classes are easier to split, d'=0 means that the classes overlap, and negative values mean that your classifier is doing worse than chance, chump. This appears to only be used by psychologists.

getDPrime

public double getDPrime()
Getter for dPrime

Returns:
Estimated distance between the two classes to be split. Larger values of d' indicate that the classes are easier to split, d'=0 means that the classes overlap, and negative values mean that your classifier is doing worse than chance, chump. This appears to only be used by psychologists.

setDPrime

protected void setDPrime(double dPrime)
Setter for dPrime

Parameters:
dPrime - Estimated distance between the two classes to be split. Larger values of d' indicate that the classes are easier to split, d'=0 means that the classes overlap, and negative values mean that your classifier is doing worse than chance, chump. This appears to only be used by psychologists.

getAreaUnderCurve

public double getAreaUnderCurve()
Getter for areaUnderCurve

Returns:
Area underneath the ROC curve, on the interval [0,1]. A value of 0.5 means that the classifier is doing no better than chance and bigger is better

setAreaUnderCurve

protected void setAreaUnderCurve(double areaUnderCurve)
Setter for areaUnderCurve

Parameters:
areaUnderCurve - Area underneath the ROC curve, on the interval [0,1]. A value of 0.5 means that the classifier is doing no better than chance and bigger is better

getOptimalThreshold

public ReceiverOperatingCharacteristic.DataPoint getOptimalThreshold()
Getter for optimalThreshold

Returns:
DataPoint, with corresponding threshold, that maximizes the value of Area=TruePositiveRate*(1-FalsePositiveRate), usually the upper-left "knee" on the ROC curve.

setOptimalThreshold

protected void setOptimalThreshold(ReceiverOperatingCharacteristic.DataPoint optimalThreshold)
Setter for optimalThreshold

Parameters:
optimalThreshold - DataPoint, with corresponding threshold, that maximizes the value of Area=TruePositiveRate*(1-FalsePositiveRate), usually the upper-left "knee" on the ROC curve.