gov.sandia.cognition.learning.algorithm.regression
Class LinearRegression

java.lang.Object
  extended by gov.sandia.cognition.util.AbstractCloneableSerializable
      extended by gov.sandia.cognition.learning.algorithm.regression.LinearRegression
All Implemented Interfaces:
BatchLearner<Collection<? extends InputOutputPair<? extends Vectorizable,Double>>,LinearDiscriminantWithBias>, SupervisedBatchLearner<Vectorizable,Double,LinearDiscriminantWithBias>, CloneableSerializable, Serializable, Cloneable

@CodeReview(reviewer="Kevin R. Dixon",
            date="2008-09-02",
            changesNeeded=false,
            comments={"Made minor changes to javadoc","Looks fine."})
@PublicationReferences(references={@PublicationReference(author="Wikipedia",title="Linear regression",type=WebPage,year=2008,url="http://en.wikipedia.org/wiki/Linear_regression"),@PublicationReference(author="Wikipedia",title="Tikhonov regularization",type=WebPage,year=2011,url="http://en.wikipedia.org/wiki/Tikhonov_regularization",notes="Despite what Wikipedia says, this is always called Ridge Regression")})
public class LinearRegression
extends AbstractCloneableSerializable
implements SupervisedBatchLearner<Vectorizable,Double,LinearDiscriminantWithBias>

Computes the least-squares regression for a LinearCombinationFunction given a dataset. A LinearCombinationFunction is a weighted linear combination of (potentially) nonlinear basis functions. This looks like y(x) = b + w'x, where "b" is a scalar bias and "w" is a weight vector. The internal class LinearRegression.Statistic returns the goodness-of-fit statistics for a set of target-estimate pairs, include a p-value for the null hypothesis significance.

Since:
2.0
Author:
Kevin R. Dixon
See Also:
Serialized Form

Nested Class Summary
static class LinearRegression.Statistic
          Computes regression statistics using a chi-square measure of the statistical significance of the learned approximator
 
Field Summary
static double DEFAULT_PSEUDO_INVERSE_TOLERANCE
          Tolerance for the pseudo inverse in the learn method, 1.0E-10.
static double DEFAULT_REGULARIZATION
          Default regularization, 0.0.
 
Constructor Summary
LinearRegression()
          Creates a new instance of LinearRegression
LinearRegression(double regularization, boolean usePseudoInverse)
          Creates a new instance of LinearRegression
 
Method Summary
 LinearRegression clone()
          This makes public the clone method on the Object class and removes the exception that it throws.
 double getRegularization()
          Getter for regularization
 boolean getUsePseudoInverse()
          Getter for usePseudoInverse
 LinearDiscriminantWithBias learn(Collection<? extends InputOutputPair<? extends Vectorizable,Double>> data)
          Computes the linear regression for the given Collection of InputOutputPairs.
 void setRegularization(double regularization)
          Setter for regularization
 void setUsePseudoInverse(boolean usePseudoInverse)
          Setter for usePseudoInverse
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

DEFAULT_PSEUDO_INVERSE_TOLERANCE

public static final double DEFAULT_PSEUDO_INVERSE_TOLERANCE
Tolerance for the pseudo inverse in the learn method, 1.0E-10.

See Also:
Constant Field Values

DEFAULT_REGULARIZATION

public static final double DEFAULT_REGULARIZATION
Default regularization, 0.0.

See Also:
Constant Field Values
Constructor Detail

LinearRegression

public LinearRegression()
Creates a new instance of LinearRegression


LinearRegression

public LinearRegression(double regularization,
                        boolean usePseudoInverse)
Creates a new instance of LinearRegression

Parameters:
regularization - L2 ridge regularization term, must be nonnegative, a value of zero is equivalent to unregularized regression.
usePseudoInverse - Flag to use a pseudoinverse. True to use the expensive, but more accurate, pseudoinverse routine. False uses a very fast, but numerically less stable LU solver. Default value is "true".
Method Detail

clone

public LinearRegression clone()
Description copied from class: AbstractCloneableSerializable
This makes public the clone method on the Object class and removes the exception that it throws. Its default behavior is to automatically create a clone of the exact type of object that the clone is called on and to copy all primitives but to keep all references, which means it is a shallow copy. Extensions of this class may want to override this method (but call super.clone() to implement a "smart copy". That is, to target the most common use case for creating a copy of the object. Because of the default behavior being a shallow copy, extending classes only need to handle fields that need to have a deeper copy (or those that need to be reset). Some of the methods in ObjectUtil may be helpful in implementing a custom clone method. Note: The contract of this method is that you must use super.clone() as the basis for your implementation.

Specified by:
clone in interface CloneableSerializable
Overrides:
clone in class AbstractCloneableSerializable
Returns:
A clone of this object.

learn

public LinearDiscriminantWithBias learn(Collection<? extends InputOutputPair<? extends Vectorizable,Double>> data)
Computes the linear regression for the given Collection of InputOutputPairs. The inputs of the pairs is the independent variable, and the pair output is the dependent variable (variable to predict). The pairs can have an associated weight to bias the regression equation.

Specified by:
learn in interface BatchLearner<Collection<? extends InputOutputPair<? extends Vectorizable,Double>>,LinearDiscriminantWithBias>
Parameters:
data - Collection of InputOutputPairs for the variables. Can be WeightedInputOutputPairs.
Returns:
LinearCombinationFunction that minimizes the RMS error of the outputs.

getUsePseudoInverse

public boolean getUsePseudoInverse()
Getter for usePseudoInverse

Returns:
Flag to use a pseudoinverse. True to use the expensive, but more accurate, pseudoinverse routine. False uses a very fast, but numerically less stable LU solver. Default value is "true".

setUsePseudoInverse

public void setUsePseudoInverse(boolean usePseudoInverse)
Setter for usePseudoInverse

Parameters:
usePseudoInverse - Flag to use a pseudoinverse. True to use the expensive, but more accurate, pseudoinverse routine. False uses a very fast, but numerically less stable LU solver. Default value is "true".

getRegularization

public double getRegularization()
Getter for regularization

Returns:
L2 ridge regularization term, must be nonnegative, a value of zero is equivalent to unregularized regression.

setRegularization

public void setRegularization(double regularization)
Setter for regularization

Parameters:
regularization - L2 ridge regularization term, must be nonnegative, a value of zero is equivalent to unregularized regression.