gov.sandia.cognition.learning.algorithm.regression
Class KernelWeightedRobustRegression<InputType,OutputType>

java.lang.Object
  extended by gov.sandia.cognition.util.AbstractCloneableSerializable
      extended by gov.sandia.cognition.algorithm.AbstractIterativeAlgorithm
          extended by gov.sandia.cognition.algorithm.AbstractAnytimeAlgorithm<ResultType>
              extended by gov.sandia.cognition.learning.algorithm.AbstractAnytimeBatchLearner<Collection<? extends InputOutputPair<? extends InputType,OutputType>>,ResultType>
                  extended by gov.sandia.cognition.learning.algorithm.AbstractAnytimeSupervisedBatchLearner<InputType,OutputType,Evaluator<? super InputType,? extends OutputType>>
                      extended by gov.sandia.cognition.learning.algorithm.regression.KernelWeightedRobustRegression<InputType,OutputType>
Type Parameters:
InputType - Input class for the Evaluator and inputs on the InputOutputPairs dataset
OutputType - Output class for the Evaluator, outputs on the InputOutputPairs dataset. Furthermore, the Kernel must be able to evaluate OutputTypes.
All Implemented Interfaces:
AnytimeAlgorithm<Evaluator<? super InputType,? extends OutputType>>, IterativeAlgorithm, StoppableAlgorithm, AnytimeBatchLearner<Collection<? extends InputOutputPair<? extends InputType,OutputType>>,Evaluator<? super InputType,? extends OutputType>>, BatchLearner<Collection<? extends InputOutputPair<? extends InputType,OutputType>>,Evaluator<? super InputType,? extends OutputType>>, SupervisedBatchLearner<InputType,OutputType,Evaluator<? super InputType,? extends OutputType>>, CloneableSerializable, Serializable, Cloneable

public class KernelWeightedRobustRegression<InputType,OutputType>
extends AbstractAnytimeSupervisedBatchLearner<InputType,OutputType,Evaluator<? super InputType,? extends OutputType>>

KernelWeightedRobustRegression takes a supervised learning algorithm that operates on a weighted collection of InputOutputPairs and modifies the weight of a sample based on the dataset output and its corresponding estimate from the Evaluator from the supervised learning algorithm at each iteration. This weight is added to the dataset sample and the supervised learning algorithm is run again. This process repeats until the weights converge. This algorithm is a direct generalization of the LOESS-based (LOWESS-based) Robust Regression using a general learner and kernel. A typical use case is using a regression algorithm (LinearRegression or DecoupledVectorLinearRegression) and a RadialBasisKernel. This results in a regression algorithm that learns to "ignore" outliers and fit the remaining data. (Think of fitting a height-versus-age curve and an 8-foot tall Yao Ming made it into your training set, skewing your results with that massive outlier.) KernelWeightedRobustRegression is different from LocallyWeightedLearning in that KWRR creates a global function approximator and holds for all inputs. Thus, learning time for KWRR is relatively high up front, but evaluation time is relatively low. On the other hand, LWL creates a local function approximator in response to each evaluation, and LWL does not create a global function approximator. As such, LWL has (almost) no up-front learning time, but each evaluation requires a relatively high evaluation. KWRR is more appropriate when you know the general structure of your data, but it is riddled with outliers. LWL is more appropriate when you don't know/understand the general trend of your data AND you can afford evaluation time to be somewhat costly.

Since:
2.0
Author:
Kevin R. Dixon
See Also:
Serialized Form

Field Summary
static int DEFAULT_MAX_ITERATIONS
          Default maximum number of iterations before stopping
static double DEFAULT_TOLERANCE
          Default tolerance stopping criterion
 
Fields inherited from class gov.sandia.cognition.learning.algorithm.AbstractAnytimeBatchLearner
data, keepGoing
 
Fields inherited from class gov.sandia.cognition.algorithm.AbstractAnytimeAlgorithm
maxIterations
 
Fields inherited from class gov.sandia.cognition.algorithm.AbstractIterativeAlgorithm
DEFAULT_ITERATION, iteration
 
Constructor Summary
KernelWeightedRobustRegression(SupervisedBatchLearner<InputType,OutputType,?> iterationLearner, Kernel<? super OutputType> kernelWeightingFunction)
          Creates a new instance of RobustRegression
KernelWeightedRobustRegression(SupervisedBatchLearner<InputType,OutputType,?> iterationLearner, Kernel<? super OutputType> kernelWeightingFunction, int maxIterations, double tolerance)
          Creates a new instance of RobustRegression
 
Method Summary
protected  void cleanupAlgorithm()
          Called to clean up the learning algorithm's state after learning has finished.
 SupervisedBatchLearner<InputType,OutputType,?> getIterationLearner()
          Getter for iterationLearner
 Kernel<? super OutputType> getKernelWeightingFunction()
          Getter for kernelWeightingFunction
 Evaluator<? super InputType,? extends OutputType> getResult()
          Gets the current result of the algorithm.
 double getTolerance()
          Getter for tolerance
protected  boolean initializeAlgorithm()
          Called to initialize the learning algorithm's state based on the data that is stored in the data field.
 void setIterationLearner(SupervisedBatchLearner<InputType,OutputType,?> iterationLearner)
           
 void setKernelWeightingFunction(Kernel<? super OutputType> kernelWeightingFunction)
          Getter for kernelWeightingFunction
 void setLearned(Evaluator<InputType,OutputType> result)
          Getter for result
 void setTolerance(double tolerance)
          Setter for tolerance
protected  boolean step()
          Called to take a single step of the learning algorithm.
 
Methods inherited from class gov.sandia.cognition.learning.algorithm.AbstractAnytimeBatchLearner
clone, getData, getKeepGoing, learn, setData, setKeepGoing, stop
 
Methods inherited from class gov.sandia.cognition.algorithm.AbstractAnytimeAlgorithm
getMaxIterations, isResultValid, setMaxIterations
 
Methods inherited from class gov.sandia.cognition.algorithm.AbstractIterativeAlgorithm
addIterativeAlgorithmListener, fireAlgorithmEnded, fireAlgorithmStarted, fireStepEnded, fireStepStarted, getIteration, getListeners, removeIterativeAlgorithmListener, setIteration, setListeners
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface gov.sandia.cognition.learning.algorithm.BatchLearner
learn
 
Methods inherited from interface gov.sandia.cognition.util.CloneableSerializable
clone
 
Methods inherited from interface gov.sandia.cognition.algorithm.AnytimeAlgorithm
getMaxIterations, setMaxIterations
 
Methods inherited from interface gov.sandia.cognition.algorithm.IterativeAlgorithm
addIterativeAlgorithmListener, getIteration, removeIterativeAlgorithmListener
 
Methods inherited from interface gov.sandia.cognition.algorithm.StoppableAlgorithm
isResultValid
 

Field Detail

DEFAULT_MAX_ITERATIONS

public static final int DEFAULT_MAX_ITERATIONS
Default maximum number of iterations before stopping

See Also:
Constant Field Values

DEFAULT_TOLERANCE

public static final double DEFAULT_TOLERANCE
Default tolerance stopping criterion

See Also:
Constant Field Values
Constructor Detail

KernelWeightedRobustRegression

public KernelWeightedRobustRegression(SupervisedBatchLearner<InputType,OutputType,?> iterationLearner,
                                      Kernel<? super OutputType> kernelWeightingFunction)
Creates a new instance of RobustRegression

Parameters:
iterationLearner - Internal learning algorithm that computes optimal solutions given the current weightedData. The iterationLearner should operate on WeightedInputOutputPairs (we have a hard time enforcing this, as many learning algorithms operate both on InputOutputPairs and WeightedInputOutputPairs and their prototype is "? extends InputOutputPair")
kernelWeightingFunction - Kernel function that provides the weighting for the estimate error, generally the Kernel should weight accurate estimates higher than inaccurate estimates.

KernelWeightedRobustRegression

public KernelWeightedRobustRegression(SupervisedBatchLearner<InputType,OutputType,?> iterationLearner,
                                      Kernel<? super OutputType> kernelWeightingFunction,
                                      int maxIterations,
                                      double tolerance)
Creates a new instance of RobustRegression

Parameters:
iterationLearner - Internal learning algorithm that computes optimal solutions given the current weightedData. The iterationLearner should operate on WeightedInputOutputPairs (we have a hard time enforcing this, as many learning algorithms operate both on InputOutputPairs and WeightedInputOutputPairs and their prototype is "? extends InputOutputPair")
kernelWeightingFunction - Kernel function that provides the weighting for the estimate error, generally the Kernel should weight accurate estimates higher than inaccurate estimates.
maxIterations -
tolerance - Tolerance before stopping the algorithm
Method Detail

initializeAlgorithm

protected boolean initializeAlgorithm()
Description copied from class: AbstractAnytimeBatchLearner
Called to initialize the learning algorithm's state based on the data that is stored in the data field. The return value indicates if the algorithm can be run or not based on the initialization.

Specified by:
initializeAlgorithm in class AbstractAnytimeBatchLearner<Collection<? extends InputOutputPair<? extends InputType,OutputType>>,Evaluator<? super InputType,? extends OutputType>>
Returns:
True if the learning algorithm can be run and false if it cannot.

step

protected boolean step()
Description copied from class: AbstractAnytimeBatchLearner
Called to take a single step of the learning algorithm.

Specified by:
step in class AbstractAnytimeBatchLearner<Collection<? extends InputOutputPair<? extends InputType,OutputType>>,Evaluator<? super InputType,? extends OutputType>>
Returns:
True if another step can be taken and false it the algorithm should halt.

cleanupAlgorithm

protected void cleanupAlgorithm()
Description copied from class: AbstractAnytimeBatchLearner
Called to clean up the learning algorithm's state after learning has finished.

Specified by:
cleanupAlgorithm in class AbstractAnytimeBatchLearner<Collection<? extends InputOutputPair<? extends InputType,OutputType>>,Evaluator<? super InputType,? extends OutputType>>

getKernelWeightingFunction

public Kernel<? super OutputType> getKernelWeightingFunction()
Getter for kernelWeightingFunction

Returns:
Kernel function that provides the weighting for the estimate error, generally the Kernel should weight accurate estimates higher than inaccurate estimates.

setKernelWeightingFunction

public void setKernelWeightingFunction(Kernel<? super OutputType> kernelWeightingFunction)
Getter for kernelWeightingFunction

Parameters:
kernelWeightingFunction - Kernel function that provides the weighting for the estimate error, generally the Kernel should weight accurate estimates higher than innaccurate estimates.

getTolerance

public double getTolerance()
Getter for tolerance

Returns:
Tolerance before stopping the algorithm

setTolerance

public void setTolerance(double tolerance)
Setter for tolerance

Parameters:
tolerance - Tolerance before stopping the algorithm

setLearned

public void setLearned(Evaluator<InputType,OutputType> result)
Getter for result

Parameters:
result - DecoupledVectorFunction that is being optimized

getResult

public Evaluator<? super InputType,? extends OutputType> getResult()
Description copied from interface: AnytimeAlgorithm
Gets the current result of the algorithm.

Returns:
Current result of the algorithm.

getIterationLearner

public SupervisedBatchLearner<InputType,OutputType,?> getIterationLearner()
Getter for iterationLearner

Returns:
Internal learning algorithm that computes optimal solutions given the current weightedData. The iterationLearner should operate on WeightedInputOutputPairs (we have a hard time enforcing this, as many learning algorithms operate both on InputOutputPairs and WeightedInputOutputPairs)

setIterationLearner

public void setIterationLearner(SupervisedBatchLearner<InputType,OutputType,?> iterationLearner)
Parameters:
iterationLearner - Internal learning algorithm that computes optimal solutions given the current weightedData. The iterationLearner should operate on WeightedInputOutputPairs (we have a hard time enforcing this, as many learning algorithms operate both on InputOutputPairs and WeightedInputOutputPairs)