gov.sandia.cognition.learning.algorithm.ensemble
Class CategoryBalancedIVotingLearner<InputType,CategoryType>

java.lang.Object
  extended by gov.sandia.cognition.util.AbstractCloneableSerializable
      extended by gov.sandia.cognition.algorithm.AbstractIterativeAlgorithm
          extended by gov.sandia.cognition.algorithm.AbstractAnytimeAlgorithm<ResultType>
              extended by gov.sandia.cognition.learning.algorithm.AbstractAnytimeBatchLearner<Collection<? extends InputOutputPair<? extends InputType,OutputType>>,ResultType>
                  extended by gov.sandia.cognition.learning.algorithm.AbstractAnytimeSupervisedBatchLearner<InputType,CategoryType,WeightedVotingCategorizerEnsemble<InputType,CategoryType,Evaluator<? super InputType,? extends CategoryType>>>
                      extended by gov.sandia.cognition.learning.algorithm.ensemble.IVotingCategorizerLearner<InputType,CategoryType>
                          extended by gov.sandia.cognition.learning.algorithm.ensemble.CategoryBalancedIVotingLearner<InputType,CategoryType>
Type Parameters:
InputType - The type of the input for the categorizer to learn. This is the type passed to the internal batch learner to learn each ensemble member.
CategoryType - The type of the category that is the output for the categorizer to learn. It is also passed to the internal batch learner to learn each ensemble member. It must have a valid equals and hashCode method.
All Implemented Interfaces:
AnytimeAlgorithm<WeightedVotingCategorizerEnsemble<InputType,CategoryType,Evaluator<? super InputType,? extends CategoryType>>>, IterativeAlgorithm, StoppableAlgorithm, AnytimeBatchLearner<Collection<? extends InputOutputPair<? extends InputType,CategoryType>>,WeightedVotingCategorizerEnsemble<InputType,CategoryType,Evaluator<? super InputType,? extends CategoryType>>>, BatchLearner<Collection<? extends InputOutputPair<? extends InputType,CategoryType>>,WeightedVotingCategorizerEnsemble<InputType,CategoryType,Evaluator<? super InputType,? extends CategoryType>>>, BatchLearnerContainer<BatchLearner<? super Collection<? extends InputOutputPair<? extends InputType,CategoryType>>,? extends Evaluator<? super InputType,? extends CategoryType>>>, SupervisedBatchLearner<InputType,CategoryType,WeightedVotingCategorizerEnsemble<InputType,CategoryType,Evaluator<? super InputType,? extends CategoryType>>>, CloneableSerializable, Randomized, Serializable, Cloneable

public class CategoryBalancedIVotingLearner<InputType,CategoryType>
extends IVotingCategorizerLearner<InputType,CategoryType>

An extension of IVoting for dealing with skew problems that makes sure that there are an equal number of examples from each category in each sample that an ensemble member is trained on.

Since:
3.3.0
Author:
Justin Basilico
See Also:
Serialized Form

Nested Class Summary
 
Nested classes/interfaces inherited from class gov.sandia.cognition.learning.algorithm.ensemble.IVotingCategorizerLearner
IVotingCategorizerLearner.OutOfBagErrorStoppingCriteria<InputType,CategoryType>
 
Field Summary
 
Fields inherited from class gov.sandia.cognition.learning.algorithm.ensemble.IVotingCategorizerLearner
counterFactory, currentBag, currentCorrectIndices, currentEnsembleCorrect, currentIncorrectIndices, currentMember, currentMemberEstimates, dataFullEstimates, dataInBag, dataList, dataOutOfBagEstimates, DEFAULT_MAX_ITERATIONS, DEFAULT_PERCENT_TO_SAMPLE, DEFAULT_PROPORTION_INCORRECT_IN_SAMPLE, DEFAULT_VOTE_OUT_OF_BAG_ONLY, ensemble, learner, numCorrectToSample, numIncorrectToSample, percentToSample, proportionIncorrectInSample, random, sampleSize, voteOutOfBagOnly
 
Fields inherited from class gov.sandia.cognition.learning.algorithm.AbstractAnytimeBatchLearner
data, keepGoing
 
Fields inherited from class gov.sandia.cognition.algorithm.AbstractAnytimeAlgorithm
maxIterations
 
Fields inherited from class gov.sandia.cognition.algorithm.AbstractIterativeAlgorithm
DEFAULT_ITERATION, iteration
 
Constructor Summary
CategoryBalancedIVotingLearner()
          Creates a new CategoryBalancedIVotingLearner.
CategoryBalancedIVotingLearner(BatchLearner<? super Collection<? extends InputOutputPair<? extends InputType,CategoryType>>,? extends Evaluator<? super InputType,? extends CategoryType>> learner, int maxIterations, double percentToSample, double proportionIncorrectInSample, boolean voteOutOfBagOnly, Factory<? extends DataDistribution<CategoryType>> counterFactory, Random random)
          Creates a new CategoryBalancedIVotingLearner.
CategoryBalancedIVotingLearner(BatchLearner<? super Collection<? extends InputOutputPair<? extends InputType,CategoryType>>,? extends Evaluator<? super InputType,? extends CategoryType>> learner, int maxIterations, double percentToSample, Random random)
          Creates a new CategoryBalancedIVotingLearner.
 
Method Summary
protected  void createBag(ArrayList<Integer> correctIndices, ArrayList<Integer> incorrectIndices)
          Create the next sample (bag) of examples to learn the next ensemble member from.
 
Methods inherited from class gov.sandia.cognition.learning.algorithm.ensemble.IVotingCategorizerLearner
cleanupAlgorithm, getCounterFactory, getCurrentEnsembleCorrect, getDataFullEstimates, getDataOutOfBagEstimates, getLearner, getPercentToSample, getProportionIncorrectInSample, getRandom, getResult, initializeAlgorithm, isVoteOutOfBagOnly, sampleIndicesWithReplacementInto, setCounterFactory, setLearner, setPercentToSample, setProportionIncorrectInSample, setRandom, setVoteOutOfBagOnly, step
 
Methods inherited from class gov.sandia.cognition.learning.algorithm.AbstractAnytimeBatchLearner
clone, getData, getKeepGoing, learn, setData, setKeepGoing, stop
 
Methods inherited from class gov.sandia.cognition.algorithm.AbstractAnytimeAlgorithm
getMaxIterations, isResultValid, setMaxIterations
 
Methods inherited from class gov.sandia.cognition.algorithm.AbstractIterativeAlgorithm
addIterativeAlgorithmListener, fireAlgorithmEnded, fireAlgorithmStarted, fireStepEnded, fireStepStarted, getIteration, getListeners, removeIterativeAlgorithmListener, setIteration, setListeners
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface gov.sandia.cognition.learning.algorithm.BatchLearner
learn
 
Methods inherited from interface gov.sandia.cognition.util.CloneableSerializable
clone
 
Methods inherited from interface gov.sandia.cognition.algorithm.AnytimeAlgorithm
getMaxIterations, setMaxIterations
 
Methods inherited from interface gov.sandia.cognition.algorithm.IterativeAlgorithm
addIterativeAlgorithmListener, getIteration, removeIterativeAlgorithmListener
 
Methods inherited from interface gov.sandia.cognition.algorithm.StoppableAlgorithm
isResultValid
 

Constructor Detail

CategoryBalancedIVotingLearner

public CategoryBalancedIVotingLearner()
Creates a new CategoryBalancedIVotingLearner.


CategoryBalancedIVotingLearner

public CategoryBalancedIVotingLearner(BatchLearner<? super Collection<? extends InputOutputPair<? extends InputType,CategoryType>>,? extends Evaluator<? super InputType,? extends CategoryType>> learner,
                                      int maxIterations,
                                      double percentToSample,
                                      Random random)
Creates a new CategoryBalancedIVotingLearner.

Parameters:
learner - The learner to use to create the categorizer on each iteration.
maxIterations - The maximum number of iterations to run for, which is also the number of learners to create.
percentToSample - The percentage of the total size of the data to sample on each iteration. Must be positive.
random - The random number generator to use.

CategoryBalancedIVotingLearner

public CategoryBalancedIVotingLearner(BatchLearner<? super Collection<? extends InputOutputPair<? extends InputType,CategoryType>>,? extends Evaluator<? super InputType,? extends CategoryType>> learner,
                                      int maxIterations,
                                      double percentToSample,
                                      double proportionIncorrectInSample,
                                      boolean voteOutOfBagOnly,
                                      Factory<? extends DataDistribution<CategoryType>> counterFactory,
                                      Random random)
Creates a new CategoryBalancedIVotingLearner.

Parameters:
learner - The learner to use to create the categorizer on each iteration.
maxIterations - The maximum number of iterations to run for, which is also the number of learners to create.
percentToSample - The percentage of the total size of the data to sample on each iteration. Must be positive.
proportionIncorrectInSample - The percentage of incorrect examples to put in each sample. Must be between 0.0 and 1.0 (inclusive).
voteOutOfBagOnly - Controls whether or not in-bag or out-of-bag votes are used to determine accuracy.
counterFactory - The factory for counting votes.
random - The random number generator to use.
Method Detail

createBag

protected void createBag(ArrayList<Integer> correctIndices,
                         ArrayList<Integer> incorrectIndices)
Description copied from class: IVotingCategorizerLearner
Create the next sample (bag) of examples to learn the next ensemble member from.

Overrides:
createBag in class IVotingCategorizerLearner<InputType,CategoryType>
Parameters:
correctIndices - The list of indices the ensemble is currently getting correct.
incorrectIndices - The list of indices the ensemble is currently getting incorrect.