gov.sandia.cognition.learning.algorithm.ensemble
Class CategoryBalancedBaggingLearner<InputType,CategoryType>

java.lang.Object
  extended by gov.sandia.cognition.util.AbstractCloneableSerializable
      extended by gov.sandia.cognition.algorithm.AbstractIterativeAlgorithm
          extended by gov.sandia.cognition.algorithm.AbstractAnytimeAlgorithm<ResultType>
              extended by gov.sandia.cognition.learning.algorithm.AbstractAnytimeBatchLearner<Collection<? extends InputOutputPair<? extends InputType,OutputType>>,ResultType>
                  extended by gov.sandia.cognition.learning.algorithm.AbstractAnytimeSupervisedBatchLearner<InputType,OutputType,EnsembleType>
                      extended by gov.sandia.cognition.learning.algorithm.ensemble.AbstractBaggingLearner<InputType,CategoryType,Evaluator<? super InputType,? extends CategoryType>,WeightedVotingCategorizerEnsemble<InputType,CategoryType,Evaluator<? super InputType,? extends CategoryType>>>
                          extended by gov.sandia.cognition.learning.algorithm.ensemble.BaggingCategorizerLearner<InputType,CategoryType>
                              extended by gov.sandia.cognition.learning.algorithm.ensemble.CategoryBalancedBaggingLearner<InputType,CategoryType>
Type Parameters:
InputType - The input type for supervised learning. Passed on to the internal learning algorithm. Also the input type for the learned ensemble.
CategoryType - The output type for supervised learning. Passed on to the internal learning algorithm. Also the output type of the learned ensemble.
All Implemented Interfaces:
AnytimeAlgorithm<WeightedVotingCategorizerEnsemble<InputType,CategoryType,Evaluator<? super InputType,? extends CategoryType>>>, IterativeAlgorithm, StoppableAlgorithm, AnytimeBatchLearner<Collection<? extends InputOutputPair<? extends InputType,CategoryType>>,WeightedVotingCategorizerEnsemble<InputType,CategoryType,Evaluator<? super InputType,? extends CategoryType>>>, BatchLearner<Collection<? extends InputOutputPair<? extends InputType,CategoryType>>,WeightedVotingCategorizerEnsemble<InputType,CategoryType,Evaluator<? super InputType,? extends CategoryType>>>, BatchLearnerContainer<BatchLearner<? super Collection<? extends InputOutputPair<? extends InputType,CategoryType>>,? extends Evaluator<? super InputType,? extends CategoryType>>>, SupervisedBatchLearner<InputType,CategoryType,WeightedVotingCategorizerEnsemble<InputType,CategoryType,Evaluator<? super InputType,? extends CategoryType>>>, CloneableSerializable, Randomized, Serializable, Cloneable

public class CategoryBalancedBaggingLearner<InputType,CategoryType>
extends BaggingCategorizerLearner<InputType,CategoryType>

An extension of the basic bagging learner that attempts to sample bags that have equal numbers of examples from every category.

Since:
3.3.0
Author:
Justin Basilico
See Also:
Serialized Form

Field Summary
protected  ArrayList<CategoryType> categoryList
          The list of categories.
protected  HashMap<CategoryType,ArrayList<Integer>> dataPerCategory
          The mapping of categories to indices of examples belonging to the category.
 
Fields inherited from class gov.sandia.cognition.learning.algorithm.ensemble.AbstractBaggingLearner
bag, dataInBag, dataList, DEFAULT_MAX_ITERATIONS, DEFAULT_PERCENT_TO_SAMPLE, ensemble, learner, percentToSample, random
 
Fields inherited from class gov.sandia.cognition.learning.algorithm.AbstractAnytimeBatchLearner
data, keepGoing
 
Fields inherited from class gov.sandia.cognition.algorithm.AbstractAnytimeAlgorithm
maxIterations
 
Fields inherited from class gov.sandia.cognition.algorithm.AbstractIterativeAlgorithm
DEFAULT_ITERATION, iteration
 
Constructor Summary
CategoryBalancedBaggingLearner()
          Creates a new instance of CategoryBalancedBaggingLearner.
CategoryBalancedBaggingLearner(BatchLearner<? super Collection<? extends InputOutputPair<? extends InputType,CategoryType>>,? extends Evaluator<? super InputType,? extends CategoryType>> learner)
          Creates a new instance of CategoryBalancedBaggingLearner.
CategoryBalancedBaggingLearner(BatchLearner<? super Collection<? extends InputOutputPair<? extends InputType,CategoryType>>,? extends Evaluator<? super InputType,? extends CategoryType>> learner, int maxIterations, double percentToSample, Random random)
          Creates a new instance of CategoryBalancedBaggingLearner.
 
Method Summary
protected  void cleanupAlgorithm()
          Called to clean up the learning algorithm's state after learning has finished.
protected  void fillBag(int sampleCount)
          Fills the internal bag field by sampling the given number of samples.
protected  boolean initializeAlgorithm()
          Called to initialize the learning algorithm's state based on the data that is stored in the data field.
 
Methods inherited from class gov.sandia.cognition.learning.algorithm.ensemble.BaggingCategorizerLearner
addEnsembleMember, createInitialEnsemble
 
Methods inherited from class gov.sandia.cognition.learning.algorithm.ensemble.AbstractBaggingLearner
getBag, getDataInBag, getDataList, getEnsemble, getLearner, getPercentToSample, getRandom, getResult, setBag, setDataInBag, setDataList, setEnsemble, setLearner, setPercentToSample, setRandom, step
 
Methods inherited from class gov.sandia.cognition.learning.algorithm.AbstractAnytimeBatchLearner
clone, getData, getKeepGoing, learn, setData, setKeepGoing, stop
 
Methods inherited from class gov.sandia.cognition.algorithm.AbstractAnytimeAlgorithm
getMaxIterations, isResultValid, setMaxIterations
 
Methods inherited from class gov.sandia.cognition.algorithm.AbstractIterativeAlgorithm
addIterativeAlgorithmListener, fireAlgorithmEnded, fireAlgorithmStarted, fireStepEnded, fireStepStarted, getIteration, getListeners, removeIterativeAlgorithmListener, setIteration, setListeners
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface gov.sandia.cognition.learning.algorithm.BatchLearner
learn
 
Methods inherited from interface gov.sandia.cognition.util.CloneableSerializable
clone
 
Methods inherited from interface gov.sandia.cognition.algorithm.AnytimeAlgorithm
getMaxIterations, setMaxIterations
 
Methods inherited from interface gov.sandia.cognition.algorithm.IterativeAlgorithm
addIterativeAlgorithmListener, getIteration, removeIterativeAlgorithmListener
 
Methods inherited from interface gov.sandia.cognition.algorithm.StoppableAlgorithm
isResultValid
 

Field Detail

categoryList

protected ArrayList<CategoryType> categoryList
The list of categories.


dataPerCategory

protected HashMap<CategoryType,ArrayList<Integer>> dataPerCategory
The mapping of categories to indices of examples belonging to the category.

Constructor Detail

CategoryBalancedBaggingLearner

public CategoryBalancedBaggingLearner()
Creates a new instance of CategoryBalancedBaggingLearner.


CategoryBalancedBaggingLearner

public CategoryBalancedBaggingLearner(BatchLearner<? super Collection<? extends InputOutputPair<? extends InputType,CategoryType>>,? extends Evaluator<? super InputType,? extends CategoryType>> learner)
Creates a new instance of CategoryBalancedBaggingLearner.

Parameters:
learner - The learner to use to create the categorizer on each iteration.

CategoryBalancedBaggingLearner

public CategoryBalancedBaggingLearner(BatchLearner<? super Collection<? extends InputOutputPair<? extends InputType,CategoryType>>,? extends Evaluator<? super InputType,? extends CategoryType>> learner,
                                      int maxIterations,
                                      double percentToSample,
                                      Random random)
Creates a new instance of CategoryBalancedBaggingLearner.

Parameters:
learner - The learner to use to create the categorizer on each iteration.
maxIterations - The maximum number of iterations to run for, which is also the number of learners to create.
percentToSample - The percentage of the total size of the data to sample on each iteration. Must be positive.
random - The random number generator to use.
Method Detail

initializeAlgorithm

protected boolean initializeAlgorithm()
Description copied from class: AbstractAnytimeBatchLearner
Called to initialize the learning algorithm's state based on the data that is stored in the data field. The return value indicates if the algorithm can be run or not based on the initialization.

Overrides:
initializeAlgorithm in class AbstractBaggingLearner<InputType,CategoryType,Evaluator<? super InputType,? extends CategoryType>,WeightedVotingCategorizerEnsemble<InputType,CategoryType,Evaluator<? super InputType,? extends CategoryType>>>
Returns:
True if the learning algorithm can be run and false if it cannot.

fillBag

protected void fillBag(int sampleCount)
Description copied from class: AbstractBaggingLearner
Fills the internal bag field by sampling the given number of samples.

Overrides:
fillBag in class AbstractBaggingLearner<InputType,CategoryType,Evaluator<? super InputType,? extends CategoryType>,WeightedVotingCategorizerEnsemble<InputType,CategoryType,Evaluator<? super InputType,? extends CategoryType>>>
Parameters:
sampleCount - The number to sample.

cleanupAlgorithm

protected void cleanupAlgorithm()
Description copied from class: AbstractAnytimeBatchLearner
Called to clean up the learning algorithm's state after learning has finished.

Overrides:
cleanupAlgorithm in class AbstractBaggingLearner<InputType,CategoryType,Evaluator<? super InputType,? extends CategoryType>,WeightedVotingCategorizerEnsemble<InputType,CategoryType,Evaluator<? super InputType,? extends CategoryType>>>