gov.sandia.cognition.learning.algorithm.genetic
Class GeneticAlgorithm<CostParametersType,GenomeType>

java.lang.Object
  extended by gov.sandia.cognition.util.AbstractCloneableSerializable
      extended by gov.sandia.cognition.algorithm.AbstractIterativeAlgorithm
          extended by gov.sandia.cognition.algorithm.AbstractAnytimeAlgorithm<ResultType>
              extended by gov.sandia.cognition.learning.algorithm.AbstractAnytimeBatchLearner<CostParametersType,GenomeType>
                  extended by gov.sandia.cognition.learning.algorithm.genetic.GeneticAlgorithm<CostParametersType,GenomeType>
Type Parameters:
GenomeType - Type of genome used to represent a single element in the genetic population. For example, a Vector.
CostParametersType - Type of parameters that the cost function takes. For example, Collection<InputOutputPairs>.
All Implemented Interfaces:
AnytimeAlgorithm<GenomeType>, IterativeAlgorithm, MeasurablePerformanceAlgorithm, StoppableAlgorithm, AnytimeBatchLearner<CostParametersType,GenomeType>, BatchCostMinimizationLearner<CostParametersType,GenomeType>, BatchLearner<CostParametersType,GenomeType>, CloneableSerializable, Serializable, Cloneable
Direct Known Subclasses:
ParallelizedGeneticAlgorithm

@CodeReviews(reviews={@CodeReview(reviewer="Kevin R. Dixon",date="2008-07-23",changesNeeded=false,comments={"I don\'t much like the constructors for this class, but it\'s probably not worth changing at this point.","Made some cosmetic changes to the code.","Added previous code review as CodeReview annotation","Otherwise, looks fine."}),@CodeReview(reviewer="Justin Basilico",date="2006-10-05",changesNeeded=false,comments={"Cleaned up the code a little.","Made the constructor initialize the variables."})})
public class GeneticAlgorithm<CostParametersType,GenomeType>
extends AbstractAnytimeBatchLearner<CostParametersType,GenomeType>
implements BatchCostMinimizationLearner<CostParametersType,GenomeType>, MeasurablePerformanceAlgorithm

The GeneticAlgorithm class implements a generic genetic algorithm that uses a given cost function to minimize and a given reproduction function for generating the population.

A GA maintains a population of potential solutions to the minimization problem, each with an associated cost-function score. The members of the population are called "genomes". The genomes are allowed to "reproduce" based on their cost-function scores to produce the next "generation" of genomes. We always carry over the all-time lowest-cost genome to the next generation. Notice that GAs have gone slightly overboard with the evolutionary metaphor? Me too.

GAs can be a very effective approach to finding the minimum-cost parameter set of a function for a cost function. However, GAs are generally regarded as a method of last resort, being superior to only Simulated Annealing (please read my opinion/polemic/diatribe on Simulated Annealing).

Here's a recap of that comment: Instead of GAs, try Powell's method, which is a powerful minimization technique that only relies on function evaluations. Generally better than Powell's method is Conjugate Gradient with automated gradient approximation, which only relies on function evaluations and automatically estimates the gradient for you.

If you can store the (approximated) Jacobian in memory, then the best technique is usually BFGS with automated gradient approximation (sometimes Levenberg-Marquardt Estimation is as good as BFGS, but usually not).

Since:
1.0
Author:
Jonathan McClain, Justin Basilico, Kevin R. Dixon
See Also:
SimulatedAnnealer, Serialized Form

Field Summary
static int DEFAULT_MAX_ITERATIONS
          The default maximum number of iterations, 1000.
 
Fields inherited from class gov.sandia.cognition.learning.algorithm.AbstractAnytimeBatchLearner
data, keepGoing
 
Fields inherited from class gov.sandia.cognition.algorithm.AbstractAnytimeAlgorithm
maxIterations
 
Fields inherited from class gov.sandia.cognition.algorithm.AbstractIterativeAlgorithm
DEFAULT_ITERATION, iteration
 
Constructor Summary
GeneticAlgorithm(Collection<GenomeType> initialPopulation, Reproducer<GenomeType> reproducer, CostFunction<? super GenomeType,? super CostParametersType> cost)
          Creates a new instance of GeneticAlgorithm.
GeneticAlgorithm(Collection<GenomeType> initialPopulation, Reproducer<GenomeType> reproducer, CostFunction<? super GenomeType,? super CostParametersType> cost, int maxIterations)
          Creates a new instance of GeneticAlgorithm.
GeneticAlgorithm(Collection<GenomeType> initialPopulation, Reproducer<GenomeType> reproducer, CostFunction<? super GenomeType,? super CostParametersType> cost, int maxIterations, int maxIterationsWithoutImprovement)
          Creates a new instance of GeneticAlgorithm.
 
Method Summary
protected  void cleanupAlgorithm()
          Called to clean up the learning algorithm's state after learning has finished.
protected  ArrayList<EvaluatedGenome<GenomeType>> evaluatePopulation(Collection<GenomeType> population)
          Converts a population of genomes into evaluated genomes.
 EvaluatedGenome<GenomeType> getBestSoFar()
          Gets the best genome found so far.
 CostFunction<? super GenomeType,? super CostParametersType> getCostFunction()
          Gets the cost function that the learner is minimizing.
 Collection<GenomeType> getInitialPopulation()
          Getter for initialPopulation.
 int getIterationsWithoutImprovement()
          Gets the current number of iterations without improvement.
 int getMaxIterationsWithoutImprovement()
          Gets the maximum number of iterations to go without improvement before stopping.
 NamedValue<Double> getPerformance()
          Gets the performance, which is the cost of the best genome.
 Collection<EvaluatedGenome<GenomeType>> getPopulation()
          Gets the population of genomes.
 Reproducer<GenomeType> getReproducer()
          Gets the reproducer.
 GenomeType getResult()
          Gets the current result of the algorithm.
protected  boolean initializeAlgorithm()
          Called to initialize the learning algorithm's state based on the data that is stored in the data field.
protected  EvaluatedGenome<GenomeType> searchForBetter(EvaluatedGenome<GenomeType> bestSoFar, Collection<EvaluatedGenome<GenomeType>> population)
          Searches the provided population of genomes for one whose cost is lower than the provided best so far genome.
 void setBestSoFar(EvaluatedGenome<GenomeType> bestSoFar)
          Sets the best genome found so far.
 void setCostFunction(CostFunction<? super GenomeType,? super CostParametersType> cost)
          Sets the cost function.
 void setInitialPopulation(Collection<GenomeType> initialPopulation)
          Setter for initialPopulation.
 void setIterationsWithoutImprovement(int iterationsWithoutImprovement)
          Sets the current number of iterations without improvement.
 void setMaxIterationsWithoutImprovement(int maxIterationsWithoutImprovement)
          Sets the maximum number of iterations to go without improvement before stopping.
 void setPopulation(Collection<EvaluatedGenome<GenomeType>> population)
          Sets the population of genomes.
 void setReproducer(Reproducer<GenomeType> reproducer)
          Sets the reproducer.
protected  boolean step()
          Called to take a single step of the learning algorithm.
 
Methods inherited from class gov.sandia.cognition.learning.algorithm.AbstractAnytimeBatchLearner
clone, getData, getKeepGoing, learn, setData, setKeepGoing, stop
 
Methods inherited from class gov.sandia.cognition.algorithm.AbstractAnytimeAlgorithm
getMaxIterations, isResultValid, setMaxIterations
 
Methods inherited from class gov.sandia.cognition.algorithm.AbstractIterativeAlgorithm
addIterativeAlgorithmListener, fireAlgorithmEnded, fireAlgorithmStarted, fireStepEnded, fireStepStarted, getIteration, getListeners, removeIterativeAlgorithmListener, setIteration, setListeners
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface gov.sandia.cognition.learning.algorithm.BatchCostMinimizationLearner
learn
 
Methods inherited from interface gov.sandia.cognition.util.CloneableSerializable
clone
 
Methods inherited from interface gov.sandia.cognition.algorithm.AnytimeAlgorithm
getMaxIterations, setMaxIterations
 
Methods inherited from interface gov.sandia.cognition.algorithm.IterativeAlgorithm
addIterativeAlgorithmListener, getIteration, removeIterativeAlgorithmListener
 
Methods inherited from interface gov.sandia.cognition.algorithm.StoppableAlgorithm
isResultValid
 

Field Detail

DEFAULT_MAX_ITERATIONS

public static final int DEFAULT_MAX_ITERATIONS
The default maximum number of iterations, 1000.

See Also:
Constant Field Values
Constructor Detail

GeneticAlgorithm

public GeneticAlgorithm(Collection<GenomeType> initialPopulation,
                        Reproducer<GenomeType> reproducer,
                        CostFunction<? super GenomeType,? super CostParametersType> cost)
Creates a new instance of GeneticAlgorithm.

Parameters:
cost - The cost function for genomes.
initialPopulation - The initial population to start the algorithm
reproducer - The reproduction method to use.

GeneticAlgorithm

public GeneticAlgorithm(Collection<GenomeType> initialPopulation,
                        Reproducer<GenomeType> reproducer,
                        CostFunction<? super GenomeType,? super CostParametersType> cost,
                        int maxIterations)
Creates a new instance of GeneticAlgorithm.

Parameters:
cost - The cost function for genomes.
initialPopulation - The initial population to start the algorithm
reproducer - The reproduction method to use.
maxIterations - The maximum number of iterations to run.

GeneticAlgorithm

public GeneticAlgorithm(Collection<GenomeType> initialPopulation,
                        Reproducer<GenomeType> reproducer,
                        CostFunction<? super GenomeType,? super CostParametersType> cost,
                        int maxIterations,
                        int maxIterationsWithoutImprovement)
Creates a new instance of GeneticAlgorithm.

Parameters:
cost - The cost function for genomes.
initialPopulation - The initial population to start the algorithm
reproducer - The reproduction method to use.
maxIterations - The maximum number of iterations to run.
maxIterationsWithoutImprovement - The maximum number of iterations to go without improvement before stopping.
Method Detail

initializeAlgorithm

protected boolean initializeAlgorithm()
Description copied from class: AbstractAnytimeBatchLearner
Called to initialize the learning algorithm's state based on the data that is stored in the data field. The return value indicates if the algorithm can be run or not based on the initialization.

Specified by:
initializeAlgorithm in class AbstractAnytimeBatchLearner<CostParametersType,GenomeType>
Returns:
True if the learning algorithm can be run and false if it cannot.

step

protected boolean step()
Description copied from class: AbstractAnytimeBatchLearner
Called to take a single step of the learning algorithm.

Specified by:
step in class AbstractAnytimeBatchLearner<CostParametersType,GenomeType>
Returns:
True if another step can be taken and false it the algorithm should halt.

cleanupAlgorithm

protected void cleanupAlgorithm()
Description copied from class: AbstractAnytimeBatchLearner
Called to clean up the learning algorithm's state after learning has finished.

Specified by:
cleanupAlgorithm in class AbstractAnytimeBatchLearner<CostParametersType,GenomeType>

getResult

public GenomeType getResult()
Description copied from interface: AnytimeAlgorithm
Gets the current result of the algorithm.

Specified by:
getResult in interface AnytimeAlgorithm<GenomeType>
Returns:
Current result of the algorithm.

searchForBetter

protected EvaluatedGenome<GenomeType> searchForBetter(EvaluatedGenome<GenomeType> bestSoFar,
                                                      Collection<EvaluatedGenome<GenomeType>> population)
Searches the provided population of genomes for one whose cost is lower than the provided best so far genome.

Parameters:
bestSoFar - The genome to compare to.
population - The population to search.
Returns:
The best genome that was found. Returns the original if no better were found.

evaluatePopulation

protected ArrayList<EvaluatedGenome<GenomeType>> evaluatePopulation(Collection<GenomeType> population)
Converts a population of genomes into evaluated genomes.

Parameters:
population - The population of genomes to evaluate.
Returns:
A population of evaluated genomes.

getCostFunction

public CostFunction<? super GenomeType,? super CostParametersType> getCostFunction()
Description copied from interface: BatchCostMinimizationLearner
Gets the cost function that the learner is minimizing.

Specified by:
getCostFunction in interface BatchCostMinimizationLearner<CostParametersType,GenomeType>
Returns:
The CostFunction that the learner's algorithm is minimizing.

getReproducer

public Reproducer<GenomeType> getReproducer()
Gets the reproducer.

Returns:
The reproducer.

getBestSoFar

public EvaluatedGenome<GenomeType> getBestSoFar()
Gets the best genome found so far.

Returns:
The best genome found so far.

getMaxIterationsWithoutImprovement

public int getMaxIterationsWithoutImprovement()
Gets the maximum number of iterations to go without improvement before stopping.

Returns:
The current maximum.

getIterationsWithoutImprovement

public int getIterationsWithoutImprovement()
Gets the current number of iterations without improvement.

Returns:
The current iteration.

getPopulation

public Collection<EvaluatedGenome<GenomeType>> getPopulation()
Gets the population of genomes.

Returns:
The population of genomes.

setCostFunction

public void setCostFunction(CostFunction<? super GenomeType,? super CostParametersType> cost)
Sets the cost function.

Parameters:
cost - The new cost function.

setReproducer

public void setReproducer(Reproducer<GenomeType> reproducer)
Sets the reproducer.

Parameters:
reproducer - The new reproducer.

setBestSoFar

public void setBestSoFar(EvaluatedGenome<GenomeType> bestSoFar)
Sets the best genome found so far.

Parameters:
bestSoFar - The new best genome.

setMaxIterationsWithoutImprovement

public void setMaxIterationsWithoutImprovement(int maxIterationsWithoutImprovement)
Sets the maximum number of iterations to go without improvement before stopping.

Parameters:
maxIterationsWithoutImprovement - The new maximum.

setIterationsWithoutImprovement

public void setIterationsWithoutImprovement(int iterationsWithoutImprovement)
Sets the current number of iterations without improvement.

Parameters:
iterationsWithoutImprovement - The new iteration.

setPopulation

public void setPopulation(Collection<EvaluatedGenome<GenomeType>> population)
Sets the population of genomes.

Parameters:
population - The new population.

getInitialPopulation

public Collection<GenomeType> getInitialPopulation()
Getter for initialPopulation.

Returns:
The initial population of genomes.

setInitialPopulation

public void setInitialPopulation(Collection<GenomeType> initialPopulation)
Setter for initialPopulation.

Parameters:
initialPopulation - The initial population of genomes.

getPerformance

public NamedValue<Double> getPerformance()
Gets the performance, which is the cost of the best genome.

Specified by:
getPerformance in interface MeasurablePerformanceAlgorithm
Returns:
The performance of the algorithm.