gov.sandia.cognition.learning.data
Class RandomDataPartitioner<DataType>

java.lang.Object
  extended by gov.sandia.cognition.util.AbstractCloneableSerializable
      extended by gov.sandia.cognition.util.AbstractRandomized
          extended by gov.sandia.cognition.learning.data.RandomDataPartitioner<DataType>
Type Parameters:
DataType - The type of data to partition.
All Implemented Interfaces:
DataPartitioner<DataType>, RandomizedDataPartitioner<DataType>, CloneableSerializable, Randomized, Serializable, Cloneable

public class RandomDataPartitioner<DataType>
extends AbstractRandomized
implements RandomizedDataPartitioner<DataType>

The RandomDataPartitioner class implements a randomized data partitioner that takes a collection of data and randomly splits it into training and testing sets based on a fixed percentage of training data.

Since:
2.0
Author:
Justin Basilico
See Also:
Serialized Form

Field Summary
static double DEFAULT_TRAINING_PERCENT
          The default percentage of training data is 50%.
protected  double trainingPercent
          The percentage of training data.
 
Fields inherited from class gov.sandia.cognition.util.AbstractRandomized
random
 
Constructor Summary
RandomDataPartitioner()
          Creates a new instance of RandomDataPartitioner.
RandomDataPartitioner(double trainingPercent, Random random)
          Creates a new instance of RandomDataPartitioner.
 
Method Summary
protected static void checkTrainingPercent(double trainingPercent)
          Checks to make sure the training percent greater than 0.0 and less than 1.0.
 PartitionedDataset<DataType> createPartition(Collection<? extends DataType> data)
          Randomly partitions the given data into a training and testing set.
static
<DataType> PartitionedDataset<DataType>
createPartition(Collection<? extends DataType> data, double trainingPercent, Random random)
          Randomly partitions the given data into a training and testing set.
 double getTrainingPercent()
          Gets the percentage of data to put in the training partition.
 void setTrainingPercent(double trainingPercent)
          Sets the percentage of data to put in the training partition.
 
Methods inherited from class gov.sandia.cognition.util.AbstractRandomized
clone, getRandom, setRandom
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface gov.sandia.cognition.util.Randomized
getRandom, setRandom
 

Field Detail

DEFAULT_TRAINING_PERCENT

public static final double DEFAULT_TRAINING_PERCENT
The default percentage of training data is 50%.

See Also:
Constant Field Values

trainingPercent

protected double trainingPercent
The percentage of training data.

Constructor Detail

RandomDataPartitioner

public RandomDataPartitioner()
Creates a new instance of RandomDataPartitioner.


RandomDataPartitioner

public RandomDataPartitioner(double trainingPercent,
                             Random random)
Creates a new instance of RandomDataPartitioner.

Parameters:
trainingPercent - The percentage of training data.
random - The Random object to use.
Method Detail

createPartition

public PartitionedDataset<DataType> createPartition(Collection<? extends DataType> data)
Randomly partitions the given data into a training and testing set.

Specified by:
createPartition in interface DataPartitioner<DataType>
Parameters:
data - The data to partition.
Returns:
The data partitioned according to the training percentage.

createPartition

public static <DataType> PartitionedDataset<DataType> createPartition(Collection<? extends DataType> data,
                                                                      double trainingPercent,
                                                                      Random random)
Randomly partitions the given data into a training and testing set.

Type Parameters:
DataType - The type of data to partition.
Parameters:
data - The data to partition.
trainingPercent - the percentage of data to put in the training partition. Must be greater than 0.0 and less than 1.0.
random - The random number generator to use.
Returns:
The data partitioned according to the training percentage.

getTrainingPercent

public double getTrainingPercent()
Gets the percentage of data to put in the training partition.

Returns:
The percentage of data to put in the training partition.

setTrainingPercent

public void setTrainingPercent(double trainingPercent)
Sets the percentage of data to put in the training partition. Must be greater than 0.0 and less than 1.0.

Parameters:
trainingPercent - The percentage of data to put in the training partition.

checkTrainingPercent

protected static final void checkTrainingPercent(double trainingPercent)
Checks to make sure the training percent greater than 0.0 and less than 1.0.

Parameters:
trainingPercent - The percentage of data to put in the training partition.