gov.sandia.cognition.statistics.distribution
Class ChineseRestaurantProcess

java.lang.Object
  extended by gov.sandia.cognition.util.AbstractCloneableSerializable
      extended by gov.sandia.cognition.statistics.distribution.ChineseRestaurantProcess
All Implemented Interfaces:
Vectorizable, ClosedFormComputableDiscreteDistribution<Vector>, ClosedFormComputableDistribution<Vector>, ClosedFormDistribution<Vector>, ComputableDistribution<Vector>, DiscreteDistribution<Vector>, Distribution<Vector>, DistributionWithMean<Vector>, CloneableSerializable, Serializable, Cloneable
Direct Known Subclasses:
ChineseRestaurantProcess.PMF

@PublicationReferences(references={@PublicationReference(author="Michael I. Jordan",title="Dirichlet Processes, Chinese Restaurant Processes and All That",year=2005,type=Conference,publication="NIPS Tutorial",url="http://www.cs.berkeley.edu/~jordan/nips-tutorial05.ps"),@PublicationReference(author="Wikipedia",title="http://en.wikipedia.org/wiki/Chinese_restaurant_process",year=2010,type=WebPage,url="http://en.wikipedia.org/wiki/Chinese_restaurant_process",notes="Very poor, unclear description.")})
public class ChineseRestaurantProcess
extends AbstractCloneableSerializable
implements ClosedFormComputableDiscreteDistribution<Vector>

A Chinese Restaurant Process is a discrete stochastic processes that partitions data points to clusters. This is done by imagining a restaurant with an infinite number of tables. The first customer sits at an empty table. The next customer picks an existing table proportionate to how many customers are already sitting at the various tables, and a new table with some nonzero probability. This results in a Dirichlet distribution with a variable number of parameters, which grows approximately as O(log(n)), where n is the number of customers to assign to tables.

Since:
3.0
Author:
Kevin R. Dixon
See Also:
Serialized Form

Nested Class Summary
static class ChineseRestaurantProcess.PMF
          PMF of the Chinese Restaurant Process
 
Field Summary
protected  double alpha
          CRP concentration parameter, must be greater than zero.
static double DEFAULT_ALPHA
          Default concentration parameter, 1.0.
static int DEFAULT_NUM_CUSTOMERS
          Default number of customers, 2.
protected  int numCustomers
          Total number of customers that we will arrange around tables, must be greater than zero.
 
Constructor Summary
ChineseRestaurantProcess()
          Creates a new instance of ChineseRestaurantProcess
ChineseRestaurantProcess(ChineseRestaurantProcess other)
          Default constructor
ChineseRestaurantProcess(double alpha, int numCustomers)
          Creates a new instance of ChineseRestaurantProcess
 
Method Summary
 ChineseRestaurantProcess clone()
          This makes public the clone method on the Object class and removes the exception that it throws.
 void convertFromVector(Vector parameters)
          Converts the object from a Vector of parameters.
 Vector convertToVector()
          Converts the object to a vector.
 double getAlpha()
          Getter for alpha.
 Set<Vector> getDomain()
          Returns an object that allows an iteration through the domain (x-axis, independent variable) of the Distribution
 int getDomainSize()
          Gets the size of the domain.
 Vector getMean()
          Gets the arithmetic mean, or "first central moment" or "expectation", of the distribution.
 int getNumCustomers()
          Getter for numCustomers
 ChineseRestaurantProcess.PMF getProbabilityFunction()
          Gets the distribution function associated with this Distribution, either the PDF or PMF.
 Vector sample(Random random)
          Draws a single random sample from the distribution.
 ArrayList<Vector> sample(Random random, int numSamples)
          Draws multiple random samples from the distribution.
static int sampleNextCustomer(Collection<Integer> tables, int numCustomers, double alpha, Random random)
          Determines where the next customer sits, given the number of customers already sitting at the various tables and the concentration parameter alpha.
 void setAlpha(double alpha)
          Setter for alpha.
 void setNumCustomers(int numCustomers)
          Setter for numCustomers
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

DEFAULT_ALPHA

public static final double DEFAULT_ALPHA
Default concentration parameter, 1.0.

See Also:
Constant Field Values

DEFAULT_NUM_CUSTOMERS

public static final int DEFAULT_NUM_CUSTOMERS
Default number of customers, 2.

See Also:
Constant Field Values

alpha

protected double alpha
CRP concentration parameter, must be greater than zero.


numCustomers

protected int numCustomers
Total number of customers that we will arrange around tables, must be greater than zero.

Constructor Detail

ChineseRestaurantProcess

public ChineseRestaurantProcess()
Creates a new instance of ChineseRestaurantProcess


ChineseRestaurantProcess

public ChineseRestaurantProcess(double alpha,
                                int numCustomers)
Creates a new instance of ChineseRestaurantProcess

Parameters:
alpha - CRP concentration parameter, must be greater than zero.
numCustomers - Total number of customers that we will arrange around tables, must be greater than zero.

ChineseRestaurantProcess

public ChineseRestaurantProcess(ChineseRestaurantProcess other)
Default constructor

Parameters:
other - CRP to copy
Method Detail

clone

public ChineseRestaurantProcess clone()
Description copied from class: AbstractCloneableSerializable
This makes public the clone method on the Object class and removes the exception that it throws. Its default behavior is to automatically create a clone of the exact type of object that the clone is called on and to copy all primitives but to keep all references, which means it is a shallow copy. Extensions of this class may want to override this method (but call super.clone() to implement a "smart copy". That is, to target the most common use case for creating a copy of the object. Because of the default behavior being a shallow copy, extending classes only need to handle fields that need to have a deeper copy (or those that need to be reset). Some of the methods in ObjectUtil may be helpful in implementing a custom clone method. Note: The contract of this method is that you must use super.clone() as the basis for your implementation.

Specified by:
clone in interface Vectorizable
Specified by:
clone in interface CloneableSerializable
Overrides:
clone in class AbstractCloneableSerializable
Returns:
A clone of this object.

getMean

public Vector getMean()
Description copied from interface: DistributionWithMean
Gets the arithmetic mean, or "first central moment" or "expectation", of the distribution.

Specified by:
getMean in interface DistributionWithMean<Vector>
Returns:
Mean of the distribution.

sample

public Vector sample(Random random)
Description copied from interface: Distribution
Draws a single random sample from the distribution.

Specified by:
sample in interface Distribution<Vector>
Parameters:
random - Random-number generator to use in order to generate random numbers.
Returns:
Sample drawn according to this distribution.

sampleNextCustomer

public static int sampleNextCustomer(Collection<Integer> tables,
                                     int numCustomers,
                                     double alpha,
                                     Random random)
Determines where the next customer sits, given the number of customers already sitting at the various tables and the concentration parameter alpha.

Parameters:
tables - Number of customers sitting at the various tables.
numCustomers - Number of customers already sitting, should equal the sum of "tables".
alpha - Concentration parameter.
random - Random number generator.
Returns:
Index of the table where the next customer sits, according to the Chinese Restaurant Process.

sample

public ArrayList<Vector> sample(Random random,
                                int numSamples)
Description copied from interface: Distribution
Draws multiple random samples from the distribution. It is generally more efficient to use this multiple-sample method than multiple calls of the single-sample method. (But not always.)

Specified by:
sample in interface Distribution<Vector>
Parameters:
random - Random-number generator to use in order to generate random numbers.
numSamples - Number of samples to draw from the distribution.
Returns:
Samples drawn according to this distribution.

getAlpha

public double getAlpha()
Getter for alpha.

Returns:
CRP concentration parameter, must be greater than zero.

setAlpha

public void setAlpha(double alpha)
Setter for alpha.

Parameters:
alpha - CRP concentration parameter, must be greater than zero.

getNumCustomers

public int getNumCustomers()
Getter for numCustomers

Returns:
Total number of customers that we will arrange around tables, must be greater than zero.

setNumCustomers

public void setNumCustomers(int numCustomers)
Setter for numCustomers

Parameters:
numCustomers - Total number of customers that we will arrange around tables, must be greater than zero.

getDomain

public Set<Vector> getDomain()
Description copied from interface: DiscreteDistribution
Returns an object that allows an iteration through the domain (x-axis, independent variable) of the Distribution

Specified by:
getDomain in interface DiscreteDistribution<Vector>
Returns:
Collection that enumerates each value that the domain can take

getDomainSize

public int getDomainSize()
Description copied from interface: DiscreteDistribution
Gets the size of the domain.

Specified by:
getDomainSize in interface DiscreteDistribution<Vector>
Returns:
The size of the domain.

getProbabilityFunction

public ChineseRestaurantProcess.PMF getProbabilityFunction()
Description copied from interface: ComputableDistribution
Gets the distribution function associated with this Distribution, either the PDF or PMF.

Specified by:
getProbabilityFunction in interface ComputableDistribution<Vector>
Specified by:
getProbabilityFunction in interface DiscreteDistribution<Vector>
Returns:
Distribution function associated with this Distribution.

convertToVector

public Vector convertToVector()
Description copied from interface: Vectorizable
Converts the object to a vector.

Specified by:
convertToVector in interface Vectorizable
Returns:
The Vector form of the object.

convertFromVector

public void convertFromVector(Vector parameters)
Description copied from interface: Vectorizable
Converts the object from a Vector of parameters.

Specified by:
convertFromVector in interface Vectorizable
Parameters:
parameters - The parameters to incorporate.