gov.sandia.cognition.learning.algorithm.pca
Class ThinSingularValueDecomposition

java.lang.Object
  extended by gov.sandia.cognition.util.AbstractCloneableSerializable
      extended by gov.sandia.cognition.learning.algorithm.pca.AbstractPrincipalComponentsAnalysis
          extended by gov.sandia.cognition.learning.algorithm.pca.ThinSingularValueDecomposition
All Implemented Interfaces:
BatchLearner<Collection<Vector>,PrincipalComponentsAnalysisFunction>, PrincipalComponentsAnalysis, CloneableSerializable, Serializable, Cloneable

@CodeReview(reviewer="Kevin R. Dixon",
            date="2008-07-23",
            changesNeeded=false,
            comments={"Minor changes to javadoc.","Looks fine."})
public class ThinSingularValueDecomposition
extends AbstractPrincipalComponentsAnalysis

Computes the "thin" singular value decomposition of a dataset. That is, we find the top "numComponents" left singular values of the data matrix by using the EigenvectorPowerIteration algorithm to find successive components. This method is extremely fast to converge, produces accurate estimates of eigenvectors, and is computationally and memory efficient. In my experience, this approach has been uniformly superior to the GeneralizedHebbianAlgorithm approach to computing singular vectors (in terms of accuracy, memory, and computation time).

Since:
2.0
Author:
Kevin R. Dixon
See Also:
EigenvectorPowerIteration, Serialized Form

Constructor Summary
ThinSingularValueDecomposition(int numComponents)
          Creates a new instance of ThinSingularValueDecomposition
ThinSingularValueDecomposition(int numComponents, PrincipalComponentsAnalysisFunction learned)
          Creates a new instance of ThingSingularValueDecomposition
 
Method Summary
 PrincipalComponentsAnalysisFunction learn(Collection<Vector> data)
          Creates a PrincipalComponentsAnalysisFunction based on the number of components and the given data.
static PrincipalComponentsAnalysisFunction learn(Collection<Vector> data, int numComponents)
          Creates a PrincipalComponentsAnalysisFunction based on the number of components and the given data.
 
Methods inherited from class gov.sandia.cognition.learning.algorithm.pca.AbstractPrincipalComponentsAnalysis
clone, getNumComponents, getResult, setNumComponents, setResult
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

ThinSingularValueDecomposition

public ThinSingularValueDecomposition(int numComponents)
Creates a new instance of ThinSingularValueDecomposition

Parameters:
numComponents - Number of components to extract from the data, must be greater than zero

ThinSingularValueDecomposition

public ThinSingularValueDecomposition(int numComponents,
                                      PrincipalComponentsAnalysisFunction learned)
Creates a new instance of ThingSingularValueDecomposition

Parameters:
numComponents - Number of components to extract from the data, must be greater than zero
learned - Vector function that maps the input space onto a numComponents-dimension Vector representing the directions of maximal variance (information gain). The i-th row in the matrix approximates the i-th column of the "U" matrix of the Singular Value Decomposition.
Method Detail

learn

public PrincipalComponentsAnalysisFunction learn(Collection<Vector> data)
Creates a PrincipalComponentsAnalysisFunction based on the number of components and the given data. This will return the top "numComponents" number of left eigenvectors of the data.

Parameters:
data - Dataset of which compute the PCA, with each Vector of equal dimension
Returns:
Vector function that maps the input space onto a numComponents-dimension Vector representing the directions of maximal variance (information gain). The i-th row in the matrix approximates the i-th column of the "U" matrix of the Singular Value Decomposition.

learn

public static PrincipalComponentsAnalysisFunction learn(Collection<Vector> data,
                                                        int numComponents)
Creates a PrincipalComponentsAnalysisFunction based on the number of components and the given data. This will return the top "numComponents" number of left eigenvectors of the data.

Parameters:
data - Dataset of which compute the PCA, with each Vector of equal dimension
numComponents - Number of components to extract from the data, must be greater than zero
Returns:
Vector function that maps the input space onto a numComponents-dimension Vector representing the directions of maximal variance (information gain). The i-th row in the matrix approximates the i-th column of the "U" matrix of the Singular Value Decomposition.