gov.sandia.cognition.text.term.relation
Class TermVectorSimilarityNetworkCreator

java.lang.Object
  extended by gov.sandia.cognition.util.AbstractCloneableSerializable
      extended by gov.sandia.cognition.text.term.relation.TermVectorSimilarityNetworkCreator
All Implemented Interfaces:
CloneableSerializable, Serializable, Cloneable

public class TermVectorSimilarityNetworkCreator
extends AbstractCloneableSerializable

Creates term similarity networks by comparing vectors representing the terms.

Since:
3.0
Author:
Justin Basilico
See Also:
Serialized Form

Field Summary
static double DEFAULT_EFFECTIVE_ZERO
          The default effective zero value is 0.0.
protected  double effectiveZero
          The value to treat as zero.
protected  MatrixFactory<? extends Matrix> matrixFactory
          The matrix factory to create the matrix that backs the similarity network.
protected  SimilarityFunction<? super Vector,? super Vector> similarityFunction
          The similarity function between term vectors used to determine the similarity between two terms.
 
Constructor Summary
TermVectorSimilarityNetworkCreator()
          Creates a new TermVectorSimilarityNetworkCreator.
TermVectorSimilarityNetworkCreator(SimilarityFunction<? super Vector,? super Vector> similarityFunction)
          Creates a new TermVectorSimilarityNetworkCreator.
TermVectorSimilarityNetworkCreator(SimilarityFunction<? super Vector,? super Vector> similarityFunction, double effectiveZero)
          Creates a new TermVectorSimilarityNetworkCreator.
TermVectorSimilarityNetworkCreator(SimilarityFunction<? super Vector,? super Vector> similarityFunction, double effectiveZero, MatrixFactory<? extends Matrix> matrixFactory)
          Creates a new TermVectorSimilarityNetworkCreator.
 
Method Summary
 MatrixBasedTermSimilarityNetwork create(Collection<? extends Vectorizable> documents, TermIndex termIndex)
          Creates a new similarity network between the terms in the given documents.
 double getEffectiveZero()
          Gets the value to treat as zero.
 MatrixFactory<? extends Matrix> getMatrixFactory()
          Gets the matrix factory to create the matrix that backs the similarity network.
 SimilarityFunction<? super Vector,? super Vector> getSimilarityFunction()
          Gets the similarity function between term vectors used to determine the similarity between two terms.
 void setEffectiveZero(double effectiveZero)
          Sets the value to treat as zero.
 void setMatrixFactory(MatrixFactory<? extends Matrix> matrixFactory)
          Sets the matrix factory to create the matrix that backs the similarity network.
 void setSimilarityFunction(SimilarityFunction<? super Vector,? super Vector> similarityFunction)
          Sets the similarity function between term vectors used to determine the similarity between two terms.
 
Methods inherited from class gov.sandia.cognition.util.AbstractCloneableSerializable
clone
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

DEFAULT_EFFECTIVE_ZERO

public static final double DEFAULT_EFFECTIVE_ZERO
The default effective zero value is 0.0.

See Also:
Constant Field Values

similarityFunction

protected SimilarityFunction<? super Vector,? super Vector> similarityFunction
The similarity function between term vectors used to determine the similarity between two terms.


effectiveZero

protected double effectiveZero
The value to treat as zero. Used to increase the sparseness of a similarity network.


matrixFactory

protected MatrixFactory<? extends Matrix> matrixFactory
The matrix factory to create the matrix that backs the similarity network.

Constructor Detail

TermVectorSimilarityNetworkCreator

public TermVectorSimilarityNetworkCreator()
Creates a new TermVectorSimilarityNetworkCreator.


TermVectorSimilarityNetworkCreator

public TermVectorSimilarityNetworkCreator(SimilarityFunction<? super Vector,? super Vector> similarityFunction)
Creates a new TermVectorSimilarityNetworkCreator.

Parameters:
similarityFunction - The similarity function between term vectors used to determine the term similarity.

TermVectorSimilarityNetworkCreator

public TermVectorSimilarityNetworkCreator(SimilarityFunction<? super Vector,? super Vector> similarityFunction,
                                          double effectiveZero)
Creates a new TermVectorSimilarityNetworkCreator.

Parameters:
similarityFunction - The similarity function between term vectors used to determine the term similarity.
effectiveZero - The effective value to treat as zero. Used to increase the sparseness of a similarity network.

TermVectorSimilarityNetworkCreator

public TermVectorSimilarityNetworkCreator(SimilarityFunction<? super Vector,? super Vector> similarityFunction,
                                          double effectiveZero,
                                          MatrixFactory<? extends Matrix> matrixFactory)
Creates a new TermVectorSimilarityNetworkCreator.

Parameters:
similarityFunction - The similarity function between term vectors used to determine the term similarity.
effectiveZero - The effective value to treat as zero. Used to increase the sparseness of a similarity network.
matrixFactory - The matrix factory used to create the similarity matrix.
Method Detail

create

public MatrixBasedTermSimilarityNetwork create(Collection<? extends Vectorizable> documents,
                                               TermIndex termIndex)
Creates a new similarity network between the terms in the given documents. First the document vectors are turned into a term-by-document matrix. Then the similarity function in this object is used to calculate the similarity between the column vectors representing each term to populate a term-by-term matrix. The resulting matrix will be symmetric.

Parameters:
documents - The term vectors for each document to calculate the similarity network from.
termIndex - The index of terms that was used to create the term vectors for each document.
Returns:
A new similarity network for the terms in the given index calculated using the given vectors.

getSimilarityFunction

public SimilarityFunction<? super Vector,? super Vector> getSimilarityFunction()
Gets the similarity function between term vectors used to determine the similarity between two terms.

Returns:
The similarity function.

setSimilarityFunction

public void setSimilarityFunction(SimilarityFunction<? super Vector,? super Vector> similarityFunction)
Sets the similarity function between term vectors used to determine the similarity between two terms.

Parameters:
similarityFunction - The similarity function.

getEffectiveZero

public double getEffectiveZero()
Gets the value to treat as zero. Used to increase the sparseness of a similarity network.

Returns:
The threshold to treat absolute values below as zero.

setEffectiveZero

public void setEffectiveZero(double effectiveZero)
Sets the value to treat as zero. Used to increase the sparseness of a similarity network.

Parameters:
effectiveZero - The threshold to treat absolute values below as zero.

getMatrixFactory

public MatrixFactory<? extends Matrix> getMatrixFactory()
Gets the matrix factory to create the matrix that backs the similarity network.

Returns:
The matrix factory.

setMatrixFactory

public void setMatrixFactory(MatrixFactory<? extends Matrix> matrixFactory)
Sets the matrix factory to create the matrix that backs the similarity network.

Parameters:
matrixFactory - The matrix factory.