gov.sandia.cognition.text.term.vector.weighter.local
Class TermFrequencyLocalTermWeighter
java.lang.Object
gov.sandia.cognition.util.AbstractCloneableSerializable
gov.sandia.cognition.math.matrix.DefaultVectorFactoryContainer
gov.sandia.cognition.text.term.vector.weighter.local.AbstractLocalTermWeighter
gov.sandia.cognition.text.term.vector.weighter.local.TermFrequencyLocalTermWeighter
- All Implemented Interfaces:
- VectorFactoryContainer, LocalTermWeighter, CloneableSerializable, Serializable, Cloneable
@PublicationReference(author="Wikipedia",
title="tf-idf",
type=WebPage,
url="http://en.wikipedia.org/wiki/tf-idf",
year=2009)
public class TermFrequencyLocalTermWeighter
- extends AbstractLocalTermWeighter
Local weighting for term frequency. The input is assumed to be a vector of
the number of times a term appears in the document. If n_i,j is the number of
times term i appears in document j, the term frequency for term i in document
j is:
tf_(i,j) = n_(i,j) / (sum_k n_(k, j)
- Since:
- 3.0
- Author:
- Justin Basilico
- See Also:
- Serialized Form
TermFrequencyLocalTermWeighter
public TermFrequencyLocalTermWeighter()
- Creates a new
TermFrequencyLocalTermWeighter
.
TermFrequencyLocalTermWeighter
public TermFrequencyLocalTermWeighter(VectorFactory<? extends Vector> vectorFactory)
- Creates a new
LogLocalTermWeighter
.
- Parameters:
vectorFactory
- The vector factory to use.
computeLocalWeights
public Vector computeLocalWeights(Vector counts)
- Description copied from interface:
LocalTermWeighter
- Computes the new local weights for a given document.
- Parameters:
counts
- The document to compute local weights for.
- Returns:
- The local weight vector for the documents.