gov.sandia.cognition.text.term.vector.weighter
Class CommonTermWeighterFactory

java.lang.Object
  extended by gov.sandia.cognition.text.term.vector.weighter.CommonTermWeighterFactory

public class CommonTermWeighterFactory
extends Object

A factory for well-known weighting schemes.

Since:
3.0
Author:
Justin Basilico

Method Summary
static CompositeLocalGlobalTermWeighter createLogDominanceWeighter()
          Creates a log-dominance weighting scheme.
static CompositeLocalGlobalTermWeighter createLogEntropyWeighter()
          Creates a log-entropy weighting scheme.
static CompositeLocalGlobalTermWeighter createTFIDFWeighter()
          Creates a term-frequency inverse-document-frequency (TF-IDF) weighting scheme but without any normalization.
static CompositeLocalGlobalTermWeighter createTFIDFWeighterWithUnitNormalization()
          Creates a term-frequency inverse-document-frequency (TF-IDF) weighting scheme with unit vector normalization (2-norm).
static CompositeLocalGlobalTermWeighter createTFWeighter()
          Creates a term-frequency (TF) weighting scheme.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

createTFWeighter

public static CompositeLocalGlobalTermWeighter createTFWeighter()
Creates a term-frequency (TF) weighting scheme. No global weight or normalizer is used.

Returns:
A new TF-IDF weighter.

createTFIDFWeighter

@PublicationReference(author="Wikipedia",
                      title="tf-idf",
                      type=WebPage,
                      url="http://en.wikipedia.org/wiki/tf-idf",
                      year=2009)
public static CompositeLocalGlobalTermWeighter createTFIDFWeighter()
Creates a term-frequency inverse-document-frequency (TF-IDF) weighting scheme but without any normalization.

Returns:
A new TF-IDF weighter.

createTFIDFWeighterWithUnitNormalization

public static CompositeLocalGlobalTermWeighter createTFIDFWeighterWithUnitNormalization()
Creates a term-frequency inverse-document-frequency (TF-IDF) weighting scheme with unit vector normalization (2-norm).

Returns:
A new TF-IDF weighter.

createLogEntropyWeighter

@PublicationReference(author="Susan T. Dumais",
                      title="Improving the retrieval of information from external sources",
                      year=1991,
                      type=Journal,
                      publication="Behavior Research Methods, Instruments, and Computers",
                      pages={229,236},
                      url="http://www.google.com/url?sa=t&source=web&ct=res&cd=1&url=http%3A%2F%2Fwww.psychonomic.org%2Fsearch%2Fview.cgi%3Fid%3D5145&ei=o7joSdGEHY-itgPLre3tAQ&usg=AFQjCNEvm6PZEL6_Hk3XThI6DQ-gGx9EnQ&sig2=-gjFzNroJQirwGtwjaJvgQ")
public static CompositeLocalGlobalTermWeighter createLogEntropyWeighter()
Creates a log-entropy weighting scheme.

Returns:
A new log-entropy weighter.

createLogDominanceWeighter

public static CompositeLocalGlobalTermWeighter createLogDominanceWeighter()
Creates a log-dominance weighting scheme.

Returns:
A new log-dominance weighter.