gov.sandia.cognition.text.term.vector.weighter.global
Class AbstractEntropyBasedGlobalTermWeighter

java.lang.Object
  extended by gov.sandia.cognition.util.AbstractCloneableSerializable
      extended by gov.sandia.cognition.text.term.vector.AbstractVectorSpaceModel
          extended by gov.sandia.cognition.text.term.vector.weighter.global.AbstractGlobalTermWeighter
              extended by gov.sandia.cognition.text.term.vector.weighter.global.AbstractFrequencyBasedGlobalTermWeighter
                  extended by gov.sandia.cognition.text.term.vector.weighter.global.AbstractEntropyBasedGlobalTermWeighter
All Implemented Interfaces:
VectorFactoryContainer, VectorSpaceModel, GlobalTermWeighter, CloneableSerializable, Serializable, Cloneable
Direct Known Subclasses:
DominanceGlobalTermWeighter, EntropyGlobalTermWeighter

public abstract class AbstractEntropyBasedGlobalTermWeighter
extends AbstractFrequencyBasedGlobalTermWeighter

An abstract implementation of a global term weighting scheme that keeps track of the sum of the entropy term (f_ij * log(f_ij)) over all documents. It is used as a speed-up for global term weighting methods that are based on entropy so that they can be computed incrementally.

Since:
3.0
Author:
Justin Basilico
See Also:
Serialized Form

Field Summary
protected  Vector termEntropiesSum
          A vector containing the sum of the entropy term (f_ij * log(f_ij)) over each document in the collection for each term.
 
Fields inherited from class gov.sandia.cognition.text.term.vector.weighter.global.AbstractFrequencyBasedGlobalTermWeighter
documentCount, termDocumentFrequencies, termGlobalFrequencies
 
Fields inherited from class gov.sandia.cognition.text.term.vector.weighter.global.AbstractGlobalTermWeighter
vectorFactory
 
Constructor Summary
AbstractEntropyBasedGlobalTermWeighter()
          Creates a new AbstractEntropyBasedGlobalTermWeighter.
AbstractEntropyBasedGlobalTermWeighter(VectorFactory<? extends Vector> vectorFactory)
          Creates a new AbstractEntropyBasedGlobalTermWeighter.
 
Method Summary
 void add(Vector counts)
          Adds a document to the model.
 AbstractEntropyBasedGlobalTermWeighter clone()
          This makes public the clone method on the Object class and removes the exception that it throws.
 Vector getTermEntropiesSum()
          Gets the vector containing the sum of term the entropies.
protected  void growVectors(int newDimensionality)
          Called when the dimensionality of the term vector grows.
protected  void initializeVectors(int dimensionality)
          Initializes internal vectors to the given dimensionality.
 boolean remove(Vector counts)
          Removes the document from the model.
protected  void setTermEntropiesSum(Vector termEntropiesSum)
          Sets the vector containing the sum of the term entropies.
 
Methods inherited from class gov.sandia.cognition.text.term.vector.weighter.global.AbstractFrequencyBasedGlobalTermWeighter
getDocumentCount, getTermDocumentFrequencies, getTermGlobalFrequencies, setDocumentCount, setTermDocumentFrequencies, setTermGlobalFrequencies
 
Methods inherited from class gov.sandia.cognition.text.term.vector.weighter.global.AbstractGlobalTermWeighter
getVectorFactory, setVectorFactory
 
Methods inherited from class gov.sandia.cognition.text.term.vector.AbstractVectorSpaceModel
add, addAll, remove, removeAll
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface gov.sandia.cognition.text.term.vector.weighter.global.GlobalTermWeighter
getDimensionality, getGlobalWeights
 
Methods inherited from interface gov.sandia.cognition.text.term.vector.VectorSpaceModel
add, addAll, remove, removeAll
 

Field Detail

termEntropiesSum

protected Vector termEntropiesSum
A vector containing the sum of the entropy term (f_ij * log(f_ij)) over each document in the collection for each term.

Constructor Detail

AbstractEntropyBasedGlobalTermWeighter

public AbstractEntropyBasedGlobalTermWeighter()
Creates a new AbstractEntropyBasedGlobalTermWeighter.


AbstractEntropyBasedGlobalTermWeighter

public AbstractEntropyBasedGlobalTermWeighter(VectorFactory<? extends Vector> vectorFactory)
Creates a new AbstractEntropyBasedGlobalTermWeighter.

Parameters:
vectorFactory - The vector factory to use.
Method Detail

clone

public AbstractEntropyBasedGlobalTermWeighter clone()
Description copied from class: AbstractCloneableSerializable
This makes public the clone method on the Object class and removes the exception that it throws. Its default behavior is to automatically create a clone of the exact type of object that the clone is called on and to copy all primitives but to keep all references, which means it is a shallow copy. Extensions of this class may want to override this method (but call super.clone() to implement a "smart copy". That is, to target the most common use case for creating a copy of the object. Because of the default behavior being a shallow copy, extending classes only need to handle fields that need to have a deeper copy (or those that need to be reset). Some of the methods in ObjectUtil may be helpful in implementing a custom clone method. Note: The contract of this method is that you must use super.clone() as the basis for your implementation.

Specified by:
clone in interface CloneableSerializable
Overrides:
clone in class AbstractFrequencyBasedGlobalTermWeighter
Returns:
A clone of this object.

add

public void add(Vector counts)
Description copied from interface: VectorSpaceModel
Adds a document to the model.

Specified by:
add in interface VectorSpaceModel
Overrides:
add in class AbstractFrequencyBasedGlobalTermWeighter
Parameters:
counts - Adds a document to the model.

remove

public boolean remove(Vector counts)
Description copied from interface: VectorSpaceModel
Removes the document from the model.

Specified by:
remove in interface VectorSpaceModel
Overrides:
remove in class AbstractFrequencyBasedGlobalTermWeighter
Parameters:
counts - The document to remove.
Returns:
True if this object changed as a result of the removal.

initializeVectors

protected void initializeVectors(int dimensionality)
Description copied from class: AbstractFrequencyBasedGlobalTermWeighter
Initializes internal vectors to the given dimensionality.

Overrides:
initializeVectors in class AbstractFrequencyBasedGlobalTermWeighter
Parameters:
dimensionality - The dimensionality to initialize to.

growVectors

protected void growVectors(int newDimensionality)
Description copied from class: AbstractFrequencyBasedGlobalTermWeighter
Called when the dimensionality of the term vector grows.

Overrides:
growVectors in class AbstractFrequencyBasedGlobalTermWeighter
Parameters:
newDimensionality - The new dimensionality;

getTermEntropiesSum

public Vector getTermEntropiesSum()
Gets the vector containing the sum of term the entropies.

Returns:
The term entropies sum.

setTermEntropiesSum

protected void setTermEntropiesSum(Vector termEntropiesSum)
Sets the vector containing the sum of the term entropies.

Parameters:
termEntropiesSum - The term entropies sum.