gov.sandia.cognition.text.term.filter
Class NGramFilter

java.lang.Object
  extended by gov.sandia.cognition.util.AbstractCloneableSerializable
      extended by gov.sandia.cognition.text.term.filter.NGramFilter
All Implemented Interfaces:
TermFilter, CloneableSerializable, Serializable, Cloneable

public class NGramFilter
extends AbstractCloneableSerializable
implements TermFilter

A term filter that creates an n-gram of terms.

Since:
3.0
Author:
Justin Basilico
See Also:
Serialized Form

Field Summary
static int DEFAULT_SIZE
          The default is a bigram.
protected  int size
          The size of the n-gram.
 
Constructor Summary
NGramFilter()
          Creates a new NGramFilter with the default size.
NGramFilter(int size)
          Creates a new NGramFilter with the given size.
 
Method Summary
 NGramFilter clone()
          This makes public the clone method on the Object class and removes the exception that it throws.
 Collection<TermOccurrence> filterTerms(Iterable<? extends TermOccurrence> terms)
          Filters the given list of terms into a new list of terms based on some internal criteria for what constitutes a term.
 int getSize()
          Gets the size of the n-gram created by the filter.
 void setSize(int size)
          Sets the size of the n-gram created by the filter.
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

DEFAULT_SIZE

public static final int DEFAULT_SIZE
The default is a bigram.

See Also:
Constant Field Values

size

protected int size
The size of the n-gram. Also known as the value of n.

Constructor Detail

NGramFilter

public NGramFilter()
Creates a new NGramFilter with the default size.


NGramFilter

public NGramFilter(int size)
Creates a new NGramFilter with the given size.

Parameters:
size - The size of the n-grams to create. Must be greater than 1.
Method Detail

clone

public NGramFilter clone()
Description copied from class: AbstractCloneableSerializable
This makes public the clone method on the Object class and removes the exception that it throws. Its default behavior is to automatically create a clone of the exact type of object that the clone is called on and to copy all primitives but to keep all references, which means it is a shallow copy. Extensions of this class may want to override this method (but call super.clone() to implement a "smart copy". That is, to target the most common use case for creating a copy of the object. Because of the default behavior being a shallow copy, extending classes only need to handle fields that need to have a deeper copy (or those that need to be reset). Some of the methods in ObjectUtil may be helpful in implementing a custom clone method. Note: The contract of this method is that you must use super.clone() as the basis for your implementation.

Specified by:
clone in interface CloneableSerializable
Overrides:
clone in class AbstractCloneableSerializable
Returns:
A clone of this object.

filterTerms

public Collection<TermOccurrence> filterTerms(Iterable<? extends TermOccurrence> terms)
Description copied from interface: TermFilter
Filters the given list of terms into a new list of terms based on some internal criteria for what constitutes a term.

Specified by:
filterTerms in interface TermFilter
Parameters:
terms - The terms to filter.
Returns:
The new list of terms.

getSize

public int getSize()
Gets the size of the n-gram created by the filter. Also known as the value of n.

Returns:
The size of the n-gram created by the filter.

setSize

public void setSize(int size)
Sets the size of the n-gram created by the filter. Also known as the value of n.

Parameters:
size - The size of the n-gram created by the filter. Must be greater than 1.