gov.sandia.cognition.text.term.filter.stem
Class PorterEnglishStemmingFilter

java.lang.Object
  extended by gov.sandia.cognition.util.AbstractCloneableSerializable
      extended by gov.sandia.cognition.text.term.filter.AbstractSingleTermFilter
          extended by gov.sandia.cognition.text.term.filter.stem.PorterEnglishStemmingFilter
All Implemented Interfaces:
SingleTermFilter, TermFilter, CloneableSerializable, Serializable, Cloneable

@PublicationReferences(references={@PublicationReference(author="Martin Porter",title="The Porter Stemming Algorithm",year=2006,type=WebPage,url="http://tartarus.org/~martin/PorterStemmer/"),@PublicationReference(author="Martin F. Porter",title=" An algorithm for suffix stripping",year=1980,publication="Program 14(3)",pages={130,137},type=Journal),@PublicationReference(author="Wikipedia",title="Stemming",year=2010,type=WebPage,url="http://en.wikipedia.org/wiki/Stemming")})
public class PorterEnglishStemmingFilter
extends AbstractSingleTermFilter

A term filter that uses the Porter Stemming algorithm. It is a rule-based algorithm for stemming English words. This class just wraps the Java implementation of the stemmer by Martin Porter himself and turns it into a TermFilter.

Since:
3.0
Author:
Justin Basilico
See Also:
Serialized Form

Constructor Summary
PorterEnglishStemmingFilter()
          Creates a new PorterEnglishStemmingFilter.
 
Method Summary
 TermOccurrence filterTerm(TermOccurrence occurrence)
          Takes a single term occurrence and filters that occurrence into a new occurrence or returns null, indicating that the filter rejects that term.
static String stem(String word)
          Stems the given String according to the Porter stemming algorithm for English words.
 
Methods inherited from class gov.sandia.cognition.text.term.filter.AbstractSingleTermFilter
filterTerms
 
Methods inherited from class gov.sandia.cognition.util.AbstractCloneableSerializable
clone
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface gov.sandia.cognition.util.CloneableSerializable
clone
 

Constructor Detail

PorterEnglishStemmingFilter

public PorterEnglishStemmingFilter()
Creates a new PorterEnglishStemmingFilter.

Method Detail

filterTerm

public TermOccurrence filterTerm(TermOccurrence occurrence)
Description copied from interface: SingleTermFilter
Takes a single term occurrence and filters that occurrence into a new occurrence or returns null, indicating that the filter rejects that term.

Parameters:
occurrence - The term occurrence to filter.
Returns:
A term occurrence (may be a new instance or the same as the given one) of the term to replace the given one or null to indicate that the filter has rejected the given term.

stem

public static String stem(String word)
Stems the given String according to the Porter stemming algorithm for English words.

Parameters:
word - The word to stem.
Returns:
The stemmed version of the given word.