gov.sandia.cognition.text.term.filter
Class DefaultStopList

java.lang.Object
  extended by gov.sandia.cognition.util.AbstractCloneableSerializable
      extended by gov.sandia.cognition.text.term.filter.DefaultStopList
All Implemented Interfaces:
StopList, CloneableSerializable, Serializable, Cloneable

public class DefaultStopList
extends AbstractCloneableSerializable
implements StopList

A default, case-insensitive stop-list.

Since:
3.0
Author:
Justin Basilico
See Also:
Serialized Form

Field Summary
protected  Set<String> words
          The set of words in the stop list, all in lower-case.
 
Constructor Summary
DefaultStopList()
          Creates a new, empty DefaultStopList.
DefaultStopList(Iterable<String> words)
          Creates a new DefaultStopList with the given set of words.
 
Method Summary
 void add(String word)
          Adds a word to the stop list.
 void addAll(Iterable<String> words)
          Adds all of the given words to the stop list.
 DefaultStopList clone()
          This makes public the clone method on the Object class and removes the exception that it throws.
 boolean contains(String word)
          Returns true if the given word is in the stop list.
 boolean contains(Term term)
          Returns true if the given term is in the stop list.
 boolean contains(Termable term)
          Determines if the given term is contained in this stop list.
 Set<String> getWords()
          Gets the set of words in the stop list.
static DefaultStopList loadFromText(BufferedReader reader)
          Loads a stop list by reading in from the given reader and treating each line as a word.
static DefaultStopList loadFromText(File file)
          Loads a stop list by reading in a given file and treating each line as a word.
static DefaultStopList loadFromText(URI uri)
          Loads a stop list by reading in a given file and treating each line as a word.
static DefaultStopList loadFromText(URL url)
          Loads a stop list by reading in a given file and treating each line as a word.
static DefaultStopList loadFromText(URLConnection connection)
          Loads a stop list by reading in a given file and treating each line as a word.
 void saveAsText(File file)
          Saves the stop list to the given file.
 void saveAsText(PrintStream out)
          Saves the stop list to the given stream.
protected  void setWords(Set<String> words)
          Sets the set of words in the stop list.
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

words

protected Set<String> words
The set of words in the stop list, all in lower-case.

Constructor Detail

DefaultStopList

public DefaultStopList()
Creates a new, empty DefaultStopList.


DefaultStopList

public DefaultStopList(Iterable<String> words)
Creates a new DefaultStopList with the given set of words.

Parameters:
words - The words to add to the stop list.
Method Detail

clone

public DefaultStopList clone()
Description copied from class: AbstractCloneableSerializable
This makes public the clone method on the Object class and removes the exception that it throws. Its default behavior is to automatically create a clone of the exact type of object that the clone is called on and to copy all primitives but to keep all references, which means it is a shallow copy. Extensions of this class may want to override this method (but call super.clone() to implement a "smart copy". That is, to target the most common use case for creating a copy of the object. Because of the default behavior being a shallow copy, extending classes only need to handle fields that need to have a deeper copy (or those that need to be reset). Some of the methods in ObjectUtil may be helpful in implementing a custom clone method. Note: The contract of this method is that you must use super.clone() as the basis for your implementation.

Specified by:
clone in interface CloneableSerializable
Overrides:
clone in class AbstractCloneableSerializable
Returns:
A clone of this object.

add

public void add(String word)
Adds a word to the stop list.

Parameters:
word - The word to add to the stop list.

addAll

public void addAll(Iterable<String> words)
Adds all of the given words to the stop list.

Parameters:
words - The words to add.

contains

public boolean contains(Termable term)
Description copied from interface: StopList
Determines if the given term is contained in this stop list.

Specified by:
contains in interface StopList
Parameters:
term - The term.
Returns:
True if the term is in the list and false otherwise.

contains

public boolean contains(Term term)
Returns true if the given term is in the stop list.

Parameters:
term - A term.
Returns:
True if the term is contained in the stop list. Otherwise, false.

contains

public boolean contains(String word)
Returns true if the given word is in the stop list.

Parameters:
word - A word.
Returns:
True if the word is contained in the stop list. Otherwise, false.

getWords

public Set<String> getWords()
Gets the set of words in the stop list.

Returns:
The set of words in the stop list.

setWords

protected void setWords(Set<String> words)
Sets the set of words in the stop list.

Parameters:
words - The set of words in the stop list.

saveAsText

public void saveAsText(File file)
                throws IOException
Saves the stop list to the given file. Each word is written on a separate line.

Parameters:
file - The file to save the stop list to.
Throws:
IOException - If there is an IO error.

saveAsText

public void saveAsText(PrintStream out)
                throws IOException
Saves the stop list to the given stream. Each word is written on a separate line. The stream is not closed at the end.

Parameters:
out - The stream to write the stop words to.
Throws:
IOException - If there is an IO error.

loadFromText

public static DefaultStopList loadFromText(File file)
                                    throws IOException
Loads a stop list by reading in a given file and treating each line as a word.

Parameters:
file - The file to read in.
Returns:
A new stop list containing a stop word for each line in the file.
Throws:
IOException - If there is an IO error.

loadFromText

public static DefaultStopList loadFromText(URI uri)
                                    throws IOException
Loads a stop list by reading in a given file and treating each line as a word.

Parameters:
uri - The file to read in.
Returns:
A new stop list containing a stop word for each line in the file.
Throws:
IOException - If there is an IO error.

loadFromText

public static DefaultStopList loadFromText(URL url)
                                    throws IOException
Loads a stop list by reading in a given file and treating each line as a word.

Parameters:
url - The file to read in.
Returns:
A new stop list containing a stop word for each line in the file.
Throws:
IOException - If there is an IO error.

loadFromText

public static DefaultStopList loadFromText(URLConnection connection)
                                    throws IOException
Loads a stop list by reading in a given file and treating each line as a word.

Parameters:
connection - The connection to the file to read in.
Returns:
A new stop list containing a stop word for each line in the file.
Throws:
IOException - If there is an IO error.

loadFromText

public static DefaultStopList loadFromText(BufferedReader reader)
                                    throws IOException
Loads a stop list by reading in from the given reader and treating each line as a word.

Parameters:
reader - The reader to read the stop words from. Does not close the reader.
Returns:
A new stop list containing a stop word for each line in the reader.
Throws:
IOException - If there is an IO error.