gov.sandia.cognition.text.document.extractor
Interface SingleDocumentExtractor

All Superinterfaces:
DocumentExtractor
All Known Implementing Classes:
AbstractSingleDocumentExtractor, TextDocumentExtractor

public interface SingleDocumentExtractor
extends DocumentExtractor

Interface for a DocumentExtractor that only extracts a single document from a file.

Since:
3.0
Author:
Justin Basilico

Method Summary
 Document extractDocument(File file)
          Attempts to extract a document from the given file.
 Document extractDocument(URI uri)
          Attempts to extract a document from the given file.
 Document extractDocument(URLConnection connection)
          Attempts to extract a document from the given file.
 
Methods inherited from interface gov.sandia.cognition.text.document.extractor.DocumentExtractor
canExtract, canExtract, canExtract, extractAll, extractAll, extractAll
 

Method Detail

extractDocument

Document extractDocument(File file)
                         throws DocumentExtractionException,
                                IOException
Attempts to extract a document from the given file.

Parameters:
file - The file to extract.
Returns:
The document extracted from the given file.
Throws:
DocumentExtractionException - If there is an error extracting data from the file.
IOException - If there is an IO error.

extractDocument

Document extractDocument(URI uri)
                         throws DocumentExtractionException,
                                IOException
Attempts to extract a document from the given file.

Parameters:
uri - The URI of the file to extract.
Returns:
The document extracted from the given file.
Throws:
DocumentExtractionException - If there is an error extracting data from the file.
IOException - If there is an IO error.

extractDocument

Document extractDocument(URLConnection connection)
                         throws DocumentExtractionException,
                                IOException
Attempts to extract a document from the given file.

Parameters:
connection - The connection to the file to extract.
Returns:
The document extracted from the given file.
Throws:
DocumentExtractionException - If there is an error extracting data from the file.
IOException - If there is an IO error.