maito.datacollecting
Class DataCollectorImpl

java.lang.Object
  extended by maito.datacollecting.DataCollectorImpl
All Implemented Interfaces:
DataCollector, DataProcessor

public class DataCollectorImpl
extends java.lang.Object
implements DataCollector

Author:
Antti Laitinen

Constructor Summary
DataCollectorImpl(java.lang.String dataDir, java.lang.String configDir)
          Creates a new DataCollectorImpl instance.
 
Method Summary
 boolean addSource(java.lang.String name, java.lang.String type, java.net.URL location, java.lang.String format)
          Adds a single source to this DataCollector's data sources that will be updateble in the future.
 java.lang.String[] getCurrentTasks()
          Returns a user readable description of every task that is in progress at the moment.
 java.lang.String[] getErrors()
          Returns all errors that have occurred since the last data processing was started.
 DataSourceDescription[] getSources()
          Returns a description of each source that this DataCollector has.
 java.util.HashMap getSupportedTypes()
          Returns all source types that are supported by this DataCollector.
 boolean removeSources(DataSourceDescription[] sources, boolean removeData)
          Removes one or more data sources permanently.
 void setLogListener(LogListener listener)
          Sets a listener for all log messages sent by this DataProcessor.
 void updateSources(DataSourceDescription[] sources)
          Starts updating data sources.
 boolean workInProgress()
          Tells whether this DataProcessor is currently processing data.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

DataCollectorImpl

public DataCollectorImpl(java.lang.String dataDir,
                         java.lang.String configDir)
                  throws java.lang.RuntimeException
Creates a new DataCollectorImpl instance. Reads all configuration files and prepares data sources for updating.

Parameters:
dataDir - The directory that contains configuration files for each data source.
configDir - The directory that contains configuration files used by the whole program (for example database configuration).
Throws:
java.lang.RuntimeException - Thrown if something goes wrong in the initialization.
Method Detail

getSupportedTypes

public java.util.HashMap getSupportedTypes()
Description copied from interface: DataCollector
Returns all source types that are supported by this DataCollector.

Specified by:
getSupportedTypes in interface DataCollector
Returns:
A HashMap containing mappings from a transfer type to data format. The key values are String objects containing a supported transfer type. The keys map into String[] array objects that contain all supported data formats for the transfer type. example: "oaipmh" -> {"DCXML","oai_citeseer"} (key "oaipmh" maps to an array that contains "DCXML" and "oai_citeseer")

addSource

public boolean addSource(java.lang.String name,
                         java.lang.String type,
                         java.net.URL location,
                         java.lang.String format)
                  throws java.lang.RuntimeException
Description copied from interface: DataCollector
Adds a single source to this DataCollector's data sources that will be updateble in the future.

Specified by:
addSource in interface DataCollector
Parameters:
name - A name/id for this source. This is the name that is visible to the user.
type - The type of this source. Must be one of the following:
  • "file"
  • "OAI-PMH"
location - The location where this source's data is retrieved from.
format - The data format. Must be one of the following:
  • "quick_format_name"
  • "quick_format_document"
  • "DCXML"
  • "oai_citeseer"
Returns:
A boolean value telling whether the given source was successfully added.
Throws:
java.lang.IllegalArgumentException - Thrown when something is wrong with the parameters so that a new data source cannot be created.
java.lang.RuntimeException

getSources

public DataSourceDescription[] getSources()
Description copied from interface: DataCollector
Returns a description of each source that this DataCollector has.

Specified by:
getSources in interface DataCollector
Returns:
An array of DataSourceDescription objects. The array is empty if this DataCollector has no sources.

updateSources

public void updateSources(DataSourceDescription[] sources)
Description copied from interface: DataCollector
Starts updating data sources.

Specified by:
updateSources in interface DataCollector
Parameters:
sources - The data sources that are to be updated.

removeSources

public boolean removeSources(DataSourceDescription[] sources,
                             boolean removeData)
Description copied from interface: DataCollector
Removes one or more data sources permanently. The data source will no longer be updated. All data from the database will be removed.

Specified by:
removeSources in interface DataCollector
Parameters:
sources - The sources that are to be removed.
removeData - If true also all raw data is deleted from disk. If false the raw data is left alone.
Returns:
A boolean value telling whether the sources were successfully removed.

workInProgress

public boolean workInProgress()
Description copied from interface: DataProcessor
Tells whether this DataProcessor is currently processing data.

Specified by:
workInProgress in interface DataProcessor
Returns:
true if data is being processed, otherwise false.

getCurrentTasks

public java.lang.String[] getCurrentTasks()
Description copied from interface: DataProcessor
Returns a user readable description of every task that is in progress at the moment.

Specified by:
getCurrentTasks in interface DataProcessor
Returns:
An array of String objects containing the description of each task. If no tasks are in progress returns an empty array.

getErrors

public java.lang.String[] getErrors()
Description copied from interface: DataProcessor
Returns all errors that have occurred since the last data processing was started. The errors are in a user readable form.

Specified by:
getErrors in interface DataProcessor
Returns:
An array of String objects where each String is a description of the error. If no errors have occurred the array is empty.

setLogListener

public void setLogListener(LogListener listener)
Description copied from interface: DataProcessor
Sets a listener for all log messages sent by this DataProcessor.

Specified by:
setLogListener in interface DataProcessor
Parameters:
listener - The object that listens to this DataProcessor's log messages.