|
|||||||||
PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES |
See:
Description
Interface Summary | |
IConstraints | Interface for objects which apply a set of additional constraints to a listStatements operation. |
IDBConnection | Encapsulate the specification of a jdbc connection, mostly used to simplify the calling pattern for ModelRDB factory methods. |
IDBID | Interface for database identifiers. |
IRDBDriver | Generic database interface used for implementing RDF Stores. |
Class Summary | |
ConstraintsGeneric | Implemention of the IConstraints interface used to specify search constraints on Jena statements. |
DBConnection | Encapsulate the specification of a jdbc connection, mostly used to simplify the calling pattern for ModelRDB factory methods. |
DBIDHash | Interface for database identifiers. |
DBIDInt | Interface for database identifiers. |
DriverGenericAttribute | Adaption of the base generic layout driver to use separate tables for specific predicates (called attributes). |
DriverGenericGeneric | Base database driver for implementing ModelRDB and StoreRDB. |
DriverGenericGenericProc | Adaption of the base generic layout driver to support databases with stored procedures that do the id allocation and duplication checking inline with the actual insert (for each of Resource, Namespace, Literal and Statement). |
DriverGenericHash | Adaption of the base generic layout driver to use unique hashes of as the index terms instead of relying on database sequences. |
DriverGenericMMGeneric | Adaption of the base generic layout driver to support multiple models in one database. |
DriverGenericMMGenericProc | Adaption of the base generic layout driver to support databases with stored procedures that do the id allocation and duplication checking inline with the actual insert (for each of Resource, Namespace, Literal and Statement). |
DriverGenericMMHash | Adaption of the base generic layout driver to support multiple models in one database. |
DriverGenericProc | Adaption of the base generic layout driver to support databases with stored procedures that do the id allocation and duplication checking inline with the actual insert (for each of Resource, Namespace, Literal and Statement). |
DriverInterbaseHash | Adaption of the base generic layout driver to the limitations of InterBase SQL. |
DriverInterbaseMMHash | Adaption of the base generic layout driver to the limitations of InterBase SQL. |
DriverOracleMMGeneric | Customize the MMGeneric driver for use with Oracle. |
ModelRDB | This implementation of the Model interface uses a relational database to hold the model statements. |
PropertyImplRDB | A variation on the default Property implementation that adds a unique database ID field. |
ResourceImplRDB | A variation on the default Resource implementation that adds a unique database ID field. |
ResultSetIterator | Iterates over an SQL result set returning each row as an ArrayList of objects. |
ResultSetResourceIterator | Version of ResultSetIterator that extracts database rows as resources, assuming that the SQL returns rows of form [id, localname, namespaceid]. |
ResultSetStatementIterator | Version of ResultSetIterator that extracts database rows as statments assuming that the SQL returns rows of form: |
ResultSetStringIterator | Version of ResultSetIterator that extracts database rows as single strings. |
SQLCache | Stores a set of sql statements loaded from a resource file. |
StoreRDB | Generic store implementation for RDB backed RDF storage. |
TestDBConnection | |
TestGenericDriver | Unit tests for the generic database driver. |
TestJenaRegression | Run jena regression tests on an RDB-backed model. |
TestRDB | Overall test harness for running all currently using rdb unit tests. |
TestStoreRDB | Test harness for StoreRDB and supporting classes. |
Exception Summary | |
RDFRDBException | Used to signal most errors with RDB access. |
A general relational database backend for persistent storage of jena models.
The jena/rdb module provides an implementation of the jena model
interface which
stores the RDF statement information in a relational database. The implementation can support
a variety of database table layouts and can customize the SQL code to cope with the vagaries
of different database implementations.
Database-backed RDF models are instances of the class jena.rdb.ModelRDB
.
As well as implementing the full jena.model.Model
interface the static
methods on ModelRDB
provide means to create, extend and reopen database instances.
First consider the situation where we have an available database but as yet it has no RDF models stored in it and we want to format it for holding RDF statements. In that case we would use:
DBConnection dbcon = new DBConnection(DATABASE_URI, user, password); ModelRDB model = ModelRDB.create(dbcon, LAYOUT_STYLE, DATABASE_TYPE);The
DBConnection
class provides different methods for specifying
the underlying database. In particular it can be specified, as in the example above,
as a jdbc uri (e.g. jdbc:interbase:\\localhost:\databases\test.gdb
) along
with any required user name and password. Alternatively, the database connection can be
opened using the standard jdbc calls and the resulting jdbc Connection
object can
be wrapped up as a DBConnection
for passing on the ModelRDB.create
.
The ModelRDB.create
call takes two arguments in addition to the database connection
itself. Firstly, the LAYOUT_STYLE
is a string defining the type of database
table structure to be used. Typical values for this include:
Generic | General layout, all statements are stored in a single table. Resources and literals are indexed using integer id's generated by database sequence generators. |
Generic | Variant on the generic layout that uses stored procedures for all model updates, this can have a 30-50% performance advantage in some cases. |
MMGeneric | Similar layout to "Generic" but can support more than one jena model in a single database. |
Hash | Similar layout to "Generic" but uses MD5 hashes to generate the id's for resources and literals - this avoids relying on the database generators and is more portable and very similar performance. |
MMHash | Similar layout to "Hash" but can support more than one jena model in a single database. |
The second argument DATABASE_TYPE
is a string defining the type of the database. Whilst, jdbc
offers good database independence most SQL code remains database-dependent - for example sequence generators,
stored procedures and limitations on table indexes all vary across databases. The jena RDB modules cope with
this by allowing implementors to customise the SQL code to suit the database server to be used. If using
a portable layout such as "Generic"
or "Hash"
then the DATABASE_TYPE
of "Generic"
may work otherwise use a specific database name here. The distribution includes
configuration files for "interbase", "mysql" and "postgresql". Others can be created.
The call to ModelRDB.create
will create the appropriate database tables and record
within the database a note of the layout chosen. This means that a previously created database can
be reopened using:
DBConnection dbcon = new DBConnection(DATABASE_URI, user, password); ModelRDB model = ModelRDB.open(dbcon);Note that no layout of database information is needed this time - it is retrieved from the pre-formatted database.
Some database formats only support one jena model per database. Other layouts can
support multiple models with a single database - these have slightly lower performance
but can be more convenient. Thus if dbConnection
is a connection to an
already formated databasewhose layout supports multiple models then the call and:
ModelRDB model = model.createModel(dbConnection, modelName);will create an additional model within the same database. The
modelName
can be used
to reopen the same model in the future using:
ModelRDB model = model.open(dbConnection, modelName);and
Iterator it = ModelRDB.listModels(dbConnection);will list the name of all the modesl stored in the database.
The ModelRDB
interface supports all the standard jena facilities for navigating
the model. This allows us to, for example, find all statements with a given pattern
of subject, property and object values. If we wish to performance partial matching
on object literal values (e.g. finding all statements whose literal object value starts
with "foo" or is an integer in the range [2,8), say) then we have to use the Selector
mechanism. Unfortunately in this case all candidate statements with matching subject and property
values will be retrieved and then filtered by the supplied Selector.test()
code.
The RDB package allows us to use the underlying database implementation by providing an alternative mechanism for listing statements - that of constraints. For example,
IConstraints constraints = modelrdb.createConstraints(); constraints.addSubjectConstraint(foo) .addPropertyConstraint(prop); Iterator statements = modelrdb.listStatements(constraints);will return an iterator overall statements in the model with subject
foo
and
property prop
. More interestingly the code:
IConstraints constraints = modelrdb.createConstraints(); constraints.addSubjectConstraint(foo) .addPropertyConstraint(prop) .addStringConstraint("NOT LIKES", "%bar%"); Iterator statements = modelrdb.listStatements(constraints);will list just that subset of the above statements whose object value is a literal string which does not contain the substring "bar". The first argument of the
addStringConstraint
call can be any standard SQL string match operation.
As well as string matching there is some experimental support for integer-valued literals.
When and if jena is extended to support true typed literals a fuller match constraint
mechanism might be possible. In the meantime, to support the common case of integer literals we
note any literal in the database which could be interpreted as an integer. In this way we
can support code such as:
IConstraints constraints = modelrdb.createConstraints(); constraints.addSubjectConstraint(foo) .addIntConstraint("<=", 42) .addIntConstraint(">", 4); Iterator statements = modelrdb.listStatements(constraints);Note that in all these cases the
constraints
object can be reused which
may avoid the overhead of generating and parsing the required SQL code (depending on the
nature of the jdbc driver in use).
When a model is created with a given layout style (say Layout
) and
database type (say Dbtype
)
then the implemention attempts to find a class called DriverLayoutDbtype
of type IRDBDriver
which implements all the required storage operations. In this way additional layouts
and database types can be supported by extending the existing implementations. Most
such implementations would extend DriverGenericGeneric
directly or perhaps
the slighly more specialized DriverGenericMMGeneric
for multimodel layouts.
The existing implementations gain extra modularity by moving most the raw SQL code out
into a separate file. Thus tailoring an implementation to a new SQL dialect often just
means generating a new SQL defintion file and the java driver class itself need only
point to the driver file (see DriverInterbaseGeneric
for example).
The existing implementations store these SQL definition files in the classpath in a subdirectory
called etc
. See the javadoc for the SQLCache
class for information
on the format of these SQL defintion files and see the included code for example usage.
Some databases don't properly map java strings onto database strings. In particular, a Postgresql database must be created with "ENCODING = unicode" to cope with multibyte strings. If you create literals with multibyte characters and store them in a default SQL_ASCII postgresql database then some strange behaviour can result - statements involving that literal may be hard to list or delete correctly.
The classes Test* in this package are just used for unit testing. They are polluting the package namespace instead of being hidden away in a separate test package because some of them access protected methods which are not accessible outside of the package.
|
|||||||||
PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES |