Implementing too quickly, without first understanding the RDF data model leads to frustration and disappointment. Yet studying the data model alone, is dry stuff and often leads to tortuous metaphysical conundrums. It is better to approach understanding both the data model and how to use it in parallel. Learn a bit of the data model and try it out. Then learn a bit more and try that out. Then the theory informs the practice and the practice the theory. The data model is quite simple, so this approach does not take long.
RDF has an XML syntax and many who are familiar with XML will think of RDF in terms of that syntax. This is mistake. RDF should be understood in terms of its data model. RDF data can be represented in XML, and this tutorial does address the RDF XML syntax. Understanding the syntax is secondary however, to understanding the data model.
An implementation of the Jena API, including the working source code for all the examples used in this tutorial can be downloaded from http://www.hpl.hp.com/semweb/.
The Resource Description Framework (RDF) is a standard (technically a W3C Recommendation) for describing resources. What is a resource? That is rather a deep question and the precise definition is still the subject of debate. For our purpuses we can think of it as anything we can identify. You are a resource, as is your home page, this tutorial, the number one and the great white whale in Moby Dick.
Our examples in this tutorial will be about people. They use an RDF representation of VCARDS. RDF is best thought of in the form of node and arc diagrams. A simple vcard might look like this in RDF:
The resource, John Smith, is shown in an elipse and is identified by It is identified by a Uniform Resource Identifier (URI)1, in this case "http://.../JohnSmith". If you try to access that resource using your browser, you are unlikely to be successful; April the first jokes not withstanding, you would be rather surprised if your browser were able to deliver John Smith to your desk top. If you are unfamiliar with URI's, think of them simply as rather strange looking names.
Resources have properties. In these examples we are interested in the sort of properties that would appear on John Smith's business card. Figure 1 shows only one property, John Smith's full name. A property is represented by an arc, labeled with the name of a property. The name of a property is also a URI, but as URI's are rather long and cumbersome, the diagram shows it in XML qname form. The part before the ':' is called a namespace prefix and represents a namespace. The part after the ':' is called a local name and represents a name in that namespace. Properties are usually represented in this qname form when written as RDF XML and it is a convenient shorthand for representing them in diagrams and in text. Strictly, however, properties are identified by a URI. The nsprefix:localname form is a shorthand for the URI of the namespace concatenated with the localname. There is no need for the URI of a property to resolve to anything when accessed by a browser.
Each property has a value. In this case the value is a literal, which for now we can think of as a strings of characters2. Literals are shown in rectangles.
Jena is a Java API which can be used to create and manipulate RDF graphs like this one. Jena has object classes to represent graphs, resources, properties and literals. The interfaces representing resources, properties and literals are called Resource, Property and Literal respectively. In Jena, a graph is called a model3 and is represented by the Model interface.
The code to create this graph, or model, is simple:
// some definitions static String personURI = "http://somewhere/JohnSmith"; static String fullName = "John Smith"; // create an empty graph Model model = new ModelMem(); // create the resource Resource johnSmith = model.createResource(personURI); // add the property johnSmith.addProperty(VCARD.FN, fullName);
It begins with some constant definitions and then creates an empty graph or model. ModelMem is a class which implements the Model interface and holds all its data in main memory. Jena contains other implementations of the Model interface, e.g one with stores its data in a Berkley DB database, and another which uses a relational database.
The John Smith resource is then created and a property added to it. The property is provided by a "constant" class VCARD which holds objects representing all the definitions in the VCARD schema. Jena provides constant classes for other well known schemas, such as RDF and RDF schema themselves, Dublin Core and DAML.
The code to create the resource and add the property, can be more compactly written in a cascading style:
Resource johnSmith = model.createResource(personURI) .addProperty(VCARD.FN, fullName);
The working code for this example can be found in the tutorial package of the Jena distribution as tutorial 1. As an exercise, take this code and modify it to create a simple VCARD for yourself.
Now lets add some more detail to the vcard, exploring some more features of RDF and Jena.
In the first example, the property value was a literal. RDF properties can also take other resources as their value. Using a common RDF technique, this example shows how to represent the different parts of John Smith's name:
Here we have added a new property, vcard:N to represent the structure of John Smith's name. There are several things of interest about this graph. Note that the vcard:N property takes a resource as its value. Note also that the ellipse representing the compound name has no URI. It is known as an blank Node.
The Jena code to construct this example, is again very simple. First some declarations and the creation of the empty model.
// some definitions String personURI = "http://somewhere/JohnSmith"; String givenName = "John"; String familyName = "Smith"; String fullName = givenName + " " + familyName; // create an empty graph Model model = new ModelMem(); // create the resource // and add the properties cascading style Resource johnSmith = model.createResource(personURI) .addProperty(VCARD.FN, fullName) .addProperty(VCARD.N, model.createResource() .addProperty(VCARD.Given, givenName) .addProperty(VCARD.Family, familyName));
The working code for this example can be found as tutorial 2 in the tutorial package of the Jena distribution.
Each arc in an RDF graph is called a statement. Each statement asserts a fact about a resource. A statement has three parts:
A statement is sometimes called a triple, because of its three parts.
An RDF graph is represented is a set of statements. Each call of addProperty in tutorial2 added a another statement to the graph. Note that a graph is set of statements; adding a duplicate of a statement has no effect. The Jena model interface defines a listStatements() method which returns an iterator4 over all all the statements in a graph. Each time the next method of the iterator is called it returns a Jena object of type Statement. The Statement interface provides accessor methods to the subject, predicate and object of a statement.
Now we will use that interface to extend tutorial2 to list all the statements created and print them out. The complete code for this can be found in tutorial 3.
// list the statements in the graph StmtIterator iter = model.listStatements(); // print out the predicate, subject and object of each statement while (iter.hasNext()) { Statement stmt = iter.next(); // get next statement Resource subject = stmt.getSubject(); // get the subject Property predicate = stmt.getPredicate(); // get the predicate RDFNode object = stmt.getObject(); // get the object System.out.print(subject.toString()); System.out.print(" " + predicate.toString() + " "); if (object instanceof Resource) { System.out.print(object.toString()); } else { // object is a literal System.out.print(" \"" + object.toString() + "\"); } System.out.println(" ."); }
Since the object of a statement can be either a resource or a literal, the getObject() method returns an object typed as RDFNode, which is a common superclass of both Resource and Literal. The underlying object is of the appropriate type, so the code uses instanceof to determine which and processes it accordingly.
When run, this program should produce output resembling:
http://somewhere/JohnSmith http://www.w3.org/2001/vcard-rdf/3.0#N anon:14df86:ecc3dee17b:-7fff . anon:14df86:ecc3dee17b:-7fff http://www.w3.org/2001/vcard-rdf/3.0#Family "Smith" . anon:14df86:ecc3dee17b:-7fff http://www.w3.org/2001/vcard-rdf/3.0#Given "John" . http://somewhere/JohnSmith http://www.w3.org/2001/vcard-rdf/3.0#FN "John Smith" .
Now you know why it is clearer to draw graphs. If you look carefully, you will see that each line consists of three fields representing the subject, predicate and object of each statement. There are four arcs in the graph, so there are four statements. The "anon:14df86:ecc3dee17b:-7fff" is an internal identifier generated by Jena. It is not a URI and should not be confused with one. It is simply an internal label used by the Jena implementation.
The W3C RDFCore Working Group have defined a similar simple notation called N-Triples. The name means "triple notation. We will see in the next section that Jena has an N-Triples writer built in.
Jena has methods for reading and writing RDF as XML. These can be used to save an RDF model to a file and later, read it back in again.
Tutorial 3 created a model and wrote it out in triple form. Tutorial 4 modifies tutorial 3 to write the
model in RDF XML form to the standard output stream. The code again, is very
simple. Simply create a PrintWriter
and call the
model.write
method.
// now write the model in XML form to a file model.write(new PrintWriter(System.out));
The output should look something like this:
<rdf:RDF xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#' xmlns:vcard='http://www.w3.org/2001/vcard-rdf/3.0#' > <rdf:Description rdf:about='http://somewhere/JohnSmith'> <vcard:FN>John Smith</vcard:FN> <vcard:N rdf:resource='#A0'/> </rdf:Description> <rdf:Description rdf:about='#A0'> <vcard:Given>John</vcard:Given> <vcard:Family>Smith</vcard:Family> </rdf:Description> </rdf:RDF>
The RDF specifications specify how to represent RDF as XML. The RDF XML syntax is quite complex. The reader is referred to the primer being developed by the RDFCore WG for a more detailed introduction. However, lets take a quick look at how to interpret the above.
RDF is usually embedded in an <rdf:RDF> element. The element is optional if there are other ways of know that some XML is RDF, but it is usually present. The RDF element defines the two namespaces used in the document. There is then an <rdf:Description> element which describes the resource whose URI is "http://somewhere/JohnSmith". If the rdf:about attribute was missing, this element would represent a blank node.
The <vcard:FN> element describes a property of the resource. The property name is the "FN" in the vcard namespace. RDF converts this to a URI reference by concatenating the URI reference for the namespace prefix and "FN", the local name part of the name. This gives a URI reference of "http://www.w3.org/2001/vcard-rdf/3.0#FN". The value of the property is the literal "John Smith".
The <vcard:N> element is a resource. In this case the resource is represented by a relative URI reference. RDF converts this to an absolute URI reference by concatenating it with the base URI of the current document.
There is an error in this RDF XML; it does not exactly represent the graph we created. The blank node in the graph has been given a URI reference. It is no longer blank. The RDF/XML syntax is not capable of representing all RDF graphs; for example it cannot represent a blank node which is the object of two statements. The 'dumb' writer we used to write this RDF/XML makes no attempt to write correctly the subset of graphs which can be written correctly. It gives a URI to each blank node, making it no longer blank.
Jena has an extensible interface which allows new writers for different serialization languages for RDF to be easily plugged in. The above call invoked the standard 'dumb' writer. Jena also includes a more sophisticated RDF/XML writer which can be invoked by specifying another argument to the write() method call:
// now write the model in XML form to a file model.write(new PrintWriter(System.out), "RDF/XML-ABBREV");
This writer, the so called PrettyWriter, takes advantage of features of the RDF/XML abbreviated syntax to write a graph more compactly. It is also able to preserve blank nodes where that is possible. It is however, not suitable for writing very large graphs, as its performance is unlikely to be acceptable. To write large files and preserve blank nodes, write in N-Triples format:
// now write the model in XML form to a file model.write(new PrintWriter(System.out), "N-TRIPLE");
This will produce output similar to that of tutorial 3 which conforms to the N-Triples specification.
Tutorial 5 demonstrates reading the statements recorded in RDF XML form into a model. With this tutorial, we have provided a small database of vcards in RDF/XML form. The following code will read it in and write it out.
// create an empty model Model model = new ModelMem(); // use the class loader to find the input file InputStream in = Tutorial05.class .getClassLoader() .getResourceAsStream(inputFileName); if (in == null) { throw new IllegalArgumentException( "File: " + inputFileName + " not found"); } // read the RDF/XML file model.read(new InputStreamReader(in), ""); // write it to standard out model.write(new PrintWriter(System.out));
The second argument to the read() method call is the URI which will be used for resolving relative URI's. As there are no relative URI references in the test file, it is can be empty.
For now, it can always be to the empty string. When run, tutorial 5 will produce XML output which looks like:
<rdf:RDF xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#' xmlns:vcard='http://www.w3.org/2001/vcard-rdf/3.0#' > <rdf:Description rdf:about='#A0'> <vcard:Family>Smith</vcard:Family> <vcard:Given>John</vcard:Given> </rdf:Description> <rdf:Description rdf:about='http://somewhere/JohnSmith/'> <vcard:FN>John Smith</vcard:FN> <vcard:N rdf:resource='#A0'/> </rdf:Description> <rdf:Description rdf:about='http://somewhere/SarahJones/'> <vcard:FN>Sarah Jones</vcard:FN> <vcard:N rdf:resource='#A1'/> </rdf:Description> <rdf:Description rdf:about='http://somewhere/MattJones/'> <vcard:FN>Matt Jones</vcard:FN> <vcard:N rdf:resource='#A2'/> </rdf:Description> <rdf:Description rdf:about='#A3'> <vcard:Family>Smith</vcard:Family> <vcard:Given>Rebecca</vcard:Given> </rdf:Description> <rdf:Description rdf:about='#A1'> <vcard:Family>Jones</vcard:Family> <vcard:Given>Sarah</vcard:Given> </rdf:Description> <rdf:Description rdf:about='#A2'> <vcard:Family>Jones</vcard:Family> <vcard:Given>Matthew</vcard:Given> </rdf:Description> <rdf:Description rdf:about='http://somewhere/RebeccaSmith/'> <vcard:FN>Becky Smith</vcard:FN> <vcard:N rdf:resource='#A3'/> </rdf:Description> </rdf:RDF>
Jena is a Java API for semantic web applications. The key RDF package for
the application developer is com.hp.hpl.mesa.rdf.jena.model
.
The API has been defined in terms of interfaces so that application code can
work with different implementations without change. This package contains
interfaces for representing models, resources, properties, literals,
statements and all the other key concepts of RDF. So that application code
remains independent of the implementation, it is best if it uses interfaces
wherever possible, not specific class implementations.
The com.hp.hpl.mesa.rdf.jena.tutorial
package contains the
working source code for all the examples used in this tutorial.
The com.hp.hpl.mesa.rdf.jena.mem
package contains an
implementation of Jena API which stores all model state in main memory.
Applications creating in-memory models will typically create instances of the
ModelMem
class defined in this package. This is a case where,
currently, the application must refer to a specific implementation class. A
more general mechanism will be introduced in the future.
The com.hp.hpl.mesa.rdf.jena.common
package contains
implementation classes which may be common to many implementations. For
example, it defines classes ResourceImpl
,
PropertyImpl
, LiteralImpl
which may be used
directly or subclassed by different implementations. Applications should
rarely, if ever, use these classes directly. For example, rather than
creating a new instance of ResourceImpl
, it is better to use the
createResource
method of whatever model is being used. That
way, if the model implementation has used an optimized implementation of
Resource
, then no conversions between the two types will be
necessary.
The Jena development team plan some refactoring of the code in the near future. The current package names reflect the fact that Jena was originally developed as part of a project called Mesa. Under the new naming scheme, the RDF packages will be in com.hp.hpl.jena.rdf.
So far, this tutorial has dealt mainly with creating, reading and writing RDF graphs. It is now time to deal with accessing information held in a graph.
Given the URI of a resource, the resource object can be retrieved from a
model using the Model.getResource(String uri)
method. This
method is defined to return a Resource object if one exists in the model, or
otherwise to create a new one. For example, to retrieve the Adam Smith
resource from the model read in from the file in tutorial 5:
// retrieve the John Smith vcard resource from the model Resource vcard = model.getResource(johnSmithURI);
The Resource interface defines a number of methods for accessing the
properties of a resource. The Resource.getProperty(Property p)
method accesses a property of the resource. This method does not follow the
usual Java accessor convention in that the type of the object returned is
Statement,
not the Property
that you might have
expected. Returning the whole statement allows the application to access the
value of the property using one of its accessor methods which return the
object of the statement. For example to retrieve the resource which is the
value of the vcard:N
property:
// retrieve the value of the N property Resource name = (Resource) vcard.getProperty(VCARD.N) .getObject();
In general, the object of a statement could be a resource or a literal, so the application code, knowing the value must be a resource, casts the returned object. One of the things that Jena tries to do is to provide type specific methods so the application does not have to cast and type checking can be done at compile time. The code fragment above, can be more conveniently written:
// retrieve the value of the FN property Resource name = vcard.getProperty(VCARD.N) .getResource();
Similarly, the literal value of a property can be retrieved:
// retrieve the given name property String fullName = vcard.getProperty(VCARD.FN) .getString();
In this example, the vcard resource has only one vcard:FN
and
one vcard:N
property. RDF permits a resource to repeat a
property; for example Adam might have more than one nickname. Lets give him
two:
// add two nick name properties to vcard vcard.addProperty(VCARD.NICKNAME, "Smithy") .addProperty(VCARD.NICKNAME, "Adman");
As noted before, Jena represents a RDF graph as set of
statements, so adding a statement with the subject, predicate and object as
one already in the graph will have no effect. Jena does not define which of
the two nicknames presennt in the graph will be returned. The result of
calling vcard.getProperty(VCARD.NICKNAME)
is indeterminate. Jena
will return one of the values, but there is no guarantee even that two
consecutive calls will return the same value.
If it is possible that a property may occur more than once, then the
Resource.listProperties(Property p) method can be used to return an iterator
which will list them all. This method returns an iterator which returns
objects of type Statement
. We can list the nicknames like
this:
// set up the output System.out.println("The nicknames of \"" + fullName + "\" are:"); // list the nicknames StmtIterator iter = vcard.listProperties(VCARD.NICKNAME); while (iter.hasNext()) { System.out.println(" " + iter.next() .getObject() .toString()); }
This code can be found in tutorial 6, which produces the following output when run:
The nicknames of "John Smith" are: Smithy Adman
All the properties of a resource can be listed by using the
listStatements()
method without an argument.
The previous section dealt with the case of navigating a model from a resource with a known URI. This section deals with searching a model. The core Jena API supports only a limited query primitive. The more powerful query facilities of RDQL are described elsewhere in this tutorial.
The Model.listStatements()
method, which lists all the
statements in a model, is perhaps the crudest way of querying a model. Its
use is not recommended on very large graphs.
Model.listSubjects()
is similar, but returns an iterator over
all resources that have properties, i.e. are the subject of some
statement.
Model.listSubjectsWithProperty(Property p, RDFNode
o)
will return an iterator over all the resources which
have property
p
with value o
. We might expect to
be able to use the rdf:type property to retrieve all the vcard resources by
searching for their type property:
// retrieve all resource of type Vcard. ResIterator iter = model.listSubjectsWithProperty(RDF.type, VCARD.Vcard);
Unfortunately, however, the vcard schema we are using does not define a
type for vcards! However, if we assume that only resources of type vcard
will have vcard:FN
property, and that in our data, all such
resources have such a property, then we can find all the vcards like this:
// list vcards ResIterator iter = model.listSubjectsWithProperty(VCARD.FN); while (iter.hasNext()) { Resource r = iter.next(); ... }
All these query methods are simply syntactic sugar over a primitive query
method model.listStatements(Selector s)
. This method returns an
iterator over all the statements in the model 'selected' by s
.
The selector interface is designed to be extensible, but for now, there is
only implementation of it, the class SelectorImpl
from the
package com.hp.hpl.mesa.rdf.jena.common
. Using
SelectorImpl
is one of the rare occasions in Jena when it is
necessary to use a specific class rather than an interface. The
SelectorImp
l constructor takes three arguments:
Selector selector = new SelectorImpl(subject, predicate, object)
This selector will select all statements with a subject that matches
subject
, a predicate that matches predicate
and an
object that matches object
. if a null
is supplied
in any of the positions, it matches anything, otherwise resources, including
properties since they are a subset of resources, match if their URI
references are equal and Literals are equal if all their components are
equal. Thus:
Selector selector = new SelectorImpl(null, null, null);
will select all the statements in a graph.
Selector selector = new SelectorImpl(null, VCARD.FN, null);
will select all the statements with VCARD.FN as their predicate, whatever the subject or object. YThe following code, which can be found in full in tutorial 7 lists the full names on all the vcards in the database.
// select all the resources with a VCARD.FN property ResIterator iter = model.listSubjectsWithProperty(VCARD.FN); if (iter.hasNext()) { System.out.println("The database contains vcards for:"); while (iter.hasNext()) { System.out.println(" " + iter.next() .getProperty(VCARD.FN) .getString()); } } else { System.out.println("No vcards were found in the database"); }
This should produce output similar to the following:
The database contains vcards for: Sarah Jones John Smith Matt Jones Becky Smith
Your next exercise is to modify this code to use SelectorImpl
instead of listSubjectsWithProperty
.
Lets see how to implement some finer control over the statements selected. SelectorImpl can be subclassed and its selects method modified to perform further filtering:
// select all the resources with a VCARD.FN property // whose value ends with "Smith" StmtIterator iter = model.listStatements( new SelectorImpl(null, VCARD.FN, (RDFNode) null) { public boolean selects(Statement s) { try { return s.getString() .endsWith("Smith"); } catch (RDFException e) { throw new RDFError(e); } } });
This sample code uses a neat Java technique of overridding a method
definition inline when creating an instance of the class. Here the
selects(...)
method checks to ensure that the full name ends
with "Smith". It is important to note that filtering based on the subject,
predicate and object arguments takes place before the
selects(...)
method is called, so the extra test will only be
applied to matching statements.
The full code can be found in tutorial 8 and produces output like this:
The database contains vcards for: John Smith Becky Smith
You might think that:
// do all filtering in the selects method StmtIterator iter = model.listStatements( new SelectorImpl(null, null, (RDFNode) null) { public boolean selects(Statement s) { try { return (subject == null || s.getSubject().equals(subject)) && (predicate == null || s.getPredicate().equals(predicate)) && (object == null || s.getObject().equals(object)) } catch (RDFException e) { throw new RDFError(e); } } });
is equivalent to:
StmtIterator iter = model.listStatements(new SelectorImpl(subject, predicate, object)
Whilst functionaly they may be equivalent, the first form will list all the statements in the graph and test each one individually, whilst the later allows indexes maintained by the implementation to improve performance. Try it on a large graph and see for yourself, but make a cup of coffee first.
Jena provides three operations for manupulating graphs as a whole. These are the common set operations of union, intersection and difference.
The union of two graphs is the union of the sets of statements which represent each graph. This is one the key operations that the design of RDF supports. It enables data from disparate data sources to be merged. Consider the following two graphs:
and
When these are merged, the two http://...JohnSmith nodes are merged into
one and the duplicate vcard:FN
arc is dropped to produce:
Lets look at the code to do this (the full code is in tutorial 9) and see what happens.
// read the RDF/XML files model1.read(new InputStreamReader(in1), ""); model2.read(new InputStreamReader(in2), ""); // merge the graphs Model model = model1.union(model2); // print the graph as RDF/XML model.write(new PrintWriter(System.out), "RDF/XML-ABBREV");
The output produced by the pretty writer looks like this:
<?xml version='1.0'?> <rdf:RDF xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#' xmlns:RDFNsId1='http://www.w3.org/2001/vcard-rdf/3.0#'> <rdf:Description rdf:about='http://somewhere/JohnSmith/'> <RDFNsId1:N RDFNsId1:Given='John' RDFNsId1:Family='Smith'/> <RDFNsId1:FN>John Smith</RDFNsId1:FN> <RDFNsId1:EMAIL rdf:value='John@somewhere.com' rdf:type='http://www.w3.org/2001/vcard-rdf/3.0#internet'/> </rdf:Description> </rdf:RDF>
Even if you are unfamiliar with the details of the RDF/XML syntax, it should be reasonably clear that the graphs have merged as expected. The intersection and difference of the graphs can be computed in a similar manner.<./p>
Jena's current exceptions policy has proved to be unpopular and will be changed in the near future. The original design took a strict approach and required a checked exception wherever an application should check for a possible error. Because Jena was designed to be highly flexible and capable of being extended to support, for example, different storage managers, it was possible that a storage manager could give an unexpected error at any time. Thus nearly all Jena methods ended up declaring that it throws an RDFException. Experience has shown that this is not a popular approach, that these exceptions when caught are usually ignored and no benefit is gained from making them checked.
Thus Jena's exception policy will be reviewed in the near future. Most, if not all exceptions, will become unchecked. In the meantime, though folks, I'm afraid you are still going to have to catch and ignore them.
RDF defines a special kind of resources for representing collections of things. These resources are called containers. The members of a container can be either literals or resources. There are three kinds of container:
A container is represented by a resource. That resource will have an rdf:type property whose value should be one of rdf:Bag, rdf:Alt or rdf:Seq, or a subclass of one of these, depending on the type of the container. The first member of the container is the value of the container's rdf:_1 property; the second member of the container is the value of the container's rdf:_2 property and so on. The rdf:_nnn properties are known as the ordinal properties.
For example, the graph for a simple bag containing the vcards of the Smith's might look like this:
Whilst the members of the bag are represented by the properties rdf:_1, rdf:_2 etc the ordering of the properties is not significant. We could switch the values of the rdf:_1 and rdf:_2 properties and the resulting graph would represent the same information.
Alt's are intended to represent alternatives. For example, lets say a resource represented a software product. It might have a property to indicate where it might be obtained from. The value of that property might be an Alt collection containing various sites from which it could be downloaded. Alt's are unordered except that the rdf:_1 property has special significance. It represents the default choice.
Whilst containers can be handled using the basic machinery of resources and properties, Jena has explicit interfaces and implementation classes to handle them. It is not a good idea to have an object manipulating a container, and at the same time to modify the state of that container using the lower level methods.
Lets modify tutorial 8 to create this bag:
// create a bag Bag smiths = model.createBag(); // select all the resources with a VCARD.FN property // whose value ends with "Smith" StmtIterator iter = model.listStatements( new SelectorImpl(null, VCARD.FN, (RDFNode) null) { public boolean selects(Statement s) { try { return s.getString() .endsWith("Smith"); } catch (RDFException e) { throw new RDFError(e); } } }); // add the Smith's to the bag while (iter.hasNext()) { smiths.add(iter.next().getSubject()); }
If we write out this graph, it contains something like the following:
<rdf:RDF xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#' xmlns:vcard='http://www.w3.org/2001/vcard-rdf/3.0#' > ... <rdf:Description rdf:about='#A3'> <rdf:type rdf:resource='http://www.w3.org/1999/02/22-rdf-syntax-ns#Bag'/> <rdf:_1 rdf:resource='http://somewhere/JohnSmith/'/> <rdf:_2 rdf:resource='http://somewhere/RebeccaSmith/'/> </rdf:Description> </rdf:RDF>
which represents the Bag resource.
The container interface provides an iterator to list the contents of a container:
// print out the members of the bag NodeIterator iter2 = smiths.iterator(); if (iter2.hasNext()) { System.out.println("The bag contains:"); while (iter2.hasNext()) { System.out.println(" " + (Resource) iter2.next()) .getProperty(VCARD.FN) .getString()); } } else { System.out.println("The bag is empty"); }
which produces the following output:
The bag contains: John Smith Becky Smith
Executable example code can be found in tutorial 10.
The Jena classes offer methods for manipulating containers including adding new members, inserting new members into the middle of a container and removing existing members. The Jena container classes currently ensure that the the list of ordinal properties used starts at rdf:_1 and is contiguous. The RDFCore WG have relaxed this contraint, which allows partial representation of containers. This therefore is an area of Jena may be changed in the future.
RDF literals are not just simple strings. Literals may have a language tag to indicate the language of the literal. The literal "chat" with an English language tag is considered different to the literal "chat" with a French language tag. This rather strange behaviour is an artefact of the original RDF/XML syntax.
Further there are really two sorts of Literals. In one, the string component is just that, an ordinary string. In the other the string component is expected to be a well balanced fragment of XML. When an RDF graph is written as RDF/XML a special construction using a parseType='Literal' attribute is used to represent it.
In Jena, these attributes of a literal may be set when the literal is constructed, e.g. in tutorial 11:
// create the resource Resource r = model.createResource(); // add the property r.addProperty(RDFS.label, model.createLiteral("chat", "en")) .addProperty(RDFS.label, model.createLiteral("chat", "fr")) .addProperty(RDFS.label, model.createLiteral("<em>chat</em>", true)); // write out the graph model.write(new PrintWriter(System.out));
produces
<rdf:RDF xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#' xmlns:rdfs='http://www.w3.org/2000/01/rdf-schema#' > <rdf:Description rdf:about='#A0'> <rdfs:label xml:lang='en'>chat</rdfs:label> <rdfs:label xml:lang='fr'>chat</rdfs:label> <rdfs:label xml:lang='en' rdf:parseType='Literal'><em>chat</em></rdfs:label> </rdf:Description> </rdf:RDF>
For two literals to be considered equal, they must either both be XML literals or both be simple literals. In addition, either both must have no language tag, or if language tags are present they must be equal. For simple literals the strings must be equal. XML literals have two notions of equality. The simple notion is that the conditions previously mentioned are true and the strings are also equal. The other notion is that they can be equal if the cannonicalization of their strings is equal.
Jena's interfaces also support typed literals, however, at present these
are only convenience methods. The typed value is converted to a string and a
string literal is stored in the graph. For example, try (noting that for
simple literals, we can omit the model.createLiteral
(...)
call):
// create the resource Resource r = model.createResource(); // add the property r.addProperty(RDFS.label, "11") .addProperty(RDFS.label, 11); // write out the graph model.write(new PrintWriter(System.out), "N-TRIPLE");
The output produced is:
_:A... <http://www.w3.org/2000/01/rdf-schema#label> "11" .
Since both literals are really just the string "11", then only one statement is added.
The RDFCore WG is defining new mechanisms for supporting datatypes in RDF. Jena does not yet support these, but we expect to in the near future. This then, is another area where changes in the interfaces and behaviour are to be expected.
Blank Node | Represents a resource, but does not indicate a URI for the resource. Blank nodes act like existentially qualified variables in first order logic. |
Dublin Core | A standard for metadata about web resources. Further information can be found at the Dublin Core web site. |
Literal | A string of characters which can be the value of a property. |
Object | The part of a triple which is the value of the statement |
Predicate | The property part of a triple. |
Property | A property is an attribute of a resource. For example DC.title is a property, as is RDF.type. |
Resource | Some entity. It could be a web resource such as web page, or it could be a concrete physical thing such as a tree or a car. It could be an abstract idea such as chess or football. Resources are named by URI's. |
Statement | An arc in an RDF graph, normally interpretted as a fact. |
Subject | The resource which is the source of an arc in an RDF graph |
Triple | A structure containing a subject, a predicate and an object. Another term for a statement. |