Processing of structured documents, Session 4

Processing of structured documents, Session 4 (17.-18.2.)

Study the file xpath.xml ( XML file / HTML file), and draw the tree structure consisting of the nodes specified in the XPath data model. Write XPath expressions retrieving the following elements relative to the element that has an "id" attribute with the value "start" in the document. That is, this element is assumed to be the current context node for the location expressions. For each expression, which elements are retrieved? You can test your expressions using an XSLT transformation template (there is an example of the expression 'The attributes of the current element' available).
1. The current context element.
2. Each child of the context element.
3. Each ancestor of the context element.
4. Each child named 'this'.
5. The third child named 'this'.
6. All ancestors named 'that.
7. The closest ancestor named 'that'.
8. The most remote ancestor named 'that'.
9. The immediately preceding sibling element.
10. The immediately following sibling element.
11. The closest preceding sibling named 'that'.
12. The closest following sibling named 'that'.
13. The seventh sibling of the context element.
(This exercise: courtesy of Ken Holman.)

Assume you have an XML document containing information about books, as follows:

<?xml version="1.0"?>

         <books>
             <book category="reference" ID="1" >
                 <author>Nigel Rees</author>
                 <title>Sayings of the Century</title>
             </book>

             <book category="fiction" ID="2" >
                 <author>Evelyn Waugh</author>
                 <title>Sword of Honour</title>
             </book>

             <book category="fiction" ID="3" >
                 <author>Herman Melville</author>
                 <title>Moby Dick</title>
             </book>

         </books>

Additionally, you have another XML document containing the prices of the books:

<?xml version="1.0"?>

<priceList>
    <bPrice bookId="1">8.95</bPrice>
    <bPrice bookId="2">12.99</bPrice>    
    <bPrice bookId="3">8.99</bPrice>
</priceList>

Give SAX content handlers that output for each fiction book in the document (category="fiction"): author, title, and price. For instance,

Evelyn Waugh, Sword of Honour: 12.99
Herman Melville, Moby Dick: 8.99

Tip: You can parse the pricelist document first and store the prices in an array. When the parsing reaches the end of the document, the control returns to the caller of "parse" and you can parse the books document.

Project, Part 4:

Modify the Part 3 (last week's project task):
- Assume that the price information is not included in the order that comes from Company B, but the prices are stored in a separate XML document (like in the task 2 above).
- Modify your SAX handlers accordingly.

Helena Ahonen-Myka

Last modified: Thu Feb 13 08:50:42 EET 2003