Subqueries

Reading material

Elmasri&Navathe: luku 8.3.1-8.3.3 (sivut 261-267)

Subqueies are queries that are embedded in another query. In SQL subqueries may be used in the WHERE-part of the query and since  SQL-92 also in the FROM-part of the query.

Subqueries are very useful when used in the WHERE-part of a query. A subquery produces a table as its result (a set of rows). SQL provides condition structures to compare single values to a set of values. These structures include the IN and NOT IN operators and the use of SOME, ANY and ALL qualifiers together with the traditional comparison operators. Subqueries may be used to define the set side of these comparisons. In this case the subquery must retrieve only a single column. Our next example fetches the names of customers that have not ordered anything in January 2002. It uses a subquery that retrieves the customer identifiers for customers that have ordered something in January 2002. To  pick up a customer into the result it is required that the customer's identifier does not belong to the ones obtained by the subquery. We use Oracles's default date format in this query.

select name
from customer
where 
  customerId not in
    (select customer
     from ordered
     where whenmade between '1-JAN-2002' and '31-JAN-2002')
order by name

When a column value is compared with a constant set of values it is practical to require only equality or inequalily with some value within the set. When the set of values is not constant but achieved using a subquery, also other types of comparisons become useful. Next example shows how to find out the most expensive products. 

select productID, objectName, ModelId, price
from product, objectName
where  
   product.productId=objectName.objectId and
   objectName.language='english' and
   price >=all (select price from product)

In this query we used the subquery to fetch all the prices that are in use. The price of the product to be selected was required to be greater than or equal to all of them. The most expensive product meets this condition. For less  expensive products there is at least one product that is more expensive than they and the condition is not true..

IN and  NOT IN operators may also be used to compare a sequence of multiple column values to the multi-column rows retrieved by a subquery.   The sequence is specified by including the column definitions (or expression)  in brackets and the sequece must contain as many items as there are result columns in the subquery. For example, products not ordered at all in January 2002 may be retrieved with the query:

select productID, modelID 
from product
where (productID,modelID) not in  
  (select productID,ModelID 
   from ItemOrdered
   where OrderID in
     (select OrderID 
      from Ordered 
      where WhenMade between '1-JAN-2002' and '31-JAN-2002')
   ) 
order by productID, ModelID;

Here the red subquery produces rows with two columns and it is required that the value of column pair (productID, modelID) is not included in these rows. This query also shows that subqueries may be embedded within other subqueries.

 

Connected subqueries

The subqueries discussed so far have been independent of the query they are embedded in. It would be possible to execute the subquery first and then to use its result like a set of constant values in the outer query. Independent subqueries are however not the only type of subqueries. Subqueries whose result depends on the current row of the outer query may also be defined.  Next example query retrieves the names of delivery boys that have not made any deliveries to their own delivery area. The subquery is evaluated separately for each delivery boy and it gives the areas this boy has made deliveries to. 

select name 
from deliveryboy
where 
   deliveryArea not in
    (select areaCode
     from customer,ordered,delivery
     where
        customer.customerID=Ordered.Customer and
        delivery.orderID = Ordered.OrderID and
        delivery.deliveredBy=deliveryBoy.BoyID
    )

Because the result of a connected subquery varies according to the rows of the outer query, it is practical to test whether the result is empty . Predicates EXISTS and NOT EXISTS are available for this purpose. EXISTS is true, if the result has at least one row, and NOT EXISTS is true if the result is empty. It is not meaningful to test the existence of result for independent subqueries. Next example query finds out delivery boys that have  not had any deliveries in January 2002. 

select boyID, name
from deliveryBoy
where not exists 
  (select *  
   from delivery
   where deliveredBy=deliveryBoy.BoyID and
     whenDelivered between '01-JAN-2002' and '31-JAN-2002')

The above query may be expressed also without using connected subqueries as follows:

select boyID, name
from deliveryBoy
where boyID not in 
  (select deliveredBy
   from delivery
   where 
     whenDelivered between '01-JAN-2002' and '31-JAN-2002')

It's up to you which you like best.

Subqueries may also be used in the FROM-part of the query. This usage will be discussed in connection to summary queries.