Join Operations
Join Operations
Typically, we want only combinations of the Cartesian product that satisfy certain conditions
and so we would normally use a Join operation instead of the Cartesian product
operation. The Join operation, which combines two relations to form a new relation, is one
of the essential operations in the relational algebra. Join is a derivative of Cartesian product,
equivalent to performing a Selection operation, using the join predicate as the selection
formula, over the Cartesian product of the two operand relations. Join is one of the
most difficult operations to implement efficiently in an RDBMS and is one of the reasons
why relational systems have intrinsic performance problems. We examine strategies for
implementing the Join operation in Section 21.4.3.
There are various forms of Join operation, each with subtle differences, some more useful
than others:
n Theta join
n Equijoin (a particular type of Theta join)
n Natural join
n Outer join
n Semijoin.
Theta join (q-join)
R !F S The Theta join operation defines a relation that contains tuples satisfying
the predicate F from the Cartesian product of R and S. The predicate F is
of the form R.ai q S.bi where q may be one of the comparison operators
(<, ≤, >, ≥, =, ≠).
We can rewrite the Theta join in terms of the basic Selection and Cartesian product
operations:
R 1F S = σF (R × S)
As with Cartesian product, the degree of a Theta join is the sum of the degrees of the
operand relations R and S. In the case where the predicate F contains only equality (=), the
term Equijoin is used instead. Consider again the query of Example 4.6.
Example 4.7 Equijoin operation
List the names and comments of all clients who have viewed a property for rent.
In Example 4.6 we used the Cartesian product and Selection operations to obtain this list.
However, the same result is obtained using the Equijoin operation:
(ΠclientNo, fName, lName(Client)) 1 Client.clientNo = Viewing.clientNo (ΠclientNo, propertyNo, comment(Viewing))
or
Result ← TempClient 1 TempClient.clientNo = TempViewing.clientNo TempViewing
The result of these operations was shown in Figure 4.8.
Natural join
R ! S The Natural join is an Equijoin of the two relations R and S over all common
attributes x. One occurrence of each common attribute is eliminated from
the result.
The Natural join operation performs an Equijoin over all the attributes in the two relations
that have the same name. The degree of a Natural join is the sum of the degrees of the
relations R and S less the number of attributes in x.
Example 4.8 Natural join operation
List the names and comments of all clients who have viewed a property for rent.
In Example 4.7 we used the Equijoin to produce this list, but the resulting relation had
two occurrences of the join attribute clientNo. We can use the Natural join to remove one
occurrence of the clientNo attribute:
(ΠclientNo, fName, lName(Client)) 1(ΠclientNo, propertyNo, comment(Viewing))
or
Result ← TempClient 1 TempViewing
The result of this operation is shown in Figure 4.9.
Outer join
Often in joining two relations, a tuple in one relation does not have a matching tuple
in the other relation; in other words, there is no matching value in the join attributes.
We may want tuples from one of the relations to appear in the result even when there
are no matching values in the other relation. This may be accomplished using the Outer
join.
R % S The (left) Outer join is a join in which tuples from R that do not have matching
values in the common attributes of S are also included in the result
relation. Missing values in the second relation are set to null.
The Outer join is becoming more widely available in relational systems and is a specified
operator in the SQL standard (see Section 5.3.7). The advantage of an Outer join is that
information is preserved, that is, the Outer join preserves tuples that would have been lost
by other types of join.
Example 4.9 Left Outer join operation
Produce a status report on property viewings.
In this case, we want to produce a relation consisting of the properties that have been
viewed with comments and those that have not been viewed. This can be achieved using
the following Outer join:
(ÐpropertyNo, street, city(PropertyForRent)) 5 Viewing
The resulting relation is shown in Figure 4.10. Note that properties PL94, PG21, and PG16
have no viewings, but these tuples are still contained in the result with nulls for the
attributes from the Viewing relation.
Strictly speaking, Example 4.9 is a Left (natural) Outer join as it keeps every tuple in
the left-hand relation in the result. Similarly, there is a Right Outer join that keeps every
tuple in the right-hand relation in the result. There is also a Full Outer join that keeps all
tuples in both relations, padding tuples with nulls when no matching tuples are found.
Semijoin
R @F S The Semijoin operation defines a relation that contains the tuples of R that
participate in the join of R with S.
The Semijoin operation performs a join of the two relations and then projects over the
attributes of the first operand. One advantage of a Semijoin is that it decreases the number
of tuples that need to be handled to form the join. It is particularly useful for computing
joins in distributed systems (see Sections 22.4.2 and 23.6.2). We can rewrite the Semijoin
using the Projection and Join operations:
R 2F S = ΠA(R 1F S) A is the set of all attributes for R
This is actually a Semi-Theta join. There are variants for Semi-Equijoin and Semi-Natural
join.
Example 4.10 Semijoin operation
List complete details of all staff who work at the branch in Glasgow.
If we are interested in seeing only the attributes of the Staff relation, we can use the following
Semijoin operation, producing the relation shown in Figure 4.11.
Staff 2 Staff.branchNo = Branch branchNo.(σcity = ‘Glasgow’ (Branch))
Typically, we want only combinations of the Cartesian product that satisfy certain conditions
and so we would normally use a Join operation instead of the Cartesian product
operation. The Join operation, which combines two relations to form a new relation, is one
of the essential operations in the relational algebra. Join is a derivative of Cartesian product,
equivalent to performing a Selection operation, using the join predicate as the selection
formula, over the Cartesian product of the two operand relations. Join is one of the
most difficult operations to implement efficiently in an RDBMS and is one of the reasons
why relational systems have intrinsic performance problems. We examine strategies for
implementing the Join operation in Section 21.4.3.
There are various forms of Join operation, each with subtle differences, some more useful
than others:
n Theta join
n Equijoin (a particular type of Theta join)
n Natural join
n Outer join
n Semijoin.
Theta join (q-join)
R !F S The Theta join operation defines a relation that contains tuples satisfying
the predicate F from the Cartesian product of R and S. The predicate F is
of the form R.ai q S.bi where q may be one of the comparison operators
(<, ≤, >, ≥, =, ≠).
We can rewrite the Theta join in terms of the basic Selection and Cartesian product
operations:
R 1F S = σF (R × S)
As with Cartesian product, the degree of a Theta join is the sum of the degrees of the
operand relations R and S. In the case where the predicate F contains only equality (=), the
term Equijoin is used instead. Consider again the query of Example 4.6.
Example 4.7 Equijoin operation
List the names and comments of all clients who have viewed a property for rent.
In Example 4.6 we used the Cartesian product and Selection operations to obtain this list.
However, the same result is obtained using the Equijoin operation:
(ΠclientNo, fName, lName(Client)) 1 Client.clientNo = Viewing.clientNo (ΠclientNo, propertyNo, comment(Viewing))
or
Result ← TempClient 1 TempClient.clientNo = TempViewing.clientNo TempViewing
The result of these operations was shown in Figure 4.8.
Natural join
R ! S The Natural join is an Equijoin of the two relations R and S over all common
attributes x. One occurrence of each common attribute is eliminated from
the result.
The Natural join operation performs an Equijoin over all the attributes in the two relations
that have the same name. The degree of a Natural join is the sum of the degrees of the
relations R and S less the number of attributes in x.
Example 4.8 Natural join operation
List the names and comments of all clients who have viewed a property for rent.
In Example 4.7 we used the Equijoin to produce this list, but the resulting relation had
two occurrences of the join attribute clientNo. We can use the Natural join to remove one
occurrence of the clientNo attribute:
(ΠclientNo, fName, lName(Client)) 1(ΠclientNo, propertyNo, comment(Viewing))
or
Result ← TempClient 1 TempViewing
The result of this operation is shown in Figure 4.9.
Outer join
Often in joining two relations, a tuple in one relation does not have a matching tuple
in the other relation; in other words, there is no matching value in the join attributes.
We may want tuples from one of the relations to appear in the result even when there
are no matching values in the other relation. This may be accomplished using the Outer
join.
R % S The (left) Outer join is a join in which tuples from R that do not have matching
values in the common attributes of S are also included in the result
relation. Missing values in the second relation are set to null.
The Outer join is becoming more widely available in relational systems and is a specified
operator in the SQL standard (see Section 5.3.7). The advantage of an Outer join is that
information is preserved, that is, the Outer join preserves tuples that would have been lost
by other types of join.
Example 4.9 Left Outer join operation
Produce a status report on property viewings.
In this case, we want to produce a relation consisting of the properties that have been
viewed with comments and those that have not been viewed. This can be achieved using
the following Outer join:
(ÐpropertyNo, street, city(PropertyForRent)) 5 Viewing
The resulting relation is shown in Figure 4.10. Note that properties PL94, PG21, and PG16
have no viewings, but these tuples are still contained in the result with nulls for the
attributes from the Viewing relation.
Strictly speaking, Example 4.9 is a Left (natural) Outer join as it keeps every tuple in
the left-hand relation in the result. Similarly, there is a Right Outer join that keeps every
tuple in the right-hand relation in the result. There is also a Full Outer join that keeps all
tuples in both relations, padding tuples with nulls when no matching tuples are found.
Semijoin
R @F S The Semijoin operation defines a relation that contains the tuples of R that
participate in the join of R with S.
The Semijoin operation performs a join of the two relations and then projects over the
attributes of the first operand. One advantage of a Semijoin is that it decreases the number
of tuples that need to be handled to form the join. It is particularly useful for computing
joins in distributed systems (see Sections 22.4.2 and 23.6.2). We can rewrite the Semijoin
using the Projection and Join operations:
R 2F S = ΠA(R 1F S) A is the set of all attributes for R
This is actually a Semi-Theta join. There are variants for Semi-Equijoin and Semi-Natural
join.
Example 4.10 Semijoin operation
List complete details of all staff who work at the branch in Glasgow.
If we are interested in seeing only the attributes of the Staff relation, we can use the following
Semijoin operation, producing the relation shown in Figure 4.11.
Staff 2 Staff.branchNo = Branch branchNo.(σcity = ‘Glasgow’ (Branch))
Join Operations
Reviewed by Shopping Sale
on
22:07
Rating:
No comments: