relational algebra

54
Relational Algebra Lecture 4

Upload: osborn

Post on 08-Jan-2016

32 views

Category:

Documents


1 download

DESCRIPTION

Relational Algebra. Lecture 4. Relational Algebra. Relational Algebra Extended Relational-Algebra-Operations Modification of the Database Views Operations on Bags. Relational Schema and Relations. R is a set of attributes: R(A,B,C, …, D) – is a relational schema - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Relational Algebra

Relational Algebra

Lecture 4

Page 2: Relational Algebra

Relational Algebra

• Relational Algebra

• Extended Relational-Algebra-Operations

• Modification of the Database

• Views

• Operations on Bags

Page 3: Relational Algebra

Relational Schema and Relations

• R is a set of attributes: R(A,B,C, …, D) – is a relational schema

• <val(A1), val(A2), val(C), …val(D)> - tuple

• r – is a set of tuples defined on attributes of R is a relation with schema R. Denoted by r(R).

Page 4: Relational Algebra

What is “algebra”

• Mathematical model consisting of:– Operands --- Variables or values;

– Operators --- Symbols denoting procedures that construct new values from a given values

• Relational Algebra is algebra whose operands are relations and operators are designed to do the most commons things that we need to do with relations

Page 5: Relational Algebra

Relational Algebra

• Six basic operators

– Select – r)– Project – r)– Union – r U s

– set difference – r - s

– Cartesian product – r X s

– Rename – rename (r)

• The operators take two or more relations as inputs and give a new relation as a result.

Page 6: Relational Algebra

Select Operation – Example

• Relation r A B C D

1

5

12

23

7

7

3

10

A=B ^ D > 5 (r)A B C D

1

23

7

10

Page 7: Relational Algebra

Select Operation

• Notation: p(r)• p is called the selection predicate• Defined as:

p(r) = {t | t r and p(t)}Where p is a formula in propositional calculus consisting of terms connected by : (and), (or), (not)Each term is one of:

<attribute>op <attribute> or <constant> where op is one of: =, , >, . <. • Example of selection:

branch-name=“Perryridge”(account)

Page 8: Relational Algebra

Project Operation – Example

• Relation r:

A B C

10

20

30

40

1

1

1

2

A C

1

1

1

2

=

A C

1

1

2

A,C (r)That is, the projection ofa relation on a set of attributes is a set of tuples

Page 9: Relational Algebra

Project Operation

• Notation:

A1, A2, …, Ak (r)

where A1, A2 are attribute names and r is a relation.• The result is defined as the relation of k columns obtained

by erasing the columns that are not listed• Duplicate rows removed from result, since relations are

sets• E.g. to eliminate the branch-name attribute of account

account-number, balance (account)

Page 10: Relational Algebra

Union Operation – Example

• Relations r, s:

r s:

A B

1

2

1

A B

2

3

rs

A B

1

2

1

3

Page 11: Relational Algebra

Union Operation

• Notation: r s

• Defined as:

r s = {t | t r or t s}

• For r s to be valid.

1. r, s must have the same number of attributes

2. The attribute domains must be compatible (e.g., 2nd column of r deals with the same type of values as does the 2nd column of s)

to find all customers with either an account or a loan customer-name (depositor) customer-name (borrower)

Page 12: Relational Algebra

Set Difference Operation – Example

• Relations r, s:

r – s:

A B

1

2

1

A B

2

3

s

A B

1

1

r

Page 13: Relational Algebra

Set Difference Operation

• Notation r – s

• Defined as:

r – s = {t | t r and t s}

• Set differences must be taken between compatible relations.

– r and s must have the same number of attributes

– attribute domains of r and s must be compatible

Page 14: Relational Algebra

Cartesian-Product Operation-Example

Relations r, s:

r x s:

A B

1

2

A B

11112222

C D

1010201010102010

E

aabbaabb

C D

10102010

E

aabbr

s

Page 15: Relational Algebra

Cartesian-Product Operation

• Notation r x s

• Defined as:

r x s = {t q | t r and q s}

• Assume that attributes of r(R) and s(S) are disjoint. (That is, R S = ).

• If attributes of r(R) and s(S) are not disjoint, then renaming

must be used.

Page 16: Relational Algebra

Composition of Operations• Can build expressions using multiple operations• Example: A=C(r x s)• r x s

A=C(r x s)

A B

11112222

C D

1010201010102010

E

aabbaabb

A B C D E

122

102020

aab

Page 17: Relational Algebra

Rename Operation

• Allows us to name, and therefore to refer to, the results of relational-algebra expressions.

• Allows us to refer to a relation by more than one name.Example:

X (E)returns the expression E under the name XIf a relational-algebra expression E has arity n, then

(A1, A2, …, An) (E)returns the result of expression E under the name X, and with

the attributes renamed to A1, A2, …., An.

xx

Page 18: Relational Algebra

Banking Example

branch (branch-name, branch-city, assets)

customer (customer-name, customer-street, customer-only)

account (account-number, branch-name, balance)

loan (loan-number, branch-name, amount)

depositor (customer-name, account-number)

borrower (customer-name, loan-number)

Page 19: Relational Algebra

Example Queries

• Find all loans of over $1200

Find the loan number for each loan of an amount greater than $1200

amount > 1200 (loan)

loan-number (amount > 1200 (loan))

Page 20: Relational Algebra

Example Queries

• Find the names of all customers who have a loan, an account, or both, from the bank

Find the names of all customers who have a loan and an account at bank.

customer-name (borrower) customer-name (depositor)

customer-name (borrower) customer-name (depositor)

Page 21: Relational Algebra

Example Queries

• Find the names of all customers who have a loan at the Perryridge branch.

Find the names of all customers who have a loan at the Perryridge branch but do not have an account at any branch of the bank.customer-name (branch-name = “Perryridge”

(borrower.loan-number = loan.loan-number(borrower x loan))) –

customer-name(depositor)

customer-name (branch-name=“Perryridge”

(borrower.loan-number = loan.loan-number(borrower x loan)))

Page 22: Relational Algebra

Example Queries

• Find the names of all customers who have a loan at the Perryridge branch.

Query 2

customer-name(loan.loan-number = borrower.loan-number(

(branch-name = “Perryridge”(loan)) x borrower))

Query 1

customer-name(branch-name = “Perryridge” ( borrower.loan-number = loan.loan-number(borrower x loan)))

Page 23: Relational Algebra

Example Queries

Find the largest account balance• Rename account relation as d• The query is:

balance(account) - account.balance

(account.balance < d.balance (account x d (account)))

Page 24: Relational Algebra

Formal Definition

• A basic expression in the relational algebra consists of either one of the following:– A relation in the database– A constant relation

• Let E1 and E2 be relational-algebra expressions; the following are all relational-algebra expressions:– E1 E2

– E1 - E2

– E1 x E2

p (E1), P is a predicate on attributes in E1

s(E1), S is a list consisting of some of the attributes in E1

x (E1), x is the new name for the result of E1

Page 25: Relational Algebra

Additional Operations

We define additional operations that do not add any powerto the relational algebra, but that simplify common queries.

• Set intersection• Natural join• Division• Assignment

Page 26: Relational Algebra

Set-Intersection Operation

• Notation: r s

• Defined as:

• r s ={ t | t r and t s }

• Assume:

– r, s have the same arity

– attributes of r and s are compatible

• Note: r s = r - (r - s)

Page 27: Relational Algebra

Set-Intersection Operation - Example

• Relation r, s:

• r s

A B

121

A B

23

r s

A B

2

Page 28: Relational Algebra

Notation: r sNatural-Join Operation

• Let r and s be relations on schemas R and S respectively. Then, r s is a relation on schema R S obtained as follows:

– Consider each pair of tuples tr from r and ts from s.

– If tr and ts have the same value on each of the attributes in R S, add a tuple t to the result, where

• t has the same value as tr on r

• t has the same value as ts on s

• Example:

R = (A, B, C, D)

S = (E, B, D)

– Result schema = (A, B, C, D, E)

– r s is defined as: r.A, r.B, r.C, r.D, s.E (r.B = s.B r.D = s.D (r x s))

Page 29: Relational Algebra

Natural Join Operation – Example

• Relations r, s:

A B

12412

C D

aabab

B

13123

D

aaabb

E

r

A B

11112

C D

aaaab

E

s

r s

Page 30: Relational Algebra

Division Operation

• Suited to queries that include the phrase “for all”.• Let r and s be relations on schemas R and S respectively where

– R = (A1, …, Am, B1, …, Bn)– S = (B1, …, Bn)The result of r s is a relation on schema

R – S = (A1, …, Am)

r s = { t | t R-S(r) u s ( tu r ) }

r s Notation:

Page 31: Relational Algebra

Division Operation – Example

Relations r, s:

r s: A

B

1

2

A B

12311134612

r

s

Page 32: Relational Algebra

Another Division Example

A B

aaaaaaaa

C D

aabababb

E

11113111

Relations r, s:

r s:

D

ab

E

11

A B

aa

C

r

s

Page 33: Relational Algebra

Division Operation

• Definition in terms of the basic algebra operationLet r(R) and s(S) be relations, and let S R

r s = R-S (r) –R-S ( (R-S (r) x s) – R-S,S(r))

To see why R-S,S(r) simply reorders attributes of r

R-S(R-S (r) x s) – R-S,S(r)) gives those tuples t in

R-S (r) such that for some tuple u s, tu r.

Page 34: Relational Algebra

Assignment Operation• The assignment operation () provides a convenient

way to express complex queries. – Write query as a sequential program consisting of

• a series of assignments • followed by an expression whose value is

displayed as a result of the query.– Assignment must always be made to a temporary

relation variable.• Example: Write r s as

temp1 R-S (r)

temp2 R-S ((temp1 x s) – R-S,S (r))

result = temp1 – temp2

Page 35: Relational Algebra

Example Queries

• Find all customers who have an account from at least the “Downtown” and the Uptown” branches.

where CN denotes customer-name and BN denotes

branch-name.

Query 1

CN(BN=“Downtown”(depositor account))

CN(BN=“Uptown”(depositor account))

Query 2

customer-name, branch-name (depositor account)

temp(branch-name) ({(“Downtown”), (“Uptown”)})

Page 36: Relational Algebra

• Find all customers who have an account at all branches

located in Brooklyn city.

Example Queries

customer-name, branch-name (depositor account)

branch-name (branch-city = “Brooklyn” (branch))

Page 37: Relational Algebra

Extended Relational-Algebra-Operations

• Generalized Projection

• Outer Join

• Aggregate Functions

Page 38: Relational Algebra

Generalized Projection

• Extends the projection operation by allowing arithmetic functions to be used in the projection list.

F1, F2, …, Fn(E)

• E is any relational-algebra expression

• Each of F1, F2, …, Fn are are arithmetic expressions involving constants and attributes in the schema of E.

• Given relation credit-info(customer-name, limit, credit-balance), find how much more each person can spend:

customer-name, limit – credit-balance (credit-info)

Page 39: Relational Algebra

Aggregate Functions and Operations• Aggregation function takes a collection of values and returns a single

value as a result.avg: average valuemin: minimum valuemax: maximum valuesum: sum of valuescount: number of values

• Aggregate operation in relational algebra

G1, G2, …, Gn g F1( A1), F2( A2),…, Fn( An) (E)– E is any relational-algebra expression– G1, G2 …, Gn is a list of attributes on which to group (can be empty)– Each Fi is an aggregate function– Each Ai is an attribute name

Page 40: Relational Algebra

Aggregate Operation – Example

• Relation r:

A B

C

7

7

3

10

g sum(c) (r)sum-C

27

Page 41: Relational Algebra

Aggregate Operation – Example

• Relation account grouped by branch-name:

branch-name g sum(balance) (account)

branch-name account-number balance

PerryridgePerryridgeBrightonBrightonRedwood

A-102A-201A-217A-215A-222

400900750750700

branch-name balance

PerryridgeBrightonRedwood

13001500700

Page 42: Relational Algebra

Aggregate Functions • Result of aggregation does not have a name

– Can use rename operation to give it a name

– For convenience, we permit renaming as part of aggregate operation

branch-name g sum(balance) as sum-balance (account)

Page 43: Relational Algebra

Outer Join – Example

• Relation loan

Relation borrowercustomer-name loan-number

JonesSmithHayes

L-170L-230L-155

300040001700

loan-number amount

L-170L-230L-260

branch-name

DowntownRedwoodPerryridge

Page 44: Relational Algebra

Left Outer Join• Join

loan Borrower

loan-number amount

L-170L-230

30004000

customer-name

JonesSmith

branch-name

DowntownRedwood

JonesSmithnull

loan-number amount

L-170L-230L-260

300040001700

customer-namebranch-name

DowntownRedwoodPerryridge

Left Outer Join

loan Borrower

Page 45: Relational Algebra

Right Outer Join, Full Outer Join• Right Outer Join

loan borrower

loan borrowerOuter Join

loan-number amount

L-170L-230L-155

30004000null

customer-name

JonesSmithHayes

branch-name

DowntownRedwoodnull

loan-number amount

L-170L-230L-260L-155

300040001700null

customer-name

JonesSmithnullHayes

branch-name

DowntownRedwoodPerryridgenull

Page 46: Relational Algebra

Null Values

• It is possible for tuples to have a null value, denoted by null, for some of their attributes

• null signifies an unknown value or that a value does not exist.

• The result of any arithmetic expression involving null is null.

• Aggregate functions simply ignore null values

• For duplicate elimination and grouping, null is treated like any other value, and two nulls are assumed to be the same

Page 47: Relational Algebra

Null Values

• Comparisons with null values return the special truth value unknown– If false was used instead of unknown, then not (A < 5)

would not be equivalent to A >= 5• Three-valued logic using the truth value unknown:

– OR: (unknown or true) = true, (unknown or false) = unknown (unknown or unknown) = unknown

– AND: (true and unknown) = unknown, (false and unknown) = false, (unknown and unknown) = unknown

– NOT: (not unknown) = unknown• Result of select predicate is treated as false if it evaluates to

unknown

Page 48: Relational Algebra

Expression Trees

Leaves are operands --- either variables standing for relations or particular relations

Interior nodes are operators applied to their descendents customer-name, branch-name

depositor account

Page 49: Relational Algebra

Relational Algebra on Bags

• A bag is like a set but it allows elements to be repeated in a set.

• Example: {1, 2, 1, 3, 2, 5, 2} is a bag.

• Difference between a bag and a list is that order is not important in a bag.

• Example: {1, 2, 1, 3, 2, 5, 2} and

{1,1,2,3,2,2,5} is the same bag

Page 50: Relational Algebra

Need for Bags

• SQL allows relations with repeated tuples. Thus SQL is not a relational algebra but rather “bag” algebra

• In SQL one need to specifically ask to remove duplicates, otherwise replicated tuples will not be eliminated

• Operation projection is more efficient on bags than on sets

Page 51: Relational Algebra

Operations on Bags

• Select applies to each tuple and no duplicates are eliminated

• Project also applies to each tuple and duplicates are not eliminated. Example

A B C

1 2 3 1 2 52 3 7

Projection on A, B A B

1 21 22 3

Page 52: Relational Algebra

Other Bag Operations

• An element in the union appears the number of times it appears in both bags

• Example: {1, 2, 3, 1} UNION {1, 1, 2, 3, 4, 1} = {1, 1, 1, 1, 1, 2, 2, 3, 3, 4}• An element appears in the intersection of two bags is the minimum of the

number of times it appears in either.• Example (con’t): {1, 2, 3, 1} INTERSECTION {1, 1, 2, 3, 4, 1} = {1, 1, 2, 3}• An element appears in the difference of two bags A and B as it appears in

A minus the number of times it appears in B but never less that 0 times

Page 53: Relational Algebra

Bag Laws

• Not all laws for set operations are valid for bags:

• Commutative law for union does hold for bags:

R UNION S = S UNION R

• However S union S = S for sets and it is not equal to S if S is a bag

Page 54: Relational Algebra

Extension of Relational Algebra to Bags

• All operations that we studied for sets can be extended on bags.

• We have considered Union, Difference, Projection, Cartesian Product, Rename, Select