cmsc424: database design
DESCRIPTION
CMSC424: Database Design. Lecture 5. Review: Relational Algebra. Relational Algebra Operators Select ( ) Project ( ) Set Union (U) Set Difference (-) Cartesian Product ( ) Rename ( ) These are called fundamental operations. 3. Outer Joins ( ). - PowerPoint PPT PresentationTRANSCRIPT
CMSC424, Spring 2005
CMSC424: Database Design
Lecture 5
CMSC424, Spring 2005
Review: Relational Algebra
Relational Algebra Operators1. Select ()2. Project ()3. Set Union (U)4. Set Difference (-)5. Cartesian Product ()6. Rename ()
These are called fundamental operations
CMSC424, Spring 2005
Relational AlgebraRedundant Operators
4. Update ( )
2. Division ( )
1. Natural Join ( )
3. Outer Joins ( )
CMSC424, Spring 2005
Natural Join
Idea: match tuples on common attributes
A B C D
1223
αααβ
+--+
10102010
E B D
‘a’‘a’‘b’‘c’
ααββ
10201010
r s
A B C D E
12233
αααββ
+--++
1010201010
‘a’‘a’‘a’‘b’‘c’
=
Relation1 Relation2Notation:
CMSC424, Spring 2005
Division
Query: Find customers who have accounts in all branches in Brooklyn
r1 all branches in Brooklynr2 associate customers with branches they
have accounts in
Now what ? Use the division operator
Relation1 Relation2Notation:
Idea: expresses “for all” queries
CMSC424, Spring 2005
bname lno amt
DowntownRedwood
Perry
L-170L-230L-260
300040001700
loan =cname lno
JonesSmithHayes
L-170L-230L-155
borrower =
=bname lno amt cname
DowntownRedwood
L-170L-230
30004000
JonesSmith
Join result loses… any record of Perry any record of Hayes
Outer Joins
Motivation:
loan borrower =
CMSC424, Spring 2005
bname lno amt
DowntownRedwood
Perry
L-170L-230L-260
300040001700
loan =cname lno
JonesSmithHayes
L-170L-230L-155
borrower =
bname lno amt cname
DowntownRedwood
Perry
L-170L-230L-260
300040001700
JonesSmith
┴
• preserves all tuples in left relation1. Left Outer Join ( )
┴ = NULL
Outer Joins
loan borrower =
CMSC424, Spring 2005
bname lno amt cname
DowntownRedwood
┴
L-170L-230L-155
30004000
┴
JonesSmithHayes
bname lno amt
DowntownRedwood
Perry
L-170L-230L-260
300040001700
loan =cname lno
JonesSmithHayes
L-170L-230L-155
borrower =
• preserves all tuples in right relation2. Right Outer Join ( )
┴ = NULL
Outer Joins
loan borrower =
CMSC424, Spring 2005
bname lno amt
DowntownRedwood
Perry
L-170L-230L-260
300040001700
loan =cname lno
JonesSmithHayes
L-170L-230L-155
borrower =
• preserves all tuples in both relations3. Full Outer Join ( )
┴ = NULL
Outer Joins
bname lno amt cname
DowntownRedwood
Perry┴
L-170L-230L-260L-155
300040001700
┴
JonesSmith
┴Hayes
loan borrower =
CMSC424, Spring 2005
1. Deletion: r r – s e.g., account account – σbname=Perry (account)(deletes all Perry accounts)
2. Insertion: r r se.g., branch branch {(Waltham, Boston, 7M)}(inserts new branch with bname = Waltham, bcity = Boston, assets = 7M)
3. Update: r πe1,…,en (r)e.g., account πbname,acct_no,bal*1.05 (account)(adds 5% interest to account balances)
Update
Identifier QueryNotation:
Common Uses:
CMSC424, Spring 2005
Extended Relational Algebra
1. Generalized projection
2. Aggregates
CMSC424, Spring 2005
e1,…,en (Relation)
e1,…,en can include arithmetic expressions – not just attributes
cname limit balance
JonesTurner
50003000
20002500
credit =
π cname, limit - balance (credit) = cname limit-balance
JonesTurner
3000500
Generalized Projection
Notation:
Example
Then…
CMSC424, Spring 2005
e1,…,en (Relation)
e1,…,en can include arithmetic expressions – not just attributes
cname limit balance
JonesTurner
50003000
20002500
credit =
π cname, limit - balance as limitbalance (credit) = cname limitbalance
JonesTurner
3000500
Generalized Projection
Notation:
Example
Then…
CMSC424, Spring 2005
Aggregate Functions and Operations
Aggregation function takes a collection of values and returns a single value as a result.
avg: average valuemin: minimum valuemax: maximum valuesum: sum of valuescount: number of values
CMSC424, Spring 2005
Aggregate Operation – Example
Relation r:A B
C
77310
g sum(c) as sumC (r)sum-C
27
CMSC424, Spring 2005
Aggregate Functions and OperationsGeneral form:
G1, G2, …, Gn g F1( A1), F2( A2),…, Fn( An) (E)E is any relational-algebra expressionG1, G2 …, Gn is a list of attributes on which to group (can be empty)
Each Fi is an aggregate function
Each Ai is an attribute name
CMSC424, Spring 2005
Aggregate Operation – Example
Relation account grouped by branch-name:
branch-name g sum(balance) (account)
branch-name account-number balance
PerryridgePerryridgeBrightonBrightonRedwood
A-102A-201A-217A-215A-222
400900750750700
branch-name balancePerryridgeBrightonRedwood
13001500700
CMSC424, Spring 2005
Other Theoretical Languages
Relational CalculusNon-proceduralTuple relational calculus
ExamplesSafety
Domain relational calculus
CMSC424, Spring 2005
Review: Query Languages
Theoretical Use Practical Use
Relational Algebra
Formal semantics of practical QL’s
Language Expressivity
Internal query representation for query optimizers
TRC Foundation for SQL
SQL-92 - Standard for Relational DB Query Languages
SQL-99 - Standard for Object-Relational DB Query Languages
OQL - Standard for Object-Oriented DB Query Languages
XQuery - Standard for XML-based DB Query Languages
CMSC424, Spring 2005
SQL - Introduction
Standard DML/DDL for relational DB’s
• DML = Data Manipulation Language (queries, updates)• DDL = Data Definition Language (create tables, indexes, …)
• View definition• Security (Authorization)• Integrity constraints• Transactions
• Early 70’s, IBM system R project (SEQUEL)• Later, become standard (Structured Query Language)
Also includes
History
CMSC424, Spring 2005
SQL: Basic StructureSELECT A1, ….., An
FROM r1, ….., rm
WHERE P
Equivalent to:
A1,A2,…,An (σP (r1 … rn ))
CMSC424, Spring 2005
A Simple SELECT-FROM-WHERE Query
Similar to
SELECT bnameFROM loanWHERE amt > 1000
Can instead write :SELECT DISTINCT bnameFROM loanWHERE amt > 1000
(removes duplicates from result)
We will discuss bag algebra a bit later
bname ( amt > 1000 (loan) )
bnameRedwoodPerryDowntownPerry
Why preserve duplicates?
Duplicates are retained(i.e., result not a set)
But not quite
CMSC424, Spring 2005
Another SELECT-FROM-WHERE Query
Similar to
SELECT cname, balanceFROM depositor, accountWHERE depositor.acct_no =
account.acct_no
Can also write
SELECT d.cname, a.balanceFROM depositor as d,
account as aWHERE d.acct_no = a.acct_no
(neccessary for self-joins)
cname balanceJohnsonSmithHayesTurner
JohnsonJones
Lindsay
500700400350900750700
Note:
cname, balance (depositor account )
Returns:
CMSC424, Spring 2005
The SELECT Clause
• Equivalent to (generalized) projection, despite name
e.g: SELECT *FROM loan
e.g: SELECT bname, acct_no, balance*1.05
FROM account
• Can use ‘*’ to get all attributes
• Can write SELECT DISTINCT to eliminate duplicates• Can write SELECT ALL to preserve duplicates (default)• Can include arithmetic expressions
CMSC424, Spring 2005
(or , depending on WHERE clause)
The FROM Clause
• Equivalent to cartesian product ()
e.g: FROM borrower, loan
• Computes borrowerloan
• Identifies borrower, loan columns in result, allowing one to write
WHERE borrower.lno = loan.lno
e.g: FROM borrower as b, loan as l
allows one to write
WHERE b.lno = l.lno
• Binds tuples in relations to variable names
CMSC424, Spring 2005
The WHERE Clause
• Equivalent to Selection, despite name
1. Simpleattribute relop attribute (or constant)
(relop: =, <>, <, >, <=, >=)
2. Complex (using AND, OR, NOT, BETWEEN)e.g: SELECT lno
FROM loanWHERE amt BETWEEN 90000 AND 100000
is the same as…
SELECT lnoFROM loanWHERE amt >= 90000 AND amt <= 100000
• WHERE predicate can be:
CMSC424, Spring 2005
Data Definition Language
Allows specification of relation schema as well as:Attribute domainsIntegrity constraintsSecurity and authorization informationCreation of Indexes…
CMSC424, Spring 2005
DDL
CREATE TABLE branch(branch-name char(15) not null, branch-city char(30), assets integer, primary key (branch-name), check (assets >= 0))
DROP TABLE branch
ALTER TABLE branch ADD zipcode integer