closure and lossless decomposition - simon fraser · pdf file• a relation schema r is in...

24
Closure and Lossless Decomposition

Upload: ngokhanh

Post on 13-Mar-2018

219 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Closure and Lossless Decomposition - Simon Fraser · PDF file• A relation schema R is in BCNF if for all ... Closure and Lossless Decomposition 22 ... – Lossless-join decomposition

Closure and Lossless Decomposition

Page 2: Closure and Lossless Decomposition - Simon Fraser · PDF file• A relation schema R is in BCNF if for all ... Closure and Lossless Decomposition 22 ... – Lossless-join decomposition

CMPT 354: Database I -- Closure and Lossless Decomposition 2

Boyce-Codd Normal Form

• A relation schema R is in BCNF if for all functional dependencies in F+ of the form α → β at least one of the following holds– α → β is trivial (i.e., β ⊆ α)– α is a superkey for R

• bor_loan = (customer_id, loan_number, amount) is not in BCNF– loan_number → amount holds on bor_loan but

loan_number is not a superkey

Page 3: Closure and Lossless Decomposition - Simon Fraser · PDF file• A relation schema R is in BCNF if for all ... Closure and Lossless Decomposition 22 ... – Lossless-join decomposition

CMPT 354: Database I -- Closure and Lossless Decomposition 3

Decomposing into BCNF

• For schema R and a non-trivial dependency α → β causing a violation of BCNF, decompose R into– (α U β ): α is the key– (R - ( β - α ))

• bor_loan = (customer_id, loan_number, amount), loan_number → amount– (loan_number, amount ) – (customer_id, loan_number)

Page 4: Closure and Lossless Decomposition - Simon Fraser · PDF file• A relation schema R is in BCNF if for all ... Closure and Lossless Decomposition 22 ... – Lossless-join decomposition

CMPT 354: Database I -- Closure and Lossless Decomposition 4

Third Normal Form

• A relation schema R is in the third normal form (3NF) if for all α → β in F+ at least one of the following holds– α → β is trivial (i.e., β ∈ α)– α is a superkey for R– Each attribute A in β – α is contained in a candidate key

for R• Each attribute may be in a different candidate key

• If a relation is in BCNF it is in 3NF – In BCNF one of the first two conditions above must hold

• The third condition is a minimal relaxation of BCNF to ensure dependency preservation

Page 5: Closure and Lossless Decomposition - Simon Fraser · PDF file• A relation schema R is in BCNF if for all ... Closure and Lossless Decomposition 22 ... – Lossless-join decomposition

CMPT 354: Database I -- Closure and Lossless Decomposition 5

Comparison of BCNF and 3NF

• It is always possible to decompose a relation into a set of relations that are in 3NF such that the decomposition is losslessand the dependencies are preserved

• It is always possible to decompose a relation into a set of relations that are in BCNF such that the decomposition is lossless– It may not be possible to preserve all functional

dependencies

Page 6: Closure and Lossless Decomposition - Simon Fraser · PDF file• A relation schema R is in BCNF if for all ... Closure and Lossless Decomposition 22 ... – Lossless-join decomposition

CMPT 354: Database I -- Closure and Lossless Decomposition 6

Using BCNF and 3NF

• How can we generate lossless decompositions into BCNF and 3NF?

• How can we test if a decomposition is dependency-preserving?

• Some critical operations– If α → β, is α a superkey?– To preserve all functional dependencies F, what is the

minimal set of functional dependencies that we need to preserve?

Page 7: Closure and Lossless Decomposition - Simon Fraser · PDF file• A relation schema R is in BCNF if for all ... Closure and Lossless Decomposition 22 ... – Lossless-join decomposition

CMPT 354: Database I -- Closure and Lossless Decomposition 7

Closure of Attribute Sets

• Given a set of attributes α, the closure of α underF (denoted by α+) is the set of attributes that are functionally determined by α under F– Application: whether α is a super key?

• Algorithm to compute α+, the closure of α under Fresult := α;while (changes to result) do

for each β → γ in F dobegin

if β ⊆ result then result := result ∪ γend

Page 8: Closure and Lossless Decomposition - Simon Fraser · PDF file• A relation schema R is in BCNF if for all ... Closure and Lossless Decomposition 22 ... – Lossless-join decomposition

CMPT 354: Database I -- Closure and Lossless Decomposition 8

Example of Attribute Set Closure• R = (A, B, C, G, H, I)• F = {A → B

A → C CG → HCG → IB → H}

• (AG)+

1. result = AG2. result = ABCG (A → C

and A → B)3. result = ABCGH (CG → H

and CG ⊆ AGBC)4. result = ABCGHI(CG → I

and CG ⊆ AGBCH)

• Is AG a candidate key? – Is AG a super key?

• Does AG → R? • Yes, (AG)+ ⊇ R

– Is any subset of AG a superkey?

• Does A → R? Is (A)+ ⊇ R?• Does G → R? Is (G)+ ⊇ R?

Page 9: Closure and Lossless Decomposition - Simon Fraser · PDF file• A relation schema R is in BCNF if for all ... Closure and Lossless Decomposition 22 ... – Lossless-join decomposition

CMPT 354: Database I -- Closure and Lossless Decomposition 9

Uses of Attribute Closure

• Testing for superkey: check if α+ contains all attributes of R

• Testing functional dependencies– To check if a functional dependency α → β

holds (or, in other words, is in F+), just check if β ⊆ α+

• Computing closure of F– For each γ ⊆ R, find the closure γ+

– For each S ⊆ γ+, output a functional dependency γ → S

Page 10: Closure and Lossless Decomposition - Simon Fraser · PDF file• A relation schema R is in BCNF if for all ... Closure and Lossless Decomposition 22 ... – Lossless-join decomposition

CMPT 354: Database I -- Closure and Lossless Decomposition 10

Armstrong’s Axioms

• Finding F+

– (reflexivity) If β ⊆ α, then α → β– (augmentation) If α → β, then γ α → γ β– (transitivity) If α → β, and β → γ, then α → γ

• These rules are – Sound: generate only functional dependencies

that actually hold – Complete: generate all functional dependencies

that hold

Page 11: Closure and Lossless Decomposition - Simon Fraser · PDF file• A relation schema R is in BCNF if for all ... Closure and Lossless Decomposition 22 ... – Lossless-join decomposition

CMPT 354: Database I -- Closure and Lossless Decomposition 11

Procedure for Computing F+

F + = Frepeat

for each functional dependency f in F+

apply reflexivity and augmentation rules on fadd the resulting functional dependencies to F +

for each pair of functional dependencies f1and f2 in F +if f1 and f2 can be combined using transitivity

then add the resulting functional dependency to F +until F + does not change any further

Page 12: Closure and Lossless Decomposition - Simon Fraser · PDF file• A relation schema R is in BCNF if for all ... Closure and Lossless Decomposition 22 ... – Lossless-join decomposition

CMPT 354: Database I -- Closure and Lossless Decomposition 12

Redundancy among Dependencies

• A → C is redundant in: {A → B, B → C}• Parts of a functional dependency may be

redundant– On left hand side of a rule: {A → B, B → C, AC → D}

can be simplified to {A → B, B → C, A → D} – On right hand side of a rule: {A → B, B → C, A → CD}

can be simplified to {A → B, B → C, A → D} • A canonical cover of F is a “minimal” set of

functional dependencies equivalent to F, having no redundant dependencies or redundant parts of dependencies

Page 13: Closure and Lossless Decomposition - Simon Fraser · PDF file• A relation schema R is in BCNF if for all ... Closure and Lossless Decomposition 22 ... – Lossless-join decomposition

CMPT 354: Database I -- Closure and Lossless Decomposition 13

Extraneous Attributes

• Consider functional dependency α → β in F– Attribute A is extraneous in α if A ∈ α and F

logically implies (F – {α → β}) ∪ {(α – A) → β}.– Attribute A is extraneous in β if A ∈ β and the

set of functional dependencies (F – {α → β}) ∪{α →(β – A)} logically implies F

• Implication in the opposite direction is trivial in each of the cases above, since a “stronger” functional dependency always implies a “weaker” one

Page 14: Closure and Lossless Decomposition - Simon Fraser · PDF file• A relation schema R is in BCNF if for all ... Closure and Lossless Decomposition 22 ... – Lossless-join decomposition

CMPT 354: Database I -- Closure and Lossless Decomposition 14

Extraneous Attributes – Example

• Example: Given F = {A → C, AB → C }– B is extraneous in AB → C because {A → C, AB

→ C} logically implies A → C (i.e. the result of dropping B from AB → C)

• Example: Given F = {A → C, AB → CD}– C is extraneous in AB → CD since AB → C can

be inferred even after deleting C

Page 15: Closure and Lossless Decomposition - Simon Fraser · PDF file• A relation schema R is in BCNF if for all ... Closure and Lossless Decomposition 22 ... – Lossless-join decomposition

CMPT 354: Database I -- Closure and Lossless Decomposition 15

Testing Extraneous Attributes

• Consider functional dependency α → β in F• To test if attribute A ∈ α is extraneous in α

– Compute (α – A)+ using the dependencies in F – If (α – A)+ contains A, A is extraneous

• To test if attribute A ∈ β is extraneous in β– Compute α+ using only the dependencies in F’ =

(F – {α → β}) ∪ {α →(β – A)}, – if α+ contains A, A is extraneous

Page 16: Closure and Lossless Decomposition - Simon Fraser · PDF file• A relation schema R is in BCNF if for all ... Closure and Lossless Decomposition 22 ... – Lossless-join decomposition

CMPT 354: Database I -- Closure and Lossless Decomposition 16

Canonical Cover

• A canonical cover for F is a set of dependencies Fc such that – F and Fc are logically equivalent to each other

• F logically implies all dependencies in Fc

• Fc logically implies all dependencies in F– No functional dependency in Fc contains an

extraneous attribute, and– Each left side of functional dependency in Fc is

unique

Page 17: Closure and Lossless Decomposition - Simon Fraser · PDF file• A relation schema R is in BCNF if for all ... Closure and Lossless Decomposition 22 ... – Lossless-join decomposition

CMPT 354: Database I -- Closure and Lossless Decomposition 17

Canonical Cover Computationrepeat

use the union rule to replace anydependencies in F

α1 → β1 and α1 → β2 with α1 → β1 β2 find a functional dependency α → β with an

extraneous attribute either in α or in βif an extraneous attribute is found, then delete it from α → β

until F does not change• The union rule may become applicable after some

extraneous attributes have been deleted, so it has to be re-applied

Page 18: Closure and Lossless Decomposition - Simon Fraser · PDF file• A relation schema R is in BCNF if for all ... Closure and Lossless Decomposition 22 ... – Lossless-join decomposition

CMPT 354: Database I -- Closure and Lossless Decomposition 18

Example• R = (A, B, C)

F = {A → BC, B → C, A →B, AB → C}

• Combine A → BC and A → B into A → BC– Now, F = {A → BC, B → C,

AB → C}

• A is extraneous in AB → C– B → C is already present– Now, F = {A → BC, B → C}

• C is extraneous in A → BC – Check if A → C is logically

implied by A → B and the other dependencies

• Yes: using transitivity on A → B and B → C.

• Can use attribute closure of A in more complex cases

• The canonical cover is: {A → B, B → C}

Page 19: Closure and Lossless Decomposition - Simon Fraser · PDF file• A relation schema R is in BCNF if for all ... Closure and Lossless Decomposition 22 ... – Lossless-join decomposition

CMPT 354: Database I -- Closure and Lossless Decomposition 19

Lossless-join Decomposition

• If R is decomposed into R1 and R2, we require that for all possible relations r on schema R satisfies r = ∏R1 (r ) ∏R2 (r )

• A decomposition of R into R1 and R2 is lossless join if and only if at least one of the following dependencies is in F+

– R1 ∩ R2 → R1– R1 ∩ R2 → R2

Page 20: Closure and Lossless Decomposition - Simon Fraser · PDF file• A relation schema R is in BCNF if for all ... Closure and Lossless Decomposition 22 ... – Lossless-join decomposition

CMPT 354: Database I -- Closure and Lossless Decomposition 20

Example• R = (A, B, C), F = {A → B, B → C)

– Can be decomposed in two different ways

• R1 = (A, B), R2 = (B, C)– Lossless-join decomposition:

R1 ∩ R2 = {B} and B → BC– Dependency preserving

• R1 = (A, B), R2 = (A, C)– Lossless-join decomposition:

R1 ∩ R2 = {A} and A → AB– Not dependency preserving

(cannot check B → C without computing R1 R2)

Page 21: Closure and Lossless Decomposition - Simon Fraser · PDF file• A relation schema R is in BCNF if for all ... Closure and Lossless Decomposition 22 ... – Lossless-join decomposition

CMPT 354: Database I -- Closure and Lossless Decomposition 21

Dependency Preservation

• Let Fi be the set of dependencies F+ that includes only attributes in Ri– A decomposition is dependency preserving, if

(F1 ∪ F2 ∪ … ∪ Fn)+ = F+

– If it is not, then checking updates for violation of functional dependencies may require computing joins, which is expensive

Page 22: Closure and Lossless Decomposition - Simon Fraser · PDF file• A relation schema R is in BCNF if for all ... Closure and Lossless Decomposition 22 ... – Lossless-join decomposition

CMPT 354: Database I -- Closure and Lossless Decomposition 22

Testing Dependency Preservation• To check if a dependency α → β is preserved in a

decomposition of R into R1, R2, …, Rnresult = αrepeat

for each Ri in the decompositiont = (result ∩ Ri)+ ∩ Ri , result = result ∪ t

until result does not change– If result contains all attributes in β, then the functional dependency

α → β is preserved• We apply the test on all dependencies in F to check if a

decomposition is dependency preserving• This procedure takes polynomial time

– Compute F+ and (F1 ∪ F2 ∪ … ∪ Fn)+ requires exponential time

Page 23: Closure and Lossless Decomposition - Simon Fraser · PDF file• A relation schema R is in BCNF if for all ... Closure and Lossless Decomposition 22 ... – Lossless-join decomposition

CMPT 354: Database I -- Closure and Lossless Decomposition 23

Example

• R = (A, B, C )F = {A → B, B → C}Key = {A}

• R is not in BCNF• Decomposition R1 = (A, B), R2 = (B, C)

– R1 and R2 are in BCNF– Lossless-join decomposition– Dependency preserving

Page 24: Closure and Lossless Decomposition - Simon Fraser · PDF file• A relation schema R is in BCNF if for all ... Closure and Lossless Decomposition 22 ... – Lossless-join decomposition

CMPT 354: Database I -- Closure and Lossless Decomposition 24

Summary and To-Do List

• Closure of attribute sets: concept, computation, and applications

• Canonical cover: concept and computation• Lossless join decomposition• Testing dependency preservation• Read Sections 7.4.2-7.4.5• Assignment 2