5. distributed database design

1

5. Distributed Database Design

Chapter 3

Distributed Database Design

2

Outline

Introduction Fragmentation （片段划分） Horizontal fragmentation （水平划分） Derived horizontal fragmentation （导出式水平划分） Vertical fragmentation （垂直划分）

Allocation （片段分配）

3

Outline



4

Distributed Database Design

The design of a distributed computer system involves making decisions on the placement of data and programs across the sites of a computer network.

In distributed DBMSs, such placement involves two things: Placement of the DDBMS software Placement of the applications that run on the database

The course concentrates on distribution of data. The distribution of DDBMS and applications are given a

priori.

5

Framework of Distribution

Behavior of access pattern

DynamicStatic

Data sharing

Data + program sharing

PartialInformation

CompleteInformation

Level of knowledgeon access pattern behavior

Level of sharing

No sharing

6

Design Strategies

Top-Down Mostly in designing systems from scratch

Mostly in designing homogeneous systems

Bottom-up When the databases already exist at a number of sites

Combining both

7

Top-Down Design Process

Requirement Analysis

View DesignConceptual Design

Distribution Design

Physical Design

Objectives

Access Information ES’sGCS

LCS’s

LIS’s

User Input

User Input User Input

View Integration

8

Distribution Design Issues

Fragmentation Why fragmentation at all?

How should we fragment?

How much should we fragment?

How to test correctness of decomposition?

Allocation

Necessary information required for fragmentation

and allocation

9

Why Fragment?

Can’t we just distribute relations? What is a reasonable unit of distribution? Relation

– Views are subsets of relations locality

– Extra communication cost

Fragments of relations (sub-relations)

10

Fragmentation

Unit of distribution = unit of data application accesses

Reduce irrelevant data access Facilitate intra-query concurrency over different fragments Can be used with other performance enhancing methods

(e.g., indexing and clustering) Applications have conflicting requirements, making disjoint

fragmentation a very hard problem Multiple fragment access requires join or union Semantic data control (integrity enforcement) could be very

costly

11

About fragmentation

How should we fragment? Vertical Fragments sub grouping of attributes

Horizontal Fragments sub grouping of tuples

Mixed/Hybrid Fragments combination of above two

How much to fragment? Too little too much of irrelevant data access

Too much too much processing cost

Need to find suitable level of fragmentation

12

Correctness Criteria Completeness no loss of data Decomposition of relation R into fragments R1, R2, …, Rn is

complete if and only if each data item in R can also be found in some Ri .

Reconstruction If relation R is decomposed into fragments R1, R2, …, Rn then there

should exist some relational operator such that R = 1 i n Ri

Disjointness If relation R is decomposed into fragments R1, R2, …, Rn and data

item di is in Rj, then di should not be in any other fragment Rk (k!=j).

13

Allocation Alternatives

Full Replication Each fragment resides at each site

Partial Replication Each fragment resides at some of the sites

Not-replicated (Partitioned) Each fragment resides at only one site

Q & A:

What are the advantages and disadvantages?

14

Allocation Alternatives

Full-Replication Partial -replication PartitioningQueryProcessing

Easy

DirectoryManagement

Easy ornon-existent

ConcurrencyControl

Moderate Difficult Easy

Reliability High High Low

RealityPossible

applicationRealistic

Possibleapplication

Same Difficulty

Same Difficulty

15

Rule of Thumb for Allocation

If number-of-read-only-queries is more than number-of-update-queries,

then replication is advantageous, otherwise replication may cause problems.

16

Information Requirements

Four categories Database information

Application information

Communication network information

Computer system information

17

Outline



18

Fragmentation

Horizontal fragmentation (HF) Primary horizontal fragmentation (PHF)

– based on predicates accessing the relation

Derived horizontal fragmentation (DHF)– based on predicates being defined on another logically related

relation

We shall first study algorithm for horizontal fragmentation, and then study issues related to derived horizontal fragmentation.

Vertical fragmentation (VF) Hybrid fragmentation (HVF)

19

PHF Information Requirements

Title, Sal

PAY

Eno,Ename, Title

EMP

Jno, Jname, Budget. Loc

PROJ

Eno, Jno, Resp, Dur

ASG

L1

L2 L3

Database Information > links

> Cardinality of each relation: card (R)

20

PHF Information Requirements (cont.)

Simple predicate: Given R(A1 , A2 , ..., An ), with each Ai

having domain of values dom(Ai), a simple predicate pj is

pj : Ai Value

where {<, >, , , } and Value dom(Ai).

Example Jname=“maintenance”Budget 200000

Follow ``80/20'' rule

Application Information 1

21

minterm predicate: Given R and a set of simple predicates Pr = {p1, p2, ..., pm} on R, the set of minterm

predicates M = {m 1, m 2 , ..., m z } is defined as

M = {mi | mi = Pj Pr p*j }, 1 i z, 1 j z

where p*j = pj or ¬pj

m1: (Jname=“maintenance”) (Budget 200000)

m2: ¬ (Jname=“maintenance”) (Budget 200000)

m3: (Jname=“maintenance”) ¬ (Budget 200000)

m4: ¬ (Jname=“maintenance”) ¬ (Budget 200000)

Example



22


Minterm selectivity: sel (mi)

– number of tuples of the relation that would be accessed by a

user query specified according to a given minterm predicate.

Access frequency: acc (qi)

– frequency with which user applications access data. If Q = {q1 ,

q2 , ..., qn } is the set of queries, acc (qi ) indicates access

frequency of query qi in a given period.


23

Primary Horizontal Fragmentation

Each horizontal fragment Ri of relation R is defined by

Ri = Fi (R), 1 i w, where Fi is a selection formula,

which is (preferably) a minterm predicate.

A horizontal fragment Ri of relation R consists of all the

tuples of R which satisfy a minterm predicate mi.

Given a set of minterm predicates M, there are as

many as horizontal fragments of relation R as there

are minterm predicates.

Set of horizontal fragments also referred to as

minterm fragments

24

PHF - Algorithm

Input: A relation R, a set of simple predicates Pr

Output: The set of fragments of R = {R1 , R2 , ..., Rw },

which obey the fragmentation rules.

Preliminaries:

Pr should be complete

Pr should be minimal

25

Completeness of Simple Predicates

A set of simple predicates Pr is said to be

complete if and only if the accesses to the tuples of the minterm fragments defined on Pr requires

that two tuples of the same minterm fragment have the same probability of being accessed by any application.

26

Completeness of Simple Predicates Example: PNO PNAME BUDGET LOC

P1 Instrumentation 150000 MntrealP2 Database 135000 New YorkP3 CAD/CAM 250000 New YorkP4 Maintenance 310000 Paris

Applications:Q1: Find the projects at each locationQ2: Find projects with budget less than $200,000

Predicates: Pr={ LOC=“Montreal”, LOC=“New York”, LOC=“Paris”} Pr={ LOC=“Montreal”, LOC=“New York”, LOC=“Paris”,

BUDGET 2000000, BUDGET >200000}

incomplete

27

Minimality of Simple Predicates

If a predicate influences how fragmentation is performed, (i.e., causes a fragment f to be further fragmented into, say, fi and fj), then there should be

at least one application that accesses fi and fj

differently. In other words, the simple predicate should be

relevant in determining a fragmentation.

If all the predicates of a set Pr are relevant, then Pr is

minimal.

28

Minimality of Simple Predicates

Example

Pr={ LOC=“Montreal”, LOC=“New York”, LOC=“Paris”, BUDGET 2000000, BUDGET >200000, PNAME=“Instrumentation”}

Applications:Q1: Find the projects at each locationQ2: Find projects with budget less than $200,000

Minimal??

Pr={ LOC=“Montreal”, LOC=“New York”, LOC=“Paris”, BUDGET 2000000, BUDGET >200000} minimal

29

COM_MIN Algorithm

Input: A relation R and a set of simple predicates Pr

Output: A complete and minimal set of simple predicates Pr ’ for Pr

Rule 1: A relation or fragment is partitioned into at least two parts which are accessed differently by at least one application.

30

COM_MIN Algorithm (cont.)

1. Initialization

• Find a pi Pr such that pi partitions R according to Rule 1

• Set Pr’ = pi ; Pr Pr - pi ; F fi

2. Iteratively add predicates to Pr ’ until it is complete

• Find a pj Pr such that pj partitions some fk defined according to

minterm predicate over Pr’ according to Rule 1

• Pr’ pj ; Pr = Pr - pj ; F fj

• If pk Pr ’ which is nonrelevant, then

Pr ’ = Pr ’ - pk ; F F - fk

31

PHORIZONTAL Algorithm

Make use of COM_MIN to perform fragmentation

Input: A relation R and a set of simple predicates Pr

Output: A set of minterm predicates M according to which relation R is to be fragmented

Steps: Pr ’ COM_MIN(R, Pr) Determine the set M of minterm predicates Determine the set I of implications among pi Pr ’

Eliminate the contradictory minterms from M

32

Contradictory Minterms

Given a minimal and complete set of simple predicates, containing n simple predicates

Not all the minterm fragments derived are valid A fragment can be self contradictory because of

implications among simple predicates.

Example: Dom(Sal): [10000, 200000]; Dom(Loc) = {HK,SF}

p1 : sal < 50000; p2 : Loc = HK; p3 : Loc = SF

Note: p2 (¬ p3 ); p3 (¬ p2 )

the minterm p1 p2 p3 is self contradictory.

33

PHORIZONTAL Algorithm (cont.)

Input: relation R and a set of simple predicates Pr

Output: a set of minterm fragments M

begin

Pr’ = COM-MIN(R, Pr );

M = set of minterm predicates from Pr’

I = set of implications among pi Pr’

for each mi M

if mi is contradictory according to I then

M = M mi

end

34

PHF Example: PAYTITLE SAL

Mech. Eng. 27000Programmer 24000Elec. Eng. 40000Syst. Anal. 34000

PAY

Application: Check the salary info and determine raiseEmployee records kept at two sites ==> application runs at two sites

Pr = {P1, P2}, which is complete and minimal Pr’ = PrMinterm predicates: m1: (SAL 30000); m2: (SAL > 30000)

TITLE SALElec. Eng. 40000Syst. Anal. 34000

PAY2

TITLE SALMech. Eng. 27000Programmer 24000

PAY1

Simple predicates: P1: SAL 30000, P2: SAL > 30000

35

PHF Example: PROJ

Applications Find the name and budget of

projects given their locations – Issued at three sites

Access project information according to budget

– One site access 200000; other accesses >200000

p1: LOC = “Montreal”

p2 : LOC = “New York”

p3 : LOC = “Paris”

p4: BUDGET 200000

p5 : BUDGET > 200000

m1: LOC = “Montreal” BUDGET 200000

m2 : LOC = “Montreal” BUDGET > 200000

m3 : LOC = “New York” BUDGET 200000

m4 : LOC = “New York” BUDGET > 200000

m5 : LOC = “Paris” BUDGET 200000

M6 : LOC = “Paris” BUDGET > 200000

PNO PNAME BUDGET LOCP1 Instrumentation 150000 MntrealP2 Database 135000 New YorkP3 CAD/CAM 250000 New YorkP4 Maintenance 310000 Paris

PROJ

36

PHF Example: PROJ - Result

PNO PNAME BUDGET LOCP1 Instrumentation 150000 MntrealP2 Database 135000 New YorkP3 CAD/CAM 250000 New YorkP4 Maintenance 310000 Paris

PROJ

m1: LOC = “Montreal” BUDGET 200000

m2 : LOC = “Montreal” BUDGET > 200000

m3 : LOC = “New York” BUDGET 200000

m4 : LOC = “New York” BUDGET > 200000

m5 : LOC = “Paris” BUDGET 200000

M6 : LOC = “Paris” BUDGET > 200000

PNO PNAME BUDGET LOCP1 Instrumentation 150000 Mntreal

PROJ1

PNO PNAME BUDGET LOCP2 Database 135000 New York

PNO PNAME BUDGET LOCP3 CAD/CAM 250000 New York

PNO PNAME BUDGET LOCP4 Maintenance 310000 Paris

PROJ2

PROJ4

PROJ6

37

PHF - Correctness

Completeness Since Pr is complete and minimal, the selection

predicates are complete

Reconstruction If relation R is fragmented into FR = {R1, R2, …, Rr}

R = Ri FR Ri

Disjointness Minterm predicates that form the basis of fragmentation

should be mutually exclusive

38

DHF: Derived Horizontal Fragmentation

Title, Sal

PAY

Eno, Ename, Title

EMP

Jno, Jname, Budget. Loc

PROJ

Eno, Jno, Resp, Dur

ASG

L1

L2 L3

Each link is an equijoin

Owner relation

Member relation

DHF: Defined on a member relation according to a selection operation on its owner

39

DHF – Example

TITLE SALMech. Eng. 27000Programmer 24000

PAY1= SAL 30000 PAY

ENO ENAME TITLEE1 J. Doe Elec.Eng.E2 M. Smith Syst. Anal.E3 A. Lee Mech. Eng.E4 J. Miller ProgrammerE5 B. Casey Syst. Anal.E6 L. Chu Elec.Eng.E7 R. Davis Mech. Eng.E8 J. Jones Syst. Anal.

EMP ENO ENAME TITLEE3 A. Lee Mech. Eng.E4 J. Miller ProgrammerE7 R. Davis Mech. Eng.

EMP1 = EMP PAY1

ENO ENAME TITLEE1 J. Doe Elec.Eng.E2 M. Smith Syst. Anal.E5 B. Casey Syst. Anal.E6 L. Chu Elec.Eng.E8 J. Jones Syst. Anal.

EMP2 = EMP PAY2

TITLE SALElec. Eng. 40000Syst. Anal. 34000

PAY2= SAL> 30000 PAY

DHF

Title, Sal

PAY

Eno, Ename, Title

EMP L1

40

DHF: Derived Horizontal Fragmentation Let S be horizontally fragmented and let there be a

link L with owner(L) = S, and member(L) = R, the derived horizontal fragments of R are defined as

Ri = R Si , 1 i w

where Si is the horizontal fragment of S, is the semi join operator, and w is the maximum number of fragments

Inputs to derived horizontal fragmentation: partitions of owner relation member relation the semi join condition

The algorithm is straight forward.

41

DHF: Correctness

Completeness primary horizontal fragmentation based on completeness

of selection predicates. For derived horizontal fragmentation based on referential integrity

Reconstruction Same as primary horizontal fragmentation (via union)

Disjointedness Simple join graphs between the owner and the member

fragments.

42

DHF: Issues

Multiple owners for a member relation; how should we derived horizontally fragments of a member relation?

There could be a chain of derived horizontal fragmentation.

43

Vertical Fragmentation (VF) Has been studied within the centralized context Vertical partitioning of a relation R produces

fragments R1, R2, ..., Rm, each of which contains a subset of R's attributes as well as the primary key of R

The object of vertical fragmentation is to reduce irrelevant attribute access, and thus irrelevant data access

“Optimal” vertical fragmentation is one that minimizes the irrelevant data access for user applications

44

VF Two Approaches

Grouping: each individual attribute one fragment, at each step join some of the fragments until some criteria being satisfied Attributes to fragments

Splitting: start with global relation, and generate beneficial partitions based on access behavior of the applications Relations to fragments

45

Vertical Fragmentation

Overlapping fragments grouping

Non-overlapping fragments Splitting

We do not consider the replicated key attributes to be overlapping Advantage

– easier to enforce functional dependencies (for integrity checking)

46

VF Information Requirements Application Information

Attribute affinities– A measure that indicates how closely related the attributes are

– This is obtained from more primitive usage data

Attribute usage values– Given a set of queries Q = {q1 , q2, ..., qm} that will run on relation

R (A1, A2, ..., An)

use (qi, Aj ) = 1 if attribute Aj is referenced by query qi

0 otherwise

use (qi, . ) can be defined accordingly.

47

VF Definition of use(qi, Aj)

Consider the following 4 queries for relation PROJ

q1: SELECT BUDGET FROM PROJ WHERE PNO = val;

q2: SELECT PNAME,BUDGET FROM PROJ;

q3: SELECT PNAME FROM PROJ WHERE LOC = val;

q4: SELECT SUM(BUDGET) FROM PROJ WHERE LOC=val;

Let A1= PNO, A2= PNAME, A3 = BUDGET, A4 = LOC

1100

1010

0110

0101

A1 A2 A3 A4

q1

q2

q3

q4

48

VF - Affinity Measure aff(Ai, Aj)

The attribute affinity measure between two attributes Ai and Aj of a relation R (A1, A2, ..., An) with respect to

the set of applications Q = {q1 , q2, ..., qm} is defined as:

Aff (Ai, Aj) = { k | use (qk, Ai)=1 use (qk, Aj)=1} l refl (qk) * accl (qk)

where refl (qk) is the number of accesses to attributes for each execution of application qk at site l; accl (qk) is the access frequency of query qk at site l .

49

VF - Affinity Measure aff(Ai, Aj)

Assume each query in the previous example accesses the attributes once during each execution.

Also assume the access frequency of query k at different sites is:

then Aff (A1, A3) = 15*1 + 20*1 + 10*1 = 45

and the attribute affinity matrix AA is:

003

252525

005

102015S1 S2 S3

q1

q2

q3

q4

783750

353545

755800

045045A1 A2 A3 A4

A1

A2

A3

A4

1100

1010

0110

0101

A1 A2 A3 A4

q1

q2

q3

q4

50

AA Matrix for Vertical Fragmentation

This affinity matrix will be used to guide the fragmentation effort. The process involves first clustering together the attributes with high affinity for each other, and then splitting the relation accordingly.

51

VF - Correctness

A relation R, defined over attribute set A and key K, generates vertical partitioning

FR = {R1 , R2 , … ,Rr} Completeness

Reconstruction

Disjointness Duplicate keys are not considered to be overlapping

AiRA

kR = Ri Ri FR

52

Hybrid Fragmentation

R

R1 R2

R11 R12 R21 R22 R23

HF

VF VF

53

Outline



54

Allocation

File Allocation vs. Database Allocation Fragments are not individual files

– Relationships have to be maintained

Access to databases is more complicated– Remote file access model not applicable

– Relationship between allocation and query processing

Cost of integrity enforcement should be considered

Cost of concurrency control should be considered

55

Allocation Problem

Assuming:

nFFFF ,,, 21 A set of fragments

mSSSS ,,, 21 A set of sites

qQQQQ ,,, 21 A set of applications

Problem

Find the optimal distribution of F over S

56

Optimality with Two Aspects

Minimal costStoring Fi at Sj

Querying Fi at Sj

Updating Fi at all Sj 's with a copy of Fi

Communication

PerformanceResponse time

Throughput

……Separate the two issues to reduce its complexity.

57

A Simple Formulation of the Cost Problem

For a single fragment Fi

R = {r1, r2, …, rm}

rj : read-only traffic generated at Sj for Fi

U = {u1, u2, …, um}

uj : update traffic generated at Sj for Fi

F, S, Q are defined as before

1

58

Assume the communication cost between any pair of sites Si and Sj is fixed

C(T ) = {c11,c12, c13, …, c1,m, …, cm-1,m}

cij : retrieval communication cost

C’(U ) = {c’11,c’12, c’13, …, c’1,m, …, c’m-1,m}

c’ij : update communication cost

A Simple Formulation of the Cost Problem (cont.)

59

D = {d1, d2, …, dm}

cost for storing Fi at Sj

No capacity constraints for sites and communication links

3


60

Let

The allocation problem is a cost minimization problem for finding the set

I i.e., the sites to store the fragment Fi, where

otherwise0

site toassigned is fragment theif1 jiij

SFx


mSSS ,,, 21

61

This formulation only considers one fragment Fi

at site Sj. It is NP-complete.


]dx)cminr'cux(min[

IS|jjj

m

1i

j,iIS|j

j

IS|jj,ijj

jj

j

i ijj jS I

u c'|

Updates :)(min }|{ cr ijIji s j

Reads :j

j jS Id

|

Storage :For queries from site i

Total cost

62

A precise formulation must consider: All fragments together

How query is processed

The enforcement of integrity constraint

The cost of concurrency control and transaction control

A Precise Formulation of the Cost Problem

63

Allocation Model in General Allocation Model

min (total Cost)

subject to – response time constraint

– storage constraint

– processing constraint

Decision variable

0

1xij

if fragment Fi is stored at site Sj

otherwise

64

Allocation Model (cont.) Total Cost

∑all queries query processing cost +

∑all sites ∑all fragments cost of storing a fragment at a site

Storage Cost (on fragment Fj at site Sk)

(unit storage cost at Sk) * (size of Fj) * xjk

Query processing Cost (for one query)

processing component + transmission component

65

Allocation Model (cont.) Query Processing Cost

Processing component

access cost + integrity enforcement cost + concurrency control cost

access cost: ∑all sites ∑all fragments (number of update accesses + number of read accesses) * xjk * local processing cost at site

integrity enforcement and concurrency control costs can be similarly calculated.

66

Allocation Model (cont.) Query Processing Cost

Transmission component

cost of processing updates + cost of processing retrievals

Cost of updates∑all sites ∑all fragments update message cost + ∑all sites ∑all fragments acknowledgement cost

Retrieval costs

∑all fragments minall sites (cost of retrieval command + cost of sending back the result)

67

Allocation Model (cont.)

Constraints Response time for each query not longer than maximally

allowed response time for that query

Storage constraint: The total size of all fragments allocated at a site must be less than the storage capacity at that site.

Processing constraint: The total processing load because of all queries at a site must be less than the processing capacity at that site.

68

Solution Methods

NP-complete Heuristics approaches Exploring techniques developed in operational research

(运筹学 )

Reduce problem complexity ignore replication first, and then improve with a greedy

algorithm

69

Question & Answer

5. distributed database design

Documents