foundation of computing systems

33
28.08.09 IT 60101: Lecture #14 1 Foundation of Computing Systems Lecture 14 B Trees

Upload: simone

Post on 13-Jan-2016

28 views

Category:

Documents


0 download

DESCRIPTION

Foundation of Computing Systems. Lecture 14 B Trees. Indexing Mechanism. m-way Search B-Tree Indexing Trie indexing. m-way Search Tree. Definition An m -way search tree T is a tree in which all nodes are of degree ≤ m Each node in the tree contains the following attributes : where - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Foundation of Computing Systems

28.08.09 IT 60101: Lecture #14 1

Foundation of Computing Systems

Lecture 14

B Trees

Page 2: Foundation of Computing Systems

28.08.09 IT 60101: Lecture #14 2

Indexing Mechanism

• m-way Search

• B-Tree Indexing

• Trie indexing

Page 3: Foundation of Computing Systems

28.08.09 IT 60101: Lecture #14 3

9 0

2 1 0-

-

-

6 0 0

7 0 0

8 0 0

-

-

1 0

2 0

3 0

5 0

7 0

9 9

11 0

1 2 0

1 5 0

1 9 0

2 2 0

3 0 0

3 3 0

4 0 0

-

5 1 0

5 2 0

5 6 0

5 8 0

-

6 1 0

6 5 0

6 6 5

6 8 5

6 9 0

7 0 1

7 2 5

7 5 0

7 5 5

-

8 8 0

8 8 5-

--

5 0 0

-

-

-

-

R o o t

Page 4: Foundation of Computing Systems

28.08.09 IT 60101: Lecture #14 4

m-way Search Tree

• Definition

1. An m-way search tree T is a tree in which all nodes are of degree ≤ m

2. Each node in the tree contains the following attributes:

where 1 ≤ n < m Ki (1 ≤ i ≤ n) are key values in the node Pi (0 ≤ i ≤ n) are pointers to the sub-trees of T. Ki < Ki+1,     1 ≤ i < n

3. All the key values in the sub-tree pointed by Pi are less than the key values Ki+1, 0 ≤ i < n.

4. All the key values in the sub-tree pointed by Pn is greater than Kn.

5. All the sub-trees pointed by Pi (0 ≤ i ≤ n) are also m-way search trees.

P K P K P K P 0 1 1 22 nn. . . .

Page 5: Foundation of Computing Systems

28.08.09 IT 60101: Lecture #14 5

m-way Search Tree: Example

2 0 4 0

1 0- - 1 5 - - 2 5 - 3 0 - 4 5 - 5 0 -

- 3 5 - - -

A

B C D

E

[P ] [K P ] [K P ]0 1 1 2 2

P 0 P 1 P 2

Page 6: Foundation of Computing Systems

28.08.09 IT 60101: Lecture #14 6

B-Tree Indexing

• A B tree T of order m is an m-way search tree that is either empty, or it satisfies the following properties:

– The root node has at least 2 children

– All nodes other than the root node have at least child

– All failure nodes are at the same level.

2/m

Page 7: Foundation of Computing Systems

28.08.09 IT 60101: Lecture #14 7

Example: B-Tree of Order 3

3 0

2 0 4 0

1 0 1 5 2 5 3 5 4 5 5 0

F F F F F F F F F F

F = Fa ilu re node

Page 8: Foundation of Computing Systems

28.08.09 IT 60101: Lecture #14 8

Operations on B-Trees

• Searching

• Insertion

• Deletion

Page 9: Foundation of Computing Systems

28.08.09 IT 60101: Lecture #14 9

Insertion:10, 20, 30, 40, 50, 60, 70, 80, 90

• Insertion of 10• Initially the B tree is empty. Get a node (note that it is the

root node) and insert the key 10 into it

- 1 0 - - -

N0

1 0

Page 10: Foundation of Computing Systems

28.08.09 IT 60101: Lecture #14 10

Insertion:10, 20, 30, 40, 50, 60, 70, 80, 90

• Insertion of 20• A node in a B tree of order m can have at most (m – 1) key

values. So, in this case, the root node can hold the key value

20 after 10

2 0

- 1 0 - - - 1 0 - 2 0 -

N02 0N0

Page 11: Foundation of Computing Systems

28.08.09 IT 60101: Lecture #14 11

Insertion:10, 20, 30, 40, 50, 60, 70, 80, 90

• Insertion of 30• A key value is to be inserted into a node which already has the maximum

number of key values (that is, m – 1 for a B tree of order m).

• Insert the value, say X into the list of values in the node in ascending order

• Split the list of values into three parts: P1, P2 and P3

• P1 contains first – 1 key values

• P3 contains + 1, ..., m-th values

• P2 values contain the -th value.

• With this splitting, the -th value is to be inserted into the parent node of the current node

• If the parent node is nil, then create a new node.

• Note: In place of the current node, two nodes are to be allotted containing the

key values in P1 and P3 respectively.

2/m

2/m 2/m

2/m

Page 12: Foundation of Computing Systems

28.08.09 IT 60101: Lecture #14 12

Insertion:10, 20, 30, 40, 50, 60, 70, 80, 90

• Insertion of 30• A key value is to be inserted into a node which already has the maximum

number of key values (that is, m – 1 for a B tree of order m).

3 0

1 0 - - 2 0

N0

3 0

3 03 0 N1

2 0 - -

- 1 0 - - - - 3 0 - -

N2 N3

Page 13: Foundation of Computing Systems

28.08.09 IT 60101: Lecture #14 13

Insertion:10, 20, 30, 40, 50, 60, 70, 80, 90

• Insertion of 40• Search for the node where 40 should be placed

N1

2 0 - -

- 1 0 - - - - 3 0 - - -

N2 N3

4 0N1

2 0 - -

- 1 0 - - - - 3 0 - 4 0 -

N2 N3

Page 14: Foundation of Computing Systems

28.08.09 IT 60101: Lecture #14 14

Insertion:10, 20, 30, 40, 50, 60, 70, 80, 90

• Insertion of 50• 50 should go to node N3, but it is already full. So, it will be

splitted followed by rearrangement

5 0

N1

2 0 - -

- 1 0 - - - - 3 0 - 4 0 -

N2

5 0

N3

N1

5 0 - - - - 1 0 - - - - 3 0 - - - -

N2

2 0 4 0 -

N4 N5

Page 15: Foundation of Computing Systems

28.08.09 IT 60101: Lecture #14 15

Insertion:10, 20, 30, 40, 50, 60, 70, 80, 90

• Insertion of 60• Search the nodes for 60 (it is N5) and insert it there.

N1

6 0

5 0 - - - - 1 0 - - - - 3 0 - - - -

N2

2 0 4 0 -

N4 N5

6 0N1

5 0 - 6 0 - - 1 0 - - - - 3 0 - - - -

N2

2 0 4 0 -

N4 N5

Page 16: Foundation of Computing Systems

28.08.09 IT 60101: Lecture #14 16

Insertion:10, 20, 30, 40, 50, 60, 70, 80, 90

• Insertion of 70• The node for insertion of key value 70 is N5, which is already full

– So, it requires to split N5 into N6 and N7

– This process in turns require to insert 60 into N1

» Requires another splitting of N1 into N8, N9, N10

N1

5 0 - - - 1 0 - - - - 3 0 - - - -

N2

2 0 4 0 -

N4 N5

6 0

7 0

7 0

6 0

N NN2 N4

- 1 0 - - - - 3 0 - - - - 5 0 - - - 7 0 - -

6 7

4 0 -

N1

2 0

N8

6 0 - - -

N

4 0 -

N9 N1 0

2 0

NN2 N4

- 1 0 - - - - 3 0 - - 5 0 - - - 7 0 - -

6 7

Page 17: Foundation of Computing Systems

28.08.09 IT 60101: Lecture #14 17

Insertion:10, 20, 30, 40, 50, 60, 70, 80, 90

• Insertion of 80– Search the nodes for 60 (it is N5) and insert it there.

8 0

N8

6 0 - - -

N

4 0 -

N9 N1 0

2 0

NN2N

4

- 1 0 - - - - 3 0 - - 5 0 - - - 7 0 - -

6 7

8 0

N8

6 0 - - -

N

4 0 -

N9 N1 0

2 0

NN2N

4

- 1 0 - - - - 3 0 - - - 5 0 - - - - 7 0 - 8 0 -

6 7

Page 18: Foundation of Computing Systems

28.08.09 IT 60101: Lecture #14 18

Insertion:10, 20, 30, 40, 50, 60, 70, 80, 90

• Insertion of 90– The node N7 is the right place where 90 has to be

accommodated but it is full.

9 0

N8

6 0 - - -

N

4 0 -

N9 N1 0

2 0

NN2N

4 6 7

- 1 0 - - - - 3 0 - - - - 5 0 - - - - 7 0 - 8 0 - 9 0

Page 19: Foundation of Computing Systems

28.08.09 IT 60101: Lecture #14 19

Insertion:10, 20, 30, 40, 50, 60, 70, 80, 90

• Insertion of 90– The node N7 is the right place where 90 has to be

accommodated but it is full.

– So, splitting of N7 (to N11 and N12) is necessary.

• Requires the insertion of 80 into N10, the parent of N7

N8

6 0 - 8 0 -

N

9 0 - -

N9 N1 0

2 0

NN2N

4

- 1 0 - - - - 3 0 - - - - 5 0 - - - - 7 0 - - - -

6 1 1

4 0

N1 2

Page 20: Foundation of Computing Systems

28.08.09 IT 60101: Lecture #14 20

Deletion in B-Trees

• Case 1:  Deletion of a key value from a leaf node• Case 1.a:  Removal of a key value leads to the number of keys

≥ – 1.

• Case 1.b:  Removal of key value leads to the number of keys

< – 1.

• Case 2: Deletion of a key value from a non-leaf node.  

2/m

2/m

Page 21: Foundation of Computing Systems

28.08.09 IT 60101: Lecture #14 21

Deletion in B-Trees: Case 1.a

• Case 1:  Deletion of a key value from a leaf node• Case 1.a:  Removal of a key value leads to the number of keys

≥ – 1.

– Removal of a key value from the leaf node does not violate the requirement of minimum number of key values in that node

2/m

(a ) D e le tion o f l, t and y

o

c g k r v

a b d e f h i

m n

p q

s u w x z

l m n s t u w x y z

(b ) A fte r de le tion o f l, t, and y

o

c g k r v

a b d e f h i m n p q s u w x z

Page 22: Foundation of Computing Systems

28.08.09 IT 60101: Lecture #14 22

Deletion in B-Trees: Case 1.b

• Case 1:  Deletion of a key value from a leaf node• Case 1.b:  Removal of key value leads to the number of keys

< – 1.

• Three situations may be possible in this case: 1. The nearest right sibling contains more than – 1 key

values.

2. The nearest left sibling contains more than – 1 key values.

3. Neither the nearest left sibling nor the right sibling contain more than – 1 key values.

2/m

2/m

2/m

2/m

Page 23: Foundation of Computing Systems

28.08.09 IT 60101: Lecture #14 23

Deletion in B-Trees: Case 1.b

• Case 1:  Deletion of a key value from a leaf node• Case 1.b:  Removal of key value leads to the number of keys

< – 1.

The nearest right (or left) sibling contains more than – 1 key values.

2/m

2/m

(a ) D e le tion o f h and s

o

c g k r v

a b d e f h i p ql m n s u w x z

f

g v

w

m o v e r ig h t m o v e le f t

(b ) A fte r de le tion o f h and s

o

c f k r w

a b d e g i m n p q v u x z

Page 24: Foundation of Computing Systems

28.08.09 IT 60101: Lecture #14 24

Deletion in B-Trees: Case 1.b

• Case 1:  Deletion of a key value from a leaf node• Case 1.b:  Removal of key value leads to the number of keys

< – 1.

Neither the nearest left sibling nor the right sibling contain more than – 1 key values.

2/m

2/m

(a ) D e le tion o f e : com b ine w ith righ t s ib ling

com bine

o

c f k r w

a b d e g i m n p q v u x z

(b ) A fte r de le tion o f e

o

c k r w

a b d f g i m n p q v u x z

Page 25: Foundation of Computing Systems

28.08.09 IT 60101: Lecture #14 25

Deletion in B-Trees: Case 1.b

• Case 1:  Deletion of a key value from a leaf node• Case 1.b:  Removal of key value leads to the number of keys

< – 1.

Neither the nearest left sibling nor the right sibling contain more than – 1 key values.

2/m

2/m

o

c k r w

a b d f g i m n p q v u x z

com bine

o

r com binec k

a b d f g i m n p q v u w x

a b d f g i m n p q v u w x

c k o r

(c ) A fte r de le tion o f z

Page 26: Foundation of Computing Systems

28.08.09 IT 60101: Lecture #14 26

Deletion in B-Trees: Case 1.b

• Case 2:  Deletion of a key value from a non-leaf node

(b ) A fte r d e le tio n o f r a n d th e n d e le tio n o f g

c o m b in e (a ) D e le tio n o f r

o

c g k r v

d e f p q s t u w x y z

o

c g k s v

a b d e f m n p q t u w x y z

l m n

l

h i

l

a b

h ik

Page 27: Foundation of Computing Systems

28.08.09 IT 60101: Lecture #14 27

Deletion in B-Trees: Case 1.b

• Case 2:  Deletion of a key value from a non-leaf node

o

s v

a b d e f m n p q t u w x y z

c h l

k i

(c ) A fte r d e le tio n o f g a n d th e n d e le tio n o f o

p

c h l

a b d f m n w x y zk i

p v

q s t u

l

l

c h

a b d e f m n w x y zk i

p v

l

q s t u

(d ) A fte r d e le tio n o f o .

Page 28: Foundation of Computing Systems

28.08.09 IT 60101: Lecture #14 28

Some Properties of B-Trees

• A B-tree is always a height balanced tree

• The degree of a B-tree of order m is m, that is, the maximum number of branches that can emanate from a node is m

• In a B-tree of order m and height h,

• The maximum number of nodes possible =

• The maximum number of key values that a node in a B tree of order m can have is m – 1.

• The maximum number of key values that is possible in a B tree of order m is

1

11

0

m

mm

hh

i

i

1hm

Page 29: Foundation of Computing Systems

28.08.09 IT 60101: Lecture #14 29

Some Properties of B-Trees

• The root node contains at least 2 children

• All nodes other than the root node can have at least children.

• The minimum number of key values in the root node is 1 (if the B tree is not empty).

• The minimum number of key values in any node other than the root node is – 1.

• The minimum number of key values in a B-tree of order m is

2/m

2/m

12

21

hm

Page 30: Foundation of Computing Systems

28.08.09 IT 60101: Lecture #14 30

TRIE Structure

• A trie tree is an m-way search tree

• Definition• A “trie” is a tree of order m either empty or consisting of an

ordered sequence of exactly m tries each of order m.

Page 31: Foundation of Computing Systems

28.08.09 IT 60101: Lecture #14 31

TRIE Structure: Examplea b c

0

12 3

4

a b c

aa ac

5 6 7 8 9 1 0 11

1 2 1 3 1 4 1 5 1 6

1 7 1 8 1 9 2 0

2 1 2 2

2 3

b c ca cb cc

aac aabb ab b ac

cb b

ab b c b ab b b ab c

b acc

b ab b a

b ab cb a

b ab cb

- - - - - - - - -

-

- - - - - - - -

- - - - - - - -

- - - - - - - - - -

- - - - -

- - -

Page 32: Foundation of Computing Systems

28.08.09 IT 60101: Lecture #14 32

Structure of a node in TRIE

• Physical representation

• Trie indexing is suitable for maintaining variable sized key values.

• Actual key value is never stored but key values are implied through links.

• If English alphabets are used, then a trie of order 26 can maintain whole English dictionary. (This is specially termed as lexicographic trie).

• It allows us multi-way branching based on the part of key value, not the entire key value. The branching on the i-th level is determined by the i-th component of the key value.

a b c

Page 33: Foundation of Computing Systems

28.08.09 IT 60101: Lecture #14 33

TRIE Structure: Operations

• Operations

• Searching

• Insertion

• Deletion

For operations on trie structure see the bookClassic Data Structures

Chapter 7PHI, 2nd Edn., 17th Reprint