ss zg518-l10.ppt

Upload: anup-raghuveer

Post on 04-Jun-2018

225 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/14/2019 SS ZG518-L10.ppt

    1/28

    BITSPilaniHyderabad Campus

    Dr.R.GururajCS&IS Dept.

    Database Design & Applications

    (SS ZG 518)

  • 8/14/2019 SS ZG518-L10.ppt

    2/28

    BITS Pilani, Hyderabad Campus

    Lecture Session-10

    Indexing

    Content

    What is Indexing

    Primary and Secondary indexes

    Dense and Sparse Indexing

    Multilevel Indexing

    Designing Primary and Multilevel Indexes

    What is Tree Indexing

    B+ tree

    Inserting and deleting keys into B+ Trees

    Constructing a B+ tree

    Designing a B+ Tree node structure

    1 10/09/2013 SSZG 518 Database Design & Applications Dr.R.Gururaj

  • 8/14/2019 SS ZG518-L10.ppt

    3/28

    BITS Pilani, Hyderabad Campus

    An indexfor a file works in much the same way as a catalog

    in a library.

    In a library cards are kept in alphabetical order. So we dont

    have to search all cards.

    In real world databases, indexes may be too large to behandled efficiently.

    Hence some sophisticated techniques are to be used.

    Techniques for efficient retrieval of required records fromdisk are:

    Hashing

    Indexing

    Introduction to Indexing

    2 10/09/2013 SSZG 518 Database Design & Applications Dr.R.Gururaj

  • 8/14/2019 SS ZG518-L10.ppt

    4/28

    BITS Pilani, Hyderabad Campus

    The criteria for evaluating the hashing or indexing techniquesAccess time

    Insertion time (new indexes or new records)

    Deletion time

    space overhead

    Some times more than one indexing may be required for a file.

    The attribute /field used for constructing index structure for a

    file is called a indexing field/attribute .

    3 10/09/2013 SSZG 518 Database Design & Applications Dr.R.Gururaj

  • 8/14/2019 SS ZG518-L10.ppt

    5/28

    BITS Pilani, Hyderabad Campus

    If the index field is a key, it is called as search key or indexing key.

    Indexes on key attributes:

    1. Built on ordering key(PK)Primary index

    2. Non-ordering Key - Secondary index on key attribute

    Indexes on non-key attributes:

    1. Ordering non-key -- Clustering Index

    2. Non-ordering non-key attributeSecondary index on non-key

    Hence, a file can have at most one primary index or one clustering

    index, but not both.

    4 10/09/2013 SSZG 518 Database Design & Applications Dr.R.Gururaj

  • 8/14/2019 SS ZG518-L10.ppt

    6/28

    BITS Pilani, Hyderabad Campus

    Indexing

    Nonordering field(secondary index)

    Non-keykeyNon-key

    (Clustering index)

    Key

    (primary index)

    Ordering field

    5 10/09/2013 SSZG 518 Database Design & Applications Dr.R.Gururaj

  • 8/14/2019 SS ZG518-L10.ppt

    7/28BITS Pilani, Hyderabad Campus

    Index record:Like data records, index records are alsostored in database. Any index record normally has two

    fields.

    Value Pointer

    Key value Location address of

    the record containing

    the key

    Data record:Similar kind of records(of a relation/table) arestored in a single file containing blocks. These are called

    data records and will have fields specified on the relation.

    6 10/09/2013 SSZG 518 Database Design & Applications Dr.R.Gururaj

  • 8/14/2019 SS ZG518-L10.ppt

    8/28BITS Pilani, Hyderabad Campus

    Dense Index: In this, an index record appears for every data file record.

    Sparse Ind ex : Index records are created only for some data file

    records. This occupies less space. Sparse index can be on primary or

    secondary key.

    A primary index and clustering index are non-dense.

    SQL Commands to create indexes:

    Usually when we declare PK, an index is created automatically.

    CREATE INDEX EMP_IND ON EMP(eid);

    DROP INDEX EMP_IND;

    7 10/09/2013 SSZG 518 Database Design & Applications Dr.R.Gururaj

  • 8/14/2019 SS ZG518-L10.ppt

    9/28BITS Pilani, Hyderabad Campus

    2

    15

    25

    30

    38

    45

    60

    Key

    Index

    Files/Blocks

    2

    5

    6

    9

    Data files / Blocks

    Pointer to

    block

    15

    17

    18

    19

    25

    27

    29

    30

    35

    6

    9

    Primary Indexing

    8 10/09/2013 SSZG 518 Database Design & Applications Dr.R.Gururaj

  • 8/14/2019 SS ZG518-L10.ppt

    10/28BITS Pilani, Hyderabad Campus

    Ex 1:

    Assume that we have an ordered file with 80000 recordsstored on disk. Block size is 512 Bytes. Record length isfixed and it is 70 Bytes. Key field(PK) length is 6 Bytesand block pointer is 4 Bytes. Assume unspanned recordorganization

    Design a Primary index on primary key.

    Design ing a Primary index

    9 10/09/2013 SSZG 518 Database Design & Applications Dr.R.Gururaj

  • 8/14/2019 SS ZG518-L10.ppt

    11/28BITS Pilani, Hyderabad Campus

    Solution:

    Sizeof disk block=512 Bytes; record length=70 Bytes

    Block pointer=4 Bytes. Key field=6 bytes; total records=80000

    No. records per block(Bfr)= floor (512/70)=7.31=7No. of data blocks needed= ceil(80000/7)= 11429

    Index record length= key + pointer=6+4=10 Bytes

    Blocking factor for index (Bfri) = floor(512/10)=51

    (known as fanout)No. of index blocks = Ceil(11429/51)= 225

    No. of block accesses= ceil of (log2 225) + 1 = 8+1=9

    10 10/09/2013 SSZG 518 Database Design & Applications Dr.R.Gururaj

  • 8/14/2019 SS ZG518-L10.ppt

    12/28BITS Pilani, Hyderabad Campus11 10/09/2013 SSZG 518 Database Design & Applications Dr.R.Gururaj

  • 8/14/2019 SS ZG518-L10.ppt

    13/28BITS Pilani, Hyderabad Campus

    2

    30

    60

    KeyPointer tonext level

    2

    15

    25

    30

    38

    45

    60

    Key

    First Level

    Second Level

    2

    5

    6

    9

    Data files / Blocks

    Pointer toblock

    15

    17

    18

    19

    25

    27

    29

    30

    35

    6

    9

    Mult i level Ind exing (Two levels)

    12 10/09/2013 SSZG 518 Database Design & Applications Dr.R.Gururaj

  • 8/14/2019 SS ZG518-L10.ppt

    14/28BITS Pilani, Hyderabad Campus

    Ex 2:

    Assume that we have an ordered file with 80000 recordsstored on disk. Block size is 512 Bytes. Record length isfixed and it is 70 Bytes. Key field(PK) length is 6 Bytesand block pointer is 4 Bytes. Assume unspanned recordorganization

    Design a multilevel index on primary key.

    How many levels are there.

    How many blocks are there in each index level.

    Design ing a mu lt i level index

    13 10/09/2013 SSZG 518 Database Design & Applications Dr.R.Gururaj

  • 8/14/2019 SS ZG518-L10.ppt

    15/28BITS Pilani, Hyderabad Campus

    Solution :

    Size of the disk block=512 Bytes; record length=70 Bytes

    Block pointer=4 Bytes. Key field=6 bytes; total records=80000

    No. records per block(Bfr)= floor (512/70)=7.31=7

    No. of data blocks needed= ceil(80000/7)= 11429Index record length= key + pointer=6+4=10 Bytes

    Blocking factor for index = floor(512/10)=51 - fanout

    No. of index blocks in first level= Ceil(311429/51)= 225

    No. of index blocks in 2nd

    level= Ceil(225/51)= 5No. of index blocks in 3rd level= Ceil(5/51)= 1 top level

    No. of levels in indexing structure=t=3

    No. of block accesses= No. index levels + 1= t+1=4

    14 10/09/2013 SSZG 518 Database Design & Applications Dr.R.Gururaj

  • 8/14/2019 SS ZG518-L10.ppt

    16/28BITS Pilani, Hyderabad Campus

    B+ Treeis a multilevel search tree used to implementdynamic multilevel indexing. The primary disadvantage of

    implementing multilevel indexes is that the performance

    degrades as the file grows. It can be remedied by

    reorganization, but frequent reorganization is not advisable.

    B+ tree is best suited for multilevel indexing of files, because

    it is dynamic.

    B+ Tree of Orderp

    It is a balanced tree, (all leaves are at same level).Each internal node is of the form-

    B+ Tree Indexing

    24

    K1

    32

    K2

    40

    K3

    60

    K4

    Child 1 Child 2 Child 3 Child 4Child 5

    15 10/09/2013 SSZG 518 Database Design & Applications Dr.R.Gururaj

  • 8/14/2019 SS ZG518-L10.ppt

    17/28BITS Pilani, Hyderabad Campus

    Note

    In a B+ tree record pointer for a record with given

    key can be found only at leaf node.

    But if it is in case of B-tree it can happen atintermediate node also.

    Hence in B+ tree search, success or failure can be

    declared only after reaching leaf_level.

    Where as in B-tree search can be successful atintermediate level as well.

    On failure we reach the leaf level.

    16 10/09/2013 SSZG 518 Database Design & Applications Dr.R.Gururaj

  • 8/14/2019 SS ZG518-L10.ppt

    18/28BITS Pilani, Hyderabad Campus

    Constructing a B+ Tree

    Ex 3:

    Construct a B+ tree with given specifications. The order of the tree,

    p=3 and pleaf=2. The tree should be such that all the keys in the

    subtree pointed by a pointer which is preceding the key must be

    equal to or less than the key value , and all the keys in the subtree

    pointed by a pointer which is succeeding the key must be greater

    than the key.

    Insert the following keys in same order- 56, 22, 78, 42, 102, 90, 96,

    35. Show how the tree will expand after each insertion, and the final

    tree.

    Next, delete 56, 46, 22 in the same order and show the status of

    the tree after each deletion.

    17 10/09/2013 SSZG 518 Database Design & Applications Dr.R.Gururaj

  • 8/14/2019 SS ZG518-L10.ppt

    19/28BITS Pilani, Hyderabad Campus18 10/09/2013 SSZG 518 Database Design & Applications Dr.R.Gururaj

  • 8/14/2019 SS ZG518-L10.ppt

    20/28

  • 8/14/2019 SS ZG518-L10.ppt

    21/28BITS Pilani, Hyderabad Campus20 10/09/2013 SSZG 518 Database Design & Applications Dr.R.Gururaj

  • 8/14/2019 SS ZG518-L10.ppt

    22/28BITS Pilani, Hyderabad Campus21 10/09/2013 SSZG 518 Database Design & Applications Dr.R.Gururaj

  • 8/14/2019 SS ZG518-L10.ppt

    23/28

    BITS Pilani, Hyderabad Campus

    Node design for B+ tree

    Ex 5:We need to design a B+ tree indexing for Student

    relation, on student_id attribute; the key of the relation.

    The attribute student_id is of 4 bytes length. Other

    attributes are- student_age(4 bytes), student_name(20bytes), student_address(40 bytes), student_branch(3

    bytes). The Disk block size is 1024 Bytes. If the tree-

    pointer takes 4 bytes, for the above situation, design the

    best possible number of pointers per node(internal) of the

    above B+ tree. Each internal node is a disk block which

    contains search key values and pointers to subtrees.

    22 10/09/2013 SSZG 518 Database Design & Applications Dr.R.Gururaj

  • 8/14/2019 SS ZG518-L10.ppt

    24/28

    BITS Pilani, Hyderabad Campus

    Solution:Disk block size=1024 Bytes

    Size of B+ tree node= size of disk block

    Each tree pointer points to disk block and takes 4 Bytes.

    Each key (student_id) takes 4 Bytes

    In a B+ tree node, No. of pointers = no. keys +1

    Assume that no. keys = n

    Then no. pointers= n+1

    Then min. size for a node= {(no.Keys* size of each key)+

    (no.pointers * size of each pointer)}

  • 8/14/2019 SS ZG518-L10.ppt

    25/28

    BITS Pilani, Hyderabad Campus

    Ex 6:In a data file which is ordered on the key field, we have 2,38,000 records.

    The record length is 140 bytes and the block size is 1024 bytes. The

    address of a disk block needs 8 bytes, and the key attribute of the file is of

    9 bytes length.

    If no indexing is done, give the number of block accesses needed (on

    average) to retrieve a record with given key value from the above file. Also

    give number of data blocks needed.

    Now, Design a primary index for the above file on the key attribute. Give

    how many index blocks are needed, and give the number of block

    accesses needed (on average) to retrieve a record with given key value

    from the above file.Now, Design a multilevel indexing for the same file and give, number of

    levels with number of blocks at each level, and the number of block

    accesses needed (on average) to retrieve a record with given key value

    from the above file.

    [Note: Complete working is required for your answer]24 10/09/2013 SSZG 518 Database Design & Applications Dr.R.Gururaj

  • 8/14/2019 SS ZG518-L10.ppt

    26/28

    BITS Pilani, Hyderabad Campus

    Given data :

    Total No.of records = 2,38,000 records

    Record length = 140 bytes

    Block size = 1024 bytes

    Key size = 9 bytes

    Block pointer size = 8 bytes

    Without Indexing:

    Blocking factor for records of a file (Bfr) = floor(1024/140) = 7 records /

    block

    Total no. of data blocks required = Ceil (2,38,000/ 7) = 34,000

    blocks/fileNo. of block access (on average) to access = ceil(log2 34000) = 16

    25 10/09/2013 SSZG 518 Database Design & Applications Dr.R.Gururaj

  • 8/14/2019 SS ZG518-L10.ppt

    27/28

    BITS Pilani, Hyderabad Campus

    With primary indexingIndex entry size = 9 +8 = 17 bytes

    Blocking factor for index = floor(1024/17) = 60

    entries/block

    Total no. of index blocks required = ceil(34,000/60)= 567

    No. of block accesses needed to access = ceil(log2567) + 1= 10 +1 = 11

    Data record (on average)

    With Multi level indexing:

    Blocking factor for index ( f0) = 60 entries / block

    B1 (No.of index blocks) = 567

    B2 (Second level index) = b1 / f0 = ceil(567 / 60) = 10 blocksB3 (Third level index) = b2 / f0 = ceil(10 / 60) = 1 block

    Total no. of block accesses to data record = 3 + 1 = 4 block accesses.

    26 10/09/2013 SSZG 518 Database Design & Applications Dr.R.Gururaj

  • 8/14/2019 SS ZG518-L10.ppt

    28/28

    Summary

    What is Indexing and its importance

    How Primary and Secondary indexes work

    Examples of Dense and Sparse Indexes

    What is Multilevel Indexing

    Some example problems on designing Primary

    and Multilevel Indexes

    What is Tree Indexing

    B tree and B+ tree concepts

    Constructing a B+ tree (Insert/Delete operations)

    Designing a B+ Tree node structure