lecture 1 conceptual, logical and physical db design & tuning · mit-533 database systems 2...

23
MIT-533 Database Systems 2 Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 1 Lecture 1 Conceptual, Logical and Physical DB Design & Tuning Walailuk University MIT 533 ระบบฐานขอมูล 2 Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 2 MIT-533 Database Systems 2 Objectives (I) To study conceptual, logical, and physical design steps. To study how to use Entity–Relationship (ER) modeling to build a conceptual data model based on the information given in a user’s view of an enterprise. To learn how to map a conceptual model to a logical data model and how to derive relations from a logical data model. To learn how to merge local logical data models based on specific user views into a global logical data model of the enterprise. Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 3 MIT-533 Database Systems 2 Objectives (II) To learn how to map the logical DB design to a physical DB design. To understand how to design base relations, enterprise constraints for the target DBMS. To understand appropriate settings, such as file organizations and secondary indexes based on analysis of transactions. To understand physical settings in DB design To understand monitoring and tuning DB. Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 4 MIT-533 Database Systems 2 Design Methodology A structured approach that uses procedures, techniques, tools, and documentation aids to support and facilitate the process of design. Conceptual Database Design Logical Database Design Physical Database Design

Upload: others

Post on 25-Jun-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lecture 1 Conceptual, Logical and Physical DB Design & Tuning · MIT-533 Database Systems 2 Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 10 Step 1 Build Local

MIT-533 Database Systems 2 Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 1

Lecture 1 Conceptual, Logical and

Physical DB Design & TuningWalailuk University

MIT 533 ระบบฐานขอมูล 2

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 2MIT-533 Database Systems 2

Objectives (I)To study conceptual, logical, and physical design steps.To study how to use Entity–Relationship (ER) modeling to build a conceptual data model based on the information given in a user’s view of an enterprise.To learn how to map a conceptual model to a logical data model and how to derive relations from a logical data model.To learn how to merge local logical data models based on specific user views into a global logical data model of the enterprise.

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 3MIT-533 Database Systems 2

Objectives (II)To learn how to map the logical DB design to a physical DB design.To understand how to design base relations, enterprise constraints for the target DBMS.To understand appropriate settings, such as file organizations and secondary indexes based on analysis of transactions.To understand physical settings in DB designTo understand monitoring and tuning DB.

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 4MIT-533 Database Systems 2

Design MethodologyA structured approach that uses procedures, techniques, tools, and documentation aids to support and facilitate the process of design.

• Conceptual Database Design• Logical Database Design• Physical Database Design

Page 2: Lecture 1 Conceptual, Logical and Physical DB Design & Tuning · MIT-533 Database Systems 2 Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 10 Step 1 Build Local

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 5MIT-533 Database Systems 2

Three Main Phases of Design MethodologyConceptual database design

Process of constructing a model of the information used in an enterprise, independent of all physical considerations.

Logical database designProcess of constructing a model of the information used in an enterprise based on a specific data model (e.g. relational), but independent of a particular DBMS and other physical considerations.

Physical database designProcess of producing a description of the implementation of the database on secondary storage; it describes the base relations, file organizations, and indexes design used to achieve efficient access to the data, and any associated integrity constraints & security measures.

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 6MIT-533 Database Systems 2

Critical Success Factors in DB DesignWork interactively with users as much as possible.Follow a structured methodology throughout the data modelling process.Employ a data-driven approach.Incorporate structural and integrity considerations into the data models.Combine conceptualization, normalization, and transaction validation techniques into the data modelling methodology.Use diagrams to represent as much of the data models as possible.Use a Database Design Language (DBDL) to represent additional data semantics.Build a data dictionary to supplement the data model diagrams.Be willing to repeat steps.

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 7MIT-533 Database Systems 2

Conceptual Database DesignStep 1 Build conceptual data model

Step 1.1 Identify entity typesStep 1.2 Identify relationship typesStep 1.3 Identify/Associate attributes with

entity/relationship typesStep 1.4 Determine attribute domainsStep 1.5 Determine candidate and primary key attributesStep 1.6 Consider enhanced modeling concepts (optional)Step 1.7 Check model for redundancyStep 1.8 Validate conceptual model against user

transactionsStep 1.9 Review conceptual data model with user

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 8MIT-533 Database Systems 2

Logical Database Design Step 2 Build/validate logical data model

Step 2.1 Derive relations for local logical data modelStep 2.2 Validate relations using normalizationStep 2.3 Validate relations against user transactionsStep 2.4 Define integrity constraintsStep 2.5 Review logical data model with userStep 2.6 Merge logical data models into global model

(optional step)Step 2.7 Check for future growth

Page 3: Lecture 1 Conceptual, Logical and Physical DB Design & Tuning · MIT-533 Database Systems 2 Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 10 Step 1 Build Local

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 9MIT-533 Database Systems 2

Physical Database Design Step 3 Translate global logical data model for target DBMS

Step 3.1 Design base relationsStep 3.2 Design representation of derived data Step 3.3 Design general constraints

Step 4 Design file organizations and indexesStep 4.1 Analyze transactionsStep 4.2 Choose file organizationsStep 4.3 Choose indexesStep 4.4 Estimate disk space requirements

Step 5 Design user viewsStep 6 Design security mechanisms Step 7 Consider introduction of controlled redundancy Step 8 Monitor and tune the operational system

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 10MIT-533 Database Systems 2

Step 1 Build Local Conceptual Data Model (Step 1.1–1.4)To build a conceptual data model of an enterprise.

Step 1.4 Determine attribute domainsTo determine domains for the attributes in the local conceptual model and document the details of each domain.

Step 1.3 Identify and associate attributes with entity or relationship types

To identify and associate attributes with the appropriate entity or relationship types and document the details of each attribute.

Step 1.2 Identify relationship typesTo identify the important relationships that exist between the entity types that have been identified.

Step 1.1 Identify entity typesTo identify the main entity types that are required by the view.

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 11MIT-533 Database Systems 2

Identify Entity and Relationship Types(Staff View – DreamHome Example) (Step 1.1 and 1.2)

Entity Types(Description)

Relationship Types(Description)

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 12MIT-533 Database Systems 2

Entity and Relationship Types(Staff View – DreamHome Example)

ER diagram(UML)

Page 4: Lecture 1 Conceptual, Logical and Physical DB Design & Tuning · MIT-533 Database Systems 2 Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 10 Step 1 Build Local

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 13MIT-533 Database Systems 2

Identify Attributes and their Domains(Staff View – DreamHome Example) (Step 1.3 and 1.4)

DomainsAttributes

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 14MIT-533 Database Systems 2

Step 1 Build Local Conceptual Data Model (Step 1.5 – 1.9)

Step 1.8 Validate conceptual model against user trans.To ensure that the local conceptual model supports the transactions required by the view.

Step 1.9 Review conceptual data model with userTo review the local conceptual data model with the user to ensure that the model is a ‘true’ representation of the user’s view of the enterprise.

Step 1.7 Check model for redundancy To check for the presence of any redundancy in the model.

Step 1.6 Consider enhanced modeling concepts (optional)To consider the use of enhanced modeling concepts, such as specialization / generalization, aggregation, and composition.

Step 1.5 Determine candidate and primary key attributesTo identify the candidate key(s) for each entity and if there is more than one candidate key, to choose one to be the primary key.

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 15MIT-533 Database Systems 2

Determine Primary Keys(Staff View – DreamHome Example) (Step 1.5)

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 16MIT-533 Database Systems 2

Consider Enhanced Modeling Concepts (Staff View – DreamHome Example) (Step 1.6)

Specialization / Generalization

Page 5: Lecture 1 Conceptual, Logical and Physical DB Design & Tuning · MIT-533 Database Systems 2 Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 10 Step 1 Build Local

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 17MIT-533 Database Systems 2

Non-Redundant Relationship FatherOf(An Example) (Step 1.7)

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 18MIT-533 Database Systems 2

Check if the Model Supports User Transactions (An Example) (Step 1.8) (a) List detail of staff supervised by a

named Supervisor at the branch(b) List detail of All Assistants,

alphabetically by name at the branch(c) List the details of property available

for rent at the branch, along with the owner’s details

(d) List the details of properties managed by a named member of staff at the branch

(e) List the clients registering at the branch and the names of staffs who registered the clients

(f) Identify properties located in Glasgow with rents < 450

………

Section 2

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 19MIT-533 Database Systems 2

Step 2 Build/Validate Logical Data Model

To build a logical data model from a conceptual data model, and then to validate this model to ensure it is structurally correct (using the technique of normalization) and to ensure it supports the required transactions.

Step 2.1 Derive relations for logical data modelTo create relations for the logical data model to represent the entities, relationships, and attributes that have been identified.

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 20MIT-533 Database Systems 2

Conceptual Data Model for Staff View Showing all Attributes

Page 6: Lecture 1 Conceptual, Logical and Physical DB Design & Tuning · MIT-533 Database Systems 2 Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 10 Step 1 Build Local

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 21MIT-533 Database Systems 2

Step 2 Build/Validate Logical Data Model(Step 2.1 – I)

1. Strong Entity TypesCreate a relation that includes all simple attributes of that entity. For composite attributes, include only constituent simple attributes.

2. Weak Entity TypesCreate a relation that includes all simple attributes of that entity. Primary key is partially or fully derived from each owner entity.

Staff (staffNo, fName, lName, position, sex, DOB)Primary Key staffNo

Preference (prefType, maxRent)Primary Key None (at present)

Step 2.1 Derive relations for local logical data model To create relations for the local logical data model to represent the entities, relationships, and attributes that are identified.

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 22MIT-533 Database Systems 2

Step 2 Build/Validate Logical Data Model(Step 2.1 – II)3. 1:* Binary Relationship Types

Entity on ‘one side’ is designated the parent entity and entity on ‘many side’ is the child entity.Post copy of the primary key attribute(s) of parent entity into relation representing child entity, to act as a foreign key.

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 23MIT-533 Database Systems 2

Step 2 Build/Validate Logical Data Model(Step 2.1 – III)4. 1:1 Binary Relationship Types

More complex as cardinality cannot be used to identify parent and child entities in a relationship. Instead, participation used to decide whether to combine entities into one relation or to create two relations and post copy of primary key from one relation to the other. Consider following:

(a) mandatory participation on both sides of 1:1 relationship

(b) mandatory participation on one side of 1:1 relationship

(c) optional participation on both sides of 1:1 relationship

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 24MIT-533 Database Systems 2

Step 2 Build/Validate Logical Data Model(Step 2.1 – IV)4. 1:1 Binary Relationship Types

(a) mandatory participation on both sides of 1:1 relationship

Combine entities involved into one relation and choose one of the primary keys of original entities to be primary key of new relation, while other (if one exists) is used as an alternate key.

Client (clientNo, fName, lName, telNo, prefType, maxRent, staffNo)Primary Key clientNoForeign Key staffNo references Staff(staffNo)

Page 7: Lecture 1 Conceptual, Logical and Physical DB Design & Tuning · MIT-533 Database Systems 2 Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 10 Step 1 Build Local

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 25MIT-533 Database Systems 2

Step 2 Build/Validate Logical Data Model(Step 2.1 – V)4. 1:1 Binary Relationship Types

(b) mandatory participation on one side of 1:1 relationshipIdentify parent and child entities using participation constraints. Entity with optional participation is designated parent entity, and other entity designated child entity. Copy primary key of parent placed in relation representing childentity .If relationship has one or more attributes, these attributes should follow the posting of the primary key to the child relation.

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 26MIT-533 Database Systems 2

Step 2 Build/Validate Logical Data Model(Step 2.1 – VI)4. 1:1 Binary Relationship Types

(c) optional participation on both sides of 1:1 relationship

Designation of the parent and child entities is arbitrary unless can find out more about the relationship.Consider 1:1 Staff Uses Car relationship with optional participation on both sides. Assume majority of cars, but not all, are used by staff and only minority of staff use cars. Car entity, although optional, is closer to being mandatory than Staff entity. Therefore designate Staff as parent entity and Car as child entity.

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 27MIT-533 Database Systems 2

Step 2 Build/Validate Logical Data Model(Step 2.1 – VII)5. 1:1 Recursive Relationship Types

follow rules for participation for a 1:1 relationship

mandatory participation on both sides: single relation with two copies of the primary key. mandatory participation on only one side: option to create a single relation with two copies of the primary key, or create a new relation to represent the relationship. The new relation would only have two attributes, both copies of the primary key. optional participation on both sides, again create a new relation as described above.

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 28MIT-533 Database Systems 2

Step 2 Build/Validate Logical Data Model(Step 2.1 – VIII)6. Superclass/Subclass Relationship Types

Identify superclass as parent entity and subclass entity as child entity. There are various options on how to represent such a relationship as one or more relations. Most appropriate option dependent on number of factors such as:

disjointness and participation constraints on the superclass/subclass relationship, whether subclasses are involved in distinct relationships, number of participants in superclass/subclass relationship.

Page 8: Lecture 1 Conceptual, Logical and Physical DB Design & Tuning · MIT-533 Database Systems 2 Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 10 Step 1 Build Local

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 29MIT-533 Database Systems 2

Guidelines for Representation of Superclass / Subclass Relationship

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 30MIT-533 Database Systems 2

An Example of Representation of Superclass/Subclass Relationship

Option 1 – Mandatory & NondisjointAllOwner(ownerNo, address, telNo, fName, bName, bType, contactName, pOwnerFlag, bOwnerFlag)Primary Key ownerNo

Option 2 – Optional & NondisjointOwner(ownerNo, address, telNo)Primary Key ownerNoOwnerDetails(ownerNo, fName, lName, bName, bType, contactName, pOwnerFlag, bOwnerFlag)Primary Key ownerNoForeign Key ownerNo references Owner(ownerNo)

Option 3 – Mandatory & DisjointPrivateOwner(ownerNo, fName, lName, address, telNo)Primary Key ownerNoBusinessOwner(ownerNo, bName, bType, contactName, address, telNo)Primary Key ownerNo

Option 4 – Optional & DisjointOwner(ownerNo, address, telNo)Primary Key ownerNoPrivateOwner(ownerNo, fName, lName)Primary Key ownerNoForeign Key ownerNo references Owner(ownerNo)BusinessOwner(ownerNo, bName, bType, contactName)Primary Key ownerNoForeign Key ownerNo references Owner(ownerNo)

Options

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 31MIT-533 Database Systems 2

Step 2 Build/Validate Logical Data Model(Step 2.1 – IX)7. *:* Binary Relationship Types

Create relation to represent relationship and include any attributes that are part of relationship.Post a copy of the primary key attribute(s) of the entities that participate in relationship into new relation, to act as foreign keys. These foreign keys will also form primary key of new relation, possibly in combination with some of the attributes of the relationship.

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 32MIT-533 Database Systems 2

Step 2 Build/Validate Logical Data Model(Step 2.1 – X)8. Complex Relationship Types

Create relation to represent relationship and include any attributes that are part of the relationship. Post copy of primary key attribute(s) of entities that participate in the complex relationship into new relation, to act as foreign keys. Any foreign keys that represent a ‘many’ relationship (e.g., 1..*, 0..*) generally will also form the primary key of new relation.

Page 9: Lecture 1 Conceptual, Logical and Physical DB Design & Tuning · MIT-533 Database Systems 2 Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 10 Step 1 Build Local

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 33MIT-533 Database Systems 2

Step 2 Build/Validate Logical Data Model(Step 2.1 – XI)9. Multi-Valued Attributes

Create new relation to represent multi-valued attribute and include primary key of entity in new relation, to act as a foreign key. Unless the multi-valued attribute is itself an alternate key of the entity, primary key of new relation is combination of the multi-valued attribute and the primary key of the entity.

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 34MIT-533 Database Systems 2

Mapping Summary (Mapping Entities and Relationships to Relations)

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 35MIT-533 Database Systems 2

Relations for Staff View of DreamHome

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 36MIT-533 Database Systems 2

Relations for Branch View of DreamHome(Another View of the Database)

Page 10: Lecture 1 Conceptual, Logical and Physical DB Design & Tuning · MIT-533 Database Systems 2 Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 10 Step 1 Build Local

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 37MIT-533 Database Systems 2

Step 2 Build/Validate Logical Data Model(Step 2.2 – 2.5)

Step 2.5 Review logical data model with the userTo ensure that the logical data model and supporting documentation that describes the model is a true representation of the view.

Step 2.4 Define integrity constraintsTo define the integrity constraints given in the view (i.e. required data, entity and referential integrity, domains, and enterprise constraints).

Step 2.3 Validate relations against user transactionsTo ensure that the relations in the local logical data model support the transactions required by the view.

Step 2.2 Validate relations using normalizationTo validate the relations in the local logical data model using the technique of normalization.

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 38MIT-533 Database Systems 2

Step 2 Build/Validate Local Logical Data Model(Step 2.4 – Define Referential Integrity Constraints)

Referential Integrity

Constraints

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 39MIT-533 Database Systems 2

Step 2 Build/Validate Logical Data Model (Step 2.6)

Step 2.6 Merge local logical data models into global model To merge the individual logical data models into a single global logical data model of the enterprise.

This activities in this step include:

Step 2.6.1 Merge local logical data models into global modelStep 2.6.2 Validate global logical data modelStep 2.6.3 Review global logical data model with users.

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 40MIT-533 Database Systems 2

Step 2.6.1 Merge logical data models into global model1. Review the names and contents of entities/relations and their

candidate keys. 2. Review the names and contents of relationships/foreign keys. 3. Merge entities/relations from the local data models. 4. Include (without merging) entities/relations unique to each local

data model.5. Merge relationships/foreign keys from the local data models. 6. Include (without merging) relationships/foreign keys unique to

each local data model. 7. Check for missing entities/relations and relationships/foreign

keys. 8. Check foreign keys.9. Check Integrity Constraints.10. Draw the global ER/relation diagram.11. Update the documentation

Page 11: Lecture 1 Conceptual, Logical and Physical DB Design & Tuning · MIT-533 Database Systems 2 Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 10 Step 1 Build Local

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 41MIT-533 Database Systems 2

An Example(Merging the Staff View and the Branch View – Entities)

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 42MIT-533 Database Systems 2

An Example(Merging the Staff View and the Branch View – Foreign Keys)

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 43MIT-533 Database Systems 2

Step 2 Build/Validate Logical Data Model (Step 2.6.2-2.6.3 and 2.7)

Step 2.6.3 Review global logical data model with users To ensure that the global logical data model is a true representation of the enterprise.

Step 2.6.2 Validate global logical data modelTo validate the relations created from the global logical data model using the technique of normalization and to ensure they support the required transactions, if necessary.

Step 2.7 Check for future growthTo determine whether there are any significant changes likely in the foreseeable future and to assess whether the logical data model can accommodate these changes.

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 44MIT-533 Database Systems 2

Relations that Represent the Global Logical Data Model for DreamHome (Merge Staff and Branch Views)

Page 12: Lecture 1 Conceptual, Logical and Physical DB Design & Tuning · MIT-533 Database Systems 2 Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 10 Step 1 Build Local

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 45MIT-533 Database Systems 2

Global Relation Diagram for DreamHome

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 46MIT-533 Database Systems 2

Logical vs. Physical DB DesignSources of information for physical design process includes global logical data model and documentation that describes model. Logical database design is concerned with the what, physical database design is concerned with the how.

Physical Database Design

Process of producing a description of the implementation of the database on secondary storage; it describes the base relations, file organizations, and indexes used to achieve efficient access to the data, and any associated integrity constraints and security measures.

To produce a relational database schema from the logical data model that can be implemented in the target DBMS.

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 47MIT-533 Database Systems 2

Step 3 Translate Logical Model for Target DBMS

Need to know functionality of target DBMS such as how to create base relations and whether the system supports the definition of:

PKs, FKs, and AKsrequired data – i.e. whether system supports NOT NULLdomainsrelational integrity constraintsenterprise constraints

To produce a relational database schema from the logical data model that can be implemented in the target DBMS.

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 48MIT-533 Database Systems 2

Step 3.1 Decide Base RelationsTo decide how to represent base relations identified in global logical model in target DBMS.For each relation, need to define:

the name of the relation;a list of simple attributes in brackets;the PK and, where appropriate, AKs and FKs.a list of any derived attributes and how they should be computed;referential integrity constraints for any FKs identified.

For each attribute, need to define:its domain, consisting of a data type, length, and any constraints on the domain;an optional default value for the attribute;whether the attribute can hold nulls.

Page 13: Lecture 1 Conceptual, Logical and Physical DB Design & Tuning · MIT-533 Database Systems 2 Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 10 Step 1 Build Local

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 49MIT-533 Database Systems 2

DBDL for the PropertyForRent Relation(Database Design Language)

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 50MIT-533 Database Systems 2

Step 3.2 Design Representation of Derived DataTo decide how to represent any derived data present in the global logical data model in the target DBMS.Examine logical data model and data dictionary, and produce listof all derived attributes. Derived attribute can be stored in database or calculated every time it is needed. Option selected is based on:

additional cost to store the derived data and keep it consistentwith operational data from which it is derived;cost to calculate it each time it is required.

Less expensive option is chosen subject to performance constraints.

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 51MIT-533 Database Systems 2

Derived Attribute(PropertyforRent and Staff Relation with Derived Attribute noOfProperties)

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 52MIT-533 Database Systems 2

Step 3.3 Design General ConstraintsTo design the enterprise constraints for the target DBMS. Some DBMS provide more facilities than others for defining enterprise constraints. Example:

CONSTRAINT StaffNotHandlingTooMuchCHECK (NOT EXISTS (SELECT staffNo

FROM PropertyForRentGROUP BY staffNoHAVING COUNT(*) > 100))

An example

Page 14: Lecture 1 Conceptual, Logical and Physical DB Design & Tuning · MIT-533 Database Systems 2 Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 10 Step 1 Build Local

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 53MIT-533 Database Systems 2

Step 4 Design File Organization and Index

Number of factors that may be used to measure efficiency:Transaction throughput: number of transactions processed in given time interval.Response time: elapsed time for completion of a single transaction. Disk storage: amount of disk space required to store database files.

However, no one factor is always correct. Typically, have to trade one factor off against another to achieve a reasonable balance.

To determine optimal file organizations to store the base relations and the indexes that are required to achieve acceptable performance; that is, the way in which relations and tuples will be held on secondary storage.

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 54MIT-533 Database Systems 2

Step 4.1 Analyze transactions (I)To understand the functionality of the transactions that will run on the database and to analyze the important transactions. Attempt to identify performance criteria, such as:

transactions that run frequently and will have a significant impact on performance;transactions that are critical to the business;times during the day/week when there will be a high demand made on the database (called the peak load).

Use this information to identify the parts of the database that may cause performance problems. To select appropriate file organizations and indexes, also need to know high-level functionality of the transactions, such as:

attributes that are updated in an update transaction; criteria used to restrict tuples that are retrieved in a query.

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 55MIT-533 Database Systems 2

Step 4.1 Analyze transactions (II)

Often not possible to analyze all expected transactions, so investigate most ‘important’ ones. To help identify which transactions to investigate, can use:

transaction/relation cross-reference matrix, showing relations that each transaction accesses, and/or transaction usage map, indicating which relations are potentially heavily used.

To focus on areas that may be problematic:(1) Map all transaction paths to relations.(2) Determine which relations are most frequently accessed by

transactions.(3) Analyze the data usage of selected transactions that involve

these relations.

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 56MIT-533 Database Systems 2

Cross-Referencing Transactions and Relations

Page 15: Lecture 1 Conceptual, Logical and Physical DB Design & Tuning · MIT-533 Database Systems 2 Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 10 Step 1 Build Local

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 57MIT-533 Database Systems 2

Transaction Usage Map for Some Sample Transactions Showing Expected Occurrences

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 58MIT-533 Database Systems 2

Example Transaction Analysis Form

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 59MIT-533 Database Systems 2

Step 4.2 Choose File Organizations

To determine an efficient file organization for each base relation. File organizations include Heap, Hash, Indexed Sequential Access Method (ISAM), B+-Tree, and Clusters.Some DBMSs may not allow selection of file organizations.

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 60MIT-533 Database Systems 2

Step 4.3 Choose Indexes (I)To determine whether adding indexes will improve the performance of the system.One approach is to keep tuples unordered and create as many secondary indexes as necessary. Another approach is to order tuples in the relation by specifying a primary or clustering index. In this case, choose attribute for ordering/clustering the tuples as:

attribute that is used most often for join operations, makes join efficient. attribute that is used most often to access the tuples in a relation in order of that attribute.

If ordering attribute chosen is key of relation, index will be aprimary index; otherwise, index will be a clustering index.Each relation only has either a primary index or a clustering index

Page 16: Lecture 1 Conceptual, Logical and Physical DB Design & Tuning · MIT-533 Database Systems 2 Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 10 Step 1 Build Local

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 61MIT-533 Database Systems 2

Step 4.3 Choose Indexes (II)Secondary indexes provide a mechanism for specifying an additional key for a base relation that can be used to retrieve data more efficiently.Overhead involved in maintenance and use of secondary indexes that has to be balanced against performance improvement gained when retrieving data. This includes:

adding an index record to every secondary index whenever tuple is inserted;updating a secondary index when corresponding tuple is updated;increase in disk space needed to store the secondary index;possible performance degradation during query optimization to consider all secondary indexes.

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 62MIT-533 Database Systems 2

Step 4.3 Choose Indexes (III)Guidelines for choosing ‘Wish-List’(1) Do not index small relations. (2) Index PK of a relation if it is not a key of the file organization. (3) Add secondary index to a FK if it is frequently accessed. (4) Add secondary index to any attribute that is heavily used as a

secondary key.(5) Add secondary index on attributes that are involved in: selection

or join criteria; ORDER BY; GROUP BY; and other operations involving sorting (such as UNION or DISTINCT).

(6) Add secondary index on attributes involved in built-in functions.(7) Add secondary index on attributes that could result in an index-

only plan.(8) Avoid indexing an attribute or relation that is frequently updated.(9) Avoid indexing an attribute if the query will retrieve a significant

proportion of the tuples in the relation. (10) Avoid indexing attributes that consist of long character strings.

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 63MIT-533 Database Systems 2

Step 4.4 Estimate Disk Space RequirementsTo estimate the amount of disk space that will be required by DB. It may be a requirement that the physical database implementation can be handled by the current hardware configuration.Even if this is not the case, the designer still has to estimate the amount of disk space that is required to store the database, in the event that new hard ware has to be procured.The objective of this step is to estimate the amount of disk space that is required to support the database implementation on secondary storage.Estimating the disk usage is highly dependent on the target DBMS and the hardware used to support the database. In general, the estimate is based on the size of each tuple and the number of tuples in the relation (estimated maximum).

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 64MIT-533 Database Systems 2

Step 5 Design User ViewsTo design the user views that were identified during the Requirements Collection and Analysis stage of the relational database application lifecycle. The first phase of the DB design methodology (as described previously) involves the production of a conceptual data model for either the single user view or a number of combined user views identified during the requirements collection and analysis stage.In a multi-user DBMS, user views play a central rile in defining the structure of the DB and enforcing security.

Section 1

Page 17: Lecture 1 Conceptual, Logical and Physical DB Design & Tuning · MIT-533 Database Systems 2 Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 10 Step 1 Build Local

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 65MIT-533 Database Systems 2

Step 6 Design Security MeasuresTo design the security measures for the database as specified bythe users during the requirements and collection stage of the database system development lifecycle.The security of the DB resource is extremely important.During the requirements collection and analysis stage of the DB system development lifecycle, specific security requirements should have been documented in the system requirements specification.Most relational DBMSs provide two types of database security: System security and Data securitySystem security covers access and use of the database at the system level, such as user name and password.Data security covers access and use of DB objects (such as relations/views) and the actions that users can do on the objects.

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 66MIT-533 Database Systems 2

Step 7 Introduction of Controlled Redundancy (I) To determine whether introducing redundancy in a controlled manner by relaxing normalization rules will improve the performance of the system. If necessary, sometimes we need to introduce controlled redundancy Result of normalization is a design that is structurally consistent with minimal redundancy. However, sometimes a normalized database does not provide maximum processing efficiency. May be necessary to accept loss of some benefits of a fully normalized design in favor of performance. Therefore we need to consider denormalization.

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 67MIT-533 Database Systems 2

Step 7 Introduction of Controlled Redundancy (II) Anyway denormalization may trigger the following side effects.

makes implementation more complexoften sacrifices flexibilitymay speed up retrievals but it slows down updates

Denormalization refers to a refinement to relational schema such that the degree of normalization for a modified relation is less than the degree of at least one of the originalrelations.Also use term more loosely to refer to situations where two relations are combined into one new relation, which is still normalized but contains more nulls than original relations.

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 68MIT-533 Database Systems 2

Step 7 Introduction of Controlled Redundancy (III) Consider denormalization in following situations, specifically to speed up frequent or critical transactions:

Step 7.1 Combining 1:1 relationshipsStep 7.2 Duplicating non-key attributes in 1:* relationships

to reduce joinsStep 7.3 Duplicating foreign key attributes in 1:* relationships

to reduce joinsStep 7.4 Duplicating attributes in *:* relationships

to reduce joins Step 7.5 Introducing repeating groupsStep 7.6 Creating extract tablesStep 7.7 Partitioning relations

Page 18: Lecture 1 Conceptual, Logical and Physical DB Design & Tuning · MIT-533 Database Systems 2 Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 10 Step 1 Build Local

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 69MIT-533 Database Systems 2

An Example of Controlled Redundancy(A Sample Relation Diagram)

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 70MIT-533 Database Systems 2

An Example of Controlled Redundancy(The set of tables/relations)

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 71MIT-533 Database Systems 2

Step 7.1 Combining 1:1 relationshipsFor example, combine Client and Interview.

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 72MIT-533 Database Systems 2

Step 7.2 Duplicating non-key attributes in 1:* relationships to reduce joins

For example, duplicate the lName in the relation Owner to PropertyForRent as shown below.

Page 19: Lecture 1 Conceptual, Logical and Physical DB Design & Tuning · MIT-533 Database Systems 2 Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 10 Step 1 Build Local

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 73MIT-533 Database Systems 2

Step 7.2 Duplicating non-key attributes in 1:* relationships to reduce joins (Another example)

For example, avoid looking up a table as shown below.

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 74MIT-533 Database Systems 2

Step 7.2 Duplicating non-key attributes in 1:* relationships to reduce joins (Another example)

For example, avoid looking up a table as shown below.

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 75MIT-533 Database Systems 2

Step 7.3 Duplicating FK attributes in 1:* relationship to reduce joins

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 76MIT-533 Database Systems 2

Step 7.4 Duplicating attributes in *:* relationships to reduce joins

Copy street from PropertyForRent to Viewing

Page 20: Lecture 1 Conceptual, Logical and Physical DB Design & Tuning · MIT-533 Database Systems 2 Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 10 Step 1 Build Local

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 77MIT-533 Database Systems 2

Step 7.5 Introducing repeating groups Place all telephone numbers in Branch.

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 78MIT-533 Database Systems 2

Step 7.6 Creating extract tables Reports can access derived data and perform multi-relation joins on same set of base relations. However, data the report is based on may be relatively static or may not have to be current.Possible to create a single, highly denormalizedextract table based on relations required by reports, and allow users to access extract table directly instead of base relations.

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 79MIT-533 Database Systems 2

Step 7.7 Partitioning relationsRather than combining relations together, alternative approach is to decompose them into a number of smaller and more mannageable partitions. Two main types of partitioning: horizontal and vertical.

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 80MIT-533 Database Systems 2

Advantages and disadvantages of Denormalization

Page 21: Lecture 1 Conceptual, Logical and Physical DB Design & Tuning · MIT-533 Database Systems 2 Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 10 Step 1 Build Local

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 81MIT-533 Database Systems 2

Step 8 Monitor and tune the operational systemTo monitor operational system and improve performance of system to correct inappropriate design decisions or reflect changing requirements.Number of factors may be used to measure efficiency:

Transaction throughput: number of transactions processed in given time interval.Response time: elapsed time for completion of a single transaction. Disk storage: amount of disk space required to store database files.

No one factor is always correct. Have to trade each off against another to achieve reasonable balance. Need to understand how the various hardware components interact and affect database performance.

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 82MIT-533 Database Systems 2

Why monitor and tune the operational system

The initial physical database design should not be regarded as static, but should be considered as an estimate of how the operational system might perform.Once the initial design has been implemented, it will be necessary to monitor the system and tune it as a result of observed performance and changing requirements.Many DBMs provide the Database Administrator (DBA) with utilities to monitor the operation of the system and tune it.

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 83MIT-533 Database Systems 2

Benefits of Tuning DBThere are many benefits to be gained from tuning the database.

Tuning can avoid the procurement of addition hardware.It may be possible to downsize the hardware configuration, making less and cheaper hardware, as well as less expensive maintenance.A well-tuned system produces faster response times and better throughput, which in turn makes the users, and hence the organization, more productive.Improved response times can improve staff morale.Improved response times can increase customer satisfaction.

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 84MIT-533 Database Systems 2

Main MemoryMain memory accesses are significantly faster than secondary storage accesses, sometimes 10000-100000 times faster.In general the more main memory available to the DBMS and the database applications, the faster the applications will run.However, it is sensible always to have a minimum of 5% of main memory available.It is advisable to have more than 10% available otherwise main memory is not being used optimally.When there is insufficient memory to accommodate all processes, the operating system transfers pages of processes to disk to free up memory.When one of these pages is next required, the operating system has to transfer it back from disk.Sometimes, it is necessary to swap entire processes from memory to disk, and back again, to free up memory.Problem occur with main memory when paging or swapping becomes excessive.

Page 22: Lecture 1 Conceptual, Logical and Physical DB Design & Tuning · MIT-533 Database Systems 2 Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 10 Step 1 Build Local

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 85MIT-533 Database Systems 2

CPUThe CPU controls the tasks of the other system resources and executes user processes, and is the most costly resource in the system so needs to be correctly utilized.

The main objective for this component is to prevent CPU contention in which processes are waiting for the CPU.

CPU bottlenecks occur when either the OS or user processes make too many demands on the CPU.

This is often a result of excessive paging.It is necessary to understand the typical workload through a 24-hour period and ensure that sufficient resources are available for not only the normal workload but also the peak workload.One option is to ensure that during peak load no unnecessary jobs are being run and that such jobs are instead run in off-hours.Another option may be to consider multiple CPUs, which allows the processing to be distributed and operations to be performed in parallel.

CPU MIPS (millions of instructions per second) can be used as a guide in comparing platforms and determining their ability to meet the enterprise’s throughput requirements.

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 86MIT-533 Database Systems 2

Disk I/O (I)With any large DBMS, there is a significant amount of disk I/O involved in storing and retrieving data.Disks usually have a recommended I/O rate out and, when this rate is exceeded, I/O bottlenecks occur.While CPU clock speeds have increased dramatically in recent years, I/O speeds have not increased proportionally.The way in which data is organized on disk can have a major impact on the overall disk performance.One problem that can arise is disk contention.This occurs when multiple processes try to access the same disk simultaneously.Most disks have limits on both the number of accesses and the amount of data they can transfer per second, and when these limits are reached, processes may have to wait to access the disk.

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 87MIT-533 Database Systems 2

Disk I/O (II)To avoid this, it is recommended that storage should be evenly distributed across available drives to reduce the likelihood of performance problems occurring.The basic principles of distributing the data across disks:

The operating system files should be separated from the database files.The main database files should be separated from the index files.The recovery log file should be separated from the rest of the database.

If a disk still appears to be overloaded, one or more of its heavily accessed files can be moved to a less active disk.Load balancing can be achieved by applying this principle to each of the disks until they all have approximately the same amount of I/O. RAID (Redundant Array of Independent Disk)Once again, the physical database designer needs to understand how the DBMS operates, the characteristics of the HW, and the access patterns of the users.

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 88MIT-533 Database Systems 2

NetworkWhen the amount of traffic on the network is too great, or when the number of network collisions is large, network bottlenecks occur.Each of above resources may affect other system resources.Equally well, an improvement in one resource may effect an improvement in other system resources.

Procuring more main memory should result is less paging, which should help avoid CPU bottlenecks.More effective use of main memory may result in less disk I/O

Page 23: Lecture 1 Conceptual, Logical and Physical DB Design & Tuning · MIT-533 Database Systems 2 Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 10 Step 1 Build Local

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 89MIT-533 Database Systems 2

Exercise 1Complete the following E-R diagram.

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 90MIT-533 Database Systems 2

Exercise 2Complete the following E-R diagram.

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 91MIT-533 Database Systems 2

Exercise 3 (I)Given the ER diagram in the previous question, estimate the number of rows output by each query below. For the average, assume that all tables have uniform distribution.

There are 100 tuples in Branch.There are 2 tuples of BranchTypeThere are 20000 tuples in Staff.There are 10000000 tuples in Sales.There are 5000 tuples in Product.There are 50 tuples in ProductType.

Lecture 1: Conceptual, Logical and Physical DB Design and Tuning 92MIT-533 Database Systems 2

Exercise 3 (II)What is the minimum number of tuples and the maximum number of tuples when Product and ProductType are inner-joined as follow?SELECT * FROM Product P, ProductType PTWHERE P.PTypeID = PT.PTypeID;

What is the minimum number of tuples and the maximum number of tuples when Branch and Sales are left-outer-joined as follow? SELECT * FROM Branch LEFT JOIN Sales on BranchID

Estimate minimum, maximum and average numbers of rows output by the following SQL. SELECT * FROM Sales S, Product P, Branch B, ProductType PTWHERE S.BranchID = B.BBranchID AND

S.ProductID = P.ProductID AND P.PTypeID = PT.PTypeID ANDB.BName = “Rangsit” AND PT.PTDesc = “Food”;