dd lect 3b

46
Click to edit Master subtitle style  DD Lect 3b Federated Databases

Upload: hemant-sharma

Post on 07-Apr-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: DD Lect 3b

8/4/2019 DD Lect 3b

http://slidepdf.com/reader/full/dd-lect-3b 1/46

Click to edit Master subtitle style

 

DD Lect 3b

Federated Databases

Page 2: DD Lect 3b

8/4/2019 DD Lect 3b

http://slidepdf.com/reader/full/dd-lect-3b 2/46

 

Definition

o A federated database system (FDBS) is a collection of 

co-operaing but autonomous component database

systems (DBSs).

o The software that provides controlled and coordinated

manipulation of component DBSs is called a federateddatabase management system (FDBMSs)

o Component database refers to a database of a

component DBS.

o A component DBS can participate in more than one

fereration.

o The DBMS of a component DBS, can be a centralized or

distributed DBMS or another FDBMS

Page 3: DD Lect 3b

8/4/2019 DD Lect 3b

http://slidepdf.com/reader/full/dd-lect-3b 3/46

 

FDBS

o The term FDBS was coined by Hammer and

Mcleod[1979] and Heimbigner and Mcleod[1985]

o Political Examples: UN and erstwhile Soviet Union

FDBMS

Comp. DBMS1 Comp. DBMS 2 Comp. DBMS ..n

Comp. DBS1Comp. DBS2 Comp. DBMS n

Page 4: DD Lect 3b

8/4/2019 DD Lect 3b

http://slidepdf.com/reader/full/dd-lect-3b 4/46

 

Federated DBMS

o The component DBMS can differ in such aspects as data

models, query languages, and transaction management

capabilities.

o The component DBS can continue its local operations

and at the same time participate in a federation.o The integration of component DBSs may be managed

either by the users of the federation or by the

administrator of the FDBS together with theadministrators of the component DBSs.

o The amount of integration depends on the needs of federation users and desires of the administrators of the

component DBSs to participate in the federation andshare their databases.

Page 5: DD Lect 3b

8/4/2019 DD Lect 3b

http://slidepdf.com/reader/full/dd-lect-3b 5/46

 

Characteristics

o Characterized along three orthogonal Dimensions:

Distribution, Heterogeneity and Autonomy

Another added dimension :

Networking Environment, update related functionsparticipating DBSs and types of Heterogeneity.

Page 6: DD Lect 3b

8/4/2019 DD Lect 3b

http://slidepdf.com/reader/full/dd-lect-3b 6/46

 

Distribution

o Data may be distributed among multiple databases.o Data may be distributed in multiple databases in

different ways. They include, in horizontal and vertical

partitions.o Multiple copies of soma or all of the data may be

maintained.

o These may not be identically structured,

o Increased availability and reliability as well as improved

access time are well known benefits.

Page 7: DD Lect 3b

8/4/2019 DD Lect 3b

http://slidepdf.com/reader/full/dd-lect-3b 7/46 

Heterogenity

 Database Systems

Differences in DBMS

Data models (structures, constraints ,query language)

System level support (concurrency control, commit, recovery)

Semantic HeterogenityOperating systems

File systems

Naming, file types, operations

Transaction support

ipc

Hardware/System

Instruction set

Data formats & representation

configuration

Commu

nication

Page 8: DD Lect 3b

8/4/2019 DD Lect 3b

http://slidepdf.com/reader/full/dd-lect-3b 8/46 

Differences in DBM

o An enterprise may have multiple DBMSs. Different

organization within the enterprise may have different

requirements and may select different DBMSs.

o DBMSs purchased over a period of time may be different

due to changes in technology.o Heterogeneity may result in DBMSs may be due to

differences in data models and differences at system

level

Page 9: DD Lect 3b

8/4/2019 DD Lect 3b

http://slidepdf.com/reader/full/dd-lect-3b 9/46 

Differences at System Level

o DIFFERENCES IN STRUCTURE

o Different DBMSs provide different structural primitives.

Eg: information is modeled as a table in relational model

may be modeled as a record in CODASYL model.

o If two representations have the same information

content, it is easier to deal with the differences in thestructures. If the information content is not the same, it

may not the same, it may be very difficult to deal withthe difference.

Page 10: DD Lect 3b

8/4/2019 DD Lect 3b

http://slidepdf.com/reader/full/dd-lect-3b 10/46

 

Differences in constraints

o Two data models may support different constraints.

o For eg: set type in a CODASYL schema may be partially

modeled as a referential integrity constraint is arelational schema.

o CODASYL, supports insertion and retention constraints

that are captured by the referential integrity constraintalone.

o Triggers must be used in relational systems to capturesuch semantics.

Page 11: DD Lect 3b

8/4/2019 DD Lect 3b

http://slidepdf.com/reader/full/dd-lect-3b 11/46

 

Differences in query languages

o Different languages are used to manipulate data

represented in different models.

o When two DBMS support the same data model,differences in their query languages (QUEL AND SQL) or

two different versions of SQL supported by two relationalDBMSs could contribute to heterogeneity.

Page 12: DD Lect 3b

8/4/2019 DD Lect 3b

http://slidepdf.com/reader/full/dd-lect-3b 12/46

 

Differences in System Aspect

o Differences in transaction management ( concurrency

control, commit protocols and recovery)

o Hardware and software requirements

o Communication capabilities

Page 13: DD Lect 3b

8/4/2019 DD Lect 3b

http://slidepdf.com/reader/full/dd-lect-3b 13/46

 

Semantic Heterogeneity

o This occurs when there is a disagreement about the

meaning, interpretation or the intended use of the same

or related data

o MEAL_COST of relation RESTAURANT

Average cost of a meal per person without service cost and

tax.

o MEAL_COST of relation BOARDING

Average cost of a meal per person with service cost and tax.

o

Semantically heterogeneous butdifferences in thedefinition

Page 14: DD Lect 3b

8/4/2019 DD Lect 3b

http://slidepdf.com/reader/full/dd-lect-3b 14/46

 

Autonomy

o The organizational entities that manage DBSs are often

autonomous. DBSs are often under separate and

independent control.

o Those who control a database are often willing to let

others share the data only if they retain control. Thuscomponent autonomy is important and needs to be

addresses when DBS participates in an FDBS.

o Three types of component autonomy: Design,communication and execution

Page 15: DD Lect 3b

8/4/2019 DD Lect 3b

http://slidepdf.com/reader/full/dd-lect-3b 15/46

 

Design Autonomy

o Refers to ability of component DBS to choose its own

Design w.r.t to any matter including

o The data being managed.

o The representation and naming of data elements

o The conceptualization and semantic interpretation of the

data

o Constraints( semantic integrity constraints, serializability

constraints)

o

Functions/operations supported by the systemso The implementation( record and file structure)

Page 16: DD Lect 3b

8/4/2019 DD Lect 3b

http://slidepdf.com/reader/full/dd-lect-3b 16/46

 

Communication Autonomy

o Ability o f a component DBMS to decide whether to

communicate with other component DBMSs.

o A component DBMS with communication autonomy isable to decide when and how it responds to a request

from another component DBMS.

Page 17: DD Lect 3b

8/4/2019 DD Lect 3b

http://slidepdf.com/reader/full/dd-lect-3b 17/46

 

Execution Autonomy

o Refers to the ability of a component DBMS to execute localoperations (commands or transactions submitted by local userof a component DBMS) without interference from externaloperations (commands or transactions submitted by anothercomponent DBMSs or FDBMSs) and to decide the order in which

to execute external operations.

o Thus an external FDBMS cannot enforce an order of execution of the commands on a component DBMS with execution autonomy.

o Further ,component DBMS can abort any operation that doesnot meet its local constraints nd its local operations are logically

unaffected by its participation in an FDBMS federationo  

Page 18: DD Lect 3b

8/4/2019 DD Lect 3b

http://slidepdf.com/reader/full/dd-lect-3b 18/46

 

Association Autonomy

o Implies that a component DBS has the ability to decide

whether and how much to share its functionality( ie

operations it supports) and resources ( the data itmanages) with others.

o

This includes the ability to associate/disassociate itself from the federation and to participate in more than one

federation.

Page 19: DD Lect 3b

8/4/2019 DD Lect 3b

http://slidepdf.com/reader/full/dd-lect-3b 19/46

 

Taxonomy of MDBSs

Multi-database Systems

Non Federated Systems

Eg: UNIBASE

Federated Systems

Loosely Coupled

Eg MRDSMTightly Coupled

Single Federation

Eg: DDTSMultiple federations

Eg:MERMAID

Page 20: DD Lect 3b

8/4/2019 DD Lect 3b

http://slidepdf.com/reader/full/dd-lect-3b 20/46

 

Non Federated Database System

o It is an integration of component DBMSs that are not

autonomous.

o Has only one level of management and all operations areuniformly performed.

o Does not distinguish between local and non local users

o A particular type of non federated database system inwhich all database are fully integrated to provide a

single global schema can be a called as unified MDBS.

o

It logically appears like a distributed DBS.

Page 21: DD Lect 3b

8/4/2019 DD Lect 3b

http://slidepdf.com/reader/full/dd-lect-3b 21/46

 

Federated Database System

o It consists of component DBSs that are autonomous yetparticipate in a federation to allow partial and controlledsharing of data.

o Association autonomy implies that component DBSs

have control over the data they manage.o They cooperate to allow different degrees of integration.o No centralized control in federated architectureo Represent a compromise between no integration and

total integration

o

Suitable for integrating a set of autonomous and standalone DBSs. To a system that allows partial andcontrolled sharing of data without affecting existingappliactions.

Page 22: DD Lect 3b

8/4/2019 DD Lect 3b

http://slidepdf.com/reader/full/dd-lect-3b 22/46

 

Loosely coupled / Tightly coupled

FDBSs

o If it is the users responsibility to create and maintain the

federation and there is no control enforced by the

federation system and its administrators.

o Supports multiple federated schemas

o If it is the federation system and its administratorsresponsibility to create and maintain the federation and

active control enforced by the federation system and itsadministrators to control component DBSs.

o Single /more federated schema

Page 23: DD Lect 3b

8/4/2019 DD Lect 3b

http://slidepdf.com/reader/full/dd-lect-3b 23/46

 

Processor Types in Reference

Architecture

Transforming Processor

They translate commands from one language called source languageto another language called target language called target language,or transform data from one format (source format) to anotherformat (target format). Transforming processors provide a type of 

data independence called data model transparency in which the datastructures and commands used by one processor are hidden fromother processors.

Data model transparency hides the differences in query languages

and data formats.

Page 24: DD Lect 3b

8/4/2019 DD Lect 3b

http://slidepdf.com/reader/full/dd-lect-3b 24/46

 

o For example, the data structures used by one processor

can be modified to improve overall efficiency without

requiring changes to other processors

q A command transformer that translates SQL commands

into CODASYL data manipulation language commands[Onuegbe et al. 1983; Zaniolo 19791, allowing a

CODASYL DBS to be processed using SQL commands.

q lA program generator that translates SQL commandsinto equivalent COBOL programs allowing a file system

to be processed using SQL commands.

Page 25: DD Lect 3b

8/4/2019 DD Lect 3b

http://slidepdf.com/reader/full/dd-lect-3b 25/46

 

command-transforming processors

For some command-transforming processors, there mayexist companion data transforming processors that convertdata produced by the transformed commands

into data compatible with the commands in the source

format.o For example, a data transforming processor that is the

companion to the above SQL-to-CODASYL command-transforming processor is a table builder that acceptsindividual database records produced by the CODASYLDBMS and builds complete tables for display to the SQLuser

Page 26: DD Lect 3b

8/4/2019 DD Lect 3b

http://slidepdf.com/reader/full/dd-lect-3b 26/46

 

Schema and Command Translation

 

To perform these transformations, a transforming processor needsmappings between the objects of each schema.

The task of  schema translation involves transforming a schema(schema A) describing data in one data model into an equivalentschema (schema B) describing the same data in a different datamodel.

This task also generates the mappings that correlate the schemaobjects in one schema (schema B) to the schema objects in anotherschema (schema A).

The task of  command transformation entails using thesemappings to translate commands involving the schema objects of one schema (schema B) into commands involving the schemaobjects of the other schema (schema A).

Page 27: DD Lect 3b

8/4/2019 DD Lect 3b

http://slidepdf.com/reader/full/dd-lect-3b 27/46

 

Filtering Processor

Filtering processors constrain the commands and associateddata that can be passed to another processor. Associated witheach filtering processor are mappings that describe theconstraints on commands and data. These constraints mayeither be embedded into the code of the filtering processor orbe specified in a separate data structure.

Examples of filtering processors include the following:q Syntactic constraint checker, which checks commands to

verify that they are syntactically correct.q Semantic integrity constraint checker, which performs one

or more of the following functions: (a) checks commands to

verify that they will not violate semantic integrity constraints,(b) modifies commands in such a manner that when the

Page 28: DD Lect 3b

8/4/2019 DD Lect 3b

http://slidepdf.com/reader/full/dd-lect-3b 28/46

 

commands are interpreted, semantic integrity constraints willautomatically be enforced, or

(c) verifies that data produced by another processor does not

violate any semantic integrity constraint. 

q  Access controller, which verifies that the user is permitted to

perform the command on the indicated data or verifies that theuser is permitted to use data produced by another processor.

Page 29: DD Lect 3b

8/4/2019 DD Lect 3b

http://slidepdf.com/reader/full/dd-lect-3b 29/46

 

Constructing Processor

Constructing processors partition and/or replicate an

operation submitted by a single processor into operations

that are accepted by two or more other processors.

Constructing processors also merge data produced by

several processors into a single data set for consumption byanother single processor.

o They can support location, distribution, and replication

transparencies because a processor submitting acommand does not need to know the location,

distribution, and number of processors participating inprocessing that command.

Page 30: DD Lect 3b

8/4/2019 DD Lect 3b

http://slidepdf.com/reader/full/dd-lect-3b 30/46

 

Tasks that can be handled by constructing processorsinclude the following:

Schema integration: Integrating multiple schemes intoa single schema

Negotiation: Determining what protocol should be usedamong the owners of various schemas to be integrated indetermining the contents of an integrated schema

Query (command) decomposition and optimization:Decomposing and optimizing a query (command)

expressed on an integrated schemaGlobal transaction management: Performing theconcurrency and atomicity control

Page 31: DD Lect 3b

8/4/2019 DD Lect 3b

http://slidepdf.com/reader/full/dd-lect-3b 31/46

 

Accessing Processor

o An accessing processor accepts commands and producesdata by executing the commands against a database. Itmay accept commands from several processors and

interleave the processing of those commands.Examples of accessing processors include the following:

A file management system that executes access proceduresagainst stored file

A special application program that accepts commands andgenerates data to be returned to the processor generating the

commandsA data manager of a DBMS containing data access methods

A dictionary manager that manages access to dictionary data

Page 32: DD Lect 3b

8/4/2019 DD Lect 3b

http://slidepdf.com/reader/full/dd-lect-3b 32/46

 

.

Issues that are addressed by accessing

o processors include local concurrency control,

o commitment, backup, and recovery.

o These problems and their solutions are extensively

o discussed in the literature for centralized

o and distributed DBMSs.

Page 33: DD Lect 3b

8/4/2019 DD Lect 3b

http://slidepdf.com/reader/full/dd-lect-3b 33/46

 

A Five-Level Schema Architecture forFederated Databases

The three-level schema architecture is adequate fordescribing the architecture of a centralized DBMS. It,however, is inadequate for describing the architecture of 

an FDBS.The three-level schema must be extended to support thethree dimensions of a federated database system-distribution, heterogeneity, and autonomy.

Examples of extended schema architectures include a four-level schema architecture in Mermaid [Templeton et al.1987131, five-level schema architectures in DDTS [Devoret al. 1982b] and SIRIUS-DELTA [Litwin et al. 19821, andothers [Blakey 1987; Ram and Chastain 19891.

Page 34: DD Lect 3b

8/4/2019 DD Lect 3b

http://slidepdf.com/reader/full/dd-lect-3b 34/46

 

The five-level schema architecture of anFDBS includes the following

Local Schema: A local schema is the conceptual schema of a component DBS. Alocal schema is expressed in the native data model of the component DBMS, andhence different local schemas may be expressed in different data models.

Component Schema: A component schema is derived by translating local

schemas into a data model called the canonical or common data model (CDM) of the FDBS.

Two reasons for defining component schemas in a CDM are

(1) they describe the divergent local schemas using a single representation

and (2) semantics that are missing in a local schema can be added to

its component schema.

Thus they facilitate negotiation and integration tasks performed when developinga tightly coupled FDBS. Similarly, they facilitate negotiation and specification of views and multidatabase queries in a loosely coupled

FDBS.

Page 35: DD Lect 3b

8/4/2019 DD Lect 3b

http://slidepdf.com/reader/full/dd-lect-3b 35/46

 

o The process of schema translation from

schema objects.o a local schema to a component schema use

these mappings to transform comgenerateso the mappings between the com- mands on

a component schema into componento schema objects and the local mands on the

corresponding local schemao Transforming processors

Page 36: DD Lect 3b

8/4/2019 DD Lect 3b

http://slidepdf.com/reader/full/dd-lect-3b 36/46

 

Page 37: DD Lect 3b

8/4/2019 DD Lect 3b

http://slidepdf.com/reader/full/dd-lect-3b 37/46

 

Export Schema

q Not all data of a component DBS may be available tothe federation and its users. An export schemarepresents a subset of a component schema that isavailable to the FDBS.

q It may include access control information regarding itsuse by specific federation users.

q The purpose of defining export schemas is to facilitatecontrol and management of association autonomy.

q A filtering processor can be used to provide the accesscontrol as specified in an export schema by limiting the

set of allowable operations that can be submitted onthe corresponding component schema. Such filteringprocessors and the export schemas support theautonomy feature of an FDBS.

Page 38: DD Lect 3b

8/4/2019 DD Lect 3b

http://slidepdf.com/reader/full/dd-lect-3b 38/46

 

Federated Schema

o A federated schema is an integration of multiple exportschemas.

o A federated schema also includes the information ondata distribution that is generated when integrating

export schemas.o Some systems use a separate schema called a

distribution schema or an allocation schema to containthis information.

o A constructing processor transforms commands on the

federated schema into the commands on one or moreexport schemas.o Constructing processors and the federated schemas

support the distribution feature of an FDBS.

Page 39: DD Lect 3b

8/4/2019 DD Lect 3b

http://slidepdf.com/reader/full/dd-lect-3b 39/46

 

o There may be multiple federated schemas in an FDBS,one for each class of federation users.

o A class of federation users is a group of users and/orapplications performing a related set of activities. Forexample, in a corporate environment, all managers may

be one class of federation users, and all employees andapplications in the accounting department may beanother class of federation users.

q   A concept similar to that of federated schema isrepresented by the terms import schema [Heimbignerand McLeod 19851, global schema [Landers and

Rosenberg 1982J,global conceptual schema [Litwin etal. 19821, unified schema, and enterprise schema,although the terms other than import schemas areusually used when there is only one such schema inthe system.

Page 40: DD Lect 3b

8/4/2019 DD Lect 3b

http://slidepdf.com/reader/full/dd-lect-3b 40/46

 

External Schema

An external schema defines a schema for a user and/or application ora class of users/applications.q Reasons for the use of external schemas are as follows:

Customization: A federated schema can be quite large, complex,and difficult to change. An external schema can be used to specifya subset of information in a federated schema that is relevant to

the users of the external schema. They can be changed morereadily to meet changing users’ needs. The data model for anexternal schema may be different than that of the federatedschema.

q Additional integrity constraints: Additional integrity constraints canalso be specified in the external schema.

q Access control: Export schemas provide access control withrespect to the data managed by the component databases.

q Similarly, external schemas provide access control with respect tothe data managed by the FDBS.

Page 41: DD Lect 3b

8/4/2019 DD Lect 3b

http://slidepdf.com/reader/full/dd-lect-3b 41/46

 

A filtering process analyzes the commands on an externalschema to ensure their conformance with access control andintegrity constraints of the federated schema.

If an external schema is in a different data model from that of the federated schema, a transforming processor is also needed

to transform commands on the external schema into commandson the federated schema.

Most existing prototype FDBSs support only one data model forall the external schemas and one query language interface.

Exceptions are a version of Mermaid that supported two query

language interfaces, SQL and ARIEL, and a version of DDTSthat supported SQL and GORDAS (a query language for anextended ER model).

Page 42: DD Lect 3b

8/4/2019 DD Lect 3b

http://slidepdf.com/reader/full/dd-lect-3b 42/46

 

An FDBS may be required to support local and externalschemas expressed in different data models. To facilitatetheir design, integration, and maintenance, however, allcomponent, export, and federated schemas should be inthe same data model.

This data model is called canonical or common datamodel (CDM). A language associated with the CDM iscalled an internal command language. All commands onfederated, export, and component schemas are

expressed using this internal command language.

Page 43: DD Lect 3b

8/4/2019 DD Lect 3b

http://slidepdf.com/reader/full/dd-lect-3b 43/46

 

The five-level schema architecture presented above has

several possibleredundancies

Redundancy between external and federated schemas:q External schemas can be considered redundant with federated

schemas since a federated schema could be generated for everydifferent federation user.

q This is the case in the schema architecture of Heimbigner and

McLeod [ 19851 (they use the term import schema rather thanfederated schema).q In loosely coupled FDBSs, a user defines the federated schema

by integrating export schemas. Thus there is usually no need foran additional level.

q In tightly coupled FDBSs, however, it may be desirable to

generate a few federated schemas for widely different classes of users and to customize these further by defining externalschemas.

Such external schemas can also provide additional access control.

Page 44: DD Lect 3b

8/4/2019 DD Lect 3b

http://slidepdf.com/reader/full/dd-lect-3b 44/46

 

Redundancy between an external schema of a

component DBS and an export schema:

If a component DBMS supports proper access controlsecurity features for its external schemas and if translatinga local schema into a component schema is not required(e.g., the datamodel of the component DBMS is the same asCDM of the FDBS), then the external schemas of a

component DBS may be used as an export schema in thefive-level schema architecture (external schemas of component DBSs are not shown in the five-level schemaarchitecture

Page 45: DD Lect 3b

8/4/2019 DD Lect 3b

http://slidepdf.com/reader/full/dd-lect-3b 45/46

 

Redundancy between componentschemas and local schemas

  When component DBSs uses CDM of the FDBS and

have the same functionality, it is unnecessary todefine component schemas.

Page 46: DD Lect 3b

8/4/2019 DD Lect 3b

http://slidepdf.com/reader/full/dd-lect-3b 46/46