dd lect 3b
TRANSCRIPT
8/4/2019 DD Lect 3b
http://slidepdf.com/reader/full/dd-lect-3b 1/46
Click to edit Master subtitle style
DD Lect 3b
Federated Databases
8/4/2019 DD Lect 3b
http://slidepdf.com/reader/full/dd-lect-3b 2/46
Definition
o A federated database system (FDBS) is a collection of
co-operaing but autonomous component database
systems (DBSs).
o The software that provides controlled and coordinated
manipulation of component DBSs is called a federateddatabase management system (FDBMSs)
o Component database refers to a database of a
component DBS.
o A component DBS can participate in more than one
fereration.
o The DBMS of a component DBS, can be a centralized or
distributed DBMS or another FDBMS
8/4/2019 DD Lect 3b
http://slidepdf.com/reader/full/dd-lect-3b 3/46
FDBS
o The term FDBS was coined by Hammer and
Mcleod[1979] and Heimbigner and Mcleod[1985]
o Political Examples: UN and erstwhile Soviet Union
FDBMS
Comp. DBMS1 Comp. DBMS 2 Comp. DBMS ..n
Comp. DBS1Comp. DBS2 Comp. DBMS n
8/4/2019 DD Lect 3b
http://slidepdf.com/reader/full/dd-lect-3b 4/46
Federated DBMS
o The component DBMS can differ in such aspects as data
models, query languages, and transaction management
capabilities.
o The component DBS can continue its local operations
and at the same time participate in a federation.o The integration of component DBSs may be managed
either by the users of the federation or by the
administrator of the FDBS together with theadministrators of the component DBSs.
o The amount of integration depends on the needs of federation users and desires of the administrators of the
component DBSs to participate in the federation andshare their databases.
8/4/2019 DD Lect 3b
http://slidepdf.com/reader/full/dd-lect-3b 5/46
Characteristics
o Characterized along three orthogonal Dimensions:
Distribution, Heterogeneity and Autonomy
Another added dimension :
Networking Environment, update related functionsparticipating DBSs and types of Heterogeneity.
8/4/2019 DD Lect 3b
http://slidepdf.com/reader/full/dd-lect-3b 6/46
Distribution
o Data may be distributed among multiple databases.o Data may be distributed in multiple databases in
different ways. They include, in horizontal and vertical
partitions.o Multiple copies of soma or all of the data may be
maintained.
o These may not be identically structured,
o Increased availability and reliability as well as improved
access time are well known benefits.
8/4/2019 DD Lect 3b
http://slidepdf.com/reader/full/dd-lect-3b 7/46
Heterogenity
Database Systems
Differences in DBMS
Data models (structures, constraints ,query language)
System level support (concurrency control, commit, recovery)
Semantic HeterogenityOperating systems
File systems
Naming, file types, operations
Transaction support
ipc
Hardware/System
Instruction set
Data formats & representation
configuration
Commu
nication
8/4/2019 DD Lect 3b
http://slidepdf.com/reader/full/dd-lect-3b 8/46
Differences in DBM
o An enterprise may have multiple DBMSs. Different
organization within the enterprise may have different
requirements and may select different DBMSs.
o DBMSs purchased over a period of time may be different
due to changes in technology.o Heterogeneity may result in DBMSs may be due to
differences in data models and differences at system
level
8/4/2019 DD Lect 3b
http://slidepdf.com/reader/full/dd-lect-3b 9/46
Differences at System Level
o DIFFERENCES IN STRUCTURE
o Different DBMSs provide different structural primitives.
Eg: information is modeled as a table in relational model
may be modeled as a record in CODASYL model.
o If two representations have the same information
content, it is easier to deal with the differences in thestructures. If the information content is not the same, it
may not the same, it may be very difficult to deal withthe difference.
8/4/2019 DD Lect 3b
http://slidepdf.com/reader/full/dd-lect-3b 10/46
Differences in constraints
o Two data models may support different constraints.
o For eg: set type in a CODASYL schema may be partially
modeled as a referential integrity constraint is arelational schema.
o CODASYL, supports insertion and retention constraints
that are captured by the referential integrity constraintalone.
o Triggers must be used in relational systems to capturesuch semantics.
8/4/2019 DD Lect 3b
http://slidepdf.com/reader/full/dd-lect-3b 11/46
Differences in query languages
o Different languages are used to manipulate data
represented in different models.
o When two DBMS support the same data model,differences in their query languages (QUEL AND SQL) or
two different versions of SQL supported by two relationalDBMSs could contribute to heterogeneity.
8/4/2019 DD Lect 3b
http://slidepdf.com/reader/full/dd-lect-3b 12/46
Differences in System Aspect
o Differences in transaction management ( concurrency
control, commit protocols and recovery)
o Hardware and software requirements
o Communication capabilities
8/4/2019 DD Lect 3b
http://slidepdf.com/reader/full/dd-lect-3b 13/46
Semantic Heterogeneity
o This occurs when there is a disagreement about the
meaning, interpretation or the intended use of the same
or related data
o MEAL_COST of relation RESTAURANT
Average cost of a meal per person without service cost and
tax.
o MEAL_COST of relation BOARDING
Average cost of a meal per person with service cost and tax.
o
Semantically heterogeneous butdifferences in thedefinition
8/4/2019 DD Lect 3b
http://slidepdf.com/reader/full/dd-lect-3b 14/46
Autonomy
o The organizational entities that manage DBSs are often
autonomous. DBSs are often under separate and
independent control.
o Those who control a database are often willing to let
others share the data only if they retain control. Thuscomponent autonomy is important and needs to be
addresses when DBS participates in an FDBS.
o Three types of component autonomy: Design,communication and execution
8/4/2019 DD Lect 3b
http://slidepdf.com/reader/full/dd-lect-3b 15/46
Design Autonomy
o Refers to ability of component DBS to choose its own
Design w.r.t to any matter including
o The data being managed.
o The representation and naming of data elements
o The conceptualization and semantic interpretation of the
data
o Constraints( semantic integrity constraints, serializability
constraints)
o
Functions/operations supported by the systemso The implementation( record and file structure)
8/4/2019 DD Lect 3b
http://slidepdf.com/reader/full/dd-lect-3b 16/46
Communication Autonomy
o Ability o f a component DBMS to decide whether to
communicate with other component DBMSs.
o A component DBMS with communication autonomy isable to decide when and how it responds to a request
from another component DBMS.
8/4/2019 DD Lect 3b
http://slidepdf.com/reader/full/dd-lect-3b 17/46
Execution Autonomy
o Refers to the ability of a component DBMS to execute localoperations (commands or transactions submitted by local userof a component DBMS) without interference from externaloperations (commands or transactions submitted by anothercomponent DBMSs or FDBMSs) and to decide the order in which
to execute external operations.
o Thus an external FDBMS cannot enforce an order of execution of the commands on a component DBMS with execution autonomy.
o Further ,component DBMS can abort any operation that doesnot meet its local constraints nd its local operations are logically
unaffected by its participation in an FDBMS federationo
8/4/2019 DD Lect 3b
http://slidepdf.com/reader/full/dd-lect-3b 18/46
Association Autonomy
o Implies that a component DBS has the ability to decide
whether and how much to share its functionality( ie
operations it supports) and resources ( the data itmanages) with others.
o
This includes the ability to associate/disassociate itself from the federation and to participate in more than one
federation.
8/4/2019 DD Lect 3b
http://slidepdf.com/reader/full/dd-lect-3b 19/46
Taxonomy of MDBSs
Multi-database Systems
Non Federated Systems
Eg: UNIBASE
Federated Systems
Loosely Coupled
Eg MRDSMTightly Coupled
Single Federation
Eg: DDTSMultiple federations
Eg:MERMAID
8/4/2019 DD Lect 3b
http://slidepdf.com/reader/full/dd-lect-3b 20/46
Non Federated Database System
o It is an integration of component DBMSs that are not
autonomous.
o Has only one level of management and all operations areuniformly performed.
o Does not distinguish between local and non local users
o A particular type of non federated database system inwhich all database are fully integrated to provide a
single global schema can be a called as unified MDBS.
o
It logically appears like a distributed DBS.
8/4/2019 DD Lect 3b
http://slidepdf.com/reader/full/dd-lect-3b 21/46
Federated Database System
o It consists of component DBSs that are autonomous yetparticipate in a federation to allow partial and controlledsharing of data.
o Association autonomy implies that component DBSs
have control over the data they manage.o They cooperate to allow different degrees of integration.o No centralized control in federated architectureo Represent a compromise between no integration and
total integration
o
Suitable for integrating a set of autonomous and standalone DBSs. To a system that allows partial andcontrolled sharing of data without affecting existingappliactions.
8/4/2019 DD Lect 3b
http://slidepdf.com/reader/full/dd-lect-3b 22/46
Loosely coupled / Tightly coupled
FDBSs
o If it is the users responsibility to create and maintain the
federation and there is no control enforced by the
federation system and its administrators.
o Supports multiple federated schemas
o If it is the federation system and its administratorsresponsibility to create and maintain the federation and
active control enforced by the federation system and itsadministrators to control component DBSs.
o Single /more federated schema
8/4/2019 DD Lect 3b
http://slidepdf.com/reader/full/dd-lect-3b 23/46
Processor Types in Reference
Architecture
Transforming Processor
They translate commands from one language called source languageto another language called target language called target language,or transform data from one format (source format) to anotherformat (target format). Transforming processors provide a type of
data independence called data model transparency in which the datastructures and commands used by one processor are hidden fromother processors.
Data model transparency hides the differences in query languages
and data formats.
8/4/2019 DD Lect 3b
http://slidepdf.com/reader/full/dd-lect-3b 24/46
o For example, the data structures used by one processor
can be modified to improve overall efficiency without
requiring changes to other processors
q A command transformer that translates SQL commands
into CODASYL data manipulation language commands[Onuegbe et al. 1983; Zaniolo 19791, allowing a
CODASYL DBS to be processed using SQL commands.
q lA program generator that translates SQL commandsinto equivalent COBOL programs allowing a file system
to be processed using SQL commands.
8/4/2019 DD Lect 3b
http://slidepdf.com/reader/full/dd-lect-3b 25/46
command-transforming processors
For some command-transforming processors, there mayexist companion data transforming processors that convertdata produced by the transformed commands
into data compatible with the commands in the source
format.o For example, a data transforming processor that is the
companion to the above SQL-to-CODASYL command-transforming processor is a table builder that acceptsindividual database records produced by the CODASYLDBMS and builds complete tables for display to the SQLuser
8/4/2019 DD Lect 3b
http://slidepdf.com/reader/full/dd-lect-3b 26/46
Schema and Command Translation
To perform these transformations, a transforming processor needsmappings between the objects of each schema.
The task of schema translation involves transforming a schema(schema A) describing data in one data model into an equivalentschema (schema B) describing the same data in a different datamodel.
This task also generates the mappings that correlate the schemaobjects in one schema (schema B) to the schema objects in anotherschema (schema A).
The task of command transformation entails using thesemappings to translate commands involving the schema objects of one schema (schema B) into commands involving the schemaobjects of the other schema (schema A).
8/4/2019 DD Lect 3b
http://slidepdf.com/reader/full/dd-lect-3b 27/46
Filtering Processor
Filtering processors constrain the commands and associateddata that can be passed to another processor. Associated witheach filtering processor are mappings that describe theconstraints on commands and data. These constraints mayeither be embedded into the code of the filtering processor orbe specified in a separate data structure.
Examples of filtering processors include the following:q Syntactic constraint checker, which checks commands to
verify that they are syntactically correct.q Semantic integrity constraint checker, which performs one
or more of the following functions: (a) checks commands to
verify that they will not violate semantic integrity constraints,(b) modifies commands in such a manner that when the
8/4/2019 DD Lect 3b
http://slidepdf.com/reader/full/dd-lect-3b 28/46
commands are interpreted, semantic integrity constraints willautomatically be enforced, or
(c) verifies that data produced by another processor does not
violate any semantic integrity constraint.
q Access controller, which verifies that the user is permitted to
perform the command on the indicated data or verifies that theuser is permitted to use data produced by another processor.
8/4/2019 DD Lect 3b
http://slidepdf.com/reader/full/dd-lect-3b 29/46
Constructing Processor
Constructing processors partition and/or replicate an
operation submitted by a single processor into operations
that are accepted by two or more other processors.
Constructing processors also merge data produced by
several processors into a single data set for consumption byanother single processor.
o They can support location, distribution, and replication
transparencies because a processor submitting acommand does not need to know the location,
distribution, and number of processors participating inprocessing that command.
8/4/2019 DD Lect 3b
http://slidepdf.com/reader/full/dd-lect-3b 30/46
Tasks that can be handled by constructing processorsinclude the following:
Schema integration: Integrating multiple schemes intoa single schema
Negotiation: Determining what protocol should be usedamong the owners of various schemas to be integrated indetermining the contents of an integrated schema
Query (command) decomposition and optimization:Decomposing and optimizing a query (command)
expressed on an integrated schemaGlobal transaction management: Performing theconcurrency and atomicity control
8/4/2019 DD Lect 3b
http://slidepdf.com/reader/full/dd-lect-3b 31/46
Accessing Processor
o An accessing processor accepts commands and producesdata by executing the commands against a database. Itmay accept commands from several processors and
interleave the processing of those commands.Examples of accessing processors include the following:
A file management system that executes access proceduresagainst stored file
A special application program that accepts commands andgenerates data to be returned to the processor generating the
commandsA data manager of a DBMS containing data access methods
A dictionary manager that manages access to dictionary data
8/4/2019 DD Lect 3b
http://slidepdf.com/reader/full/dd-lect-3b 32/46
.
Issues that are addressed by accessing
o processors include local concurrency control,
o commitment, backup, and recovery.
o These problems and their solutions are extensively
o discussed in the literature for centralized
o and distributed DBMSs.
8/4/2019 DD Lect 3b
http://slidepdf.com/reader/full/dd-lect-3b 33/46
A Five-Level Schema Architecture forFederated Databases
The three-level schema architecture is adequate fordescribing the architecture of a centralized DBMS. It,however, is inadequate for describing the architecture of
an FDBS.The three-level schema must be extended to support thethree dimensions of a federated database system-distribution, heterogeneity, and autonomy.
Examples of extended schema architectures include a four-level schema architecture in Mermaid [Templeton et al.1987131, five-level schema architectures in DDTS [Devoret al. 1982b] and SIRIUS-DELTA [Litwin et al. 19821, andothers [Blakey 1987; Ram and Chastain 19891.
8/4/2019 DD Lect 3b
http://slidepdf.com/reader/full/dd-lect-3b 34/46
The five-level schema architecture of anFDBS includes the following
Local Schema: A local schema is the conceptual schema of a component DBS. Alocal schema is expressed in the native data model of the component DBMS, andhence different local schemas may be expressed in different data models.
Component Schema: A component schema is derived by translating local
schemas into a data model called the canonical or common data model (CDM) of the FDBS.
Two reasons for defining component schemas in a CDM are
(1) they describe the divergent local schemas using a single representation
and (2) semantics that are missing in a local schema can be added to
its component schema.
Thus they facilitate negotiation and integration tasks performed when developinga tightly coupled FDBS. Similarly, they facilitate negotiation and specification of views and multidatabase queries in a loosely coupled
FDBS.
8/4/2019 DD Lect 3b
http://slidepdf.com/reader/full/dd-lect-3b 35/46
o The process of schema translation from
schema objects.o a local schema to a component schema use
these mappings to transform comgenerateso the mappings between the com- mands on
a component schema into componento schema objects and the local mands on the
corresponding local schemao Transforming processors
8/4/2019 DD Lect 3b
http://slidepdf.com/reader/full/dd-lect-3b 36/46
8/4/2019 DD Lect 3b
http://slidepdf.com/reader/full/dd-lect-3b 37/46
Export Schema
q Not all data of a component DBS may be available tothe federation and its users. An export schemarepresents a subset of a component schema that isavailable to the FDBS.
q It may include access control information regarding itsuse by specific federation users.
q The purpose of defining export schemas is to facilitatecontrol and management of association autonomy.
q A filtering processor can be used to provide the accesscontrol as specified in an export schema by limiting the
set of allowable operations that can be submitted onthe corresponding component schema. Such filteringprocessors and the export schemas support theautonomy feature of an FDBS.
8/4/2019 DD Lect 3b
http://slidepdf.com/reader/full/dd-lect-3b 38/46
Federated Schema
o A federated schema is an integration of multiple exportschemas.
o A federated schema also includes the information ondata distribution that is generated when integrating
export schemas.o Some systems use a separate schema called a
distribution schema or an allocation schema to containthis information.
o A constructing processor transforms commands on the
federated schema into the commands on one or moreexport schemas.o Constructing processors and the federated schemas
support the distribution feature of an FDBS.
8/4/2019 DD Lect 3b
http://slidepdf.com/reader/full/dd-lect-3b 39/46
o There may be multiple federated schemas in an FDBS,one for each class of federation users.
o A class of federation users is a group of users and/orapplications performing a related set of activities. Forexample, in a corporate environment, all managers may
be one class of federation users, and all employees andapplications in the accounting department may beanother class of federation users.
q A concept similar to that of federated schema isrepresented by the terms import schema [Heimbignerand McLeod 19851, global schema [Landers and
Rosenberg 1982J,global conceptual schema [Litwin etal. 19821, unified schema, and enterprise schema,although the terms other than import schemas areusually used when there is only one such schema inthe system.
8/4/2019 DD Lect 3b
http://slidepdf.com/reader/full/dd-lect-3b 40/46
External Schema
An external schema defines a schema for a user and/or application ora class of users/applications.q Reasons for the use of external schemas are as follows:
Customization: A federated schema can be quite large, complex,and difficult to change. An external schema can be used to specifya subset of information in a federated schema that is relevant to
the users of the external schema. They can be changed morereadily to meet changing users’ needs. The data model for anexternal schema may be different than that of the federatedschema.
q Additional integrity constraints: Additional integrity constraints canalso be specified in the external schema.
q Access control: Export schemas provide access control withrespect to the data managed by the component databases.
q Similarly, external schemas provide access control with respect tothe data managed by the FDBS.
8/4/2019 DD Lect 3b
http://slidepdf.com/reader/full/dd-lect-3b 41/46
A filtering process analyzes the commands on an externalschema to ensure their conformance with access control andintegrity constraints of the federated schema.
If an external schema is in a different data model from that of the federated schema, a transforming processor is also needed
to transform commands on the external schema into commandson the federated schema.
Most existing prototype FDBSs support only one data model forall the external schemas and one query language interface.
Exceptions are a version of Mermaid that supported two query
language interfaces, SQL and ARIEL, and a version of DDTSthat supported SQL and GORDAS (a query language for anextended ER model).
8/4/2019 DD Lect 3b
http://slidepdf.com/reader/full/dd-lect-3b 42/46
An FDBS may be required to support local and externalschemas expressed in different data models. To facilitatetheir design, integration, and maintenance, however, allcomponent, export, and federated schemas should be inthe same data model.
This data model is called canonical or common datamodel (CDM). A language associated with the CDM iscalled an internal command language. All commands onfederated, export, and component schemas are
expressed using this internal command language.
8/4/2019 DD Lect 3b
http://slidepdf.com/reader/full/dd-lect-3b 43/46
The five-level schema architecture presented above has
several possibleredundancies
Redundancy between external and federated schemas:q External schemas can be considered redundant with federated
schemas since a federated schema could be generated for everydifferent federation user.
q This is the case in the schema architecture of Heimbigner and
McLeod [ 19851 (they use the term import schema rather thanfederated schema).q In loosely coupled FDBSs, a user defines the federated schema
by integrating export schemas. Thus there is usually no need foran additional level.
q In tightly coupled FDBSs, however, it may be desirable to
generate a few federated schemas for widely different classes of users and to customize these further by defining externalschemas.
Such external schemas can also provide additional access control.
8/4/2019 DD Lect 3b
http://slidepdf.com/reader/full/dd-lect-3b 44/46
Redundancy between an external schema of a
component DBS and an export schema:
If a component DBMS supports proper access controlsecurity features for its external schemas and if translatinga local schema into a component schema is not required(e.g., the datamodel of the component DBMS is the same asCDM of the FDBS), then the external schemas of a
component DBS may be used as an export schema in thefive-level schema architecture (external schemas of component DBSs are not shown in the five-level schemaarchitecture
8/4/2019 DD Lect 3b
http://slidepdf.com/reader/full/dd-lect-3b 45/46
Redundancy between componentschemas and local schemas
When component DBSs uses CDM of the FDBS and
have the same functionality, it is unnecessary todefine component schemas.
8/4/2019 DD Lect 3b
http://slidepdf.com/reader/full/dd-lect-3b 46/46