data modeling mit2
TRANSCRIPT
-
8/2/2019 Data Modeling MIT2
1/62
Data modelingProf. Amos DAVIDhttp://ui-n2.loria.fr
-
8/2/2019 Data Modeling MIT2
2/62
2
Course content Why data modeling ?
Entity-Relation model
Relational model
We will focus on
WHATWHY
HOW
-
8/2/2019 Data Modeling MIT2
3/62
3
Why data modeling ? Illustration with an example
-
8/2/2019 Data Modeling MIT2
4/62
4
From information problem
statement to data specification A case study
Problem statement We want an information system on students
Questions
What is an information system ?
What do we mean by students ?
Why do we want the information system ?
The system should provide answers to what questions ?
-
8/2/2019 Data Modeling MIT2
5/62
5
What is an information system ? Functional characteristics of an information system
Store information (creation)
Retrieve information (access)
Update information (modification, deletion)
Components of an information system Users
End-user
Information system manager/administrator
Information base (database)
User interface To implement the functional characteristics
Between the end-user and the information base
-
8/2/2019 Data Modeling MIT2
6/62
6
General schema of an IRS :
functional approach
Informationbase
User
Common access methods* navigation* query
Results
Informationproblem
Matching
Information problem transformation into access expression
Objects
Store /Update
-
8/2/2019 Data Modeling MIT2
7/62
7
What do we mean bystudents ? Students viewed by who ?
Admission office ?
Post graduate school ?
Registrars office ? A university department ?
Alumni association ?
By a state government ?
Foreign government ?
By other structures ?
Students considered over what period ?
From admission to graduation only ?
Consider ex-students ?
University structures
Government structures
-
8/2/2019 Data Modeling MIT2
8/62
8
Who are the end-users ?
This will determine how students are viewed
This will determine the final use of the informationto be accessed
Examples of end-users in the university structure
The VC
The registrar The Dean
The HOD
Any category of student (in-course, ex-student)
-
8/2/2019 Data Modeling MIT2
9/62
9
Why data modeling ?
Represent the real world
Focus on the use of the elements
Represent only the necessary elements Represent the relationships between the selected
elements
DO NOT ignore or neglect necessary elements or
relations
-
8/2/2019 Data Modeling MIT2
10/62
10
Why data modeling ?
For efficient computerization
Reduce data redundancy
Disk space problem
Volume of data transfer
Objects of documentation
Describe the computed elementsuser noticetechnical notice
For the programmer
For the system designer / manager For the end-users
Guaranty data integrity
Valid information irrespective of context
-
8/2/2019 Data Modeling MIT2
11/62
11
Example on data integrity
Admission office
STUDENT (N, names, marital status, gender, age,degree, address)
The department
STUDENT (N, names, date of birth, courses, address)
The administrative office PERSONNEL (N, names, marital status, faculty,
department, address)
-
8/2/2019 Data Modeling MIT2
12/62
12
General schema of an IRS :
Users and usage centered approach
Real world
Operation
Event
Data
Database Management System
(DBMS)
IRS
(What is seen by the end-user)
Data modeling
Developed using
Implemented for
Determined by
-
8/2/2019 Data Modeling MIT2
13/62
13
Entity-Relation model
-
8/2/2019 Data Modeling MIT2
14/62
14
Represent the real world elements with four
main concepts
Entity
Attributes
Relation Cardinality
Employs graphic representation Intuitive approach
-
8/2/2019 Data Modeling MIT2
15/62
15
Entity
The basic conceptual or real element
Examples
A student
A personnel
A town
Entities have real existence (the instances)
They are identifiable Amos DAVID
Charles ROBERT
Ibadan
-
8/2/2019 Data Modeling MIT2
16/62
16
Entity
Each entity is associated with a set of attributes
The instances of an entity have the same characteristics
They have the same set of attributes
Examples All students have the same set of attributes
All members of staff have the same set of attributes
-
8/2/2019 Data Modeling MIT2
17/62
17
Attributes
Attributes are used for describing the entities
The entities and their attributes are determined according
to the database project
Taking into account the functions to be accomplished
Examples
Represent students at the department for course registration
Represents members of staff for salaries and promotions
One of the attributes must be an IDENTIFIER
Its value is unique for each entity
-
8/2/2019 Data Modeling MIT2
18/62
18
Attributes
How to reduce redundancy Avoid structured attributes
Structured attributes should be decomposed
Example NamesFirst name, last name AddressStreet n, street name, town, local government, state
Decomposing structured attributes allows an easy access to thecomponent elements
Example The town element of an address can be easily extracted instead of performing
string extraction on the structured element
-
8/2/2019 Data Modeling MIT2
19/62
19
Attributes
Examples Name, Address
Amos DAVID;Dept computer science, UI Ibadan, Ibadan, Oyo state
Olu OJO;23 Aderemo street, Agbowo, AgbowoLGA, Ibadan, Oyo state Uche KALU;5 market road, Anambra, Anambra state
Problems with this representation
The addresses do not have the same number of elements, so
how can one obtain a specific component ?
The nth element ?
Starting from the nth character ?
How can one locate the town within an address ?
-
8/2/2019 Data Modeling MIT2
20/62
20
Attributes Examples
Dissociate structured elements Name, Street number, Street name, Town, Local government, State
Amos DAVID; Dept computer science; UI Ibadan; Ibadan; ;Oyo state
Olu OJO; 23; Aderemo street; Agbowo; Ibadan; ; Oyo state
Uche KALU; 5; market road; Anambra; ; Anambra state
Efficiency Each entry has the same number of elements
A component element can be easily extracted using its position
Example The town value is always at the 4th position
The state value is always at the last position
The position can be in string functions or for the colon numbers in tables
-
8/2/2019 Data Modeling MIT2
21/62
21
Attributes
reducing redundancy
Avoid attributes whose value is a list ;a new entity should be created
Example (memory redundancy)
Courses as attribute ofDegree
We do not know the number of courses for a degree
Create DEGREE and COURSE
Associate the two entities (to be seen later)
-
8/2/2019 Data Modeling MIT2
22/62
22
Example (memory redundancy)
Computer science, course 1, course 2, course 3
Biology, course 3, course 6, course 7, course 20
Chemistry, course 7, course 3, course 8, course 9, course 10
In terms of memory allocation, how many courses should be
anticipated ?
Because of the unknown number of courses, the anticipated number
will either be too few or too many
-
8/2/2019 Data Modeling MIT2
23/62
23
Attributes
How to ensure data integrity
Identify the functional dependency between attributes
Example of dependency
A town belongs to only one state
Towns are unique
there is dependency between town and state if the town is known, the state can be determined
unambiguously
In a case of functional dependency, create a new entity to regroup
the dependent attributes
Create a relation between the new entity and the original one
-
8/2/2019 Data Modeling MIT2
24/62
24
Attributes
Examples Name, Street number, Street name, Town, Local government, State
1.Amos DAVID; Dept computer science; UI Ibadan; Ibadan; Oyo state
2.Olu OJO; 23; Aderemo street; Agbowo; Ibadan; Oyo state
3.Samuel UCHE; 213 Sango road; Dugbe; Ibadan; Oyo state 4.Uche KALU; 5; market road; Anambra; ; Anambra state
Entities 1, 2 and 3 are redundant, prone to non integrity Entering entities 1, 2 and 3 (town, state) three times may produce
typographical error
Should a town change from one state to another, all the entities are no longervalid All the entities must be modified
-
8/2/2019 Data Modeling MIT2
25/62
25
Entitygraphical representation
An entity is represented by a rectangle divided into two parts
The name of the entity is represented at the upper part
The names of the attributes are represented at the lower part
The identifier is underlined
PERSON
NumberLast name
First name
Date of birth
TOWN
Town nameState
Local government
-
8/2/2019 Data Modeling MIT2
26/62
26
Relation
A relation specifies the association between two
or more entities
Example
Town and Person
The relation should specify the semantic of the
association
Apersonlives ina town
-
8/2/2019 Data Modeling MIT2
27/62
27
Relation
A relation is symbolized by an oval with its semantic inside the oval
A relation is further specified by cardinalities that indicate thenumber of associated instances
Example A person lives in a minimum of one town and in a maximum of 1 town
A town is inhabited by a minimum of one person and a maximum of n(indicating several)
PERSONNumber
Last name
First name
Date of birth
TOWNName
Surface area
State
Local government
Lives in(1,1)
(1,n)
-
8/2/2019 Data Modeling MIT2
28/62
28
Relation
A relation may sometimes have an attribute
The attribute describes the relation and not theentities associated
Example
The numberof an article bought by a client as well as thedate are neither an attribute of the client nor that of thearticle, but an attribute of the association
The attributes of the relation are indicated at thelower part of the oval that represents the relation
-
8/2/2019 Data Modeling MIT2
29/62
29
Relation
CLIENT
Number
Last nameFirst name
Date of birth
ARTICLE
Name
Unit priceBoughtQuantity
Date(1,n) (1,m)
-
8/2/2019 Data Modeling MIT2
30/62
30
Relation
Maximum cardinality
This indicates the maximum cardinalities oneither side of a relation
Example
PERSONNumber
Last name
First name
Date of birth
TOWNName
Surface area
State
Local government
Lives in(1,1)
(1,n)
[n:1]
-
8/2/2019 Data Modeling MIT2
31/62
31
Relation
How to read the relations : recall
A person lives in a minimum of 1 town and in a maximum
of 1 town
A town is inhabited (is lived) by a minimum of one
person and a maximum of n person (several)
PERSONNumber
Last name
First name
Date of birth
TOWNName
Surface area
State
Local government
Lives in(1,1)
(1,n)
[n:1]
-
8/2/2019 Data Modeling MIT2
32/62
Proposed methodology
Identify an entity
Develop fully the entity and chose the
identifier Associate the entity with existing ones if and
where necessary
Establish the cardinalities
32
-
8/2/2019 Data Modeling MIT2
33/62
-
8/2/2019 Data Modeling MIT2
34/62
Exercise
In a shopping center, a client can buy one or
more articles of various quantities. Propose an
ER model for representing the elements ofinformation necessary for billing the client.
34
-
8/2/2019 Data Modeling MIT2
35/62
-
8/2/2019 Data Modeling MIT2
36/62
36
Relational model
-
8/2/2019 Data Modeling MIT2
37/62
37
Relational model
The basic concepts
Relation
Domain Attribute
Key
N-uplet
-
8/2/2019 Data Modeling MIT2
38/62
38
Relation (Table)
Last name First name Date of birth Degree
Attributes
Domain (same types of value :names)
N-Uplets
STUDENT
-
8/2/2019 Data Modeling MIT2
39/62
39
Domain
Represents the data type of a column
Can be defined in form of intention or extension
In form of intention, it is specified by a formal
definition Example
Integer values
Character set of less than 20 characters
In form of extension, it is specified as a finite listof values
Example Town : {Oyo, Ibadan, Lagos}
-
8/2/2019 Data Modeling MIT2
40/62
40
Relation
A relation R is represented as R(A1, , An)
Where
A1 takes its values from D1 An takes its values from Dm
m
-
8/2/2019 Data Modeling MIT2
41/62
41
Attribute An attribute specifies a constituent of the relation
(a particular column of the table)
Attributes are unique within a relation(each column must be distinguished from theothers) Two columns should not have the same name
Two attributes may have the same domain Example
First name, Last name : NAMES
-
8/2/2019 Data Modeling MIT2
42/62
42
Cartesian product of relation
Let R(A1, A2) be a relation
The Cartesian product of the relation represents allthe possible combinations of the values of the
attributes
Example
Cars-parked (Make, Color)
Where
Make makes of cars (Toyota, Peugeot)
Color colors of cars (red, black, white)
-
8/2/2019 Data Modeling MIT2
43/62
43
Cartesian product of a relation
Toyota, red
Toyota, black
Toyota, white Peugeot, red
Peugeot, black
Peugeot, white
-
8/2/2019 Data Modeling MIT2
44/62
44
The intention of a relation
The intention of a relation specifies how the
relation should be interpreted
Example
Cars-parked (Make, Color)
Cars parked in front of the department ofcomputer science, University of Ibadan
-
8/2/2019 Data Modeling MIT2
45/62
45
N-uplet
Represents the extension of a relation
It is a Cartesian product of attributes
A line of the table
Also called a record
Example
Person (First name, Surname)
Amos, David
John, Olaoye
-
8/2/2019 Data Modeling MIT2
46/62
46
Schema of a relation
The schema of a relation specifies the
intention of the relation and the associated
integrity constraints
-
8/2/2019 Data Modeling MIT2
47/62
47
Constraint of data integrity Constraint on a single attribute
Example The values of vehicle makes should be German vehicles
Constraint based on two attributes Example
The date of marriage should be date of birth
Constraint on the table n-uplets Example
The number of registrations for a degree in one year should belimited to one
-
8/2/2019 Data Modeling MIT2
48/62
48
Maximum Key
A set of attributes of a relation whose values aredistinct for each n-uplet
Example Person (Matriculation number, First name, Last
name, Date of birth, email)
All the attributes combined can form the maximumkey
-
8/2/2019 Data Modeling MIT2
49/62
49
Minimum key (the key)
A key is the minimum set of attributes of a
relation whose values are distinct for each n-uplet
Examples
Student N, first name, last name
The Student N, first name combined can be used as key, but
only the Student N is sufficient
Student N, first name, last name, email
Either Student N or emailcan be used as key
-
8/2/2019 Data Modeling MIT2
50/62
-
8/2/2019 Data Modeling MIT2
51/62
51
Relations in 1st normal form
A relation is said to be in 1st normal form if all
the attributes are of single values
-
8/2/2019 Data Modeling MIT2
52/62
52
Relations in 2nd normal form
A relation is in 2nd normal form if and only if
it is in 1st normal form and there is no FD
between a subset of the key and the rest of theattributes
This mean that the key of the relation must be a
minimum key
-
8/2/2019 Data Modeling MIT2
53/62
53
Relations in 3rd normal form
A relation is in 3rd normal form if it is in 2nd
normal form and there is no FD between non
key attributes
-
8/2/2019 Data Modeling MIT2
54/62
54
From ERM to RM1. To each entity corresponds a relation
(Entity namerelation name)2. To each attribute of an entity corresponds an attribute of the relation
3. The identifier of the entity becomes the key of the relation
4. For associations of maximum cardinality [1:n], add the key of the
relation on the n side to the relation on the 1 side
5. For associations of maximum cardinality [n:m], a new relation should
be created using the concatenation of the keys of the associated
relations as the key. The attributes of the association should be added
as attributes of the new relation
-
8/2/2019 Data Modeling MIT2
55/62
55
REMARKS
A collection of relations obtained from an entity-relation
model as described above will have the following
characteristics
Each attribute is single-value The key contains the least number of attributes
There are no dependencies between the attributes
A collection of relations that have the abovecharacteristics are considered to be of 3rd normal form
-
8/2/2019 Data Modeling MIT2
56/62
56
Important problem
Some relations resulting from the translation
may not have keys
In this case, define a new key
This happens occasionally particularly in thetransformation of NM associations
-
8/2/2019 Data Modeling MIT2
57/62
57
Graph of relations
Specify by pointed arrows the origin ofimported attributes
Redraw the relations in form of rectangles
Use the pointed arrows to link the relations
REMARKS
There should be no linked circle
-
8/2/2019 Data Modeling MIT2
58/62
58
Practical
Model the following types of person in the
university
the students
the members of staff
Be sure to apply the methods for reducing
redundancy and guaranteeing data integrity
-
8/2/2019 Data Modeling MIT2
59/62
59
Practical : University students Description
Each student has a number, a name, an address
A student is registered for a degree
A student may not register for more than one degree simultaneously
A student may take several degrees from the university
To a degree is associated a set of courses A degree is managed by a department
A course is offered by only one department
Example of questions What are the courses associated with a degree ?
What are the courses taken by a student for a degree ?
What are the courses offered by a department ?
Propose an ER model
-
8/2/2019 Data Modeling MIT2
60/62
60
PracticalDocumentary information
system
A library contains the following types of document
Books
Journals that contain articles
Proceedings that contain articles
Write-ups for master and PhD works Books and write-ups are described usingtitle, authors, and a list of
keywords
Journals and proceedings are described using thetitle, editor andyear ofpublication
Articles are represented using thetitle,authors,their addresses and a listofkeywords
The authorized keywords for describing the documents are represented
using a thesaurus
Propose an ER model and the associated MR to manage the information
on the various types of document in the library as well as the thesaurus
-
8/2/2019 Data Modeling MIT2
61/62
A thesaurus
List of concepts linked by semantic links
The semantic links are
SpecificGeneric link (hierarchy) See also (association)
Used for (synonymous)
61
-
8/2/2019 Data Modeling MIT2
62/62
Example of a thesaurusTransport
Plane Boat CarVehicle
Boeing Airbus Mercedes Peugeot
Specific/generic
Specific/generic Specific/generic
See also