data modeling mit2

Upload: skyjj8

Post on 05-Apr-2018

231 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/2/2019 Data Modeling MIT2

    1/62

    Data modelingProf. Amos DAVIDhttp://ui-n2.loria.fr

  • 8/2/2019 Data Modeling MIT2

    2/62

    2

    Course content Why data modeling ?

    Entity-Relation model

    Relational model

    We will focus on

    WHATWHY

    HOW

  • 8/2/2019 Data Modeling MIT2

    3/62

    3

    Why data modeling ? Illustration with an example

  • 8/2/2019 Data Modeling MIT2

    4/62

    4

    From information problem

    statement to data specification A case study

    Problem statement We want an information system on students

    Questions

    What is an information system ?

    What do we mean by students ?

    Why do we want the information system ?

    The system should provide answers to what questions ?

  • 8/2/2019 Data Modeling MIT2

    5/62

    5

    What is an information system ? Functional characteristics of an information system

    Store information (creation)

    Retrieve information (access)

    Update information (modification, deletion)

    Components of an information system Users

    End-user

    Information system manager/administrator

    Information base (database)

    User interface To implement the functional characteristics

    Between the end-user and the information base

  • 8/2/2019 Data Modeling MIT2

    6/62

    6

    General schema of an IRS :

    functional approach

    Informationbase

    User

    Common access methods* navigation* query

    Results

    Informationproblem

    Matching

    Information problem transformation into access expression

    Objects

    Store /Update

  • 8/2/2019 Data Modeling MIT2

    7/62

    7

    What do we mean bystudents ? Students viewed by who ?

    Admission office ?

    Post graduate school ?

    Registrars office ? A university department ?

    Alumni association ?

    By a state government ?

    Foreign government ?

    By other structures ?

    Students considered over what period ?

    From admission to graduation only ?

    Consider ex-students ?

    University structures

    Government structures

  • 8/2/2019 Data Modeling MIT2

    8/62

    8

    Who are the end-users ?

    This will determine how students are viewed

    This will determine the final use of the informationto be accessed

    Examples of end-users in the university structure

    The VC

    The registrar The Dean

    The HOD

    Any category of student (in-course, ex-student)

  • 8/2/2019 Data Modeling MIT2

    9/62

    9

    Why data modeling ?

    Represent the real world

    Focus on the use of the elements

    Represent only the necessary elements Represent the relationships between the selected

    elements

    DO NOT ignore or neglect necessary elements or

    relations

  • 8/2/2019 Data Modeling MIT2

    10/62

    10

    Why data modeling ?

    For efficient computerization

    Reduce data redundancy

    Disk space problem

    Volume of data transfer

    Objects of documentation

    Describe the computed elementsuser noticetechnical notice

    For the programmer

    For the system designer / manager For the end-users

    Guaranty data integrity

    Valid information irrespective of context

  • 8/2/2019 Data Modeling MIT2

    11/62

    11

    Example on data integrity

    Admission office

    STUDENT (N, names, marital status, gender, age,degree, address)

    The department

    STUDENT (N, names, date of birth, courses, address)

    The administrative office PERSONNEL (N, names, marital status, faculty,

    department, address)

  • 8/2/2019 Data Modeling MIT2

    12/62

    12

    General schema of an IRS :

    Users and usage centered approach

    Real world

    Operation

    Event

    Data

    Database Management System

    (DBMS)

    IRS

    (What is seen by the end-user)

    Data modeling

    Developed using

    Implemented for

    Determined by

  • 8/2/2019 Data Modeling MIT2

    13/62

    13

    Entity-Relation model

  • 8/2/2019 Data Modeling MIT2

    14/62

    14

    Represent the real world elements with four

    main concepts

    Entity

    Attributes

    Relation Cardinality

    Employs graphic representation Intuitive approach

  • 8/2/2019 Data Modeling MIT2

    15/62

    15

    Entity

    The basic conceptual or real element

    Examples

    A student

    A personnel

    A town

    Entities have real existence (the instances)

    They are identifiable Amos DAVID

    Charles ROBERT

    Ibadan

  • 8/2/2019 Data Modeling MIT2

    16/62

    16

    Entity

    Each entity is associated with a set of attributes

    The instances of an entity have the same characteristics

    They have the same set of attributes

    Examples All students have the same set of attributes

    All members of staff have the same set of attributes

  • 8/2/2019 Data Modeling MIT2

    17/62

    17

    Attributes

    Attributes are used for describing the entities

    The entities and their attributes are determined according

    to the database project

    Taking into account the functions to be accomplished

    Examples

    Represent students at the department for course registration

    Represents members of staff for salaries and promotions

    One of the attributes must be an IDENTIFIER

    Its value is unique for each entity

  • 8/2/2019 Data Modeling MIT2

    18/62

    18

    Attributes

    How to reduce redundancy Avoid structured attributes

    Structured attributes should be decomposed

    Example NamesFirst name, last name AddressStreet n, street name, town, local government, state

    Decomposing structured attributes allows an easy access to thecomponent elements

    Example The town element of an address can be easily extracted instead of performing

    string extraction on the structured element

  • 8/2/2019 Data Modeling MIT2

    19/62

    19

    Attributes

    Examples Name, Address

    Amos DAVID;Dept computer science, UI Ibadan, Ibadan, Oyo state

    Olu OJO;23 Aderemo street, Agbowo, AgbowoLGA, Ibadan, Oyo state Uche KALU;5 market road, Anambra, Anambra state

    Problems with this representation

    The addresses do not have the same number of elements, so

    how can one obtain a specific component ?

    The nth element ?

    Starting from the nth character ?

    How can one locate the town within an address ?

  • 8/2/2019 Data Modeling MIT2

    20/62

    20

    Attributes Examples

    Dissociate structured elements Name, Street number, Street name, Town, Local government, State

    Amos DAVID; Dept computer science; UI Ibadan; Ibadan; ;Oyo state

    Olu OJO; 23; Aderemo street; Agbowo; Ibadan; ; Oyo state

    Uche KALU; 5; market road; Anambra; ; Anambra state

    Efficiency Each entry has the same number of elements

    A component element can be easily extracted using its position

    Example The town value is always at the 4th position

    The state value is always at the last position

    The position can be in string functions or for the colon numbers in tables

  • 8/2/2019 Data Modeling MIT2

    21/62

    21

    Attributes

    reducing redundancy

    Avoid attributes whose value is a list ;a new entity should be created

    Example (memory redundancy)

    Courses as attribute ofDegree

    We do not know the number of courses for a degree

    Create DEGREE and COURSE

    Associate the two entities (to be seen later)

  • 8/2/2019 Data Modeling MIT2

    22/62

    22

    Example (memory redundancy)

    Computer science, course 1, course 2, course 3

    Biology, course 3, course 6, course 7, course 20

    Chemistry, course 7, course 3, course 8, course 9, course 10

    In terms of memory allocation, how many courses should be

    anticipated ?

    Because of the unknown number of courses, the anticipated number

    will either be too few or too many

  • 8/2/2019 Data Modeling MIT2

    23/62

    23

    Attributes

    How to ensure data integrity

    Identify the functional dependency between attributes

    Example of dependency

    A town belongs to only one state

    Towns are unique

    there is dependency between town and state if the town is known, the state can be determined

    unambiguously

    In a case of functional dependency, create a new entity to regroup

    the dependent attributes

    Create a relation between the new entity and the original one

  • 8/2/2019 Data Modeling MIT2

    24/62

    24

    Attributes

    Examples Name, Street number, Street name, Town, Local government, State

    1.Amos DAVID; Dept computer science; UI Ibadan; Ibadan; Oyo state

    2.Olu OJO; 23; Aderemo street; Agbowo; Ibadan; Oyo state

    3.Samuel UCHE; 213 Sango road; Dugbe; Ibadan; Oyo state 4.Uche KALU; 5; market road; Anambra; ; Anambra state

    Entities 1, 2 and 3 are redundant, prone to non integrity Entering entities 1, 2 and 3 (town, state) three times may produce

    typographical error

    Should a town change from one state to another, all the entities are no longervalid All the entities must be modified

  • 8/2/2019 Data Modeling MIT2

    25/62

    25

    Entitygraphical representation

    An entity is represented by a rectangle divided into two parts

    The name of the entity is represented at the upper part

    The names of the attributes are represented at the lower part

    The identifier is underlined

    PERSON

    NumberLast name

    First name

    Date of birth

    TOWN

    Town nameState

    Local government

  • 8/2/2019 Data Modeling MIT2

    26/62

    26

    Relation

    A relation specifies the association between two

    or more entities

    Example

    Town and Person

    The relation should specify the semantic of the

    association

    Apersonlives ina town

  • 8/2/2019 Data Modeling MIT2

    27/62

    27

    Relation

    A relation is symbolized by an oval with its semantic inside the oval

    A relation is further specified by cardinalities that indicate thenumber of associated instances

    Example A person lives in a minimum of one town and in a maximum of 1 town

    A town is inhabited by a minimum of one person and a maximum of n(indicating several)

    PERSONNumber

    Last name

    First name

    Date of birth

    TOWNName

    Surface area

    State

    Local government

    Lives in(1,1)

    (1,n)

  • 8/2/2019 Data Modeling MIT2

    28/62

    28

    Relation

    A relation may sometimes have an attribute

    The attribute describes the relation and not theentities associated

    Example

    The numberof an article bought by a client as well as thedate are neither an attribute of the client nor that of thearticle, but an attribute of the association

    The attributes of the relation are indicated at thelower part of the oval that represents the relation

  • 8/2/2019 Data Modeling MIT2

    29/62

    29

    Relation

    CLIENT

    Number

    Last nameFirst name

    Date of birth

    ARTICLE

    Name

    Unit priceBoughtQuantity

    Date(1,n) (1,m)

  • 8/2/2019 Data Modeling MIT2

    30/62

    30

    Relation

    Maximum cardinality

    This indicates the maximum cardinalities oneither side of a relation

    Example

    PERSONNumber

    Last name

    First name

    Date of birth

    TOWNName

    Surface area

    State

    Local government

    Lives in(1,1)

    (1,n)

    [n:1]

  • 8/2/2019 Data Modeling MIT2

    31/62

    31

    Relation

    How to read the relations : recall

    A person lives in a minimum of 1 town and in a maximum

    of 1 town

    A town is inhabited (is lived) by a minimum of one

    person and a maximum of n person (several)

    PERSONNumber

    Last name

    First name

    Date of birth

    TOWNName

    Surface area

    State

    Local government

    Lives in(1,1)

    (1,n)

    [n:1]

  • 8/2/2019 Data Modeling MIT2

    32/62

    Proposed methodology

    Identify an entity

    Develop fully the entity and chose the

    identifier Associate the entity with existing ones if and

    where necessary

    Establish the cardinalities

    32

  • 8/2/2019 Data Modeling MIT2

    33/62

  • 8/2/2019 Data Modeling MIT2

    34/62

    Exercise

    In a shopping center, a client can buy one or

    more articles of various quantities. Propose an

    ER model for representing the elements ofinformation necessary for billing the client.

    34

  • 8/2/2019 Data Modeling MIT2

    35/62

  • 8/2/2019 Data Modeling MIT2

    36/62

    36

    Relational model

  • 8/2/2019 Data Modeling MIT2

    37/62

    37

    Relational model

    The basic concepts

    Relation

    Domain Attribute

    Key

    N-uplet

  • 8/2/2019 Data Modeling MIT2

    38/62

    38

    Relation (Table)

    Last name First name Date of birth Degree

    Attributes

    Domain (same types of value :names)

    N-Uplets

    STUDENT

  • 8/2/2019 Data Modeling MIT2

    39/62

    39

    Domain

    Represents the data type of a column

    Can be defined in form of intention or extension

    In form of intention, it is specified by a formal

    definition Example

    Integer values

    Character set of less than 20 characters

    In form of extension, it is specified as a finite listof values

    Example Town : {Oyo, Ibadan, Lagos}

  • 8/2/2019 Data Modeling MIT2

    40/62

    40

    Relation

    A relation R is represented as R(A1, , An)

    Where

    A1 takes its values from D1 An takes its values from Dm

    m

  • 8/2/2019 Data Modeling MIT2

    41/62

    41

    Attribute An attribute specifies a constituent of the relation

    (a particular column of the table)

    Attributes are unique within a relation(each column must be distinguished from theothers) Two columns should not have the same name

    Two attributes may have the same domain Example

    First name, Last name : NAMES

  • 8/2/2019 Data Modeling MIT2

    42/62

    42

    Cartesian product of relation

    Let R(A1, A2) be a relation

    The Cartesian product of the relation represents allthe possible combinations of the values of the

    attributes

    Example

    Cars-parked (Make, Color)

    Where

    Make makes of cars (Toyota, Peugeot)

    Color colors of cars (red, black, white)

  • 8/2/2019 Data Modeling MIT2

    43/62

    43

    Cartesian product of a relation

    Toyota, red

    Toyota, black

    Toyota, white Peugeot, red

    Peugeot, black

    Peugeot, white

  • 8/2/2019 Data Modeling MIT2

    44/62

    44

    The intention of a relation

    The intention of a relation specifies how the

    relation should be interpreted

    Example

    Cars-parked (Make, Color)

    Cars parked in front of the department ofcomputer science, University of Ibadan

  • 8/2/2019 Data Modeling MIT2

    45/62

    45

    N-uplet

    Represents the extension of a relation

    It is a Cartesian product of attributes

    A line of the table

    Also called a record

    Example

    Person (First name, Surname)

    Amos, David

    John, Olaoye

  • 8/2/2019 Data Modeling MIT2

    46/62

    46

    Schema of a relation

    The schema of a relation specifies the

    intention of the relation and the associated

    integrity constraints

  • 8/2/2019 Data Modeling MIT2

    47/62

    47

    Constraint of data integrity Constraint on a single attribute

    Example The values of vehicle makes should be German vehicles

    Constraint based on two attributes Example

    The date of marriage should be date of birth

    Constraint on the table n-uplets Example

    The number of registrations for a degree in one year should belimited to one

  • 8/2/2019 Data Modeling MIT2

    48/62

    48

    Maximum Key

    A set of attributes of a relation whose values aredistinct for each n-uplet

    Example Person (Matriculation number, First name, Last

    name, Date of birth, email)

    All the attributes combined can form the maximumkey

  • 8/2/2019 Data Modeling MIT2

    49/62

    49

    Minimum key (the key)

    A key is the minimum set of attributes of a

    relation whose values are distinct for each n-uplet

    Examples

    Student N, first name, last name

    The Student N, first name combined can be used as key, but

    only the Student N is sufficient

    Student N, first name, last name, email

    Either Student N or emailcan be used as key

  • 8/2/2019 Data Modeling MIT2

    50/62

  • 8/2/2019 Data Modeling MIT2

    51/62

    51

    Relations in 1st normal form

    A relation is said to be in 1st normal form if all

    the attributes are of single values

  • 8/2/2019 Data Modeling MIT2

    52/62

    52

    Relations in 2nd normal form

    A relation is in 2nd normal form if and only if

    it is in 1st normal form and there is no FD

    between a subset of the key and the rest of theattributes

    This mean that the key of the relation must be a

    minimum key

  • 8/2/2019 Data Modeling MIT2

    53/62

    53

    Relations in 3rd normal form

    A relation is in 3rd normal form if it is in 2nd

    normal form and there is no FD between non

    key attributes

  • 8/2/2019 Data Modeling MIT2

    54/62

    54

    From ERM to RM1. To each entity corresponds a relation

    (Entity namerelation name)2. To each attribute of an entity corresponds an attribute of the relation

    3. The identifier of the entity becomes the key of the relation

    4. For associations of maximum cardinality [1:n], add the key of the

    relation on the n side to the relation on the 1 side

    5. For associations of maximum cardinality [n:m], a new relation should

    be created using the concatenation of the keys of the associated

    relations as the key. The attributes of the association should be added

    as attributes of the new relation

  • 8/2/2019 Data Modeling MIT2

    55/62

    55

    REMARKS

    A collection of relations obtained from an entity-relation

    model as described above will have the following

    characteristics

    Each attribute is single-value The key contains the least number of attributes

    There are no dependencies between the attributes

    A collection of relations that have the abovecharacteristics are considered to be of 3rd normal form

  • 8/2/2019 Data Modeling MIT2

    56/62

    56

    Important problem

    Some relations resulting from the translation

    may not have keys

    In this case, define a new key

    This happens occasionally particularly in thetransformation of NM associations

  • 8/2/2019 Data Modeling MIT2

    57/62

    57

    Graph of relations

    Specify by pointed arrows the origin ofimported attributes

    Redraw the relations in form of rectangles

    Use the pointed arrows to link the relations

    REMARKS

    There should be no linked circle

  • 8/2/2019 Data Modeling MIT2

    58/62

    58

    Practical

    Model the following types of person in the

    university

    the students

    the members of staff

    Be sure to apply the methods for reducing

    redundancy and guaranteeing data integrity

  • 8/2/2019 Data Modeling MIT2

    59/62

    59

    Practical : University students Description

    Each student has a number, a name, an address

    A student is registered for a degree

    A student may not register for more than one degree simultaneously

    A student may take several degrees from the university

    To a degree is associated a set of courses A degree is managed by a department

    A course is offered by only one department

    Example of questions What are the courses associated with a degree ?

    What are the courses taken by a student for a degree ?

    What are the courses offered by a department ?

    Propose an ER model

  • 8/2/2019 Data Modeling MIT2

    60/62

    60

    PracticalDocumentary information

    system

    A library contains the following types of document

    Books

    Journals that contain articles

    Proceedings that contain articles

    Write-ups for master and PhD works Books and write-ups are described usingtitle, authors, and a list of

    keywords

    Journals and proceedings are described using thetitle, editor andyear ofpublication

    Articles are represented using thetitle,authors,their addresses and a listofkeywords

    The authorized keywords for describing the documents are represented

    using a thesaurus

    Propose an ER model and the associated MR to manage the information

    on the various types of document in the library as well as the thesaurus

  • 8/2/2019 Data Modeling MIT2

    61/62

    A thesaurus

    List of concepts linked by semantic links

    The semantic links are

    SpecificGeneric link (hierarchy) See also (association)

    Used for (synonymous)

    61

  • 8/2/2019 Data Modeling MIT2

    62/62

    Example of a thesaurusTransport

    Plane Boat CarVehicle

    Boeing Airbus Mercedes Peugeot

    Specific/generic

    Specific/generic Specific/generic

    See also