chapter 2. the relational model ist2101. chapter 1 review potential problems with lists – deletion...
TRANSCRIPT
IST210 1
Chapter 2. The Relational Model
Chapter 1 Review
• Potential problems with Lists– Deletion– Update– Insertion
• Avoid these problems using a relational database:– Break a list into several tables, one table for each theme– Tables can be joined together using the value of the data
• So far, we are pretty vague on what we mean by “table”, “theme” and “value of the data”……
Student Course
Chapter 2 Objective
• Learn the concept of the relational model
• Learn the meaning and importance of keys, foreign keys, and related terminology
• Learn the meaning of functional dependencies
• Learn to apply a process for normalizing relations
IST210 4
Entity
• An entity is something of importance to a user that needs to be represented in a database
• An entity represents one theme or topic– An object (e.g., book) with various characteristics
(e.g., authors, publisher, ISSN)
IST210 5
Relation
• A relation is a two-dimensional table that has specific characteristics
• The table dimensions, like a matrix, consist of rows and columns
IST210 6
Characteristics of a Relation• Columns contain data about attributes of the entity
class– Each column has a unique name– All entries in a column are the same kind
• Rows contain data about an entity instance– Cells of the table hold a single value– No two rows may be identical
• The order of the columns is unimportant• The order of the rows is unimportantStudentID FirstName LastName DOB
9123450 John Smith Jan. 1, 1989
9123451 John Adam Jun. 1, 1988
9123452 Jane Adam Aug, 1,1989
9123453 Josh Cohen Aug. 1,1989
Exercise: Relation or Non-Relation?
2-7
EmployeeNumber Phone LastName
100 335-6421,454-9744
Abernathy
101 215-7789 Cadley
104 610-9850 Copley
107 299-9090 Jackson
Cells of the table hold multiple values
Exercise: Relation or Non-Relation?
2-8
EmployeeNumber Phone LastName
100 335-6421 Abernathy
101 215-7789 Cadley
104 610-9850 Copley
100 335-6421 Abernathy
107 299-9090 Jackson
No two rows may be identical
Exercise: Relation or Non-Relation?
2-9
EmployeeNumber FirstName LastName100 Mary Abernathy101 Jerry Cadley104 Alex Copley107 Megan Jackson
10
Presenting Relation Structure
IST210
Column 1 Column 2 … Column n RELATION_NAME(Column1, Column 2, …, Column n)
RELATION_NAME
STUDENT(StudentID, FirstName, LastName, DOB)
StudentID FirstName LastName DOB
9123450 John Smith Jan. 1, 1989
9123451 John Adam Jun. 1, 1988
9123452 Jane Adam Aug, 1,1989
9123453 Josh Cohen Aug. 1,1989
STUDENT
Original Table Relation Representation
From now on, we will frequently use this presentation for relations
Terminology
IST210
Synonyms…
Table Row Column
File Record Field
Relation Tuple Attribute
IST210 12
Key
• A key is one (or more) columns of a relation that is (are) used to uniquely identify a row– E.g.: OrderID is a key to uniquely identify an order– A composite key is a key that contains two or more
attributes• E.g.: (BuildingNumber, ApartmentNumber) is a composite
key to uniquely identify an apartment
IST210 13
Example: KeyStudentID FirstName LastName DOB9123450 John Smith Jan. 1, 19899123451 John Adam Jun. 1, 19889123452 Jane Adam Aug, 1,19899123453 Josh Cohen Aug. 1,1989
What attribute(s) form a key?
StudentID: FirstName:(FirstName, LastName):
(FirstName, LastName, DOB): (StudentID, FirstName): (StudentID, FirstName, LastName, DOB):
yesnoyes (in this table), but no (if there are thousands of records, there could be students with same first name and last name)yes (in this table), but no (if more records)yes, but …yes, but …
A key is one (or more) columns of a relation that is (are) used to uniquely identify a row.
IST210 14
Candidate Key
What attribute(s) form a key?
StudentID: FirstName:(FirstName, LastName):
(FirstName, LastName, DOB): (StudentID, FirstName): (StudentID, FirstName, LastName, DOB):
yesnoyes (in this table), but no (if there are thousands of records, there could be students with same first name and last name)yes (in this table), but no (if more records)yes, butyes, but
StudentID FirstName LastName DOB
9123450 John Smith Jan. 1, 1989
9123451 John Adam Jun. 1, 1988
9123452 Jane Adam Aug, 1,1989
9123453 Josh Cohen Aug. 1,1989
• A candidate key is called “candidate” because it is a candidate to become the primary key– A special key– If the subset of a key is also a key, we usually don’t
consider it as a candidate key
not a candidate keynot a candidate key
, a candidate key
IST210 15
Primary Key• A primary key is a candidate key chosen to be the main key
for the relation– A relation can only have one primary key– Each candidate key could be chosen as a primary key, but we usually
have preferencesStudentID FirstName LastName DOB
9123450 John Smith Jan. 1, 1989
9123451 John Adam Jun. 1, 1988
9123452 Jane Adam Aug, 1,1989
9123453 Josh Cohen Aug. 1,1989
Primary key: StudentID
IST210 16
Primary Key: Discussion
Even if HomeAddress could be a candidate key, we still prefer choosing StudentID as the primary key. Because (1) HomeAddress might have duplicate(2) HomeAddress is a string, hard to index and query. StudentID is numeric
value
STUDENT(StudentID, FirstName, LastName, DOB, SSN)
SSN: Candidate key? Good to be a primary key?
STUDENT(StudentID, FirstName, LastName, DOB, HomeAddress)
HomeAddress: Candidate key? Good to be a primary key?
Even if SSN is a candidate key, we still prefer choosing StudentID as the primary key. Because SSN is sensitive information
IST210 17
Presenting Primary Key
• Single KeyRELATION_NAME(Column1, Column 2, …, Column n)
• STUDENT(StudentID, FirstName, LastName, DOB)– The underline of StudentID indicates StudentID is the primary
key of this relation
• Composite KeyRELATION_NAME(Column1, Column 2, …, Column n)
• REGISTRATION(StudentID, CourseID, RegistrationDate)– The underline of StudentID and CourseID indicates (StudentID,
CourseID) is the composite primary key of this relation
IST210 18
How to Choose a Primary Key?CustomerName HomeAddress DOB
John Smith 293 Main St Jan. 1, 1989John Adam 10 Green Rd Jun. 1, 1988Jane White 111 University Aug, 1,1989Josh Cohen 12 Beaver Aug. 1,1989
What if none of existing attributes is appropriate?Answer: artificially create a new attribute
Candidate keys: CustomerName? HomeAddress? DOB?Primary key?
IST210 19
A Surrogate Key
• A Surrogate Key is a unique numeric value that is added to a relation to serve as the primary key– System generated– Contains no semantic meaning
• Surrogate key is very commonly used. A surrogate key is often used to replace a composite primary key or a non-numeric primary key– (FirstName, LastName, DOB) StudentID– HomeAddress CustomerID
IST210 20
Surrogate Key Examples
• Penn State database– StudentID
• Membership database– MembershipID
• Online shopping– OrderNumber
IST210 21
Review
• Key: StudentID, (StudentID, FirstName), …• Candidate key: StudentID• Primary key: (StudentID, FirstName, LastName, DOB)• Surrogate key: StudentID
StudentID FirstName LastName DOB9123450 John Smith Jan. 1, 19899123451 John Adam Jun. 1, 19889123452 Jane Adam Aug, 1,19899123453 Josh Cohen Aug. 1,1989
IST210 22
Exercise: True/False
• Candidate keys may or may not be unique.
• The primary key is used to identify unique rows in a relation.
• Surrogate key values have no meaning to the users.
• Surrogate key is not primary key.
Exercise: True/False
• Suppose we have the following table:BOOK(BookID, Title, Publisher, Year)
• Q1. Title is a key in BOOK?• Q2. (BookID, Title) is a key?• Q3. (BookID, Title) is a candidate key?• Q4. BookID is a surrogate key added to serve
as a primary key?
And one more question…
• What’s the primary key of this table?
EmployeeID EmployeeName Skill Work Location
101 Jones Typing 114 Main Street
101 Jones Shorthand 114 Main Street
101 Jones Whittling 114 Main Street
102 Bravo Light Cleaning 73 Industrial Way
103 Ellis Alchemy 73 Industrial Way
103 Ellis Flying 73 Industrial Way
104 Harrison Light Cleaning 73 Industrial Way