csc443 database management course introduction professor pepper adapted from presentations given by...
Post on 19-Dec-2015
216 Views
Preview:
TRANSCRIPT
CSC443 Database Management
Course Introduction
Professor Pepperadapted from presentations given by
Professor Juliana Freire &
Karl Aberer
& Yan Chen
& Silberschatz, Korth and Sudarshan
Major Course Objectives
Design and diagram relational databases Create Access and Oracle databasesUse SQL commandsBe able to design a good relational
databaseKnow how to get information out of a
database to answer any question
BooksDatabase System Concepts 5th Ed
Theory Cross Reference for fourth ed
Oracle 9i Programming - A Primer Practical examples
See course syllabusAvailable in Library
Learning ResourcesBlackboard: my.adelphi.eduWeb site Database System Concepts:
www.db-book.com/My office hours:
Tuesday & Thursday 12:15-1:30; Wed 12-12:30 Alumni 114 or Science Lab
My email: pepper@adelphi.eduMy phone: 516-747-2362My Web: www.adelphi.edu/~pepperk
Projects / Grading
Projects: 40% Access – 15 Oracle - 25
Homework assignments: 20%Midterm: 20%Final: 20%.
Delivering assignments
Email ftpdrop boxdiscussion boardmailbox in math department E-mail me if making a change in delivery place. forward your email from Adelphi
What is a Database Management System?
Database Management System = DBMSA collection of files that store the dataA big program written by someone else that
accesses and updates those files for you
Relational DBMS = RDBMSData files are structured as relations (tables)
What is behind this Web Site?
http://www.ticketmaster.com/Search on a large databaseSpecify search conditionsMany usersUpdatesAccess through a web interface
Central to Modern Computer Science
Other databases you may useDatabases are
EVERYWHERE
Current Commercial OutlookA major part of the software industry:
Oracle, IBM, Microsoft, Sybase also Informix (now IBM), Teradata smaller players: java-based dbms, devices, OO, …
Well-known benchmarks (esp. TPC)Lots of related industries
data warehouse, document management, storage, backup, reporting, business intelligence, app integration
Relational products dominant and evolving adapting for extensibility (user-defined types), adding
native XML support.
Open Source coming on strong MySQL, PostgreSQL, BerkeleyDB
Why Study Databases??
Need exploded Corporate: retail swipe/clickstreams, “customer
relationship mgmt”, “supply chain mgmt”, “data warehouses”, etc.
Scientific: digital libraries, Human Genome project, NASA Mission to Planet Earth, physical sensors, grid physics network
?
Why study databases?
Data is valuable:bank account records, tax records,
student records…Protect It! - no matter what
• Hurricane• Flood• Human error
Why study databases?Data often structured:Example: Bank account records all
follow the same structureWe can exploit this regular
structure To retrieve data in useful ways (that
is, we can use a query language) To store data efficiently
Why Study Databases Summary
Central to modern computer scienceDatabases are everywhereCommercially successfulFast moving technologyPlethora of structured data that business and
people need
Database Definition
Database – a very large, integrated collection of data. (the stuff)
Models a real-world enterprise Entities (e.g., teams, games) Relationships
(e.g., The Forty-Niners are playing in The Superbowl)
Database Management System – software that stores and manages databases (the tools)
Database is better than simple file system because:
Data redundancy, inconsistency and isolation
Difficult to accessIntegrity problemsAtomicity of updates (change one file and
die before the other completes)Multiple user issues
So a Database Has:representing information
data modeling languages and systems for querying data
complex queries with real semantics* over massive data sets
concurrency control for data manipulation controlling concurrent access ensuring transactional semantics
reliable data storage maintain data semantics even if you pull the plug
• * semantics: the meaning or relationship of meanings of a sign or set of signs
Describing Data: Data ModelsA data model is a collection of concepts for
describing data.A schema is a description of a particular collection
of data, using a given data model.A relation is the data stored in a certain schemaThe relational model of data is the most widely
used model today. Entities and relations among them Integrity constraints and business rules Perspective dependent (warehouse & sales view item
differently)
Database DesignThe process of designing the general structure of the
database:Logical Design – Deciding on the database
schema. Business decision – What attributes Computer Science decision – What relation schemas
Physical Design – Deciding on the physical layout of the database
Data ModelsA collection of tools for describing Data Data relationships Data semantics Data constraints
Relational modelEntity-Relationship data model (mainly for database
design) Object-based data models (Object-oriented and
Object-relational)Semistructured data model (XML)Other older models:
Network model Hierarchical model
The Entity-Relationship Model Models an enterprise as a collection of entities and relationships
Entity: a “thing” or “object” in the enterprise that is distinguishable from other objects
• Described by a set of attributes Relationship: an association among several entities
Represented diagrammatically by an entity-relationship diagram:
Relational Model
ER for concept map to Algebraic Relational Model
Relations (tables of possible data)Instance (actual data at a given time)Schema (description of those tables, their
relations)
Relational Model Look Notation: p(r) p is called the selection predicate Defined as:
p(r) = {t | t r and p(t)}
Where p is a formula in propositional calculus consisting of terms connected by : (and), (or), (not)Each term is one of:
<attribute>op <attribute> or <constant> where op is one of: =, , >, . <.
Example of selection:
branch_name=“Perryridge”(account)
Object-Relational Data ModelsExtend the relational data model by including
object orientation and constructs to deal with added data types.
Allow attributes of tuples to have complex types, including non-atomic values such as nested relations.
Preserve relational foundations, in particular the declarative access to data, while extending modeling power.
Provide upward compatibility with existing relational languages.
Design Goals
Design Goals:Avoid redundant dataEnsure that relationships among
attributes representedEnsure constraints are properly
modeled: updatescheck for violation of database
integrity constraints.
Some Basic SQL Commands
Select – Get rows of data* - everythingFrom – the name of the table (relation) will followWhere – Only get the stuff that matchesExample: Select * from movies where theater =
LoewsExercise –
Write down the query to select all of your friends that live in NY State
Example: University DatabaseConceptual schema:
Students(sid: string, name: string, login: string, age: integer, gpa:real)
Courses(cid: string, cname:string, credits:integer)
Enrolled(sid:string, cid:string, grade:string)
External Schema (View): Course_info(cid:string,enrollment:integer)
Physical schema: Relations stored as unordered files. Index on first column of Students. Key to good performance
Physical Schema
Conceptual Schema
View 1 View 2 View 3
DB
Data Independence (levels of abstraction)
Applications insulated from how data is structured and stored.
Logical data independence: Protection from changes in logical structure of data – stablize views.
Physical data independence: Protection from changes in physical structure of data.
Q: Why are these particularly important for DBMS?
Physical Schema
Conceptual Schema
View 1 View 2 View 3
DB
Queries
Change and get data from a databaseRun over data modelEasy & efficientNot good for complex calculationsDML and DDL
Data Manipulation Language (DML)
Language for accessing and manipulating the data organized by the appropriate data model
DML also known as query languageTwo classes of languages
Procedural – user specifies what data is required and how to get those data
Declarative (nonprocedural) – user specifies what data is required without specifying how to get those data
SQL is the most widely used query language
Data Definition Language (DDL) Specification notation for defining the database schema
Example:create table account ( account-number char(10), balance integer)
DDL compiler generates a set of tables stored in a data dictionary Data dictionary contains metadata (i.e., data about data)
Database schema Data storage and definition language
• Specifies the storage structure and access methods used Integrity constraints
• Domain constraints• Referential integrity (references constraint in SQL)• Assertions
Authorization
Queries - What does it look like?
System handles query plan generation & optimization; ensures correct execution.
SELECT eid, ename, title
FROM Emp EWHERE E.sal > $50K
SELECT E.loc, AVG(E.sal)
FROM Emp EGROUP BY E.locHAVING Count(*) > 5
SELECT COUNT DISTINCT (E.eid)FROM Emp E, Proj P, Asgn AWHERE E.eid = A.eid
AND P.pid = A.pidAND E.loc <> P.loc
Issues: view reconciliation, operator ordering, physical operator choice, memory management, access path (index) use, …
EmployeesEmployeesProjectsProjects
AssignmentsAssignments
EmpEmp
SelectSelect
EmpEmp
Group(agg)Group(agg)
HavingHaving
EmpEmp
Count distinctCount distinct
AsgnAsgn
JoinJoin
JoinJoin
ProjProj
SQL
SQL: widely used non-procedural language Example: Find the name of the customer with customer-id 192-83-7465
select customer.customer_namefrom customerwhere customer.customer_id = ‘192-83-7465’
Example: Find the balances of all accounts held by the customer with customer-id 192-83-7465
select account.balancefrom depositor, accountwhere depositor.customer_id = ‘192-83-7465’ and
depositor.account_number = account.account_number Application programs generally access databases through one of
Language extensions to allow embedded SQL Application program interface (e.g., ODBC/JDBC) which allow SQL
queries to be sent to a database For us: Oracle and Access SQL languages
Concurrency ControlConcurrent execution of user programs: key to good
DBMS performance. Disk accesses frequent, pretty slow Keep the CPU working on several programs concurrently.
Interleaving actions of different programs: trouble! e.g., account-transfer & print statement at same time
DBMS ensures such problems don’t arise. Users/programmers can pretend they are using a single-user
system. (called “Isolation”) Thank goodness! Don’t have to program “very, very
carefully”.
Transactions: ACID PropertiesKey concept is a transaction: a sequence of database
actions (reads/writes).
DBMS ensures atomicity (all-or-nothing property) even if system crashes in the middle.
Each transaction, executed completely, must take the DB between consistent states or must not run at all.
DBMS ensures that concurrent transactions appear to run in isolation.
DBMS ensures durability of committed Xacts even if system crashes.
DBMS can enforce simple integrity constraints on the data.
Structure of a DBMS
A typical DBMS has a layered architecture.
The figure does not show the concurrency control and recovery components.
Each database system has its own variations.
Query Optimizationand Execution
Relational Operators
Files and Access Methods
Buffer Management
Disk Space Management
DB
These layersmust considerconcurrencycontrol andrecovery
…must understand how a DBMS works
Databases make these folks happy ... DBMS vendors, programmers $20 million industry
Oracle, IBM, MS, Sybase, … End users Business, education, science, … DB application programmers
Eg smart webmasters Build web services that run off DBMSs
Database administrators (DBAs) Design logical/physical schemas Handle security and authorization Data availability, crash recovery Database tuning as needs evolve
SummaryWhat is a database – lots of data organized into entities and schemes with a manager
Why study databases? – common use, needed for programming apps
Why use databases? – all the advantages over flat file systems
Intro to Databases
Logical layer:
Query language, data models, transactions
Physical layer
Actual files with indexes, query processing, concurrency, recovery & logs
top related