sit772 -database and information retrieval week 1 ... · pdf fileoutline administration...
TRANSCRIPT
SIT772 - Database and Information Retrieval
WEEK 1. INTRODUCTION
Administrative Information
WEEK 1. INTRODUCTION 2
OutlineAdministration Information
Session 01. Introduction
◦ Concepts of Data Modelling
◦ Concepts of Database Technology
◦ Tools and techniques to create a Database Design and Implementation project
◦ Use the SQL language for Relational Database development
◦ Understand methods for capturing, representing, storing, organising, and retrieving data
◦ Gain experience in the development of structured, unstructured or loosely structured information
◦ Large Data Analytics and NoSQL
WEEK 1. INTRODUCTION 3
Administrative InformationUnit Staff:
WEEK 1. INTRODUCTION 4
Ms Thao Pham
Coordinator for Week 1 to Week 7
Campus:
Room:
Telephone:
Fax:
Email:
Burwood Campus
T2.04.4
Internal: 17486 / National: 03-9251-7486
Internal: 46440 / National: 03-9244-6440
Dr Michael Hobbs
Unit Chair and Coordinator for Week8 to Week 11
Campus:
Telephone:
Fax:
Email:
Geelong Campus at Waurn Ponds
Internal: 73342 / National: 03-5227-3342
Internal: 72028 / National: 03-5227-2028
Administrative InformationUnit Materials:
◦ No prescribed texts
◦ The recommended Textbook (for):
◦ The database part of this unit
● CARLOS CORONEL, STEVEN MORRIS, Database Systems: Design, Implementation, and Management, 11th edition, Cengage Learning.
◦ The information retrieval part of this unit, the related materials will be provided in CloudDeakinResources module.
◦ Reference textbooks:
● Thomas Connolly, Carolyn Begg, Database Systems: A practical approach to Design, Implementation, and Management, 5th edition, Pearson.
● Ricardo Baeza-Yates, Berthier Ribeiro-Neto, Modern Information Retrieval, Addison-Wesley.
Additional references will be posted on CloudDeakin from time to time under the Resources module
WEEK 1. INTRODUCTION 5
Administrative InformationUnit Materials:
◦ CloudDeakin
◦ Should check often (at least twice a week)
◦ Contains:
◦ Unit Guide (read this first!)
◦ News, e.g., content updates, changed due dates/requirements, etc.
◦ Discussion forums
◦ Administrative information
◦ Lecture slides and recordings
◦ Weekly exercises and solutions
◦ Assessments
◦ Bb Collaborate
WEEK 1. INTRODUCTION 6
TimetableLecture
Wednesday 13:00 - 14:00 LT 6 (B.4.12)
Practical
Wednesday 14:00 - 15:50 B.4.06
Bb Collaborate
- Date and time will be arrange via Unit Site discussion on CloudDeakin
PASS
Wednesday 11:00 - 11:50 Y.1.20
WEEK 1. INTRODUCTION 7
Teaching and LearningStudents should
◦ pay attention
◦ make notes
◦ listen for important information
◦ ask questions
◦ think
◦ study 10 hours per week (average)
- 3 hrs at University, 3 hrs read textbook & other resources & summary, 2hrs practices, 2 hrs assignment
- Campus: 1 x 1 hour class per week, 1 x 2 hour practical per week.
- Cloud (online): Learning experiences are via CloudDeakin. Students will have the opportunity to participate in online consultation sessions.
Notes:
Students will on average spend 150 hours over the trimester on learning and assessment activities in this unit.
WEEK 1. INTRODUCTION 8
Administrative InformationThis is an individual assessment task.
20% – Assignment 1 (9:00am, Monday, 16 January 2017)◦ Database Design and Implementation.
◦ A written report on the design and implementation of a small database project, together with SQL scripts.
20% – Assignment 2 (9:00am, Monday, 30 January 2017)◦ Information Retrieval Techniques. This is an individual assessment task
◦ Written answers to a number of Information Retrieval questions.
◦ Late submissions penalised as per Faculty regulations
◦ See CloudDeakin for further information including submission instructions
◦ Results will be return within 10 working days
60% – Final Examination
There are no hurdles to pass this unit, you only need to achieve a final mark of 50 / 100 or more
WEEK 1. INTRODUCTION 9
Administrative InformationPlagiarism:
◦ Plagiarism is the copying of another person’s ideas or expressions without appropriate acknowledgment and presenting these ideas or forms of expression as your own.
◦ It includes not only written works such as books or journals but data or images that may be presented in tables, diagrams, designs, plans, photographs, film, music, formulae, web sites and computer programs.
◦ Plagiarism also includes the use of (or passing off) the work of lecturers or other students as your own.
WEEK 1. INTRODUCTION 10
Administrative InformationPlagiarism (cont.)
◦ Please be aware that if the Faculty Academic Progress and Discipline Committee finds a student has committed an act of academic misconduct (plagiarism and/or exam cheating) it may impose one or more of the following penalties:
◦ A reprimand;
◦ A fine not exceeding $500;
◦ Allocated a zero mark in the relevant task or such other mark as is appropriate;
◦ Allocate a zero mark in the relevant unit or such other mark as is appropriate;
◦ Allocate a zero mark in such other units in which the student is enrolled as the Faculty Academic Progress and University Discipline Committee may determine;
◦ Suspend the student for up to one year;
◦ Exclude the student for a minimum period of one year.
WEEK 1. INTRODUCTION 11
Unit Weekly Activities
1. Introduction
2. Data Modelling
3. Entity Relationship Modelling
4. Normalization
5. Relational Algebra
6. The Relational Database and SQL
7. Introduction to SQL & SQL*Plus
8. Information Retrieval
9. Statistical Properties of Text and Boolean Model
10. Vector Model and IR Evaluation
11. Relevance Feedback, Unit Review
WEEK 1. INTRODUCTION 12
Some Final Thoughts
WEEK 1. INTRODUCTION 13
The unit follows a carefully planned progression.
You are strongly encouraged to attend/study lectures!
Practical
Laboratories
Assignments
& Exam
Lecture
Sessions
WEEK 1. INTRODUCTION 14
WEEK 1.Introduction Databases andDBMS
Outline• Different between Data and Information
• What is Database
• Various Types of Databases
• Important of Database Design
• Components of Database System (DBS)
• Main Functions of Database Management System (DBMS)
WEEK 1. INTRODUCTION 15
Why Databases
WEEK 1. INTRODUCTION 16
• Did you use
– Google to search for required information on the Web?
– a credit card to buy something?
– Log on to CloudDeakin?
– Access Enrolled unit site?
– your bank card to withdraw money from an ATM?
– library catalogues to look for books?
– Etc.
• Are these transactions possible without a database?
Why Databases
WEEK 1. INTRODUCTION 17
• Virtually all modern business systems
rely on databases.
• It is vital for any IS professional to have a good understanding of how databases are:
– created
– used
Why Databases
WEEK 1. INTRODUCTION 18
� Databases are:
◦ specialized structures
◦ allow computer-based systems to store, manage and retrieve data quickly.
� Various Types of Databases
◦ Single-user Database
◦ Desktop Database
◦ Multiuser Database
◦ Workgroup Database
◦ Enterprise Database
◦ Centralised Database
◦ Distributed Database
Data VS. Information
WEEK 1. INTRODUCTION 19
Data Information
• Raw facts
– Raw data - Not yet
been processed to
reveal the meaning
• Building blocks of
information
• Data management
– Data generation, storage,
and retrieval
• Produced by processing data
• Reveals the meaning of data
• Enables knowledge
creation
• Should be accurate,
relevant, and timely to
enable good decision
making
WEEK 1. INTRODUCTION 20
Data VS. Information
• Unit report (data or information?)
• Unit profile (data or information?)
Student_ID Name Major Result
8912345 Lewis,A.D. MG 66
9023456 Baker, G. P. CS 82
9134567 Hunter, S. L. IS 76
9145678 Grant, G. D CS 90
… … … …
Grade % No. of Students
HD 12 15
D 17.6 22
C 28.8 36
P 31.2 39
N 10.4 13
WEEK 1. INTRODUCTION 21
What is a Database?
• A shared, integrated
computer structure
that stores a collection of:
– End user data
(raw facts)
– Meta-data
(data about data)
WEEK 1. INTRODUCTION 22
Types of Databases
User Type Single-user, Multi-user
Size Desktop, Workgroup, Enterprise
Location Centralised, Distributed, Cloud
Data Usage Operational (a.k.a. transactional or production), Data
Data Type General-purpose, Discipline-specific
Data Structure Structured, Semi-structured, Unstructured
New Type NoSQL (Not only SQL), not the traditional database, to handle (e.g. social media on theInternet)
- Unprecedented volume of data
- Variety of data types and structures
- Velocity of data operations
WEEK 1. INTRODUCTION 23
What is a DBMS?
• DBMS = Database Management System
• A collection of programs that
– manage database structures;
– control access to data;
– facilitate the sharing of data among multiple users
and applications;
– enable efficient and effective data management.
• DB ~ e-filing cabinet
• DBMS helps manage the cabinet’s contents
WEEK 1. INTRODUCTION 24
The Benefits of DBMSThe DBMS manages the interaction between the
end user and the database
WEEK 1. INTRODUCTION 25
The Benefits of DBMS
• The end users will have better access to data
• How is this done?
– DBMS provides an integrated view to data and operations
– It reduces data inconsistencies and errors
– Enables ad-hoc queries
WEEK 1. INTRODUCTION 26
Functions of DBMS
• Performs functions that guarantee integrity and
consistency of data
WEEK 1. INTRODUCTION 27
The DBS Environment
WEEK 1. INTRODUCTION 28
Why Database Design?
• Focuses on the database structurethat will be used to store and managedata
• A database that meets all userrequirements does not just happen;its structure must be designedcarefully
• An easy-to-use DBMS does not meana good database design
WEEK 1. INTRODUCTION 29
Why Database Design?
• Even a good DBMS will performpoorly with a bad DB design
• Defines the database’s expected use
• General goals of DB design
– Avoid redundancy
– Provide efficient but controlled access to
data
– Enable a fast response to a query
WEEK 1. INTRODUCTION 30
Database Career Opportunities
WEEK 1. INTRODUCTION 31
Review Questions
• Describe each of the following:
– Data
– Information
– Database
– DBMS
– Database System Environment
• What are the DBMS functions, and why are they important?
• Why is database design so important?
• What are the problems of a computer file system?