laudon ch.6

45
FOUNDATIONS OF BUSINESS INTELLIGENCE: DATABASES AND INFORMATION MANAGEMENT Chapter 6 Eng. Rasha Al Ababseh

Upload: german-jordanian-university

Post on 19-Jan-2017

1.134 views

Category:

Business


12 download

TRANSCRIPT

Page 1: Laudon ch.6

FOUNDATIONS OF BUSINESS INTELLIGENCE: DATABASES AND INFORMATION MANAGEMENT

Chapter 6

Eng. Rasha Al Ababseh

Page 2: Laudon ch.6

Learning Objectives

• Describe how the problems of managing data resources in a traditional file environment are solved by a database management system

• Describe the capabilities and value of a database management system

• Apply important database design principles

• Evaluate tools and technologies for accessing information from databases to improve business performance and decision making

• Assess the role of information policy, data administration, and data quality assurance in the management of a firm’s data resources

Eng. Rasha Al Ababseh

Page 3: Laudon ch.6

• An effective information system provides users with accurate, timely, and relevant information.

– Accurate information is free of errors.

– Information is timely when it is available to decision makers when it is needed.

– Information is relevant when it is useful and appropriate for the types of work and decisions that require it.

Data should be organized and maintained in the right way

Eng. Rasha Al Ababseh

Page 4: Laudon ch.6

Organizing Data in a Traditional File Environment

• File organization concepts

– Database: Group of related files

– File: Group of records of same type

– Record: Group of related fields

– Field: Group of characters as word(s) or number

• Describes an entity (person, place, thing on which we store information)

• Attribute: Each characteristic, or quality, describing entity

– E.g., Attributes Date or Grade belong to entity COURSE

Eng. Rasha Al Ababseh

Page 5: Laudon ch.6

Eng. Rasha Al Ababseh

THE DATA HIERARCHY

A computer system organizes data in a hierarchy that starts with the bit, which represents either a 0 or a 1. Bits can be grouped to form a byte to represent one character, number, or symbol. Bytes can be grouped to form a field, and related fields can be grouped to form a record. Related records can be collected to form a file, and related files can be organized into a database.

Organizing Data in a Traditional File Environment

Page 6: Laudon ch.6

• Problems with the traditional file environment (files maintained separately by different departments) – Data redundancy:

• Presence of duplicate data in multiple files • Data inconsistency: Same attribute has different values • Waste of storage

– Program-data dependence: • When changes in program requires changes to data

accessed by program

Eng. Rasha Al Ababseh

Organizing Data in a Traditional File Environment

Page 7: Laudon ch.6

– Lack of flexibility: deliver routine scheduled reports after extensive programming efforts, but it cannot deliver ad hoc reports

– Poor security: Because there is little control or management of data, access to and dissemination of information may be out of control.

– Lack of data sharing and availability: it is virtually impossible for information to be shared or accessed in a timely manner

Eng. Rasha Al Ababseh

Organizing Data in a Traditional File Environment

Page 8: Laudon ch.6

Eng. Rasha Al Ababseh

The use of a traditional approach to file processing encourages each functional area in a corporation to develop specialized applications. Each application requires a unique data file that is likely to be a subset of the master file. These subsets of the master file lead to data redundancy and inconsistency, processing inflexibility, and wasted storage resources.

TRADITIONAL FILE PROCESSING

Organizing Data in a Traditional File Environment

Page 9: Laudon ch.6

The Database Approach to Data Management

• Database – Serves many applications by centralizing data and controlling

redundant data

• Database management system (DBMS)

– is software that permits an organization to centralize data, manage them efficiently, and provide access to the stored data by application programs.

– Interfaces between applications and physical data files

– Separates logical and physical views of data

– Solves problems of traditional file environment • Controls redundancy • Eliminates inconsistency • Uncouples programs and data • Enables organization to centrally manage data and data security

Eng. Rasha Al Ababseh

Page 10: Laudon ch.6

Eng. Rasha Al Ababseh

A single human resources database provides many different views of data, depending on the information requirements of the user. Illustrated here are two possible views, one of interest to a benefits specialist and one of interest to a member of the company’s payroll department.

HUMAN RESOURCES DATABASE WITH MULTIPLE VIEWS

The Database Approach to Data Management

Page 11: Laudon ch.6

• Relational DBMS – Represent data as two-dimensional tables called relations or

files

– Each table contains data on entity and attributes

• Table: grid of columns and rows – Rows (tuples): Records for different entities

– Fields (columns): Represents attribute for entity

– Key field: Field used to uniquely identify each record

– Primary key: Field in table used for key fields

– Foreign key: Primary key used in second table as look-up field to identify records from original table

Eng. Rasha Al Ababseh

The Database Approach to Data Management

Page 12: Laudon ch.6

Eng. Rasha Al Ababseh

RELATIONAL DATABASE TABLES

The Database Approach to Data Management

A relational database organizes data in the form of two-dimensional tables. Illustrated here are tables for the entities SUPPLIER and PART showing how they represent each entity and its attributes. Supplier Number is a primary key for the SUPPLIER table and a foreign key for the PART table.

Page 13: Laudon ch.6

Eng. Rasha Al Ababseh

A relational database organizes data in the form of two-dimensional tables. Illustrated here are tables for the entities SUPPLIER and PART showing how they represent each entity and its attributes. Supplier Number is a primary key for the SUPPLIER table and a foreign key for the PART table.

RELATIONAL DATABASE TABLES (cont.)

The Database Approach to Data Management

Page 14: Laudon ch.6

• Operations of a Relational DBMS

– Three basic operations used to develop useful sets of data

• SELECT: Creates subset of data of all records that meet stated criteria

• JOIN: Combines relational tables to provide user with more information than available in individual tables

• PROJECT: Creates subset of columns in table, creating tables with only the information specified

Eng. Rasha Al Ababseh

The Database Approach to Data Management

Page 15: Laudon ch.6

Eng. Rasha Al Ababseh

The select, join, and project operations enable data from two different tables to be combined and only selected attributes to be displayed.

THE THREE BASIC OPERATIONS OF A RELATIONAL DBMS

The Database Approach to Data Management

Page 16: Laudon ch.6

• Non-relational database management systems

– They are useful for accelerating simple queries against large volumes of structured and unstructured data,

– including Web, social media, graphics, and other forms of data that are difficult to analyze with traditional SQL-based tools.

– more flexible data model and are designed for managing large data sets across many distributed machines and for easily scaling up or down.

• Databases in the cloud

– scalability – you only pay for the exact services you need.

– Amazon Web Services, Microsoft SQL Azure

Eng. Rasha Al Ababseh

The Database Approach to Data Management

Page 17: Laudon ch.6

• Capabilities of Database Management Systems – Data definition capability: Specifies structure of database

content, used to create tables and define characteristics of fields. This information about the database would be documented in:

– Data dictionary: Automated or manual file storing definitions of data elements and their characteristics

– Data manipulation language: Used to add, change, delete, retrieve data from database • Structured Query Language (SQL) • Microsoft Access user tools for generation SQL

– Many DBMS have report generation capabilities for creating polished reports (Crystal Reports)

Eng. Rasha Al Ababseh

The Database Approach to Data Management

Page 18: Laudon ch.6

Eng. Rasha Al Ababseh

MICROSOFT ACCESS DATA DICTIONARY FEATURES

Microsoft Access has a rudimentary data dictionary capability that displays information about the size, format, and other characteristics of each field in a database. Displayed here is the information maintained in the SUPPLIER table. The small key icon to the left of Supplier Number indicates that it is a key field.

The Database Approach to Data Management

Page 19: Laudon ch.6

Eng. Rasha Al Ababseh

Illustrated here are the SQL statements for a query to select suppliers for parts 137 or 150. They produce a list with the same results as Figure 6-5.

EXAMPLE OF AN SQL QUERY

The Database Approach to Data Management

Page 20: Laudon ch.6

Eng. Rasha Al Ababseh

AN ACCESS QUERY

Illustrated here is how the query in Figure 6-7 would be constructed using Microsoft Access query building

tools. It shows the tables, fields, and selection criteria used for the query.

The Database Approach to Data Management

Page 21: Laudon ch.6

• Designing Databases: – you must understand the relationships among the data,

– the type of data that will be maintained in the database,

– how the data will be used, and how the organization will need to change to manage data from a company-wide perspective.

– Conceptual (logical) design: Abstract model from business perspective

– Physical design: How database is arranged on direct-access storage devices

– Normalization • Streamlining complex groupings of data to minimize redundant data

elements and awkward many-to-many relationships.

• Most efficient way to group data elements to meet business requirements, needs of application programs

Eng. Rasha Al Ababseh

The Database Approach to Data Management

Page 22: Laudon ch.6

Eng. Rasha Al Ababseh

An unnormalized relation contains repeating groups. For example, there can be many parts and suppliers for each order. There is only a one-to-one correspondence between Order-Number and Order-Date.

AN UNNORMALIZED RELATION FOR ORDER

The Database Approach to Data Management

Page 23: Laudon ch.6

Eng. Rasha Al Ababseh

After normalization, the original relation ORDER has been broken down into four smaller relations. The relation ORDER is left with only two attributes and the relation LINE_ITEM has a combined, or concatenated, key consisting of Order_Number and Part_Number.

NORMALIZED TABLES CREATED FROM ORDER

The Database Approach to Data Management

Page 24: Laudon ch.6

– referential integrity

• rules to ensure that relationships between coupled tables remain consistent

– Entity-relationship diagram

• Used by database designers to document the data model

• Illustrates relationships between entities

Eng. Rasha Al Ababseh

The Database Approach to Data Management

Page 25: Laudon ch.6

Eng. Rasha Al Ababseh

This diagram shows the relationships between the entities SUPPLIER, PART, LINE_ITEM, and ORDER that might be used to model the database in Figure 6-10.

AN ENTITY-RELATIONSHIP DIAGRAM

The Database Approach to Data Management

Page 26: Laudon ch.6

Using Databases to Improve Business Performance and Decision Making

THE CHALLENGE OF BIG DATA

• beyond the ability of typical DBMS to capture, store, and analyze.

• billions to trillions of records, all from different sources.

• Businesses are interested in big data because they can reveal more patterns and interesting anomalies than smaller data sets, with the potential to provide new insights into customer behavior, weather patterns, financial market activity, or other phenomena.

• To derive business value from these data, organizations need new technologies and tools capable of managing and analyzing non-traditional data along with their traditional enterprise data.

Eng. Rasha Al Ababseh

Page 27: Laudon ch.6

Eng. Rasha Al Ababseh

Page 28: Laudon ch.6

• Business Intelligence:

– tools for obtaining useful information from all the different types of data used by businesses today, including semi-structured and unstructured big data in vast quantities.

– consolidating, analyzing, and providing access to vast amounts of data to help users make better business decisions.

– E.g., Harrah’s Entertainment analyzes customers to develop gambling profiles and identify most profitable customers

– Principal tools include:

• Data Warehouses and Data marts

• Hadoop

• in-memory computing

• Analytical platform

Eng. Rasha Al Ababseh

Using Databases to Improve Business Performance and Decision Making

Page 29: Laudon ch.6

BUSINESS INTELLIGENCE INFRASTRUCTURE

• Data warehouse:

– Stores current and historical data from many core operational transaction systems

– Consolidates and standardizes information for use across enterprise, but data cannot be altered

– Data warehouse system will provide query, analysis, and reporting tools

– The data warehouse extracts current and historical data from multiple operational systems inside the organization.

– These data are combined with data from external sources and transformed by correcting inaccurate and incomplete data and restructuring the data for management reporting and analysis before being loaded into the data warehouse.

– A data warehouse system also provides a range of ad hoc and standardized query tools, analytical tools, and graphical reporting facilities .

Eng. Rasha Al Ababseh

Using Databases to Improve Business Performance and Decision Making

Page 30: Laudon ch.6

BUSINESS INTELLIGENCE INFRASTRUCTURE

• Data marts: – Subset of data warehouse – Summarized or highly focused portion of firm’s data for

use by specific population of users – Typically focuses on single subject or line of business

– Ex. Bookseller Barnes & Noble used to maintain a series of data marts—one for point-of-sale data in retail stores, another for college bookstore sales, and a third for online sales

Eng. Rasha Al Ababseh

Using Databases to Improve Business Performance and Decision Making

Page 31: Laudon ch.6

• Hadoop

– is an open source software framework managed by the Apache Software Foundation

– For handling unstructured and semi-structured data in vast quantities, as well as structured data.

– enables distributed parallel processing of huge amounts of data across inexpensive computers “servers”.

– breaks a big data problem down into sub-problems, distributes them among up to thousands of inexpensive computer processing nodes,

– and then combines the result into a smaller data set that is easier to analyze.

– Hadoop consists of several key services: • Hadoop Distributed File System (HDFS) for data storage. • MapReduce for high-performance parallel data processing

– Facebook announced the data gathered in the warehouse grows by roughly half a PB per day. / PB is 1000⁵

Eng. Rasha Al Ababseh

Using Databases to Improve Business Performance and Decision Making

Page 32: Laudon ch.6

• In-Memory Computing

– Another way of facilitating big data analysis.

– relies primarily on a computer’s main memory (RAM) for data storage. (Conventional DBMS use disk storage systems.)

Eng. Rasha Al Ababseh

Using Databases to Improve Business Performance and Decision Making

Page 33: Laudon ch.6

• Analytic Platforms

– analytical information based on current data records.

– tightly integrated database, server, and storage components that handle complex analytic queries 10 to 100 times faster than traditional systems.

– Analytic platforms also include in-memory systems and NoSQL non-relational database management systems.

Eng. Rasha Al Ababseh

Using Databases to Improve Business Performance and Decision Making

Page 34: Laudon ch.6

Eng. Rasha Al Ababseh

A contemporary business intelligence infrastructure features capabilities and tools to manage and analyze large quantities and different

types of data from multiple sources. Easy-to-use query and reporting tools for casual business users and more sophisticated analytical

toolsets for power users are included.

COMPONENTS OF A DATA WAREHOUSE

Page 35: Laudon ch.6

ANALYTICAL TOOLS: RELATIONSHIPS, PATTERNS, TRENDS

• Once data have been captured and organized using the business intelligence technologies, they are available for further analysis using software for database querying and reporting

– Online analytical processing (OLAP)

– Data Mining

– Text Mining and Web Mining

Eng. Rasha Al Ababseh

Using Databases to Improve Business Performance and Decision Making

Page 36: Laudon ch.6

– Online analytical processing (OLAP)

– Supports multidimensional data analysis • Viewing data using multiple dimensions • Each aspect of information (product, pricing, cost,

region, time period) is different dimension • E.g., how many washers sold in the East in June

compared with other regions? • A company would use either a specialized

multidimensional database or a tool that creates multidimensional views of data in relational databases.

– OLAP enables rapid, online answers to ad hoc queries

Eng. Rasha Al Ababseh

Using Databases to Improve Business Performance and Decision Making

Page 37: Laudon ch.6

• Data mining:

– More discovery driven than OLAP

– Finds hidden patterns, relationships in large databases and infers rules to predict future behavior

– E.g., Finding patterns in customer data for one-to-one marketing campaigns or to identify profitable customers.

– Types of information obtainable from data mining • Associations: Occurrences linked to single event • Sequences: events linked over time

• Classification: Recognizes patterns that describe group to which item belongs. Ex. businesses such as credit card or telephone companies worry about the loss of steady customers. Classification helps discover the characteristics of customers who are likely to leave and can provide a model to help managers predict who those customers are so that the managers can devise special campaigns to retain such customers.

• Clustering: Similar to classification when no groups have been defined; finds groupings within data. Ex. partitioning a database into groups of customers based on demographics and types of personal investments.

• Forecasting: Uses series of existing values to forecast what other values will be. estimate the future value of continuous variables

Eng. Rasha Al Ababseh

Using Databases to Improve Business Performance and Decision Making

Page 38: Laudon ch.6

• Text mining – Extracts key elements from large unstructured data sets,

discover patterns and relationships, and summarize the information.(e.g., stored e-mails, IM)

– Sentiment analysis software is able to mine text comments in an e-mail message, blog, social media conversation, or survey form to detect favorable and unfavorable opinions about specific subjects.

Eng. Rasha Al Ababseh

Using Databases to Improve Business Performance and Decision Making

Page 39: Laudon ch.6

• Web mining – Discovery and analysis of useful patterns and information from

WWW • E.g., to understand customer behavior, evaluate effectiveness of Web

site, etc.

– Web content mining • is the process of extracting knowledge from the content of Web

pages, which may include text, image, audio, and video data.

– Web structure mining • E.g., links to and from Web page, links pointing to a document indicate

the popularity of the document,

– Web usage mining • User interaction data recorded by Web server; determine the

value of particular customers, cross marketing strategies across products, and the effectiveness of promotional campaigns.

Eng. Rasha Al Ababseh

Using Databases to Improve Business Performance and Decision Making

Page 40: Laudon ch.6

• Databases and the Web

– Many companies use Web to make some internal databases available to customers or partners

– Typical configuration includes:

• Web server

• Application server/middleware/CGI scripts

• Database server (hosting DBM)

– Advantages of using Web for database access:

• Ease of use of browser software

• Web interface requires few or no changes to database

• Inexpensive to add Web interface to system instead of redesign and rebuild the system to improve user access

• Accessing corporate databases through the Web is creating new efficiencies, opportunities, and business models.

Eng. Rasha Al Ababseh

Using Databases to Improve Business Performance and Decision Making

Page 41: Laudon ch.6

The Database Approach to Data Management

Eng. Rasha Al Ababseh

Users access an organization’s internal database through the Web using their desktop PCs and Web browser software.

LINKING INTERNAL DATABASES TO THE WEB

Page 42: Laudon ch.6

Managing Data Resources

• Establishing an information policy

– Firm’s rules, procedures, roles for sharing, managing,

standardizing data. – which users and organizational units can share information,

– where information can be distributed,

– and who is responsible for updating and maintaining the information.

• Data administration: – Is responsible for the specific policies and procedures

through which data can be managed as an organizational resource

Eng. Rasha Al Ababseh

Page 43: Laudon ch.6

• Data governance: – deals with the policies and processes for managing the

availability, usability, integrity, and security of the data employed in an enterprise, with special emphasis on promoting privacy, security, data quality, and compliance with government regulations.

• Database administration: – Defining, organizing, implementing, maintaining database;

performed by database design and management group

Eng. Rasha Al Ababseh

Managing Data Resources

Page 44: Laudon ch.6

• Ensuring data quality – More than 25% of critical data in Fortune 1000

company databases are inaccurate or incomplete

– Most data quality problems stem from faulty input

– Before new database in place, organizations need to: • Identify and correct faulty data • Establish better routines for editing data once

database in operation

Eng. Rasha Al Ababseh

Managing Data Resources

Page 45: Laudon ch.6

• Data quality audit: – Structured survey of the accuracy and level of

completeness of the data in an information system

• surveying entire data files.

• Survey samples from data files, or

• surveying end users for their perceptions of data quality

• Data cleansing – Software to detect and correct data that are incorrect,

incomplete, improperly formatted, or redundant

– Enforces consistency among different sets of data from separate information systems

Eng. Rasha Al Ababseh

Managing Data Resources