클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스
DESCRIPTION
TRANSCRIPT
Data Warehousing & Business Intelligence in the Cloud
Seoul, Korea COEX Convention Centre 24th October 2013
Data Analytics in the
Cloud
Blair Layton
Business Development Manager
(Databases) – Amazon Web Services
(APAC)
The Explosion of Data
Existing Challenges with Analytics
The Cloud
The Explosion of Data
Existing Challenges with Analytics
The Cloud
We are constantly producing more data
• Insert big data infographic here
From all types of industries
Generation
Collection & storage
Analytics & computation
Collaboration & sharing
Take a look a data processing “pipeline”
Generation
Collection & storage
Analytics & computation
Collaboration & sharing
Data is available everywhere, contains customer insight and costs little to generate, but..,
What has changed in this pipeline
Generation
Collection & storage
Analytics & computation
Collaboration & sharing
Highly constrained
Everything else has constraints
Big Gap in turning data into actionable
information
The Explosion of Data
Existing Challenges with Analytics
The Cloud
Provision all your infrastructure and tools before you get results
Challenge 1: Capex Intensive
Source: Oracle technology global price list 11/1/2012
Cost of your infrastructure dictates what analytics you can perform
Most data never makes it to a data warehouse
1990 2000 2010 2020
The Data Analysis Gap
Enterprise Data
Data in Warehouse
Enterprise Data is growing at over 50% yearly
Data Warehousing growing at less than 10% yearly
Most data is left on the floor
Sources: Gartner: User Survey Analysis: Key Trends Shaping the Future of Data Center Infrastructure Through 2011 IDC: Worldwide Business Analytics Software 2012–2016 Forecast and 2011 Vendor Shares
Setup takes months of planning and work
Challenge 2: Hard to setup, manage and scale
Enterprises average between 3 and 4 DBAs per data warehouse
Gartner: Critical factors in calculating the data warehouse TCO, July 2009
Extending your data-warehouse can be heavy on time and cost
Managing a data analytics platform requires expensive staff
Complex tuning and management skills required
Very hard to move up the stack
These make it extremely hard to move up the Business Intelligence Maturity Stack
The Explosion of Data
Existing Challenges with Analytics
The Cloud
AWS Services
AWS Global Infrastructure
Application Services
Networking
Deployment & Administration
Database Storage Compute
AWS Global Infrastructure
9 Regions
25 Availability Zones
Continuous Expansion
• $5.2B retail business
• 7,800 employees
• A whole lot of servers
Every day, AWS adds enough
server capacity to power that
whole $5B enterprise
Powering the Most Popular Internet Businesses
Broad ecosystem of consulting partners..
We have partners and technologies ready to help
Solving Problems for Organizations Around the World
No Upfront Investment
Replace capital expenditure with variable expense
Low ongoing cost
Customers leverage our economies of scale
Flexible capacity
No need to guess capacity requirements and over-
provision
Speed and agility
Infrastructure in minutes not weeks
Focus on business
Not undifferentiated heavy lifting
Global Reach
Go global in minutes and reach a global audience
37 PRICE REDUCTIONS
Value proposition of the AWS cloud
Architected for Enterprise Security Requirements
“The Amazon Virtual Private Cloud
[Amazon VPC] was a unique option that
offered an additional level of security and
an ability to integrate with other aspects of
our infrastructure.”
Dr. Michael Miller, Head of HPC for R&D
(August 19, 2013)
Gartner “Magic Quadrant for Cloud Infrastructure as a Service,” Lydia Leong, Douglas Toombs, Bob Gill, Gregor Petri, Tiny Haynes, August 19, 2013. This Magic Quadrant graphic was published by Gartner, Inc. as part of a
larger research note and should be evaluated in the context of the entire report.. The Gartner report is available upon request from Steven Armstrong ([email protected]). Gartner does not endorse any vendor, product or
service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings. Gartner research publications consist of the opinions of Gartner's research organization
and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.
Gartner Magic Quadrant for Cloud Infrastructure as a Service
The Explosion of Data
Existing challenges with analytics
The Cloud
Data is a competitive edge
Hard and expensive to setup, manage and scale
Lowers cost and improves agility
Summarizing the problem and the opportunity
Data Analytics in the Cloud
Easy and inexpensive to get started
Easy to setup, scale and manage
Low cost to enable analytics on all your data
Open and flexible
The Solution
Technology Process View
Data source 1
Data source n
Unstructured data sources
Extract Transform, Load and Cleanse Data
warehouse
Data source 1
Analytics
Analytics
The diagram above shows functional architecture components of any data warehousing project.
Source systems
Data source 1
Data source n
Unstructured data sources
Extract Transform, Load and Cleanse Data
warehouse
Data source 1
Analytics
Analytics
The diagram above shows functional architecture components of any data warehousing project.
Data Integration
Data source 1
Data source n
Unstructured data sources
Extract Transform, Load and Cleanse Data
warehouse
Data source 1
Analytics
Analytics
The diagram above shows functional architecture components of any data warehousing project.
The Data Warehouse
Data source 1
Data source n
Unstructured data sources
Extract Transform, Load and Cleanse Data
warehouse
Data source 1
Analytics
Analytics
The diagram above shows functional architecture components of any data warehousing project.
Business Intelligence and Analytics
Data source 1
Data source n
Unstructured data sources
Extract Transform, Load and Cleanse Data
warehouse
Data source 1
Analytics
Analytics
The diagram above shows functional architecture components of any data warehousing project.
Data Analytics -Technology Stack
Data Integration
Data Warehouse
Business Intelligence
AWS Cloud
Amazon Redshift
Amazon Redshift
Data warehousing done the AWS way
• Pay as you go, no up front costs
• Fast, cheap, easy to use
• SQL
• Easy to provision Deploy
Customer quotes
“[Amazon Redshift] took an industry famous for its opaque pricing,
high TCO and unreliable results and completely turned it on its head.”
“Redshift is twenty times faster than Hive…The cost saving is even
more impressive…Our analysts like [it] so much they don’t want to go
back.”
“Team played with Redshift today and concluded it is awesome. Un-
indexed complex queries returning in < 10s.”
“Queries that used to take hours came back in seconds. Our analysts
are orders of magnitude more productive.”
Amazon Redshift lets you start small and grow big
Extra Large Node (HS1.XL) 3 spindles, 2 TB, 16 GB RAM, 2 cores
Single Node (2 TB)
Cluster 2-32 Nodes (4 TB – 64 TB)
Eight Extra Large Node (HS1.8XL) 24 spindles, 16 TB, 128 GB RAM, 16 cores, 10 GigE
Cluster 2-100 Nodes (32 TB – 1.6 PB)
Note: Nodes not to scale
Amazon Redshift Pricing – Singapore & Sydney
Price Per Hour for XL Node ($US)
On-Demand $ 1.25
1 Year Reservation $ 0.75
3 Year Reservation $ 0.45
Simple Pricing
Number of Nodes x Cost per Hour
No charge for Leader Node
Pay as you go
So for example…….
• 1 XL node reserved for 3 years:
= 0.45c x number of hours in a month
= $340 per month
• 1 XL node cluster gives you: • 2 Cores
• 16 GB RAM
• 2 TB Disk
• Plus 2 TB storage in S3 for backups & snapshots
Amazon Redshift is easy to use
• Provision in minutes
• Monitor query performance
• Point and click resize
• Built in security
• Automatic backups
Use cases
• Reporting Data-warehouse behind an OLTP system
• Data Mart to take load off the existing data warehouse
• Log file analysis for clickstream or gaming data (e.g. Advertising, Retail, Gaming)
• Query-able archive for data compliance (e.g. Telco - Call detail Records)
• Machine generated sensor data analysis (e.g. Utility - smart meters, Resources - equipment failure prediction)
• As a data analytics system for live data (Gaming, Advertising)
Amazon Partner Network
(Technology Partners)
Flexibility & choice are key in the Cloud
Application Services
Compute Storage Database
Networking
AWS Global Infrastructure
Deployment & Administration
Thank you
Extending data
integration into the Cloud
Colm Daniel
World Wide Cloud Alliances
Ron Lunasin
Sr. Director – Cloud Product
Management
Today’s Agenda
• Informatica Cloud Overview
• Informatica Cloud Amazon Redshift Connector
• Demonstration
• Next Steps
• Q&A
The Industry Leader in Cloud Integration Informatica:
#1 by Customer Count
2000+ companies
#1 by Customers/Analysts
Gartner AppExchange
#1 by Data Processed
+40B transactions/month
#1 by Connectivity
Informatica Cloud Marketplace
Top Right @ the Core: Gartner Magic Quadrants
Employees in 26 Countries…. and growing!
Global Presence & Global Perspective
New Cloud Connectors
http://www.informaticacloud.com/connectivity
New!
Cloud Integration Customer Success Stories
Synchronizing Salesforce CRM with Netsuite and other business apps
1.5M rows of data synchronized daily
App Integration Data Replication
Decreased operational issues from 70% to 30% of IT workload
Enabled faster, more accurate decision-making based on timely, trusted data
Data Migration
Consolidated Smith Barney and Morgan Stanley data on Day 1 of merger
Managers didn’t lose momentum in ongoing recruiting efforts
Extend PowerCenter
Hybrid deployment gives integration flexibility and scalability to meet various use cases
Lowered time and resources needed for integrations by 80%
iPaaS *(Build)
Reduce time to build and distribute connectivity to 3rd party data sources
Customize cloud integration templates to execute sophisticated integration workflows
Informatica Cloud
The Industry’s Most Comprehensive Cloud Integration
and Data Management Solution
Cloud Integration Connecting your cloud apps
Cloud Data Quality and MDM Delivering the “Single Customer View”
Cloud Process Automation Guiding users to work efficiently with the data
Our Mission:
Unleash the Potential Of the Cloud
Cloud Amazon Redshift
Connector Ron Lunasin, Cloud Platform Adoption
Recognition of “The Next Wave” back in 2004
Cloud based Integration
Client / Server based Integration
Challenges with Traditional Approaches to Cloud Integration
Prism ETI
Mainframe based Integration
Move to the Cloud…
IT transitions from skeptic to partner to driver
Increasing IT involvement
in Cloud decision making
Pre-2010
LOB Owned (Outside of IT)
LOB Led (IT Approved)
Business-IT Collaboration
Cloud First (IT Led)
2010-2012
2012-2013
2013
Cloud is the Reality in the Enterprise
90% Cloud decisions and operations
involve IT (IDC)
Driven by IT
Large, Accelerating Market
66% SaaS POs
signed by IT (IDC)
76% enterprises
have a formal cloud strategy
(Forrester)
74% using cloud
will increase cloud spend > 20%
(IDC)
Led by Large
Enterprises 4-6x growth rate of
on-premise IT
20-27% CAGR
$20-40B market (Forrester, IDC, Gartner, 451Group)
84% of net new software is now SaaS
(IDC)
60% of all companies
using SaaS w/in 12 months
(Forrester)
SaaS largest category
PaaS fastest growing
(Forrester)
Informatica Cloud and Amazon Redshift:
Enabling cost-effective data warehousing
• Redshift Connector pre-release announced in February
• General availability in August 2013
InformaticaCloud.com/Amazon-Redshift
What did it use to take…
• Budget large capital expenditure
• Schedule a sales meeting with Oracle, IBM, Teradata, etc…
• Formal POC (Proof of Concept)
• Procure software and hardware
• Install and setup
• Start project
What it takes now…
• Go to the web and sign-up
• Start project!
2
1
Informatica Cloud Architecture Overview
4 Secure Agent
Your Company 3
Marketplace
Amazon Redshift
Informatica Cloud Amazon Redshift demonstration
Firewall
Informatica Cloud Secure Agent
Metadata Mappings
Build mapping and execute job
1
1
Retrieve Account Data 2
2
3 Put Account Data into Flat File
4 Transfer compressed Flat File to S3
5 Initiate copy from S3
6 Load data into Amazon Redshift
6
3
5 4
Best practices to remember…
• The Amazon S3 bucket that holds the data files must be created in the same
region as your cluster
– Files are deleted from Amazon S3 bucket when upload is complete
• Choose a batch size where the number of batches matches the number of
slices in your cluster
– Each XL node has 2 slices, each 8XL node has 16
– If you have a 2 node XL cluster and 40,000 rows of data, choose a batch size of
10,000
– The Informatica Cloud Redshift connector can maximize Amazon’s parallel
processing capabilities this way
Next Steps
• Get started with Amazon Redshift
• Get started with Informatica Cloud
– InformaticaCloud.com
• Learn more about our Redshift Connector
– InformaticaCloud.com/Amazon-Redshift
Q&A Colm Daniel, [email protected]
Ron Lunasin, [email protected]
Thank you
AWS Reporting &
Analysis
Ben Connors
Worldwide Head of Alliances - Jaspersoft
• Analysis of Cloud market motivations
• Overview of Cloud trends
• Cloud User category expectations
• How BI/Jaspersoft fits into Cloud strategies
• Demos
• Summary
© 2013 Jaspersoft Corporation 71
Session Overview
Industry Movement to the Cloud
• Cloud Growth –
– Cloud IT spend will grow from 3% - 17% of total (Morgan Stanley)
• Motivations:
– Agility
– Lower cost
– Faster time to value
– Less risk
• Use cases:
– CRM, ERP, HR, Online Gaming, Manufacturing, Expense Reporting, Big Data, Consumer Applications, Etc.
• Workloads:
– Dev/Test
– ‘Spiky’
– High Growth
– Reliable production
• BI usage matches these Cloud trends
© 2013 Jaspersoft Corporation. 72
Cloud Computing Growth
© 2013 Jaspersoft Corporation. 73 http://www.forbes.com/sites/louiscolumbus/2013/02/19/gartner-predicts-infrastructure-services-will-accelerate-cloud-computing-growth/
Asia/Pacific Cloud Growth
© 2013 Jaspersoft Corporation. 74 http://techaisle.com/blog/2012/11/lots-of-clouds-in-the-forecast-and-a-holiday-story/
Top Cloud Applications
0
10
20
30
40
50
Deployed
In 12 months
• INTERNAL BUSINESS APPLICATIONS TOP THE LIST; MOBILE SITES NEXT
What kinds of applications have you delivered using a cloud environment? Which do you plan to deliver during the next 12 months?
Source: Forrester Cloud Developer Survey, Q3 2012
© 2013 Jaspersoft Corporation. 75
2013: Current/future BI
Cloud adoption trends
© 2013 Jaspersoft Corporation. 76
TechTarget 2013 Analytics & Data Warehousing Reader Challenges & Priorities Survey
Does your organization run or plan to run any part of its BI, analytics and data warehousing
systems in the cloud?
15%
13%
32%
41%
Yes, active cloud user
Plan to start using the cloud in the next 12 months
Considering, but no set plans
No
60% planning, considering, or actively using
N = 559
• The cloud continues to play a critical role in supporting BI, analytics, and DW initiatives with 3 out of 5 respondents reporting that they are planning, considering or actively using the cloud.
• Business User
– Efficient access to IT resources w/o red tape and delays
• Application Developer
– Platform with dev tools, middleware, capacity, configuration mgt.
• IT Operations
– Elastic capacity, secure, standard, keep users happy
• Management
– Control expenses & risk, delight customers/partners, move fast
© 2013 Jaspersoft Corporation 77
Constituents - Cloud Expectations
Example Industry Use Cases
for Business Intelligence
Industry Data Analyzed
Online Gaming # players vs. time, spend/player, popularity of weapons, scene usage
Education Student attendance, test scores, teacher performance, spend/student
Telecom Customer churn, data traffic patterns, billing per service
Government Crime data, demographics, health trends, economic
Advertising Click-through rates, conversion rates, regional variation
Retail Product sales, Profits, Customer traffic, Product correlations
Manufacturing Inventory, quality, vendor performance, logistics
78 © 2013 Jaspersoft Corporation
Current State of Business Intelligence
• Standalone
• Expensive
• Desktop-based
• High Latency
© 2013 Jaspersoft Corporation. 79
Competing on Time and Information
80
“The New Factors of Production: Time and Information” Brian Gentile, Jaspersoft
But business users don’t have access to timely,
actionable data
Why?
Most don’t spend their day inside a BI tool …nor
do they want to!
© 2013 Jaspersoft Corporation.
Embedded BI - Why?
• For Best Decisions, Information Should Be:
– Relevant
– Timely
– Actionable
81 © 2013 Jaspersoft Corporation.
Embedded BI
• Maintains
– Context/Relevance
– Motivation/Timeliness
– Train of thought/Timeliness
– Actionable/Within application or beyond
– Security
• Broadens User Community
– Executives
– More knowledge workers
– Self-serve, Interactive
82 © 2013 Jaspersoft Corporation.
4xC Barriers to
Embedded BI Adoption
© 2013 Jaspersoft Corporation. 83
Complex to Deploy
Cost Complex to Embed
Complex to Use
NEED: Develop for free. Pay only for what you use when
deploy
NEED: Deploy
with push-
button ease or use as a service
NEED: Embed
self-service BI through standard
APIs
NEED: Easy to
build and use BI assets
Simple, Low-Cost Embedded BI
3rd Gen Embedded BI
Breaks Barriers
© 2013 Jaspersoft Corporation. 84
Complex to Deploy
Cost Complex to Embed
Complex to Use
Free + usage-based pricing
HTML5/CSS+ RESTful
web services
Push-button on-premises deployment and Cloud BI service
Easy to build for BI Builders on any data and self-serve for BI Consumers on any device
3rd Generation Embedded BI
We Need “Intelligence Inside”
85
We want information to FIND US, not the other way round
“We need Intelligence Inside the applications and business processes we use every day.”
– Pipeline dashboard inside SaaS CRM app
– Performance report inside partner portal
– Salary data visualizations inside HR intranet
– Portfolio analytics inside client website
– Tickets crosstab inside custom helpdesk app
– Interactive charts inside native mobile app
© 2013 Jaspersoft Corporation.
Embeddable Architecture Open web standard
architecture makes
integration with any
app easy to perform
Cloud Ready Multi-tenant architecture,
100’s of SaaS
customers, top selling BI
solution on Amazon
Affordable Up to 80% less than
traditional BI platforms
while delivering significant
power & capabilities
Proven Platform Millions of users,
380,000 community
members, deployed in
130,000+ applications
Full Self-Service BI Suite Address all user requirements with
interactive reports, dashboards,
analysis, and data integration
Jaspersoft: The Intelligence Inside
Product Overview
Jaspersoft Products
88
Reporting Engine
Visual Report
Design Environment
Ad Hoc Reports, Dashboards,
In-Memory Analysis Server
Powerful OLAP
Data Analysis
Studio
© 2013 Jaspersoft Corporation.
Design Any Report . . .
© 2013 Jaspersoft Corporation. 89
… Dashboard
90 © 2013 Jaspersoft Corporation.
… or Analytic View
91 © 2013 Jaspersoft Corporation.
... Using Any Data Type
POJO files
Relational Files Relational Big Data Files
© 2013 Jaspersoft Corporation. 92
Redshift
© 2013 Jaspersoft Corporation. 93
… bringing Intelligence to Any App
… with a World-Class BI Platform
94
Reporting, Dashboards, Visualization, OLAP Analysis
Columnar-Based In-Memory Engine
Data Connectivity to Any Data
10
0%
Web
Sta
nd
ard
s: C
SS, .
JS, .
JSP,
Jav
a
Exte
nsi
ve A
PIs
: HTT
P, S
OA
P, R
EST
HTML5 Browser, Native Mobile Apps
Business Metadata Layer
Data Integration
Data Virtualization Direct
Redshift EMR On-Premises RDS SaaS
Jaspersoft Customers
Software & Technology
Financial Services
Public Sector
Telecommunications
Travel & Transportation
Manufacturing
Healthcare/Pharmaceutical
© 2013 Jaspersoft Corporation.
Jaspersoft AWS Hourly: 500+ Customers in 6 Months!
95
Jaspersoft/AWS Customer:
BizFlow/Samsung Korea
• Business Process Management (BPM)
• Challenge
– Monitor/Analyze Business Activities
• Solution
– Jaspersoft on Cloud
• Results
– Customers avoid infrastructure
– Increased BizFlow revenue
– Self-service BI
– Higher value analytics
http://www.bizflow.com/business-process-management/samsung-heavy-industries
© 2013 Jaspersoft Corporation. 96
Jaspersoft/AWS Customer:
Sage Human Capital
• Recruiting Firm for High Tech companies
• Challenge
– Visibility for recruiting process status
• Internal
• External
• Solution
– Jaspersoft on AWS
• Results
– Dashboards set up in two hours
– Disrupting the industry “Jaspersoft for AWS allows me to have big company analytics for a small business price. With this information, we can be proactive instead of reactive.” - Paul Grewal, CEO Sage Human Capital
© 2013 Jaspersoft Corporation. 97
Jaspersoft/AWS Customer:
Blue Consulting
• Administration Systems for Schools
• Challenge
– Data from many systems
– Difficult for everyone, including teachers, to access
• Solution
– Jaspersoft on AWS, Amazon Redshift
• Results
– Over 200 schools provide reporting to teachers, even at home
– More informed decisions, educational approaches, resource optimization
“Our users LOVE Jaspersoft ad hoc reporting, and the performance of the system with Redshift.” -Russ Davis, Founder & CEO
98 © 2013 Jaspersoft Corporation.
© 2013 Jaspersoft Corporation. 99
Jaspersoft BI for AWS Overview
Jaspersoft 5 Demo
100 © 2013 Jaspersoft Corporation.
Jaspersoft Integrated with Amazon Redshift
• Jaspersoft is the first BI service that you can buy per hour
– No user limitations, no monthly fee,
– less than $1 per hour
• First BI service to automatically
connect to your AWS data
– 10 minutes from launch to visualizing your data in RDS or Redshift
– AWS Security Integration
• Released February, 2013
– Over 500 customers
101
Jaspersoft Pro on AWS
Thank you