[azureビッグデータ関連サービスとhortonworks勉強会] azure hdinsight

51
Microsoft さとうなおき (@satonaoki)

Upload: naoki-neo-sato

Post on 23-Jan-2018

494 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Microsoft

さとうなおき (@satonaoki)

Why Deploy To the Cloud?

Microsoft’s Solution

How Do I Get Started?

Breaking points of traditional approach

Breaking points of traditional approach

Breaking points of traditional approach

Breaking points of traditional approach

What if you could handle big data?

Introducing Apache Hadoop

Data volume

Data variety

Data velocity

Hadoop is a platform with portfolio of projects

A Hadoop distribution is a package of projects

With many contributors

Business applications of Hadoop

New analytic applications from new data

What Is Hadoop?

Microsoft’s Solution

How Do I Get Started?

Challenges with implementing Hadoop

Why Cloud + Big Data?

Speed Scale Economics

Always Up,

Always OnOpen and flexibleTime to value

Data of all Volume,

Variety, Velocity

Massive Compute

and Storage

Deployment

expertise

Why Hadoop in the Cloud?

Scenarios For Deploying Hadoop As Hybrid

What Is Hadoop?

Why Deploy To the Cloud?

Microsoft’s Solution

Introducing Azure HDInsight

Microsoft contributions to Hadoop

Microsoft + Hortonworks

HDInsight Built for Windows or Linux

HDInsight Supports Hive

Hadoop 2.0

HDInsight Supports HBase

Data Node Data Node Data Node Data Node

Task Tracker Task Tracker Task Tracker Task Tracker

Name Node

Job Tracker

HMasterCoordination

Region Server Region Server Region Server Region Server

HDInsight Supports Mahout

HDInsight Supports Storm

Stream processin

g

Search and query

Data analytics (Excel)

Web/thick client

dashboards

Devices to take action

RabbitMQ /

ActiveMQ

Spark for Azure HDInsight In Memory Processing on Multiple Workloads

Azure

HDInsight

In Memory

Spark

• Single execution model for multiple

tasks

• Processing up to 100x faster

performance

• Developer friendly (Java, Python, Scala)

• BI tool of choice (Power BI, Tabelau,

Qlik, SAP)

• Notebook experience (Jupyter/iPython,

Zeppelin)

R Server for HDInsight

• Familiarity of R (most popular language for data scientists)

• Scalability of Hadoop and Spark

• Up to 7x faster using Spark engine

• Train and run ML models on datasets of any size

• Cloud managed solution (easy setup, elastic, SLA)

HDInsight Allows You To Add Hadoop Projects

Microsoft Makes Hadoop EasierDeep Visual Studio Integration• Debug Hive jobs through Yarn logs or troubleshoot Storm topologies

• Visualize Hadoop clusters, tables, and storage

• Submit Hive queries, Storm topologies (C# or Java spouts/bolts)

• IntelliSense

Introducing Azure HDInsight

Big Opportunities in the Cloud

Source:

1: http://www.idc.com/getdoc.jsp?containerId=prUS25329114

2: Gartner Market Guide For Hadoop. December, 2015

3X

Spending on cloud-

based Big Data and

analytics solutions will

grow 3 times faster than

on-premises solutions1

52%

52% of surveyed

organizations plan to

use or continue to

deploy Hadoop in the

cloud (IaaS and PaaS)2

128% 18

Microsoft is the only

company with cloud

revenue at large scale

that grew triple digits in

its fifth consecutive

quarter

Microsoft Azure

running Hadoop in

more datacenters

around the world than

anyone else

Operational

Central US

Iowa

West US

California

East US

Virginia

North Central US

Illinois

US Gov

Iowa

South Central US

Texas

Brazil South

Sao Paulo State

West Europe

Netherlands Sovereign

Cloud: China

North *

Beijing

Sovereign

Cloud: China

South *

Shanghai

Japan East

Tokyo, Saitama

Japan West

Osaka

India South

Chennai

East Asia

Hong Kong

SE Asia

Singapore

Australia South East

Victoria

Australia East

New South Wales

India Central

Pune

India West

Mumbai

North Europe

Ireland

East US 2

Virginia

Hadoop is being run everywhereMore Datacenters than any other vendor

Why Microsoft Azure?

Azure Storage

No hardware challenges

Deployed in minutes

Mission Critical, Enterprise Ready

Maintenance done for you

Low Cost

$£€¥

*IDC study “The Business Value and TCO Advantage of Apache Hadoop in the Cloud with Microsoft Azure HDInsight”

Introducing Azure HDInsight

Bringing Hadoop to a billion people

Making advanced analytics accessible to Hadoop

Cloud

What Is Hadoop?

Why Deploy To the Cloud?

Microsoft’s Solution

Get Started

http://azure.microsoft.com/en-us/documentation/services/hdinsight/

http://azure.microsoft.com/en-us/documentation/articles/hdinsight-learn-map/

http://www.microsoftvirtualacademy.com/training-courses/getting-started-with-microsoft-big-data

http://channel9.msdn.com/Shows/Data-Exposed

http://azure.microsoft.com/en-us/pricing/free-trial/

© 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.

The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market

conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation.

MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.