putting hadoop on any cloud big data spain

26
The Elepha nt in the Cloud Putting Hadoop on Any Cloud @natishalom

Upload: nati-shalom

Post on 15-Jan-2015

973 views

Category:

Documents


0 download

DESCRIPTION

The massive computing and storage resources that are needed to support big data applications make cloud environments an ideal fit. Now more than ever, there is a growing number of choices of cloud infrastructure providers, from Amazon AWS, OpenStack offered by the likes of HP, Rackspace and soon even Dell, VMware vCloud as well a... INCLUDING - Effectively managing your Hadoop stack in any data center (on-premise, cloud, hybrid…) - Maintaining the flexibility to choose the right cloud for the job in an ever-changing environment - Consistently manage your hadoop deployment with other elements of your Big Data system such as NoSQL DB, Web Tier etc.

TRANSCRIPT

Page 1: Putting hadoop on any cloud  big data spain

The Elephant

in the Cloud

Putting Hadoop on Any Cloud

@natishalom

Page 2: Putting hadoop on any cloud  big data spain

Columbus & The Cloud

THE DISCOVERY OF AMERICA THE THING THAT MADE IT POSSIBLE

Page 3: Putting hadoop on any cloud  big data spain

Why Cloud Portability

Matters

Page 4: Putting hadoop on any cloud  big data spain

Cloud Portability Myth #1

No one really needs cloud portability

Page 5: Putting hadoop on any cloud  big data spain

Cloud Portability

Facts

Zynga moved ~80% of their workload from Amazon to their private zCloud

“own the base, rent the spike”

http://code.zynga.com/2012/02/the-evolution-of-zcloud/

Page 6: Putting hadoop on any cloud  big data spain

Cloud Portability

Facts Started with Linode, then moved to RackSpace, then to AWS

http://code.mixpanel.com/2010/11/08/amazon-vs-rackspace/

Page 7: Putting hadoop on any cloud  big data spain

Cloud Portability

Facts

• You want the flexibility to choose what’s right for you, when it’s right for you

• Based on pricing, features, availability, performance, etc.

Page 8: Putting hadoop on any cloud  big data spain

Cloud Portability Myth #2

Cloud Portability ==

Cloud API Standardization

Page 9: Putting hadoop on any cloud  big data spain

Cloud APIs, Today

Standard APIs (?)OCCIVCloud

OSS FrameworksOpenStackCloudStackEucalyptus

Abstraction frameworksJCloudsDeltacloudFogLibvirt

Page 10: Putting hadoop on any cloud  big data spain

Cloud APIs, Today

Standard APIsNot practical in the foreseeable future

OSS Projects Need a couple more years to converge &

mature

Abstraction FrameworksProbably the only

practical (near-term) option

Page 11: Putting hadoop on any cloud  big data spain

Realization:

What You Really Care

about Is App

Portability

OS is the same on any cloud

Most clouds have compute & storage

Elasticity & scaling have same effects on the app, regardless of the cloud

Page 12: Putting hadoop on any cloud  big data spain

Cloud Portability Myth #3 All infrastructure

clouds were born equal

Page 13: Putting hadoop on any cloud  big data spain

Food for Thought

Offerings can vary quite a bit:

• Amazon guarantees only 99.5% uptime

• RackSpace will give you $$$ every time they crash

• Joyent claims to be significantly faster than both

Page 14: Putting hadoop on any cloud  big data spain

And Some Features Are

Unique…

Amazon the only major vendor to offer SSD storage. Netflix says it’s:

• ½ the price for the same throughput

• ⅕ the latency on avg.

• Even slowest requests are 6x faster

http://techblog.netflix.com/2012/07/benchmarking-high-performance-io-with.html

Page 15: Putting hadoop on any cloud  big data spain

Let’s Talk Big Data on the Cloud

Page 16: Putting hadoop on any cloud  big data spain

A Typical Big Data App…

Page 17: Putting hadoop on any cloud  big data spain

Managing Big Data on the

Cloud

• Auto start VMs• Install and configure

app components • Monitor • Repair • (Auto) Scale• Burst…

Page 18: Putting hadoop on any cloud  big data spain

The Challenges ..

Consistent Management

Making the deployment, installation, scaling, fail-over looks the same through the entire stack

Page 19: Putting hadoop on any cloud  big data spain

The Challenges (Cont)..

Cloud Portability

Choosing the Right Cloud for the Job

Running Bare-Metal for high I/O workload, Public cloud for sporadic workloads..

Page 20: Putting hadoop on any cloud  big data spain

Hadoop

• Available under different distributions

• Cloudera• IBM BigInsights• MapR• Hortonworks

Page 21: Putting hadoop on any cloud  big data spain

Big Data Apps, on Any Cloud, Your Way

Open source (Apache2)

Page 22: Putting hadoop on any cloud  big data spain

Putting Cloudify and

Hadoop Together

• Run on Any Cloud• Consistent MGT• Dynamic Scaling • Auto Recovery• Auto Scaling• Role Assignments • Monitoring• Simple maintenance

Page 23: Putting hadoop on any cloud  big data spain

How it works..1 Upload your recipe.

2 Cloudify creates VM’s & installs agents

3 Agents install and manage your app

4 Cloudify automate the scaling

Page 24: Putting hadoop on any cloud  big data spain

Few Snippets..