putting hadoop on any cloud. nati shalom at big data spain 2012

25
The Elepha nt in the Cloud Putting Hadoop on Any Cloud @natishalom

Upload: big-data-spain

Post on 15-Jan-2015

318 views

Category:

Technology


0 download

DESCRIPTION

Session presented at Big Data Spain 2012 Conference 16th Nov 2012 ETSI Telecomunicacion UPM Madrid www.bigdataspain.org More info: http://www.bigdataspain.org/es-2012/conference/putting-hadoop-cloud/nati-shalom

TRANSCRIPT

Page 1: Putting Hadoop on any Cloud. NATI SHALOM at Big Data Spain 2012

The Elephant

in the Cloud

Putting Hadoop on Any Cloud

@natishalom

Page 2: Putting Hadoop on any Cloud. NATI SHALOM at Big Data Spain 2012

Columbus & The Cloud

THE DISCOVERY OF AMERICA THE THING THAT MADE IT POSSIBLE

Page 3: Putting Hadoop on any Cloud. NATI SHALOM at Big Data Spain 2012

Why Cloud Portability

Matters

Page 4: Putting Hadoop on any Cloud. NATI SHALOM at Big Data Spain 2012

Cloud Portability Myth #1

No one really needs cloud portability

Page 5: Putting Hadoop on any Cloud. NATI SHALOM at Big Data Spain 2012

Cloud Portability

Facts

Zynga moved ~80% of their workload from Amazon to their private zCloud

“own the base, rent the spike”

http://code.zynga.com/2012/02/the-evolution-of-zcloud/

Page 6: Putting Hadoop on any Cloud. NATI SHALOM at Big Data Spain 2012

Cloud Portability

Facts Started with Linode, then moved to RackSpace, then to AWS

http://code.mixpanel.com/2010/11/08/amazon-vs-rackspace/

Page 7: Putting Hadoop on any Cloud. NATI SHALOM at Big Data Spain 2012

Cloud Portability

Facts

• You want the flexibility to choose what’s right for you, when it’s right for you

• Based on pricing, features, availability, performance, etc.

Page 8: Putting Hadoop on any Cloud. NATI SHALOM at Big Data Spain 2012

Cloud Portability Myth #2

Cloud Portability ==

Cloud API Standardization

Page 9: Putting Hadoop on any Cloud. NATI SHALOM at Big Data Spain 2012

Cloud APIs, Today

Standard APIs (?)OCCIVCloud

OSS FrameworksOpenStackCloudStackEucalyptus

Abstraction frameworksJCloudsDeltacloudFogLibvirt

Page 10: Putting Hadoop on any Cloud. NATI SHALOM at Big Data Spain 2012

Cloud APIs, Today

Standard APIsNot practical in the foreseeable future

OSS Projects Need a couple more years to converge &

mature

Abstraction FrameworksProbably the only

practical (near-term) option

Page 11: Putting Hadoop on any Cloud. NATI SHALOM at Big Data Spain 2012

Realization:

What You Really Care

about Is App

Portability

OS is the same on any cloud

Most clouds have compute & storage

Elasticity & scaling have same effects on the app, regardless of the cloud

Page 12: Putting Hadoop on any Cloud. NATI SHALOM at Big Data Spain 2012

Cloud Portability Myth #3 All infrastructure

clouds were born equal

Page 13: Putting Hadoop on any Cloud. NATI SHALOM at Big Data Spain 2012

Food for Thought

Offerings can vary quite a bit:

• Amazon guarantees only 99.5% uptime

• RackSpace will give you $$$ every time they crash

• Joyent claims to be significantly faster than both

Page 14: Putting Hadoop on any Cloud. NATI SHALOM at Big Data Spain 2012

And Some Features Are

Unique…

Amazon the only major vendor to offer SSD storage. Netflix says it’s:

• ½ the price for the same throughput

• ⅕ the latency on avg.

• Even slowest requests are 6x faster

http://techblog.netflix.com/2012/07/benchmarking-high-performance-io-with.html

@uri1803

Page 15: Putting Hadoop on any Cloud. NATI SHALOM at Big Data Spain 2012

Let’s Talk Big Data on the Cloud

Page 16: Putting Hadoop on any Cloud. NATI SHALOM at Big Data Spain 2012

A Typical Big Data App…

Page 17: Putting Hadoop on any Cloud. NATI SHALOM at Big Data Spain 2012

Managing Big Data on the

Cloud

• Auto start VMs• Install and configure

app components • Monitor • Repair • (Auto) Scale• Burst…

Page 18: Putting Hadoop on any Cloud. NATI SHALOM at Big Data Spain 2012

The Challenges ..

Consistent Management

Making the deployment, installation, scaling, fail-over looks the same through the entire stack

Page 19: Putting Hadoop on any Cloud. NATI SHALOM at Big Data Spain 2012

The Challenges (Cont)..

Cloud Portability

Choosing the Right Cloud for the Job

Running Bare-Metal for high I/O workload, Public cloud for sporadic workloads..

Page 20: Putting Hadoop on any Cloud. NATI SHALOM at Big Data Spain 2012

Hadoop

• Available under different distributions

• Cloudera• IBM BigInsights• MapR• Hortonworks

Page 21: Putting Hadoop on any Cloud. NATI SHALOM at Big Data Spain 2012

Big Data Apps, on Any Cloud, Your Way

Open source (Apache2)

Page 22: Putting Hadoop on any Cloud. NATI SHALOM at Big Data Spain 2012

Putting Cloudify and

Hadoop Together

• Run on Any Cloud• Consistent MGT• Dynamic Scaling • Auto Recovery• Auto Scaling• Role Assignments • Monitoring• Simple maintenance

Page 23: Putting Hadoop on any Cloud. NATI SHALOM at Big Data Spain 2012

Few Snippets..