performance and energy efficiency evaluation of big data systems

37
Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS 2013-10-31

Upload: sora

Post on 24-Feb-2016

41 views

Category:

Documents


0 download

DESCRIPTION

Performance and Energy Efficiency Evaluation of Big Data Systems. Presented by Yingjie Shi Institute of Computing Technology, CAS 2013-10-31. Goals of Big Data Systems. Tradeoff. Performance V.S. Energy Efficiency. Energy Efficiency. Performance. Performance. Energy Efficiency. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Performance and Energy Efficiency Evaluation of Big Data Systems

Performance and Energy Efficiency Evaluation of Big Data Systems

Presented by Yingjie ShiInstitute of Computing Technology, CAS

2013-10-31

Page 2: Performance and Energy Efficiency Evaluation of Big Data Systems

BPOE 2013 | HPCChina 2013

Goals of Big Data Systems

Larger

GreenerFaster

Page 3: Performance and Energy Efficiency Evaluation of Big Data Systems

BPOE 2013 | HPCChina 2013

Performance V.S. Energy Efficiency

Performance

Energy EfficiencyFaster & More

PowerfulGreener &Cheaper

More servers Bigger clusters Powerful processors Sophisticated

processing algorithms

Lightweight servers Efficient processors Simpler processing

algorithms …

TradeoffEvaluation

Page 4: Performance and Energy Efficiency Evaluation of Big Data Systems

BPOE 2013 | HPCChina 2013

Evaluation of Performance & Energy Efficiency Tradeoff

How to measure?AxPUE: Application Level Metrics for Power Usage Effectiveness in Big Data Systems

How to get balance?The Implications from Benchmarking Three Big Data Systems

Page 5: Performance and Energy Efficiency Evaluation of Big Data Systems

BPOE 2013 | HPCChina 2013

Motivation

If you can not measure it, you can not improve it. – Lord Kelvin

PUE(Power usage effectiveness): a measure of how efficiently a computer data center uses its power; specifically, how much of the power is actually used by the information technology equipment.

Page 6: Performance and Energy Efficiency Evaluation of Big Data Systems

BPOE 2013 | HPCChina 2013

PUE & Its Variants

Metric Time Organization Computing Formulas

PUE 2007

GreenGrid

DCiE 2008 GreenGrid DCeP 2008 GreenGrid pPUE 2012 GreenGrid PUE

Scalability 2013 GreenGrid

Total Facility EnergyIT Equipment Energy

*100%IT Equipment EnergyTotal Facility Energy

Total Facility Energy insidetheBoundaryIT Equipment Energy insidetheBoundary

*100%Actual

PUE

mm

PrQuantityof ResourceConsumed Producing this Work

UsefulWork oducedTotal

Page 7: Performance and Energy Efficiency Evaluation of Big Data Systems

BPOE 2013 | HPCChina 2013

Motivation• Scenario1

Data Management Researcher

An Improved Data Classification AlgorithmDoes it contribute to greening the data centers?

Run the Algorithms on Data Center

Compare the PUEs

No Obvious Variations!

PUE can not measure the effectiveness of any changes made upon the data center infrastructure!

Page 8: Performance and Energy Efficiency Evaluation of Big Data Systems

BPOE 2013 | HPCChina 2013

Motivation• Scenario2

Data Center Administrators

Give a budget plan of the data center energyconsumption in the next year

Estimate the data volume based on the business development

How to estimate the energy increasement?

PUE provides little reference information for data center planning according to data scale

and application complexity

Page 9: Performance and Energy Efficiency Evaluation of Big Data Systems

BPOE 2013 | HPCChina 2013

Calculation Framework

PUE

AxPUE

Page 10: Performance and Energy Efficiency Evaluation of Big Data Systems

BPOE 2013 | HPCChina 2013

Definition - ApPUE• ApPUE (Application Performance Power Usage Effectiveness): a

metric that measures the power usage effectiveness of IT

equipments, specifically, how much of the power entering IT

equipments is used to improve the application performance.

• Computation Formulas:

ApplicationPerformanceApPUEIT Equipment Power

Data processing performance of applications

The average rate of IT Equipment Energy consumed

Page 11: Performance and Energy Efficiency Evaluation of Big Data Systems

BPOE 2013 | HPCChina 2013

Definition - AoPUE• AoPUE (Application Overall Power Usage Effectiveness ): a metric

that measures the power usage effectiveness of the overall data center system, specifically, how much of the total facility power is used to improve the application performance.

• Computation Formulas:

ApplicationPerformanceAoPUETotal Facility Power

The average rate of Total Facility Energy UsedApPUEAoPUEPUE

Page 12: Performance and Energy Efficiency Evaluation of Big Data Systems

BPOE 2013 | HPCChina 2013

Acquisition – Application Performance

Application Category

Examples Metric

Service Application Search engine, Ad-hoc queries

Number of requests answered in unit time

Data Analysis Application

Data mining, Reporting, Decision support, Log analysis

Volume of data processed in unit time

Interactive Real-time Application

E-commerce, Profile data management

Number of transactions completed in unit time

High Performance Computing

Scientific Computing Number of floating-point operations in unit time

Page 13: Performance and Energy Efficiency Evaluation of Big Data Systems

BPOE 2013 | HPCChina 2013

Acquisition – Benchmark

• Requirements of Benchmarks– Provide representative workloads for big data

applications– Provide a scalable data generation tool

• BigDataBench– A big data benchmark suite open-sourced recently

and publicly available– All the requirements are well fullfilled

Page 14: Performance and Energy Efficiency Evaluation of Big Data Systems

BPOE 2013 | HPCChina 2013

Experiment Overview• Testbed

– Data center of 18 racks,362 servers– Sample 8 servers

• Workloads

• Two experiments– Different Applications– Different Implementation Algorithms

Page 15: Performance and Energy Efficiency Evaluation of Big Data Systems

BPOE 2013 | HPCChina 2013

Experiments on Different Applications

0

1

2

3

4

5

6

7

8

9

PUEApPUEAoPUE

BigDataBench SVM Sort Grep Linpack

17.2 11.5 269.9 179.7

Page 16: Performance and Energy Efficiency Evaluation of Big Data Systems

BPOE 2013 | HPCChina 2013

Experiments on Different Algorithms• Two Implementations for Sort

– Several reducers with random sampling partitioning– One reducer without partitioning

10G 25G 50G 100G0

5

10

15

20

25

30PUE(Sort1)ApPUE(Sort1)PUE(Sort2)ApPUE(Sort2)

Data Size

Page 17: Performance and Energy Efficiency Evaluation of Big Data Systems

BPOE 2013 | HPCChina 2013

Conclusions

• We analyze the requirements of application-level energy effectiveness metrics AxPUE in data centers.

• We propose two novel application-level metrics ApPUE and AoPUE to measure the energy consumed to improve the application performance.

• The experiment results show that AxPUE could provide meaningful guidance to data center design and optimization.

Page 18: Performance and Energy Efficiency Evaluation of Big Data Systems

BPOE 2013 | HPCChina 2013

Evaluation of Performance & Energy Efficiency Tradeoff

How to measure?AxPUE: Application Level Metrics for Power Usage Effectiveness in Data Centers

How to get balance?The Implications from Benchmarking Three Big Data Systems

Page 19: Performance and Energy Efficiency Evaluation of Big Data Systems

BPOE 2013 | HPCChina 2013

New Solutions

……

Page 20: Performance and Energy Efficiency Evaluation of Big Data Systems

BPOE 2013 | HPCChina 2013

Experimental PlatformsXeon (Common processor)

Atom ( Low power processor)

Tilera (Many core processor)CPU Type Intel Xeon

E5310 Intel Atom D510 Tilera TilePro36

CPU Core 4 cores @ 1.6GHz

2 cores @ 1.66GHz

36 cores @ 500MHz

L1 I/D Cache 32KB 24KB 16KB/8KB

L2 Cache 4096KB 512KB 64KB

Basic InformationBrief Comparison

Page 21: Performance and Energy Efficiency Evaluation of Big Data Systems

BPOE 2013 | HPCChina 2013

Benchmark SelectionBigDataBench

A big data benchmark suite from big data applicationsRespective applicationsAn innovative data generation tool

Application Time Complexity Characteristics

Sort O(n*log2n) Integer comparison

WordCount O(n) Integer comparison and calculation

Grep O(n) String comparisonNaïve Bayes O(m*n) Floating-point computation

SVM O(n3) Floating-point computation

Page 22: Performance and Energy Efficiency Evaluation of Big Data Systems

BPOE 2013 | HPCChina 2013

Metrics Performance: Data processed per second (DPS)

Energy Efficiency: Application Performance Power Usage Effectiveness(DPJ)

Page 23: Performance and Energy Efficiency Evaluation of Big Data Systems

BPOE 2013 | HPCChina 2013

Xeon Atom Tilera

DPS

DPJ

General Observations

Page 24: Performance and Energy Efficiency Evaluation of Big Data Systems

BPOE 2013 | HPCChina 2013

General Observations

Data scale has a significant impact on the performance and energy efficiency of big data systems.

The performance and energy efficiency trends of different applications are diverse.

Xeon Atom Tilera

Page 25: Performance and Energy Efficiency Evaluation of Big Data Systems

BPOE 2013 | HPCChina 2013

Xeon VS Atom – DPS

Page 26: Performance and Energy Efficiency Evaluation of Big Data Systems

BPOE 2013 | HPCChina 2013

Xeon VS Atom – DPJ

Page 27: Performance and Energy Efficiency Evaluation of Big Data Systems

BPOE 2013 | HPCChina 2013

Xeon VS Atom – DPS & DPJ500MB 1GB 10GB 25GB 50GB 100G

B

Sort DPSDPJ

3.670.87

4.511.08

1.890.45

1.540.36

1.360.32

1.400.33

Wordcount DPSDPJ

2.270.55

2.380.58

2.740.61

2.840.61

2.820.62

2.790.60

Grep DPSDPJ

1.830.48

1.820.46

2.300.54

2.790.62

2.870.63

2.890.64

Naïve Bayes

DPSDPJ

3.830.89

3.890.87

4.521.01

4.640.99

4.540.97

4.580.90

SVM DPSDPJ

3.190.69

3.060.64

3.170.66

3.140.67

Xeon is more powerful than Atom on processing capacity.Atom is more energy –saving than Xeon when dealing

with simple computation logic applications.

Page 28: Performance and Energy Efficiency Evaluation of Big Data Systems

BPOE 2013 | HPCChina 2013

Xeon VS Atom -- Summary

Xeon is more powerful than Atom on processing capacity.

Atom is energy conservation than Xeon when dealing with applications with simple computation logic.

Atom doesn’t show energy advantage when dealing with complex applications.

Page 29: Performance and Energy Efficiency Evaluation of Big Data Systems

BPOE 2013 | HPCChina 2013

Xeon VS Tilera – DPS

Page 30: Performance and Energy Efficiency Evaluation of Big Data Systems

BPOE 2013 | HPCChina 2013

Xeon VS Tilera – DPJ

Page 31: Performance and Energy Efficiency Evaluation of Big Data Systems

BPOE 2013 | HPCChina 2013

Xeon VS Tilera – DPS & DPJ500MB 1GB 10GB 25GB

Sort DPSDPJ

3.670.48

3.390.45

2.410.31

2.600.34

Wordcount DPSDPJ

5.190.67

5.040.65

7.350.87

7.780.92

Grep DPSDPJ

3.600.51

3.520.48

7.450.94

9.931.21

Naïve Bayes DPSDPJ

5.910.75

5.780.70

7.590.89

7.940.92

Xeon is more powerful than Tilera on processing capacityTilera is more energy-saving than Xeon when dealing with the simple computation logic and I/O intensive applicationsTilera don’t show energy advantage when dealing with complex applications

Page 32: Performance and Energy Efficiency Evaluation of Big Data Systems

BPOE 2013 | HPCChina 2013

Xeon VS Tilera

The DPS of XeonThe DPS of AtomThe DPS of Tilera

Page 33: Performance and Energy Efficiency Evaluation of Big Data Systems

BPOE 2013 | HPCChina 2013

Xeon VS Tilera

The DPS of Tilera

Tilera is more suitable to process I/O intensive applications

Page 34: Performance and Energy Efficiency Evaluation of Big Data Systems

BPOE 2013 | HPCChina 2013

Xeon VS Tilera -- Summary

36

Xeon is more powerful than Tilera on processing capacity.

Tilera is more energy conservation than Xeon when dealing with simple computation logic and I/O intensive applications.

Tilera don’t show energy advantage when dealing with complex applications.

Tilera is more suitable to process I/O intensive applications.

Page 35: Performance and Energy Efficiency Evaluation of Big Data Systems

BPOE 2013 | HPCChina 2013

ImplicationsThe performance of a big data system is not only related to the hardware itself, but also the application type and data volume of workloads.

The weak processors aren’t suitable to deal with complex applications. Even they have lower TDP, they don’t show energy cost advantage.

Page 36: Performance and Energy Efficiency Evaluation of Big Data Systems

BPOE 2013 | HPCChina 2013

Implications Cont.Xeon generally has better processing capacity accompanied with high energy consumption, especially to some light scale-out applications.

Atom and Tilera show energy consumption advantage when dealing with light scale-out applications.

Tilera exerts energy advantage on processing I/O intensive application.

Page 37: Performance and Energy Efficiency Evaluation of Big Data Systems

BPOE 2013 | HPCChina 2013