performance and energy efficiency evaluation of big data systems

Post on 24-Feb-2016

41 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Performance and Energy Efficiency Evaluation of Big Data Systems. Presented by Yingjie Shi Institute of Computing Technology, CAS 2013-10-31. Goals of Big Data Systems. Tradeoff. Performance V.S. Energy Efficiency. Energy Efficiency. Performance. Performance. Energy Efficiency. - PowerPoint PPT Presentation

TRANSCRIPT

Performance and Energy Efficiency Evaluation of Big Data Systems

Presented by Yingjie ShiInstitute of Computing Technology, CAS

2013-10-31

BPOE 2013 | HPCChina 2013

Goals of Big Data Systems

Larger

GreenerFaster

BPOE 2013 | HPCChina 2013

Performance V.S. Energy Efficiency

Performance

Energy EfficiencyFaster & More

PowerfulGreener &Cheaper

More servers Bigger clusters Powerful processors Sophisticated

processing algorithms

Lightweight servers Efficient processors Simpler processing

algorithms …

TradeoffEvaluation

BPOE 2013 | HPCChina 2013

Evaluation of Performance & Energy Efficiency Tradeoff

How to measure?AxPUE: Application Level Metrics for Power Usage Effectiveness in Big Data Systems

How to get balance?The Implications from Benchmarking Three Big Data Systems

BPOE 2013 | HPCChina 2013

Motivation

If you can not measure it, you can not improve it. – Lord Kelvin

PUE(Power usage effectiveness): a measure of how efficiently a computer data center uses its power; specifically, how much of the power is actually used by the information technology equipment.

BPOE 2013 | HPCChina 2013

PUE & Its Variants

Metric Time Organization Computing Formulas

PUE 2007

GreenGrid

DCiE 2008 GreenGrid DCeP 2008 GreenGrid pPUE 2012 GreenGrid PUE

Scalability 2013 GreenGrid

Total Facility EnergyIT Equipment Energy

*100%IT Equipment EnergyTotal Facility Energy

Total Facility Energy insidetheBoundaryIT Equipment Energy insidetheBoundary

*100%Actual

PUE

mm

PrQuantityof ResourceConsumed Producing this Work

UsefulWork oducedTotal

BPOE 2013 | HPCChina 2013

Motivation• Scenario1

Data Management Researcher

An Improved Data Classification AlgorithmDoes it contribute to greening the data centers?

Run the Algorithms on Data Center

Compare the PUEs

No Obvious Variations!

PUE can not measure the effectiveness of any changes made upon the data center infrastructure!

BPOE 2013 | HPCChina 2013

Motivation• Scenario2

Data Center Administrators

Give a budget plan of the data center energyconsumption in the next year

Estimate the data volume based on the business development

How to estimate the energy increasement?

PUE provides little reference information for data center planning according to data scale

and application complexity

BPOE 2013 | HPCChina 2013

Calculation Framework

PUE

AxPUE

BPOE 2013 | HPCChina 2013

Definition - ApPUE• ApPUE (Application Performance Power Usage Effectiveness): a

metric that measures the power usage effectiveness of IT

equipments, specifically, how much of the power entering IT

equipments is used to improve the application performance.

• Computation Formulas:

ApplicationPerformanceApPUEIT Equipment Power

Data processing performance of applications

The average rate of IT Equipment Energy consumed

BPOE 2013 | HPCChina 2013

Definition - AoPUE• AoPUE (Application Overall Power Usage Effectiveness ): a metric

that measures the power usage effectiveness of the overall data center system, specifically, how much of the total facility power is used to improve the application performance.

• Computation Formulas:

ApplicationPerformanceAoPUETotal Facility Power

The average rate of Total Facility Energy UsedApPUEAoPUEPUE

BPOE 2013 | HPCChina 2013

Acquisition – Application Performance

Application Category

Examples Metric

Service Application Search engine, Ad-hoc queries

Number of requests answered in unit time

Data Analysis Application

Data mining, Reporting, Decision support, Log analysis

Volume of data processed in unit time

Interactive Real-time Application

E-commerce, Profile data management

Number of transactions completed in unit time

High Performance Computing

Scientific Computing Number of floating-point operations in unit time

BPOE 2013 | HPCChina 2013

Acquisition – Benchmark

• Requirements of Benchmarks– Provide representative workloads for big data

applications– Provide a scalable data generation tool

• BigDataBench– A big data benchmark suite open-sourced recently

and publicly available– All the requirements are well fullfilled

BPOE 2013 | HPCChina 2013

Experiment Overview• Testbed

– Data center of 18 racks,362 servers– Sample 8 servers

• Workloads

• Two experiments– Different Applications– Different Implementation Algorithms

BPOE 2013 | HPCChina 2013

Experiments on Different Applications

0

1

2

3

4

5

6

7

8

9

PUEApPUEAoPUE

BigDataBench SVM Sort Grep Linpack

17.2 11.5 269.9 179.7

BPOE 2013 | HPCChina 2013

Experiments on Different Algorithms• Two Implementations for Sort

– Several reducers with random sampling partitioning– One reducer without partitioning

10G 25G 50G 100G0

5

10

15

20

25

30PUE(Sort1)ApPUE(Sort1)PUE(Sort2)ApPUE(Sort2)

Data Size

BPOE 2013 | HPCChina 2013

Conclusions

• We analyze the requirements of application-level energy effectiveness metrics AxPUE in data centers.

• We propose two novel application-level metrics ApPUE and AoPUE to measure the energy consumed to improve the application performance.

• The experiment results show that AxPUE could provide meaningful guidance to data center design and optimization.

BPOE 2013 | HPCChina 2013

Evaluation of Performance & Energy Efficiency Tradeoff

How to measure?AxPUE: Application Level Metrics for Power Usage Effectiveness in Data Centers

How to get balance?The Implications from Benchmarking Three Big Data Systems

BPOE 2013 | HPCChina 2013

New Solutions

……

BPOE 2013 | HPCChina 2013

Experimental PlatformsXeon (Common processor)

Atom ( Low power processor)

Tilera (Many core processor)CPU Type Intel Xeon

E5310 Intel Atom D510 Tilera TilePro36

CPU Core 4 cores @ 1.6GHz

2 cores @ 1.66GHz

36 cores @ 500MHz

L1 I/D Cache 32KB 24KB 16KB/8KB

L2 Cache 4096KB 512KB 64KB

Basic InformationBrief Comparison

BPOE 2013 | HPCChina 2013

Benchmark SelectionBigDataBench

A big data benchmark suite from big data applicationsRespective applicationsAn innovative data generation tool

Application Time Complexity Characteristics

Sort O(n*log2n) Integer comparison

WordCount O(n) Integer comparison and calculation

Grep O(n) String comparisonNaïve Bayes O(m*n) Floating-point computation

SVM O(n3) Floating-point computation

BPOE 2013 | HPCChina 2013

Metrics Performance: Data processed per second (DPS)

Energy Efficiency: Application Performance Power Usage Effectiveness(DPJ)

BPOE 2013 | HPCChina 2013

Xeon Atom Tilera

DPS

DPJ

General Observations

BPOE 2013 | HPCChina 2013

General Observations

Data scale has a significant impact on the performance and energy efficiency of big data systems.

The performance and energy efficiency trends of different applications are diverse.

Xeon Atom Tilera

BPOE 2013 | HPCChina 2013

Xeon VS Atom – DPS

BPOE 2013 | HPCChina 2013

Xeon VS Atom – DPJ

BPOE 2013 | HPCChina 2013

Xeon VS Atom – DPS & DPJ500MB 1GB 10GB 25GB 50GB 100G

B

Sort DPSDPJ

3.670.87

4.511.08

1.890.45

1.540.36

1.360.32

1.400.33

Wordcount DPSDPJ

2.270.55

2.380.58

2.740.61

2.840.61

2.820.62

2.790.60

Grep DPSDPJ

1.830.48

1.820.46

2.300.54

2.790.62

2.870.63

2.890.64

Naïve Bayes

DPSDPJ

3.830.89

3.890.87

4.521.01

4.640.99

4.540.97

4.580.90

SVM DPSDPJ

3.190.69

3.060.64

3.170.66

3.140.67

Xeon is more powerful than Atom on processing capacity.Atom is more energy –saving than Xeon when dealing

with simple computation logic applications.

BPOE 2013 | HPCChina 2013

Xeon VS Atom -- Summary

Xeon is more powerful than Atom on processing capacity.

Atom is energy conservation than Xeon when dealing with applications with simple computation logic.

Atom doesn’t show energy advantage when dealing with complex applications.

BPOE 2013 | HPCChina 2013

Xeon VS Tilera – DPS

BPOE 2013 | HPCChina 2013

Xeon VS Tilera – DPJ

BPOE 2013 | HPCChina 2013

Xeon VS Tilera – DPS & DPJ500MB 1GB 10GB 25GB

Sort DPSDPJ

3.670.48

3.390.45

2.410.31

2.600.34

Wordcount DPSDPJ

5.190.67

5.040.65

7.350.87

7.780.92

Grep DPSDPJ

3.600.51

3.520.48

7.450.94

9.931.21

Naïve Bayes DPSDPJ

5.910.75

5.780.70

7.590.89

7.940.92

Xeon is more powerful than Tilera on processing capacityTilera is more energy-saving than Xeon when dealing with the simple computation logic and I/O intensive applicationsTilera don’t show energy advantage when dealing with complex applications

BPOE 2013 | HPCChina 2013

Xeon VS Tilera

The DPS of XeonThe DPS of AtomThe DPS of Tilera

BPOE 2013 | HPCChina 2013

Xeon VS Tilera

The DPS of Tilera

Tilera is more suitable to process I/O intensive applications

BPOE 2013 | HPCChina 2013

Xeon VS Tilera -- Summary

36

Xeon is more powerful than Tilera on processing capacity.

Tilera is more energy conservation than Xeon when dealing with simple computation logic and I/O intensive applications.

Tilera don’t show energy advantage when dealing with complex applications.

Tilera is more suitable to process I/O intensive applications.

BPOE 2013 | HPCChina 2013

ImplicationsThe performance of a big data system is not only related to the hardware itself, but also the application type and data volume of workloads.

The weak processors aren’t suitable to deal with complex applications. Even they have lower TDP, they don’t show energy cost advantage.

BPOE 2013 | HPCChina 2013

Implications Cont.Xeon generally has better processing capacity accompanied with high energy consumption, especially to some light scale-out applications.

Atom and Tilera show energy consumption advantage when dealing with light scale-out applications.

Tilera exerts energy advantage on processing I/O intensive application.

BPOE 2013 | HPCChina 2013

top related