performance and energy efficiency evaluation of big data systems
Post on 24-Feb-2016
41 Views
Preview:
DESCRIPTION
TRANSCRIPT
Performance and Energy Efficiency Evaluation of Big Data Systems
Presented by Yingjie ShiInstitute of Computing Technology, CAS
2013-10-31
BPOE 2013 | HPCChina 2013
Goals of Big Data Systems
Larger
GreenerFaster
BPOE 2013 | HPCChina 2013
Performance V.S. Energy Efficiency
Performance
Energy EfficiencyFaster & More
PowerfulGreener &Cheaper
More servers Bigger clusters Powerful processors Sophisticated
processing algorithms
…
Lightweight servers Efficient processors Simpler processing
algorithms …
TradeoffEvaluation
BPOE 2013 | HPCChina 2013
Evaluation of Performance & Energy Efficiency Tradeoff
How to measure?AxPUE: Application Level Metrics for Power Usage Effectiveness in Big Data Systems
How to get balance?The Implications from Benchmarking Three Big Data Systems
BPOE 2013 | HPCChina 2013
Motivation
If you can not measure it, you can not improve it. – Lord Kelvin
PUE(Power usage effectiveness): a measure of how efficiently a computer data center uses its power; specifically, how much of the power is actually used by the information technology equipment.
BPOE 2013 | HPCChina 2013
PUE & Its Variants
Metric Time Organization Computing Formulas
PUE 2007
GreenGrid
DCiE 2008 GreenGrid DCeP 2008 GreenGrid pPUE 2012 GreenGrid PUE
Scalability 2013 GreenGrid
Total Facility EnergyIT Equipment Energy
*100%IT Equipment EnergyTotal Facility Energy
Total Facility Energy insidetheBoundaryIT Equipment Energy insidetheBoundary
*100%Actual
PUE
mm
PrQuantityof ResourceConsumed Producing this Work
UsefulWork oducedTotal
BPOE 2013 | HPCChina 2013
Motivation• Scenario1
Data Management Researcher
An Improved Data Classification AlgorithmDoes it contribute to greening the data centers?
Run the Algorithms on Data Center
Compare the PUEs
No Obvious Variations!
PUE can not measure the effectiveness of any changes made upon the data center infrastructure!
BPOE 2013 | HPCChina 2013
Motivation• Scenario2
Data Center Administrators
Give a budget plan of the data center energyconsumption in the next year
Estimate the data volume based on the business development
How to estimate the energy increasement?
PUE provides little reference information for data center planning according to data scale
and application complexity
BPOE 2013 | HPCChina 2013
Calculation Framework
PUE
AxPUE
BPOE 2013 | HPCChina 2013
Definition - ApPUE• ApPUE (Application Performance Power Usage Effectiveness): a
metric that measures the power usage effectiveness of IT
equipments, specifically, how much of the power entering IT
equipments is used to improve the application performance.
• Computation Formulas:
ApplicationPerformanceApPUEIT Equipment Power
Data processing performance of applications
The average rate of IT Equipment Energy consumed
BPOE 2013 | HPCChina 2013
Definition - AoPUE• AoPUE (Application Overall Power Usage Effectiveness ): a metric
that measures the power usage effectiveness of the overall data center system, specifically, how much of the total facility power is used to improve the application performance.
• Computation Formulas:
ApplicationPerformanceAoPUETotal Facility Power
The average rate of Total Facility Energy UsedApPUEAoPUEPUE
BPOE 2013 | HPCChina 2013
Acquisition – Application Performance
Application Category
Examples Metric
Service Application Search engine, Ad-hoc queries
Number of requests answered in unit time
Data Analysis Application
Data mining, Reporting, Decision support, Log analysis
Volume of data processed in unit time
Interactive Real-time Application
E-commerce, Profile data management
Number of transactions completed in unit time
High Performance Computing
Scientific Computing Number of floating-point operations in unit time
BPOE 2013 | HPCChina 2013
Acquisition – Benchmark
• Requirements of Benchmarks– Provide representative workloads for big data
applications– Provide a scalable data generation tool
• BigDataBench– A big data benchmark suite open-sourced recently
and publicly available– All the requirements are well fullfilled
BPOE 2013 | HPCChina 2013
Experiment Overview• Testbed
– Data center of 18 racks,362 servers– Sample 8 servers
• Workloads
• Two experiments– Different Applications– Different Implementation Algorithms
BPOE 2013 | HPCChina 2013
Experiments on Different Applications
0
1
2
3
4
5
6
7
8
9
PUEApPUEAoPUE
BigDataBench SVM Sort Grep Linpack
17.2 11.5 269.9 179.7
BPOE 2013 | HPCChina 2013
Experiments on Different Algorithms• Two Implementations for Sort
– Several reducers with random sampling partitioning– One reducer without partitioning
10G 25G 50G 100G0
5
10
15
20
25
30PUE(Sort1)ApPUE(Sort1)PUE(Sort2)ApPUE(Sort2)
Data Size
BPOE 2013 | HPCChina 2013
Conclusions
• We analyze the requirements of application-level energy effectiveness metrics AxPUE in data centers.
• We propose two novel application-level metrics ApPUE and AoPUE to measure the energy consumed to improve the application performance.
• The experiment results show that AxPUE could provide meaningful guidance to data center design and optimization.
BPOE 2013 | HPCChina 2013
Evaluation of Performance & Energy Efficiency Tradeoff
How to measure?AxPUE: Application Level Metrics for Power Usage Effectiveness in Data Centers
How to get balance?The Implications from Benchmarking Three Big Data Systems
BPOE 2013 | HPCChina 2013
New Solutions
……
BPOE 2013 | HPCChina 2013
Experimental PlatformsXeon (Common processor)
Atom ( Low power processor)
Tilera (Many core processor)CPU Type Intel Xeon
E5310 Intel Atom D510 Tilera TilePro36
CPU Core 4 cores @ 1.6GHz
2 cores @ 1.66GHz
36 cores @ 500MHz
L1 I/D Cache 32KB 24KB 16KB/8KB
L2 Cache 4096KB 512KB 64KB
Basic InformationBrief Comparison
BPOE 2013 | HPCChina 2013
Benchmark SelectionBigDataBench
A big data benchmark suite from big data applicationsRespective applicationsAn innovative data generation tool
Application Time Complexity Characteristics
Sort O(n*log2n) Integer comparison
WordCount O(n) Integer comparison and calculation
Grep O(n) String comparisonNaïve Bayes O(m*n) Floating-point computation
SVM O(n3) Floating-point computation
BPOE 2013 | HPCChina 2013
Metrics Performance: Data processed per second (DPS)
Energy Efficiency: Application Performance Power Usage Effectiveness(DPJ)
BPOE 2013 | HPCChina 2013
Xeon Atom Tilera
DPS
DPJ
General Observations
BPOE 2013 | HPCChina 2013
General Observations
Data scale has a significant impact on the performance and energy efficiency of big data systems.
The performance and energy efficiency trends of different applications are diverse.
Xeon Atom Tilera
BPOE 2013 | HPCChina 2013
Xeon VS Atom – DPS
BPOE 2013 | HPCChina 2013
Xeon VS Atom – DPJ
BPOE 2013 | HPCChina 2013
Xeon VS Atom – DPS & DPJ500MB 1GB 10GB 25GB 50GB 100G
B
Sort DPSDPJ
3.670.87
4.511.08
1.890.45
1.540.36
1.360.32
1.400.33
Wordcount DPSDPJ
2.270.55
2.380.58
2.740.61
2.840.61
2.820.62
2.790.60
Grep DPSDPJ
1.830.48
1.820.46
2.300.54
2.790.62
2.870.63
2.890.64
Naïve Bayes
DPSDPJ
3.830.89
3.890.87
4.521.01
4.640.99
4.540.97
4.580.90
SVM DPSDPJ
3.190.69
3.060.64
3.170.66
3.140.67
Xeon is more powerful than Atom on processing capacity.Atom is more energy –saving than Xeon when dealing
with simple computation logic applications.
BPOE 2013 | HPCChina 2013
Xeon VS Atom -- Summary
Xeon is more powerful than Atom on processing capacity.
Atom is energy conservation than Xeon when dealing with applications with simple computation logic.
Atom doesn’t show energy advantage when dealing with complex applications.
BPOE 2013 | HPCChina 2013
Xeon VS Tilera – DPS
BPOE 2013 | HPCChina 2013
Xeon VS Tilera – DPJ
BPOE 2013 | HPCChina 2013
Xeon VS Tilera – DPS & DPJ500MB 1GB 10GB 25GB
Sort DPSDPJ
3.670.48
3.390.45
2.410.31
2.600.34
Wordcount DPSDPJ
5.190.67
5.040.65
7.350.87
7.780.92
Grep DPSDPJ
3.600.51
3.520.48
7.450.94
9.931.21
Naïve Bayes DPSDPJ
5.910.75
5.780.70
7.590.89
7.940.92
Xeon is more powerful than Tilera on processing capacityTilera is more energy-saving than Xeon when dealing with the simple computation logic and I/O intensive applicationsTilera don’t show energy advantage when dealing with complex applications
BPOE 2013 | HPCChina 2013
Xeon VS Tilera
The DPS of XeonThe DPS of AtomThe DPS of Tilera
BPOE 2013 | HPCChina 2013
Xeon VS Tilera
The DPS of Tilera
Tilera is more suitable to process I/O intensive applications
BPOE 2013 | HPCChina 2013
Xeon VS Tilera -- Summary
36
Xeon is more powerful than Tilera on processing capacity.
Tilera is more energy conservation than Xeon when dealing with simple computation logic and I/O intensive applications.
Tilera don’t show energy advantage when dealing with complex applications.
Tilera is more suitable to process I/O intensive applications.
BPOE 2013 | HPCChina 2013
ImplicationsThe performance of a big data system is not only related to the hardware itself, but also the application type and data volume of workloads.
The weak processors aren’t suitable to deal with complex applications. Even they have lower TDP, they don’t show energy cost advantage.
BPOE 2013 | HPCChina 2013
Implications Cont.Xeon generally has better processing capacity accompanied with high energy consumption, especially to some light scale-out applications.
Atom and Tilera show energy consumption advantage when dealing with light scale-out applications.
Tilera exerts energy advantage on processing I/O intensive application.
BPOE 2013 | HPCChina 2013
top related