Quelques travaux de recherche en efficacité énergétique
Laurent Lefèvre
ANF2014, Cargèse, Octobre 2014
INRIA AVALON / LIP
Ecole Normale
Supérieure de Lyon
Ice 500 Gtons 2011-2014 : Groenland 375 Gt /Antartic
125 Gt : *2/*3 compared to average between 03-
09
Rising > 1 m (2100)
Temperature increasing (2°C – 2100) -> 4°C (50%
chance – 2100)
No more petrol in 50 years …
IT -> electricity -> CO2 -> impact
So we should change our way to use energy with IT -
Chasing watts / chasing overprovisioning / unuseful
services…
Some messages from our planet
● Energy consumption is growing :
Top500 : Nov 2010 : 127 MW – Nov
2013 : 205 MW (not all referenced) -
Green500 : 550 MW (Nov. 13 – all
referenced)
● Only usage ! not the full life cycle which is bad : planned obsolescence, rebound effect, design (rare minerals), difficult recycling...
● How to build future
exascale/datacenters platforms and
make them (more) energy
sustainable/responsible ? - Multi
dimension approaches : hardware,
software, usage
EE : More4less !
Energy : 1st limiting factor for large scale systems
((hpc)datacenter, clouds, internet)?
Towards Energy Aware Large Scale Systems
How to decrease the energy consumption without impacting the performances?
Green-IT Leverages
Shutdown : reducing the amount of powered unused resources
Slowdown : adapting the speed of resources to real usage
Optimizing : improving hardware and software for energy
reduction purpose (i.e. energy aware libraries). Adapt software to
green hardware.
Coordinating : using large scale approaches to enhance green
leverages
Example : Slowdown on servers : DVFS:
Dynamic Voltage Frequency Scaling
Various ways of « Greening » in networks
Bolla 2011 : no Green, shutdown, slowdown, shutdown+slowdown
GreenTouch - STAR
Various ways of « Greening »
An ecosystem of improvements : but a lot of them remain in labs !
« On the Road to Energy-Efficient Computing and Network: Where Exactly are We? » - Anne-Cécile Orgerie, Marcos Dias de
Asuncao, Laurent Lefevre ACM Compuyting Surveys - December 2014
GreenTouch - STAR
The slowdown approach
.
→ Adjust the link rate to the actual traffic: ALR : Adaptive Link Rate
How to define the thresholds? → live adaptation?
Green Capabilities in Acces Networks
A succes : Low Power Idle == 802.3az
Power down Ethernet transceivers (PHYs) in periods with low data rate
Key features are:
• Allow powering down the transmitters and three of the four receivers
• Include a refresh cycle
• Definition of an alert signal to rapidly wake up
Energy Efficient Ethernet
GreenTouch - STAR
Shutdown ! : Eteindre les équipements
inutiles
On/Off model? → optimization
Switching off and on is difficult and complex at large
scale without good prediction
Avoiding on-demand & overpovisioning
Needs of scheduling and planification -> need of
reservation based systems
FSN XLCLOUD Project (2012-2015)
Partners : Bull SAS, Serviware, Institut Telecom,
HPC-Project, CEA List, EISTI, ATEME, OW2,
Inria
Target : HPC as a service : supporting HPC
applications with interactive remote visualization
in energy efficient Cloud : GPUs,
Infiniband…etc…
Climate / Blazar project : capacity leasing in
Openstack (Inria, Bull, Mirantis)
http://xlcloud.org/
Reservation based Openstack
Clouds
Shutdown ! : Plus de services dans les
datacenters pour des équipements moins
consommateurs à l’extérieur
What is GreenTouch?
15 GreenTouch : Building the Roadmap | 2011
© 2011 GreenTouch Consortium
GreenTouch Mission By 2015, our goal is to deliver the architecture, specifications and
roadmap — and demonstrate key components — needed to increase
network energy efficiency by a factor of 1000 from current levels.
2000X 1600X 400X 20X
Services, Applications & Trends
Mobile
Communications
Wireline Access
Networks
Core Optical
Networking &
Transmission
Core Switching
& Routing
Standard Box replaced by low consumption box
and services are virtualized
CPE
Standard Box replaced by low consumption box
and services are virtualized
CPE
VHGW : Virtual Home Gateway
- Home Gateway : multiple services (data, voice, multimedia, tv, games) => 15 to 30W (always
on)
- Virtualizing home gateway services to reduce energy consumption at the last mile
- Combining with quasi passive CPE
- Taking care of Quality of Service
- Evaluating energy usage reduction
- Studying consolidation effects
18
Power consumption of idle vHGWs
19
Scenario : Deploy one more vHGW (i.e. LXC container)
each 20 seconds.
Experimentation duration : 6 hours.
Tool : External wattmeter (not ipmi)
Conclusion : If we ignore the clouds of point (due to the
deployments phase) above the high density line, the power
consumption is not really affected. However we can notice
that once all the vHGW are deployed (then after 1000th),
the average consumption is slightly increased.
Bandwidth fairness between 100 vHGWs
20
Here bandwidth sharing is good
(but only 600 Mbps is used
globally…)
Bandwidth sharing is correct
(but TCP by nature has
difficulties to provide stability). 21
Low throughput of 300 vHGWs (2 Mbps each) Five TCP streams
Datacentres à l’heure de l’Exascale
trop gros, trop compliqués pour être
optimisé par l’utilisateur => besoins
d’environnements logiciels
Towards Exascale infrastructures
Avoiding the 100 MW wall → 20 MW
Currently 4 Gflops per watt for the best in Green500
Less than 3 Gflops per watt for petaflops machine
For exaflops target : 50 Gflops/W
Following the Green500 path to obtain energy efficiency:
- using energy-efficient accelerators (i.e. GPUs)
- aggregating many low-power processors (example the MontBlanc project)
Applications on exascale ?
Focusing on mandatory services for exascale (fault tolerance, restart) and explore
green version
Knowing the system or not ?
Exploring 2 different approaches :
With knowledge on the application and services : Enable the user to choose the
less consuming implementation of service. ==> estimate the energy
consumption of the different implementations (protocols) of each service.
Without knowledge : allow some intelligence to reduce the energy usage ===>
autonomically estimate the energy consumption of the HPC system in order to
apply green levers
Energy in fault tolerance protocols -
Message logging
Pushing service to the limit : make measurement possible
Sender process logs all the messages that are sent to other processes. Ex. log of
100,000 messages of 100 KBytes => 10 GBytes on RAM or HDD
Extra power cost due to message logging
EE-FT/ Conclusions on Message logging
More energy for logging on HDD (2500 to 3600 Joules/GBytes) compared to RAM
(120 to 150 joules/GByte).
If power capping point of view ==> promote HDD logging
RAM logging more energy efficient due to the logging time:
• on HDD = more than 140 seconds for 10 GBytes
• on RAM = 7 seconds for 10 Gbytes
Extra power cost due to message logging
ECOFIT Framework Examples : exascale services : resilience & data
broadcasting
4 steps: Service analysis, Measurements, Calibration,
Estimation
Helping users make the right choices depending on context
and parameters
M. Diouri, Olivier Glück, Laurent Lefevre, and Franck Cappello. "ECOFIT: A Framework to Estimate Energy
Consumption of Fault Tolerance Protocols during HPC executions", CCGrid2013, the 13th IEEE/ACM
International Symposium on Cluster, Cloud and Grid Computing, Delft, the Netherlands, May 13-16, 2013
Without knowledge of applications and
services ?
HPC applications keep growing in complexity : too many bugs in HPC applications
already present, adding energy management and considerations won’t help :=)
Are HPC programmers ready for eco design of applications ?
HPC can share the same infrastructure : Optimizations made for saving energy
considering some applications are likely to impact the performance of others
Instead of looking at applications and service => Focusing on the infrastructure
• Detect and characterize system’s runtime behaviours/phases
• Optimize each subsystem (storage, memory, interconnect, CPU) accordingly
Helping users to find the best service
Online analysis without knowledge on applications
Landry Tsafack, Laurent Lefevre, Jean-Marc Pierson, Patricia Stolf, and
Georges Da Costa. "A runtime framework for energy efficient HPC
systems without a priori knowledge of applications", ICPADS 2012 : 18th
International Conference on Parallel and Distributed Systems, Singapore,
December 2012
• Irregular usage of resources
• Phase detection,
characterisation
• Power saving modes
deployment
• MREEF framework
29
Proportionnalité énergétique : le « graal »
de l’efficacité énergétique ?
Energy proportionality
Luiz André Barroso and Urs Hölzle, « The case for Energy-Proportional Computing », IEEE
Computer, 2007
At servers level :
Idle power consumption
Inefficient region depending on load
At network level :
Even less proportional
Switches energy consumption almost constant
Energy consumption and energy efficiency of a server according to its load
The grail of energy proportionality in
networks
GreenTouch - STAR
Static / dynamic part of power
Even reducing a lot some static part can
remain important
from GOS : 240 W / 260 W (92%) to
recent one 90W / 190W (47%)
First LHF : switch off unused resources :
delete the static part !
Address the dynamic part with green levers :
adapt resources to the need of applications
HPC applications keep growing in complexity : too many bugs in
HPC applications already present, adding energy management
and considerations won’t help :=)
Are HPC programmers ready for eco design of applications ?
Applications can share the same infrastructure : Optimizations made
for saving energy considering some applications are likely to
impact the performance of others
Instead of looking at applications and service => Focusing on the
infrastructure
• Detect and characterize system’s runtime behaviours/phases
• Optimize each subsystem (storage, memory, interconnect, CPU)
accordingly
What about missing parts of the curve ?
• Specific conditions of workload
• Gaps between bursts
• Exploiting heterogeneity of
processors (flops, watts, flops
per watt) to fill the missing
parts
35
2011 2012 2013 2014 2015 2016 2017
0.2 GF/W
3.5 GF/W
7 GF/W
20 GF/W
256 nodes
512 GFLOPS
1.7 Kwatt
1024 nodes
152 TFLOPS
20 Kwatt
50 PFLOPS
7 MWatt
200 PFLOPS
10 MWatt
Mont-Blanc: new energy-efficient Exascale design
Build a fully functional prototype using commercially
available low-power embedded technology
Design an Exascale machine to overcome limitations
identified in the prototype
Develop Exascale applications to run on this new
generation of HPC systems
Slide from Josep Subirats (BSC)
Heterogeneous multicore processors
ARM big.LITTLE
2 processors (4 cores each) :
● LITTLE (Cortex A7)
● big (Cortex A15)
Interconnected by a Cache Coherence system
Some utilization modes :
● Cluster migration ( 4 / 4 )
● Global Task Scheduling ( 8 )
big.LITTLE « Cluster migration »
GOAL→ Extend battery life time of
mobile devices which are idle
most of the time
Heterogeneous architectures
A the scale of a datacenter → ARM may be not enough
We could need real performance to absorb load peaks
Exploring a combination of :
Low-power processors for low load
and
high performance processors for heavy load
→ reduces static costs
→ use classical servers only at their most energy efficient load level
+ other classical levers : DVFS, switch off/on,… to improve consumption proportionality
Technical challenges
● Different architectures : ARM and x86
How to combien them and be able to go from one architecture to another ?
- live migration without impact on the moving application
- migration fastest as possible
→ First idea Classical cloud approach : Virtual machines
2 physical architectures → 2 choices for virtual machine architecture
When the VM is not on the right physical architecture, we use emulation with QEMU
→ What is the cost of emulation ?
→ Which architecture to choose for the VM ?
?
Comparison of VM architecture – First results
● ARM VM:
Native on ARM processor
Emulated on x86 processor
● X86 VM :
Native on x86 processor
Emulated on ARM processor
ARM : Samsung Chromebook (2 processors ARM
Cortex-A15)
x86 : Dell PowerEdge R720 (2 processors Intel Xeon
6 cores)
Benchmark nbench : integer/float
Comparison of VM performances – First results
Comparison of native performances – First results
Still some work to do to reach a nice energy proportional curve
The risks for datacenters !
- Risk « I need to use a constant power within my
datacenter, my network»
Ex : power usage in France 3/9/14
-> negociate with your provider – combine
reservation/prediction
- Risk « my DC needs to consume a minimum
amount of power » -> renegociate your contract
- Risk : needs overprovisioning in my networks in
case of -> , add autonomic and dynamic fault
tolerance & resilience solutions
- Risk , when my equipments (re)boot I face too
much risks -> negociate with your equipment
provider
GreenTouch - STAR
Exascale : a big energy
consumer => consume better
With a x MW customer: exchange framework is
mandatory (SESAMES)
M. Diouri, O. Gluck, and L. Lefevre. "Towards a novel Smart and Energy-Aware
Service-Oriented Manager for Extreme-Scale applications, First Workshop
for Power Grid-Friendly Computing (PGFC'12), San Jose, USA, June 2012
EP dans les réseaux : les besoins sont là
!
Networks are part of the problem/usage
But GreenTouch adresses this issue !
GreenTouch - STAR
Backbone networks
• High speed core networks
• Relatively small number of nodes
• High speed ports and small port density per node
• The network must not be disconnected
• IP over WDM
Nantes2014_Laurent_Lefevre.ppt
GreenTouch - STAR
Going inside networks for green goals
Finding energy efficiency through traffic engineering
Re routing traffic -> possibility to swtich off parts of the network
Nantes2014_Laurent_Lefevre.ppt
GreenTouch - STAR
Multi Layers Networks
Van Heddeghem, Ward, Filip Idzikowski, Willem Vereecken,
Didier Colle, Mario Pickavet, and Piet Demeester.
2012.“Power Consumption Modeling in Optical Multilayer
Networks.” Photonic Network Communications 24 (2): 86–
102
IP Layer
Optical Layer
GreenTouch - STAR
Two ways of fighting overprovisioning
On Physical layer
• Switching off optical fibers
• Allows to switch off amplifiers
• 17 > 1870 W
• Need to reconfigure optical paths
• Amplifiers switch ON is long
On IP layer
• Switching off IP links
• Allows to switch off router ports
• 2 > 5588 W
• Need to re-route in IP layer
• Ports switch ON is faster than amplifiers
switch ON
GreenTouch - STAR
Traffic aware energy efficient engineering
On demand On / Off
Choose the best links to be switched off
• May be easy (inefficient), up to NP-hard
• Using link utilization (inefficient), or flow matrices:
• In most cases are difficult to obtain
• Can be estimated, but error-prone and expensive
Find the links to be switched on
• Can be even more complex
• Better energy efficiency comes with complex computations
GreenTouch - STAR
Centralized vs Distributed
Nantes2014_Laurent_Lefevre.ppt
Centralized SDN Centralized Master election
Distributed Global Decisions Distributed Local Decisions
GreenTouch - STAR
Thanks to : R. Carpa, G. DaCosta, M. Dias de Asuncao, M.
Diouri, J.-P. Gelas, O. Gluck, J.C. Mignot, A.-C. Orgerie, G.
Tsafack, J.-M. Pierson, F. Rossigneux, P. Stolf, V.
Villebonnet