embedded lab. park yeongseong. recent computer architecture (multi-core) a vast amount of main...

20
Regularities Considered Harmful: Forcing Randomness to Memory Accesses to Reduce Row Buffer Conflicts for Multi-Core, Multi-Bank Systems Embedded Lab. Park Yeongseong ACM ASPLOS’13 Heekwon Park, Computer Science Department University of Pittsburgh

Upload: winfred-richardson

Post on 02-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Regularities Considered Harmful: Forcing Randomness to Memory Accesses to Re-

duce Row Buffer Conflicts for Multi-Core, Multi-Bank Systems

Embedded Lab.Park Yeongseong

ACM ASPLOS’13 Heekwon Park, Computer Science Department University of Pittsburgh

Introduction Background Regularity Considered Harmful Design and Implementation Performance Evaluation Conclusions Q&A

Contents

Recent computer architecture (Multi-Core) A vast amount of main memory

Introduction

Need to re-examine ◦ internal policies, mechanisms

Rethinking the memory allocation issue

Background Problem

◦ Row buffer conflict

Approach◦ Memory container◦ Randomize memory ac-

cess

< Conceptual memory organization >

Row-buffer Conflict◦ Precharging◦ Activating operation

Delay Energy Consumption

Background

< Row-buffer hit and conflict overhead >

Background

< Conflict does not occur > < Conflict occurs>

Kernel-level memory allocator◦ Mapping between virtual pages and physical page

frames Memory controller

◦ Banks

CPU cache mode ◦ Uncacheable

Variables numerous times Access

two variables mutu-ally dependent

Memory Organization Analysis

Memory Organization Analysis Figure (d) ranges from 0 to

2,000,000 (roughly 128MB size)

Figure (c) zooms in on the 590,000 ~ 640,000 portion of Figure (d)

Figure (b) zooms in on a por-tion of iterations of Figure (c)

Figure (a) zooms in on a por-tion of iterations of Figure (b)

< Analysis result>

Regularity Considered Harmful

< Sequential access pattern >

Modified Algorithm◦ Set the two variable : lo-

cated in the same cache line

◦ Different starting physical address

Average elapsed time◦ 2052μsec

Regularity Considered Harmful

< Random access pattern >

Average elapsed time◦ 1925 μsec

“1/total number of banks”.

Design and Implementation

< Memory container design >

The minimum memory unit of page frame

Design and Implementation

< Comparison between buddy and randomized algorithm>

Individual page frame management Downward search

Experiment Environment◦ IBM x3650 M2 Server◦ Intel XEON x5570 quad core processors◦ 32GB DDR3 Memory◦ 450GB SAS Disk 8◦ Linux kernel version 2.6.32

Performance Evaluation

Benchmark category◦ 1 Group : Memory intensive benchmark

Stream, Sysbench-memory, Ramspeed

◦ 2 Group : CPU or I/O intensive benchmark Kernel Compile, Dbench, Unixbench

◦ 3 Group : To represent diverse application do-mains PARSEC

Performance Evaluation

Performance Evaluation

< Memory intensive benchmark results >

< CPU or I/O intensive benchmark results >

Performance Evaluation

< PARSEC benchmark result >

: kernel-level memory allocator◦ Multi-core, Multi-bank systems

Dedicate multiple banks to a core◦ Maximize memory parallelism

Same bank Access reduce

Conclusions

Memory container

Randomizing memory allocation algorithm

http://people.cs.pitt.edu/~parkhk/publications.html

멀티 - 코어 멀티 - 뱅크에서의 메모리 참조 패턴에 따른 성능 분석 – 학위논문 ( 석사 ) 이 상엽

References

Q&A