embedded system lab. 최 길 모최 길 모 kilmo choi [email protected] a software memory...
TRANSCRIPT
Embedded System Lab.
Embedded System Lab.최 길 모
Kilmo [email protected]
A Software Memory Partition Approach for Eliminating Bank-level Interference in Multi-
core Systems
Lei Liu, Zehan Cui, Mingjie Xing, Yungang Bao, Mingyu Chen, Chengyong Wu
Embedded System Lab.최 길 모
Contents
Background and Motivation
Bank-Level Partition Mechanism(BPM)
Results
Conclusion
Reference
Embedded System Lab.최 길 모
Background and Motivation
Memory bank
The same set of memory access speed
Multicore platform
Embedded System Lab.최 길 모
Background and Motivation
Bank-Level Parallelism(BLP) and Bank Sharing
Multiple banks can serve memory requests concurrently and independently
Memory system usually employs a bank-interleaved address mapping schema
Memory interference on multicore platform
Causes performance degradation(throughput slowdown and unfairness )
ex. row buffer hit rate decrease from 1 core(over 60%) to 16 core(35%)
Core
MC
Core
MC
Bank
row bufferconflict
Embedded System Lab.최 길 모
Background and Motivation
Numerous new memory scheduling algorithms have been proposed
to address the interference problem
However, these algorithms usually employ complex scheduling logic and need
hardware modification to memory controllers
Bank-level conflicts can be fully eliminated by exclusively mapping a
thread’s data to specific banks
How much influence the performance of thread amount of available bank?
Embedded System Lab.최 길 모
Bank-Level Partition Mechanism(BPM)
Overview of BPM
OS memory management system uses a page-coloring mechanism to partition
banks into several groups and maps each thread (process) to a specific bank group
Address mapping policy
Advantages
row buffer conflict ↓ row buffer hit ↑
BPM is entirely software approach Flexible
Easier for OS to monitor thread’s behavior than hardware
Embedded System Lab.최 길 모
Bank-Level Partition Mechanism(BPM) Discover bank bits by software method
Embedded System Lab.최 길 모
Results
Environments
4 cores, 2.8GHz Intel Core i7-860 processor, 8GB DDR3 main memory
CentOS Linux 5.4 with kernel 2.6.32.15
SPEC CPU2006
Embedded System Lab.최 길 모
Results Overall system performance
Embedded System Lab.최 길 모
Results
Page-Policy and Power
Embedded System Lab.최 길 모
Results BPM VS Cache-Partition-Only
The correlation between BPM improvements and Per-core bandwidth
Embedded System Lab.최 길 모
Reference
J. Lin, Q. Lu, X. Ding, Z. Zhang, X. Zhang, and P. Sadayappan. Gain-
ing Insights into Multicore Cache Partitioning: Bridging the Gap be-
tween Simulation and Real Systems. In HPCA-14, 2008.
Dimitris Kaseridis, Jeffrey Stuecheli, Lizy Kurian John. Minimalist
Open-page: A DRAM Page-mode Scheduling Policy for the Many-
core Era. In MICRO 44, 2011