korea univ b-fetch: branch prediction directed prefetching for in-order processors 컴퓨터 ·...

19
Korea Univ B-Fetch: Branch Prediction Directed Prefetching for In-Order Processors 컴컴컴 · 컴컴컴컴컴컴컴 2015020802 컴컴컴 1 Computer Engineering and Systems Group (CESG), Department of Electrical and Computer Engineering, Texas A&M University -> Reena Panda , Paul V. Gratz Department of Computer Science, The University of Texas at San Antonio -> Daniel A. Jim´enez

Upload: horace-hamilton

Post on 28-Dec-2015

219 views

Category:

Documents


3 download

TRANSCRIPT

Korea Univ

B-Fetch: Branch Prediction Directed Prefetching for In-Order Processors

컴퓨터 ·전파통신공학과2015020802

최병준

1

Computer Engineering and Systems Group (CESG),Department of Electrical and Computer Engineering,Texas A&M University

-> Reena Panda , Paul V. Gratz

Department of Computer Science,The University of Texas at San Antonio

-> Daniel A. Jim´enez

Korea Univ

1. Introduction

• Modern computer architecture is beset bytwo opposing, conflicting trends while technology scaling and deep pipelining

have led to high processor frequencies,the memory access speed has not scaled ac-cordingly

Meanwhile, power and energy considerations have revived interest in in-order processors

2

Korea Univ

1. Introduction

• In this paper we presenta novel data cache prefetch scheme,leveraging both execution path specula-tionas well as effective address speculation,to efficiently improve performance ofin-order processors.

• Much prior work focuses on reducing the impact of the memory-wall on processor performance

3

Korea Univ

1. Introduction

• In this paper we proposea light-weight prefetcher ‘B-Fetch’,a combined control-flow and effective ad-dress speculating prefetching scheme.

• B-Fetch leveragesthe high prediction accuracies ofcurrent-generation branch predictors,combined with novel effective address speculation.

4

Korea Univ

2. Background

• To be effective at masking such high laten-cies,a prefetcher must anticipate misses and is-sue prefetches significantly ahead of actual execution

• This requires accurate prediction of ~

1. the likely memory instructions to be executed 2. the likely effective addresses of these instruc-tions

5

Korea Univ

2. Background

• Program execution path is determined bythe direction taken by the component con-trol instructions

• The memory access behavior can therefore be linked to the prior control flow behavior

6

Korea Univ

2. Background

7

• The program execution path is determined by direction taken by the relevant control instruc-tion.

• Memory access behavior can therefore be linked to prior control flow be-havior

Korea Univ

2. Background

8

Korea Univ

3. Proposed Design

• Pipeline Overview

1. Branch Lookahead2. Register-Table Lookup3. Prefetch Issue

9

Korea Univ

3. Proposed Design

• System Components Path Confidence Estimator Branch Trace Cache Branch-Register Table Prefetch Filtering

10

Korea Univ

3. Proposed Design

• System Components Path Confidence Estimator Branch Trace Cache

Branch-Register Table Prefetch Filtering

11

Korea Univ

3. Proposed Design

• System Components Path Confidence Estimator Branch Trace Cache Branch-Register Table

Prefetch Filtering

12

Korea Univ

3. Proposed Design

• System Components

Path Confidence Estimator Branch Trace Cache Branch-Register Table Prefetch Filtering

13

Korea Univ

3. Proposed Design

• Hardware Cost

The table shows B-Fetchrequires ∼33% of thetable state requiredby SMS

14

Korea Univ

4. Evaluation

• Methodology We evaluate our prefetcher in a simulation en-

vironment based on the M5 Simulator.The simulator is used to modela 1-wide, 5-stage in-order pipeline

The test workload consists of the 18 SPEC CPU2006 benchmarks that our simulation in-frastructure supports, compiled for the ALPHA ISA

15

Korea Univ

4. Evaluation

• Prefetcher Performance Figure 5 contains the

IPC for the simpleStride, SMS and Bfetch prefetchersnormalized against baseline

B-Fetch provides performance benefits across a range of applications, both integer and floating point

16

Korea Univ

4. Evaluation

• Prefetcher Performance The results show the B-Fetch prefetcher provides

a mean speedup of 39% (62%) acrossall (prefetch sensitive) benchmarks.

As compared to a stride prefetcher,B-Fetch improves the performance by over 25% (39.4%).

Compared against SMS, B-Fetch improves the perfor-mance by 2.2% (3.4%), at the cost of ∼ 1/3 the overhead in storage

17

Korea Univ

5. Conclusions

• B-Fetch not only predicts effective addresses which display regular-access patterns,but also can take advantage of the dynamic values of the registers at runtime to predict irregular and isolated data accesses.

• The focus of this paper has been improvingin-order processor performance,the B-Fetch scheme should perform compa-rably on superscalar processors,we plan to explore this in future work.

18

Korea Univ

The End

19