power and frequency analysis for data and control independence in embedded processors

17
Power and Frequency Analysis for Data and Control Independence in Embedded Processors Farzad Samie Amirali Baniasadi Sharif University of Technology University of Victoria

Upload: eyad

Post on 31-Jan-2016

27 views

Category:

Documents


0 download

DESCRIPTION

Power and Frequency Analysis for Data and Control Independence in Embedded Processors. Farzad Samie Amirali Baniasadi Sharif University of Technology University of Victoria. This Work. Goal - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Power and Frequency Analysis for  Data and Control Independence in  Embedded Processors

Power and Frequency Analysis for Data and Control Independence in

Embedded Processors

Farzad Samie Amirali Baniasadi

Sharif University of Technology University of Victoria

Page 2: Power and Frequency Analysis for  Data and Control Independence in  Embedded Processors

This Work

Goal• Power and frequency analysis for control independent and data

independent instructions in embedded processors

Motivation• Embedded processors are becoming complex

• Modern embedded processors use speculation

• Mis-speculation causes performance and power penalty

• Power is a major concern in embedded processors

• Save power and gain performance

2

Page 3: Power and Frequency Analysis for  Data and Control Independence in  Embedded Processors

This Work (cont.)

Our Approach• Reducing wasted energy and time in mispredictions.

How?• Identify and bypass Control Independent (CI) and Data Independent

(DI) instructions.

• CIs: Instruction executing independent of branch outcome.

• CI-DI: CI Instructions executing with the same operands.

Key Result:• 12% processor energy reduction.

3

Page 4: Power and Frequency Analysis for  Data and Control Independence in  Embedded Processors

Background

Branch Prediction

4

Branch Predictor

Branch History

Program Counter

Predicted direction

Predicted target address

Page 5: Power and Frequency Analysis for  Data and Control Independence in  Embedded Processors

Wrong Path (squashed) ??

Background (cont.)

5

I1

I2

I3

I4

I7

I8I9

I5I6

Branch Inst.Not taken

Misprediction Detection

Taken

Right Path

I9

I8

I7

I12

I11

I10

Control Independent Instructions (CIs)

Page 6: Power and Frequency Analysis for  Data and Control Independence in  Embedded Processors

Background (cont.)

6

R1←R1+R2

Not taken Taken

R4←R1

If (R4=0)

R2←R4-R1

R5←R2-R3

R3←0

R5←R4+1

R1←R1-1

R3←0

R4←R6+R4

R1←R4+R1

R5←R5-2R3←R3-R4

Data Independent (CI-DI)Data Dependent (CI-DD)Data Dependent (CI-DD)Data Independent (CI-DI)

R1←R1-1R5←R2-R3

R5←R4+1

Page 7: Power and Frequency Analysis for  Data and Control Independence in  Embedded Processors

CI-DI vs. CI-DD

• Bypassing CI-DIs saves more energy• No need to read operands/execute again

• Bypassing CI-DIs provides higher performance• Not need to waste time for reading operand/executing

7

Fetch Issue Dispatch ExecuteWriteBack

CI-DD

CI-DI

Page 8: Power and Frequency Analysis for  Data and Control Independence in  Embedded Processors

Methodology

• Modified SimpleScalar

• Wattch for power measurement

• MiBench: Embedded Benchmark Suite

8

Page 9: Power and Frequency Analysis for  Data and Control Independence in  Embedded Processors

Distribution

Wrong Path: 12%, CI: 5%, CI-DI: 2%9

Page 10: Power and Frequency Analysis for  Data and Control Independence in  Embedded Processors

CI Power Reduction in Different Units

Max: branch predictor unit, Min: instruction cache

10

Page 11: Power and Frequency Analysis for  Data and Control Independence in  Embedded Processors

CI Power Reduction in Stages

11

Rijndael: low misprediction low wrong path low CIs

Page 12: Power and Frequency Analysis for  Data and Control Independence in  Embedded Processors

Power Sensitivity to RUU size

12

CI CI-DI

Higher power dissipation for bigger RUU sizes

Page 13: Power and Frequency Analysis for  Data and Control Independence in  Embedded Processors

Power Sensitivity to Execution Bandwidth

13

CI CI-DI

Higher power dissipation for wider execution bandwidth

Page 14: Power and Frequency Analysis for  Data and Control Independence in  Embedded Processors

Power Sensitivity to Branch Predictor Size

14Little sensitivity to branch predictor size

Page 15: Power and Frequency Analysis for  Data and Control Independence in  Embedded Processors

Related Work

• Rotenberg et. al: studied control independence in superscalar processors, HPCA99.

• Collins et. al: suggested mechanism to predict re-convergent point, Micro04.

• Lam and Wilson: studied impact of CIs on instruction level parallelism, ISCA92.

• Gandhi et. al: recover selected branch mis-prediction, HPCA04.

15

Page 16: Power and Frequency Analysis for  Data and Control Independence in  Embedded Processors

Conclusion

• Categorize CI to CI-DI and CI-DD

• Potential power saving for bypassing CI and CI-DI instructions up-to 12%

• High sensitivity to RUU size

• High sensitivity to execution bandwidth

• Little sensitivity to branch predictor size

16

Page 17: Power and Frequency Analysis for  Data and Control Independence in  Embedded Processors

Question

Thank you

17