chapter 7...
DESCRIPTION
Chapter 7 โพรเซสเซอร์แบบไปป์ลายน์และซุปเปอร์สเกลาร์ Pipeline and Superscalar processor. เนื้อหา. ทฤษฏีและหลักการของไปป์ลายน์ ปัญหาที่เกิดขึ้นกับเทคนิคไปป์ลายน์ ทฤษฏีและหลักการของซุปเปอร์สเกลาร์ ตัวอย่างโพรเซสเซอร์ที่ใช้เทคนิคไปป์ลายน์ ตัวอย่างโพรเซสเซอร์ที่ใช้เทคนิคซุปเปอร์สเกลาร์ - PowerPoint PPT PresentationTRANSCRIPT
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 2
Chapter7โพรเซสเซอรแบบไปป�ลายน์และ
ซ�ปเปอรสเกลารPipeline and
Superscalar processor
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 3
เน์��อหา
ทฤษฏี�และหล�กการของไปป�ลายน์�ป�ญหาท��เก�ดข��น์ก�บเทคน์�คไปป�ลายน์�ทฤษฏี�และหล�กการของซุ#ปเปอร�สเกลาร�ตั�วอย'างโพรเซุสเซุอร�ท��ใช้,เทคน์�คไปป�ลายน์�ตั�วอย'างโพรเซุสเซุอร�ท��ใช้,เทคน์�คซุ#ปเปอร�สเกลาร�หล�กการท-างาน์ของ Out-of-order executionOut-of-order execution ก�บเทคน์�คไปป�ลายน์�Out-of-order execution ก�บเทคน์�คซุ#ปเปอร�สเกลาร�Out-of-order execution ใน์ไมโครโพรเซุสเซุอร�ตัระก/ลตั'างๆ
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 4
Pipeline ก�บ ร�าน์ซ�กร�ด กรรมวิ�ธี�ซ�กร�ด ม� 4 ขั้��น์ตอน์ด�งน์��
1. ซ�ก 15 น์าที� 2. ป%& น์แห�ง 15 น์าที� 3. ร�ดผ้�า 15 น์าที� 4. ส(งผ้�า 15 น์าที�
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 5
Pipeline ก�บ ร�าน์ซ�กร�ด
ซุ� ก ป�� น์ แ ห, ง ร� ด ส' งผ้, า
15 น์ า ท� 15 น์ า ท� 15 น์ า ท� 15 น์ า ท�
ซุ� ก ป�� น์ แ ห, ง ร� ด ส' งผ้, า
15 น์ า ท� 15 น์ า ท� 15 น์ า ท� 15 น์ า ท�
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 6
Pipeline ก�บ ร�าน์ซ�กร�ด
ซุ� ก ป�� น์ แ ห, ง ร� ด ส' งผ้, า
15 น์ า ท� 15 น์ า ท� 15 น์ า ท� 15 น์ า ท� 15 น์ า ท� 15 น์ า ท� 15 น์ า ท� 15 น์ า ท�
ซุ� ก ป�� น์ แ ห, ง ร� ด ส' งผ้, า
ซุ� ก ป�� น์ แ ห, ง ร� ด ส' งผ้, า
ซุ� ก ป�� น์ แ ห, ง ร� ด ส' งผ้, า
ซุ� ก ป�� น์ แ ห, ง ร� ด ส' งผ้, า
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 7
ขั้�อด�-ขั้�อเส�ยขั้องไปป�ลายน์ ขั้�อด�
งาน์เสร*จเร*วิขั้,�น์ ขั้�อเส�ย
ต�องใช้�ทีร�พยากรใน์ระบบมากขั้,�น์
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 8
Pipeline ก�บโพรเซสเซอร โพรเซสเซอรที�&เราเร�ยน์มาใน์บทีที�&
3 ใช้�เวิลาใน์การที/างาน์หลาย clock ต(อ 1 คำ/าส�&ง
เทีคำน์�คำไปป�ลายน์ ใช้�เพ�&อลดจ/าน์วิน์คำล1อกต(อ 1 คำ/าส�&งลงเหล�อแคำ( 1 clock ต(อ 1 คำ/าส�&ง
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 9
Instruction Execution Step Action 1 PCout, MARin, Read,
Select4, Add, Zin 2 Zout, PCin, Yin, WMFC 3 MDRout, IRin 4 Offset-field-of-IRout,
ADD, Zin, if N=0 then End 5 Zout, PCin, End5 steps require 5 clock cycles to complete operation
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 10
Processor Execution Divide operations of processor into 4 steps 1. Fetch 2. Decode 3. Execute 4. WriteBack
F etch D ecode E xecute W rite
1 c lock
F etch D ecode E xecute W rite
1 c lock 1 c lock 1 c lock 1 c lock 1 c lock 1 c lock 1 c lock
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 11
4-stage pipeline processor
F etch D ecode E xecute W rite
F etch D ecode E xecute W rite
F etch D ecode E xecute W rite
F etch D ecode E xecute W rite
F etch D ecode E xecute W rite
1 c lock 1 c lock 1 c lock 1 c lock 1 c lock 1 c lock 1 c lock 1 c lock
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 12
WriteBackExecuteDecode2Decode1Fetch
Flow of Instructions through pipeline
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 13
Flow of Instructions through pipeline
INC A
WriteBackExecuteDecode2Decode1Fetch
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 14
Flow of Instructions through pipeline
INC A
ADD A,R4 INC A
WriteBackExecuteDecode2Decode1Fetch
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 15
Flow of Instructions through pipeline
INC A
ADD A,R4 INC A
WriteBackExecuteDecode2Decode1Fetch
MOV R1,A ADD A,R4 INC A
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 16
Flow of Instructions through pipeline
INC A
ADD A,R4 INC A
WriteBackExecuteDecode2Decode1Fetch
MOV R1,A ADD A,R4 INC A
INC R0 MOV R1,A ADD A,R4 INC A
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 17
Flow of Instructions through pipeline
INC A
ADD A,R4 INC A
WriteBackExecuteDecode2Decode1Fetch
MOV R1,A ADD A,R4 INC A
INC R0 MOV R1,A ADD A,R4 INC A
SUBB A,R0 INC R0 MOV R1,A ADD A,R4 INC A
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 18
Flow of Instructions through pipeline
INC A
ADD A,R4 INC A
WriteBackExecuteDecode2Decode1Fetch
MOV R1,A ADD A,R4 INC A
INC R0 MOV R1,A ADD A,R4 INC A
SUBB A,R0 INC R0 MOV R1,A ADD A,R4 INC A
RLC A SUBB A,R0 INC R0 MOV R1,A ADD A,R4
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 19
Flow of Instructions through pipeline
INC A
ADD A,R4 INC A
WriteBackExecuteDecode2Decode1Fetch
MOV R1,A ADD A,R4 INC A
INC R0 MOV R1,A ADD A,R4 INC A
SUBB A,R0 INC R0 MOV R1,A ADD A,R4 INC A
RLC A SUBB A,R0 INC R0 MOV R1,A ADD A,R4
1 clock cycle
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 20
Flow of Instructions through pipeline
INC A
ADD A,R4 INC A
WriteBackExecuteDecode2Decode1Fetch
MOV R1,A ADD A,R4 INC A
INC R0 MOV R1,A ADD A,R4 INC A
SUBB A,R0 INC R0 MOV R1,A ADD A,R4 INC A
RLC A SUBB A,R0 INC R0 MOV R1,A ADD A,R4
RLC A SUBB A,R0 INC R0 MOV R1,AMOV R7,A
1 clock cycle
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 21
Flow of Instructions through pipeline
INC A
ADD A,R4 INC A
WriteBackExecuteDecode2Decode1Fetch
MOV R1,A ADD A,R4 INC A
INC R0 MOV R1,A ADD A,R4 INC A
SUBB A,R0 INC R0 MOV R1,A ADD A,R4 INC A
RLC A SUBB A,R0 INC R0 MOV R1,A ADD A,R4
RLC A SUBB A,R0 INC R0 MOV R1,AMOV R7,A
1 clock cycle
1 clock cycle
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 22Nu
mb
er
of
pip
elin
e
sta
ges
F etch D ecode E xecute W rite
F etch D ecode E xecute W rite
F etch D ecode E xecute W rite
F etch D ecode E xecute W rite
F etch D ecode E xecute W rite
1 c lock 1 c lock 1 c lock 1 c lock 1 c lock 1 c lock 1 c lock 1 c lock
F e tch/D ecode E xecute /W rite
1 c lock 1 c lock 1 c lock 1 c lock 1 c lock 1 c lock 1 c lock 1 c lock
F e tch/D ecode E xecute /W rite
F e tch/D ecode E xecute /W rite
F e tch/D ecode E xecute /W rite
F e tch/D ecode E xecute /W rite
F e tch/D ecode E xecute /W rite
F e tch/D ecode E xecute /W rite
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 23
4 -stage pipeline
Q
QS E T
C LR
D
Q
QS E T
C LR
D
Q
QS E T
C LR
D
Q
QS E T
C LR
D
Q
QS E T
C LR
D
Q
QS E T
C LR
D
Q
QS E T
C LR
D
Q
QS E T
C LR
D
Q
QS E T
C LR
D
Q
QS E T
C LR
D
Q
QS E T
C LR
D
Q
QS E T
C LR
D
Q
QS E T
C LR
D
Q
QS E T
C LR
D
Q
QS E T
C LR
D
Q
QS E T
C LR
D
CLK
S tage1 S tage2 S tage3 S tage4
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 24
2-stage pipeline
Q
QS E T
C LR
D
Q
QS E T
C LR
D
Q
QS E T
C LR
D
Q
QS E T
C LR
D
Q
QS E T
C LR
D
Q
QS E T
C LR
D
Q
QS E T
C LR
D
Q
QS E T
C LR
D
CLK
S tage1 S tage2
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 25
Problem of pipelining technique
Pipeline Hazard Structural Hazard Data hazard
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 26
Structural hazard Memory does not allow read
program and data simultaneously
Solution : Using Harvard architecture
F etch D ecode E xecute W rite
F etch D ecode E xecute W rite
F etch D ecode E xecute W rite
F etch D ecode E xecute W rite
F etch D ecode E xecute W rite
1 c lock 1 c lock 1 c lock 1 c lock 1 c lock 1 c lock 1 c lock 1 c lock
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 27
Harvard architecture
P rogramD ata
A ddress
D ata
P rocessorR ead
W rite
R ead
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 28
Data hazard
F etch D ecode E xecute W rite
F etch D ecode E xecute W rite
F etch D ecode E xecute W rite
F etch D ecode E xecute W rite
F etch D ecode E xecute W rite
1 c lock 1 c lock 1 c lock 1 c lock 1 c lock 1 c lock 1 c lock 1 c lock
LO AD R 1, R 2
AD D R 3,R 1
ST O R E [R 4],R 0
LO AD R 5,R 1
X O R R 5,R 2
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 29
Data hazard : operand forwarding
F etch D ecode E xecute W rite
F etch D ecode E xecute W rite
F etch D ecode E xecute W rite
F etch D ecode E xecute W rite
F etch D ecode E xecute W rite
1 c lock 1 c lock 1 c lock 1 c lock 1 c lock 1 c lock 1 c lock 1 c lock
LO AD R 1, R 2
AD D R 3,R 1
ST O R E [R 4],R 0
LO AD R 5,R 1
X O R R 5,R 2
R 1 forward ing
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 30
Pentium4 pipeline 20 stages
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 31
Datapath of the- Pipelined 8051
M MMMMMMMMMMMMM
RN_DEC
RI_DEC
BIT_DEC
DIRECT_DEC
R A M
S FR
A C C
IN S
IN S IN S
IN SDD
S2
S1
W R ITED E S T
FE T C HS TA G E
D E C O D E 1S TA G E
D E C O D E 2S TA G E
E XE C U TES TA G E
W R ITE B A C KS TA G E
FE T C HO P C O D E
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 32
Flow of Instructions through pipeline
INC A
ADD A,R4 INC A
WriteBackExecuteDecode2Decode1Fetch
MOV R1,A ADD A,R4 INC A
INC R0 MOV R1,A ADD A,R4 INC A
SUBB A,R0 INC R0 MOV R1,A ADD A,R4 INC A
RLC A SUBB A,R0 INC R0 MOV R1,A ADD A,R4
RLC A SUBB A,R0 INC R0 MOV R1,AMOV R7,A
1 clock cycle
1 clock cycle
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 33
Flow of DIV AB instruction through
pipelineINC A
DIV AB INC A
WriteBackExecuteDecode2Decode1Fetch
INC R1 DIV AB INC A
ADD A,B INC R1 DIV AB INC A
NOP ADD A,B INC R1 DIV AB INC A
NOP ADD A,B INC R1 DIV AB NOP
ADD A,B INC R1 DIV AB NOPNOP
NOP ADD A,B INC R1 DIV AB NOP
ADD A,B INC R1 DIV AB NOPNOP
NOP ADD A,B INC R1 DIV AB NOP
ADD A,B INC R1 DIV AB NOPNOP
ADD A,B INC R1 DIV AB NOPNOP
RL A ADD A,B INC R1 DIV AB NOP
RL A ADD A,B INC R1 DIV ABMOV R2,A
MOV R2,A RL A ADD A,B INC R1MUL AB
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 34
Superscalar
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 35
Superscalarinstructionfe tch un it
D ispatchun it
E xecutionun it1
E xecutionun it2
W rite resu lts
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 36
Pentium architecture
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 37
Superscalar VS VLIW Superscalar
Hardware detects potential parallelism between instructions
Hardware tries to issue as many instructions as possible in parallel.
The more execution unit, the more speed without recompilation
Hardware dispatch unit is Very complex VLIW
Instruction level parallelism The more execution unit, the more speed needs
recompilation Very simple hardware dispatch unit
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 38
What is out-of-order execution
OOO stands for Out-Of-Order execution
OOO is the way that the processor can execute instructions in any order that does not change the result of program
OOO provides much better performance than in-order execution
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 39
Execution time of in-order execution
- - From paper titled “Cheap Out of Order Execution using Delayed Issue”, J.P. Grossman
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 40
Execution of program in a pipelined CPU
FM U L t1,y,z
FAD D w , x, t1
FAD D t2,b,c
FM U L a,t2,d
I 1
I 2
I 3
I 4
F I1
1 clk
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 41
Execution of program in a pipelined CPU
FM U L t1,y,z
FAD D w , x, t1
FAD D t2,b,c
FM U L a,t2,d
I 1
I 2
I 3
I 4
D I1
1 clk
F I1
F I2
1 clk
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 42
Execution of program in a pipelined CPU
FM U L t1,y,z
FAD D w , x, t1
FAD D t2,b,c
FM U L a,t2,d
I 1
I 2
I 3
I 4
D I1 E I1
D I2
1 clk 1 clk
F I1
F I2
F I3
1 clk
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 43
Execution of program in a pipelined CPU
FM U L t1,y,z
FAD D w , x, t1
FAD D t2,b,c
FM U L a,t2,d
I 1
I 2
I 3
I 4
D I1 E I1 W I1
D I2 E NO P
D I3
1 clk 1 clk 1 clk
F I1
F I2
F I3
F I4
1 clk
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 44
Execution of program in a pipelined CPU
FM U L t1,y,z
FAD D w , x, t1
FAD D t2,b,c
FM U L a,t2,d
I 1
I 2
I 3
I 4
D I1 E I1 W I1
D I2 E NO P E I2
D I3 W NO P
1 clk 1 clk 1 clk 1 clk
D I4
F I1
F I2
F I3
F I4
1 clk
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 45
Execution of program in a pipelined CPU
FM U L t1,y,z
FAD D w , x, t1
FAD D t2,b,c
FM U L a,t2,d
I 1
I 2
I 3
I 4
D I1 E I1 W I1
D I2 E NO P E I2
D I3 W NO P
W I2
E I3
1 clk 1 clk 1 clk 1 clk 1 clk
D I4 D new
F I1
F I2
F I3
F I4
1 clk
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 46
Execution of program in a pipelined CPU
FM U L t1,y,z
FAD D w , x, t1
FAD D t2,b,c
FM U L a,t2,d
I 1
I 2
I 3
I 4
D I1 E I1 W I1
D I2 E NO P E I2
D I3 W NO P
W I2
E I3 W I3
1 clk 1 clk 1 clk 1 clk 1 clk 1 clk
D I4 D new E new
F I1
F I2
F I3
F I4
1 clk
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 47
Execution of program in a pipelined CPU
FM U L t1,y,z
FAD D w , x, t1
FAD D t2,b,c
FM U L a,t2,d
I 1
I 2
I 3
I 4
D I1 E I1 W I1
D I2 E NO P E I2
D I3 W NO P
W I2
E I3 W I3
1 clk 1 clk 1 clk 1 clk 1 clk 1 clk 1 clk
D I4 D new E I4E new
F I1
F I2
F I3
F I4
1 clk
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 48
Execution of program in a pipelined CPU
FM U L t1,y,z
FAD D w , x, t1
FAD D t2,b,c
FM U L a,t2,d
I 1
I 2
I 3
I 4
D I1 E I1 W I1
D I2 E NO P E I2
D I3 W NO P
W I2
E I3 W I3
1 clk 1 clk 1 clk 1 clk 1 clk 1 clk 1 clk 1 clk
D I4 D new E I4 W I4E new
F I1
F I2
F I3
F I4
1 clk
9 clock cycles
4 instructions
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 49
Assume the CPU can fetch
multiple instructions a time
F I1 E I1 W I1
NO P E I2
NO P
W I2
E I3 W I3
1 clk 1 clk 1 clk 1 clk 1 clk 1 clk 1 clk 1 clk 1 clk
F I4 NO P E I4 W I4NO P
F I2
F I3
D I1
D I2
D I3
D I4
FM U L t1,y,z
FAD D w , x, t1
FAD D t2,b,c
FM U L a,t2,d
I 1
I 2
I 3
I 4
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 50
Out-of-order execution in a pipelined CPU
F I1 E I1 W I1
E I2 W I2
E I3 W I3
1 clk 1 clk 1 clk 1 clk 1 clk 1 clk 1 clk 1 clk 1 clk
F I4 E I4 W I4
F I2
F I3
D I1
D I2
D I3
D I4
FM UL t1 ,y,z
FADD w, x, t1
FADD t2,b,c
FM UL a,t2 ,d
I1
I2
I3
I4
7 clock cycles
4 instructions
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 51
OOO in a superscalar architecture
OOO can also improve performance in a superscalar by reducing wait-time of instruction issue
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 52
- 2way superscalar archi tecture
F D E W
F D E W
F D E W
F D E W
F D E W
F D E W
F D E W
F D E W
F D E W
F D E W
1 2 3 4 5 6 7 8
F D E W
F D E W
9
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 53
Exemplary programLD R5, R7
LD R3, #25
LD R1, #0
LD R2, R1
LD R4, [R1]
ADD R2,R4
INC R2
DEC R3
ADD R5, R3
LD R7, R3
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 54
- In order execution i n Superscalararchitecture
F D E W
F D E W
F D E W
F D E W
F NO P D E W
F D E W
F D E W
F D E W
LD R5, R7
LD R3, #25
LD R1, #0
LD R2, R1
LD R4, [R1]
ADD R2,R4
INC R2
DEC R3
NOP
F NO P D E W
F D E W
NOP
1 2 3 4 5 6 7 8
F D E W
F D E W
ADD R5, R3
LD R7, R3
9
LD R5, R7
LD R3, #25
LD R1, #0
LD R2, R1
LD R4, [R1]
ADD R2,R4
INC R2
DEC R3
ADD R5, R3
LD R7, R3
9 clock cycles
10 instructions
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 55
Reorder InstructionLD R5, R7
LD R3, #25
LD R1, #0
LD R2, R1
LD R4, [R1]
ADD R2,R4
INC R2
DEC R3
ADD R5, R3
LD R7, R3
LD R5, R7
LD R3, #25
LD R1, #0
LD R2, R1
LD R4, [R1]
ADD R2,R4
INC R2
DEC R3
ADD R5, R3
LD R7, R3
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 56
Out-of-order execution
F D E W
F D E W
F D E W
F D E W
F D E W
F D E W
F D E W
F D E W
LD R3, #25
LD R1, #0
LD R2, R1
LD R4, [R1]
DEC R3
F D E W
F D E W
1 2 3 4 5 6 7 8
LD R5, R7
ADD R2,R4
ADD R5, R3
INC R2
LD R7, R3
LD R5, R7
LD R3, #25
LD R1, #0
LD R2, R1
LD R4, [R1]
ADD R2,R4
INC R2
DEC R3
ADD R5, R3
LD R7, R3
8 clock cycles
10 instructions
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 57
OOO in P6(Pentium Pro, II, III)
T
dD d
ร/ปจาก http://www.byxtreme.com/Article/ooo.html
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 58
OOO in P6(Pentium Pro, II, III)
Re-order buffer holds up to 40 instructions
Instructions are reordered by Reservation station
หมายเหตั# ข,อม/ลจากhttp://www.byxtreme.com/Article/ooo.html
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 59
OOO in P6
Picture fromhttp://folk.uio.no/botnen/intel/vt/reference/p6.gif
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 60
7OOO in K (Athlon,Duron)
T
dD d
ร/ปจาก http://www.byxtreme.com/Article/ooo.html
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 61
7OOO in K (Athlon) - Re order buffer holds up to 72instructions
Station1 used for scheduling- floating point instruction
1Station 36holds up to instructions 2Station MMMM MMM MMMMMMMMMM
integer instruction 2Station 15holds up toinstructionsหมายเหตั# ข,อม/ลจาก
http://www.byxtreme.com/Article/ooo.html
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 62
OOO of Athl on
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 63
OOO in P4
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 64
OO
O in
P
ow
erP
C G
4
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 65
OOO in Pentium4 Prescott
-Picturefromhttp://www.tomshardware.com/cpu/2 0 0 4 0 2 0 1 /prescott0 4 .html
240-208 Fundamental of Computer Architecture Chapter 7 – Pipeline and Superscalar processor 66
จบ บทีที�& 7