cs4100: 計算機結構 designing a multicycle processor 國立清華大學資訊工程學系...

79
CS4100: CS4100: 計計計計計 計計計計計 Designing a Multicycle Designing a Multicycle Processor Processor 計計計計計計計計計計計計 計計計計計計計 計計 Adapted from class notes of D. Patterson and W. Dally Copyright 1998, 2000 UCB

Upload: jeffry-ray

Post on 01-Jan-2016

243 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

CS4100: CS4100: 計算機結構計算機結構

Designing a Multicycle Designing a Multicycle ProcessorProcessor

國立清華大學資訊工程學系九十六學年度第一學期

Adapted from class notes of D. Patterson and W. Dally

Copyright 1998, 2000 UCB

Page 2: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-2 Computer Architecture

Outline Designing a processor Building the datapath A single-cycle implementation A multicycle implementation Microprogramming: simplifying control

(Appendix C.4) Exceptions

Page 3: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-3 Computer Architecture

Recap: A Single-Cycle Processor

CPI=1

PC

Instructionmemory

Readaddress

Instruction

16 32

Add ALUresult

Mux

Registers

Writeregister

Writedata

Readdata 1

Readdata 2

Readregister 1Readregister 2

Shiftleft 2

4

Mux

ALU operation3

RegWrite

MemRead

MemWrite

PCSrc

ALUSrc

MemtoReg

ALUresult

ZeroALU

Datamemory

Address

Writedata

Readdata M

ux

Signextend

Add

Page 4: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-4 Computer Architecture

PC Inst Memory mux ALU Data Mem mux

PC Reg FileInst Memory mux ALU mux

PC Inst Memory mux ALU Data Mem

PC Inst Memory cmp mux

Reg File

Reg File

Reg File

Arithmetic & Logical

Load

Store

Branch

Critical Path

setup

setup

What’s Wrong with Single-cycle?

Long cycle time All instructions take same time as the

slowest Real memory is not so ideal

cannot always get job done in one (short) cycle A FU can only be used once => higher cost

Page 5: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-5 Computer Architecture

Outline Designing a processor Building the datapath A single-cycle implementation A multicycle implementation

Multicycle datapath Multicycle execution steps Multicycle control (Appendix C.3)

Microprogramming: simplifying control (Appendix C.4)

Exceptions

Page 6: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-6 Computer Architecture

Multicycle Implementation

storage element

Acyclic CombinationalLogic

storage element

storage element

Acyclic CombinationalLogic (A)

storage element

storage element

Acyclic CombinationalLogic (B)

=>

Reduce cycle time Diff. Inst. take diff. cycles Share functional units

Page 7: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-7 Computer Architecture

Break up the instructions into steps, each step takes a cycle balance the amount of work to be done restrict each cycle to use only one major

functional unit At the end of a cycle

store values for use in later cycles (easiest thing to do)

introduce additional internal registers

Multicycle Approach

Page 8: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-8 Computer Architecture

Partition Single-Cycle Datapath

PC

Instructionmemory

Readaddress

Instruction

16 32

Add ALUresult

Mux

Registers

Writeregister

Writedata

Readdata 1

Readdata 2

Readregister 1Readregister 2

Shiftleft 2

4

Mux

ALU operation3

RegWrite

MemRead

MemWrite

PCSrc

ALUSrc

MemtoReg

ALUresult

ZeroALU

Datamemory

Address

Writedata

Readdata M

ux

Signextend

Add

Ins. fetchRF accessALU operationmemory access

Add registers between smallest steps

Page 9: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-9 Computer Architecture

Multicycle Datapath 1 memory (instr. & data), 1 ALU (addr, PC+4,

add,…), registers (IR, MDR, A, B, ALUOut) Storage for subsequent inst. (arch.-visible) vs.

storage for same inst. but in a subsequent cycle

PC

Memory

Address

Instructionor data

Data

Instructionregister

Registers

Register #

Data

Register #

Register #

ALU

Memorydata

register

A

B

ALUOut

Fig. 5.25

Page 10: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-10 Computer Architecture

Multicycle Datapath for Basic Instr.

IR and PC need write control, but others don’t MUX to select 2 sources to memory; memory needs read signal PC and A to one ALU input; four sources to another input

Shiftleft 2

MemtoReg

IorD MemRead MemWrite

PC

Memory

MemData

Writedata

Mux

0

1

RegistersWriteregister

Writedata

Readdata 1

Readdata 2

Readregister 1

Readregister 2

Instruction[15– 11]

Mux

0

1

Mux

0

1

4

ALUOpALUSrcB

RegDst RegWrite

Instruction[15– 0]

Instruction [5– 0]

Signextend

3216

Instruction[25– 21]

Instruction[20– 16]

Instruction[15– 0]

Instructionregister

1 Mux

0

3

2

ALUcontrol

Mux

0

1ALU

resultALU

ALUSrcA

ZeroA

B

ALUOut

IRWrite

Address

Memorydata

register

Fig. 5.27

Page 11: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-11 Computer Architecture

Shiftleft 2

PCMux

0

1

RegistersWriteregister

Writedata

Readdata 1

Readdata 2

Readregister 1

Readregister 2

Instruction[15– 11]

Mux

0

1

Mux

0

1

4

Instruction[15– 0]

Signextend

3216

Instruction[25– 21]

Instruction[20– 16]

Instruction[15– 0]

Instructionregister

ALUcontrol

ALUresult

ALUZero

Memorydata

register

A

B

IorD

MemRead

MemWrite

MemtoReg

PCWriteCond

PCWrite

IRWrite

ALUOp

ALUSrcB

ALUSrcA

RegDst

PCSource

RegWrite

Control

Outputs

Op[5– 0]

Instruction[31-26]

Instruction [5– 0]

Mux

0

2

Jumpaddress [31-0]Instruction [25– 0] 26 28

Shiftleft 2

PC [31-28]

1

1 Mux

0

3

2

Mux

0

1ALUOut

Memory

MemData

Writedata

Address

Adding Branch/Jump

Three sources to PC Two PC write signals

Fig. 5.28

Page 12: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-12 Computer Architecture

Outline Designing a processor Building the datapath A single-cycle implementation A multicycle implementation

Multicycle datapath Multicycle execution steps Multicycle control (Appendix C.3)

Microprogramming: simplifying control (Appendix C.4)

Exceptions

Page 13: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-13 Computer Architecture

Instruction Fetch Instruction Decode and Register Fetch Execution, Memory Address Computation,

or Branch Completion Memory Access or R-type Instruction

Completion Memory Read Completion (Write-back)

INSTRUCTIONS TAKE FROM 3 - 5 CYCLES!

Five Execution Steps

Page 14: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-14 Computer Architecture

Step 1: Instruction Fetch Use PC to get instruction and put it in the

Instruction Register (IR) Increment the PC by 4 and put the result

back in the PC Can be described succinctly using RTL

(Register-Transfer Language)

IR = Memory[PC];PC = PC + 4;

Can we figure out the values of the control signals?

What is the advantage of updating the PC now?

Page 15: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-15 Computer Architecture

Step 2: Instruction Decode and Register Fetch

Read registers rs and rt in case needed Compute the branch address in case the

instruction is a branch RTL:

A = Reg[IR[25-21]];B = Reg[IR[20-16]];ALUOut=PC+(sign-ext(IR[15-0])<<2);

We aren't setting any control lines based on the instruction type yet(we are busy "decoding" it in control logic)

Page 16: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-16 Computer Architecture

ALU is performing one of three functions, based on instruction type:

Memory Reference:ALUOut = A + sign-extend(IR[15-0]);

R-type:ALUOut = A op B;

Branch:if (A==B) PC = ALUOut;

Step 3: Execution

Page 17: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-17 Computer Architecture

Loads and stores access memory

MDR = Memory[ALUOut];or

Memory[ALUOut] = B;

R-type instructions finish

Reg[IR[15-11]] = ALUOut;

The write actually takes place at the end of the cycle on the edge

Step 4: R-type or Memory-access

Page 18: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-18 Computer Architecture

Loads write to register

Reg[IR[20-16]]= MDR;

What about all the other instructions?

Step 5: Write-back

Page 19: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-19 Computer Architecture

Summary of the Steps

Step nameAction for R-type

instructionsAction for memory-reference

instructionsAction for branches

Action for jumps

Instruction fetch IR = Memory[PC]PC = PC + 4

Instruction A = Reg [IR[25-21]]decode/register fetch B = Reg [IR[20-16]]

ALUOut = PC + (sign-extend (IR[15-0]) << 2)

Execution, address ALUOut = A op B ALUOut = A + sign-extend if (A ==B) then PC = PC [31-28] IIcomputation, branch/ (IR[15-0]) PC = ALUOut (IR[25-0]<<2)jump completion

Memory access or R-type Reg [IR[15-11]] = Load: MDR = Memory[ALUOut]completion ALUOut or

Store: Memory [ALUOut] = B

Memory read completion Load: Reg[IR[20-16]] = MDR

Fig. 5.30

Page 20: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-20 Computer Architecture

Shiftleft 2

MemtoReg

IorD MemRead MemWrite

PC

Memory

MemData

Writedata

Mux

0

1

RegistersWriteregister

Writedata

Readdata 1

Readdata 2

Readregister 1

Readregister 2

Instruction[15– 11]

Mux

0

1

Mux

0

1

4

ALUOpALUSrcB

RegDst RegWrite

Instruction[15– 0]

Instruction [5– 0]

Signextend

3216

Instruction[25– 21]

Instruction[20– 16]

Instruction[15– 0]

Instructionregister

1 Mux

0

3

2

ALUcontrol

Mux

0

1ALU

resultALU

ALUSrcA

ZeroA

B

ALUOut

IRWrite

Address

Memorydata

register

Cycle 1 of add

IR = Memory[PC]; PC = PC + 4;

Page 21: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-21 Computer Architecture

Cycle 2 of add

Shiftleft 2

MemtoReg

IorD MemRead MemWrite

PC

Memory

MemData

Writedata

Mux

0

1

RegistersWriteregister

Writedata

Readdata 1

Readdata 2

Readregister 1

Readregister 2

Instruction[15– 11]

Mux

0

1

Mux

0

1

4

ALUOpALUSrcB

RegDst RegWrite

Instruction[15– 0]

Instruction [5– 0]

Signextend

3216

Instruction[25– 21]

Instruction[20– 16]

Instruction[15– 0]

Instructionregister

1 Mux

0

3

2

ALUcontrol

Mux

0

1ALU

resultALU

ALUSrcA

ZeroA

B

ALUOut

IRWrite

Address

Memorydata

register

A=Reg[IR[25-21]]; B=Reg[IR[20-16]];ALUOut=PC+(sign-ext(IR[15-0])<<2);

Page 22: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-22 Computer Architecture

Cycle 3 of add

Shiftleft 2

MemtoReg

IorD MemRead MemWrite

PC

Memory

MemData

Writedata

Mux

0

1

RegistersWriteregister

Writedata

Readdata 1

Readdata 2

Readregister 1

Readregister 2

Instruction[15– 11]

Mux

0

1

Mux

0

1

4

ALUOpALUSrcB

RegDst RegWrite

Instruction[15– 0]

Instruction [5– 0]

Signextend

3216

Instruction[25– 21]

Instruction[20– 16]

Instruction[15– 0]

Instructionregister

1 Mux

0

3

2

ALUcontrol

Mux

0

1ALU

resultALU

ALUSrcA

ZeroA

B

ALUOut

IRWrite

Address

Memorydata

register

ALUOut = A op B;

Page 23: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-23 Computer Architecture

Cycle 4 of add

Shiftleft 2

MemtoReg

IorD MemRead MemWrite

PC

Memory

MemData

Writedata

Mux

0

1

RegistersWriteregister

Writedata

Readdata 1

Readdata 2

Readregister 1

Readregister 2

Instruction[15– 11]

Mux

0

1

Mux

0

1

4

ALUOpALUSrcB

RegDst RegWrite

Instruction[15– 0]

Instruction [5– 0]

Signextend

3216

Instruction[25– 21]

Instruction[20– 16]

Instruction[15– 0]

Instructionregister

1 Mux

0

3

2

ALUcontrol

Mux

0

1ALU

resultALU

ALUSrcA

ZeroA

B

ALUOut

IRWrite

Address

Memorydata

register

Reg[IR[15-11]] = ALUOut;

Page 24: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-24 Computer Architecture

How many cycles will it take to execute this code?

lw $t2, 0($t3)lw $t3, 4($t3)beq $t2, $t3, Labeladd $t5, $t2, $t3sw $t5, 8($t3)

Label: ...

What is going on during the 8th cycle of execution?

In what cycle does the actual addition of $t2 and $t3 takes place?

Simple Question

assume not

Page 25: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-25 Computer Architecture

Outline Designing a processor Building the datapath A single-cycle implementation A multicycle implementation

Multicycle datapath Multicycle execution steps Multicycle control (Appendix C.3)

Microprogramming: simplifying control (Appendix C.4)

Exceptions

Page 26: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-26 Computer Architecture

Implementing the Control Value of control signals is dependent upon:

what instruction is being executed which step is being performed

Control must specify both the signals to be set in any step and the next step in the sequence

Control specification Use a finite state machine (graphically) Use microprogramming

Implementation can be derived from the specification and use gates, ROM, or PLA

Page 27: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-27 Computer Architecture

Several possible initial representations, sequence control and logic representation, and control implementation => all may be determined indep.

Initial Rep. Finite State Diagram Microprogram

Sequencing Explicit Next State MicroprogramControl Function Counter +

Dispatch ROMs

Logic Rep. Logic Equations Truth Tables

Implementation PLA ROM

Controller Design: An Overview

“hardwired control” “microprogrammed control”

Page 28: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-28 Computer Architecture

Finite state machines: a set of states and next state (set by current state and input) output (set by current state and possibly

input)

We will use a Moore Machine (output based only on the current state)

Review: Finite State Machines

Next-statefunction

Current state

Clock

Outputfunction

Nextstate

Outputs

Inputs

Page 29: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-29 Computer Architecture

Control State

Next StateLogic

Output Logic

inputs (conditions)

outputs (control points)

State X

Register TransferControl Points

Depends on Input

State specifies control points for RT Transfer at exiting state (same falling edge) One state takes one cycle

Our Control Model

Page 30: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-30 Computer Architecture

Summary of the Steps

Step nameAction for R-type

instructionsAction for memory-reference

instructionsAction for branches

Action for jumps

Instruction fetch IR = Memory[PC]PC = PC + 4

Instruction A = Reg [IR[25-21]]decode/register fetch B = Reg [IR[20-16]]

ALUOut = PC + (sign-extend (IR[15-0]) << 2)

Execution, address ALUOut = A op B ALUOut = A + sign-extend if (A ==B) then PC = PC [31-28] IIcomputation, branch/ (IR[15-0]) PC = ALUOut (IR[25-0]<<2)jump completion

Memory access or R-type Reg [IR[15-11]] = Load: MDR = Memory[ALUOut]completion ALUOut or

Store: Memory [ALUOut] = B

Memory read completion Load: Reg[IR[20-16]] = MDR

Fig. 5.30

Page 31: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Control Specification for Multicycle

IR = MEM[PC]PC = PC + 4

A = R[rs]B = R[rt]S = PC+sx(Imm16)||00

PC = IR ...

R-type

S = A op B

R[rd] = S

S = A + sx(Imm16)

R[rt] = M

M = MEM[S]

lw/sw

If zeroPC = S

jumpbeq

Instruction fetch

Decode/register fetch

Execute

Memory access

Memory read

MEM[S] = B

lwsw

Page 32: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-32 Computer Architecture

Shiftleft 2

PCMux

0

1

RegistersWriteregister

Writedata

Readdata 1

Readdata 2

Readregister 1

Readregister 2

Instruction[15– 11]

Mux

0

1

Mux

0

1

4

Instruction[15– 0]

Signextend

3216

Instruction[25– 21]

Instruction[20– 16]

Instruction[15– 0]

Instructionregister

ALUcontrol

ALUresult

ALUZero

Memorydata

register

A

B

IorD

MemRead

MemWrite

MemtoReg

PCWriteCond

PCWrite

IRWrite

ALUOp

ALUSrcB

ALUSrcA

RegDst

PCSource

RegWrite

Control

Outputs

Op[5– 0]

Instruction[31-26]

Instruction [5– 0]

Mux

0

2

Jumpaddress [31-0]Instruction [25– 0] 26 28

Shiftleft 2

PC [31-28]

1

1 Mux

0

3

2

Mux

0

1ALUOut

Memory

MemData

Writedata

Address

Organization of Multicycle Processor

Fig. 5.28

Page 33: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-33 Computer Architecture

Control SignalsSignal name Effect when deasserted Effect when assertedALUSrcA 1st ALU operand = PC 1st ALU operand = Reg[rs]RegWrite None Reg file is written MemtoReg Reg. data input = ALU Reg. write data input = MDR RegDst Reg. write dest. no. = rt Reg. write dest. no. = rdMemRead None Memory at address is readMemWrite None Memory at address is written IorD Memory address = PC Memory address = ALUoutIRWrite None IR = MemoryPCWrite None PC = PCSourcePCWriteCond None If zero then PC = PCSourceSignal name Value Effect ALUOp 00 ALU adds 01 ALU subtracts 10 ALU operates according to func codeALUSrcB 00 2nd ALU input = B 01 2nd ALU input = 4 10 2nd ALU input = sign extended IR[15-0] 11 2nd ALU input = sign ext., shift left 2 IR[15-0]PCSource 00 PC = ALU (PC + 4) 01 PC = ALUout (branch target address) 10 PC = PC+4[31-28] : IR[25-0] << 2

Sin

gle

Bit

Con

trol

Mu

ltip

le B

it C

ontr

ol

Fig. 5.29

Page 34: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-34 Computer Architecture

Mapping RT to Control Signals Instruction fetch and decode portion of every

instruction is identical:

ALUSrcA = 0ALUSrcB = 11ALUOp = 00

MemReadALUSrcA = 0

IorD = 0IRWrite

ALUSrcB = 01ALUOp = 00

PCWritePCSource = 00

Instruction fetchInstruction decode/

Register fetch

(Op =

'JM

P')

01

Start

Memory reference FSM(Figure 5.38)

R-type FSM(Figure 5.39)

Branch FSM(Figure 5.40)

Jump FSM(Figure 5.41)

Fig. 5.32

A = R[rs]B = R[rt]S = PC+ sx(Imm16)||00

IR = MEM[PC]PC = PC + 4

Page 35: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-35 Computer Architecture

Mapping RT to Control Signals FSM for controlling

memory referenceinstructions:

MemWriteIorD = 1

MemReadIorD = 1

ALUSrcA = 1ALUSrcB = 10ALUOp = 00

RegWriteMemtoReg = 1

RegDst = 0

Memory address computation

(Op = 'LW') or (Op = 'SW')

Memoryaccess

Write-back step

(Op

= 'L

W')

4

2

53

From state 1

To state 0(Figure 5.37)

Memoryaccess

Fig. 5.33

Page 36: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

PCWritePCSource = 10

ALUSrcA = 1ALUSrcB = 00ALUOp = 01PCWriteCond

PCSource = 01

ALUSrcA =1ALUSrcB = 00ALUOp= 10

RegDst = 1RegWrite

MemtoReg = 0

MemWriteIorD = 1

MemReadIorD = 1

ALUSrcA = 1ALUSrcB = 10ALUOp = 00

RegDst = 0RegWrite

MemtoReg=1

ALUSrcA = 0ALUSrcB = 11ALUOp = 00

MemReadALUSrcA = 0

IorD = 0IRWrite

ALUSrcB = 01ALUOp = 00

PCWritePCSource = 00

Instruction fetchInstruction decode/

register fetch

Jumpcompletion

BranchcompletionExecution

Memory addresscomputation

Memoryaccess

Memoryaccess R-type completion

Write-back step

(Op = 'J

')

(Op =

'LW

')

4

01

9862

753

Start

Complete FSM

Fig. 5.38

•State number assignment

Page 37: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-37 Computer Architecture

input output op S Datapath control NS000000 0000 100101000000100 0001000000 0001 000000000011000 0110000000 0010 000000000010100 xxxx

000000 1010

000000 1111000001 0000

000010 0000 100101000000100 0001000010 0001 000000000011000 1001000000 0010 000000000010100 xxxx

Truth Table

Page 38: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-38 Computer Architecture

From FSM to Truth Table Please reference the logic equations in Fig.

C.3.3 and the truth table in Fig. C.3.6

Output EquationPCWrite state0 + state9PCWriteCond state8IorD state3 + state5… …NextState0 state4 + state5 + state7 + state8

+state9NextState1 state0NextState2 state1 ((op = ‘lw’) + (op = ‘sw’))NextState3 state2 (op = ‘lw’)… ...

Output Current states0 1 2 3 4 5 6 7 8 9

PCWrite 1 0 0 0 0 0 0 0 0 1PCWriteCond 0 0 0 0 0 0 0 0 1 0IorD 0 0 0 1 0 1 0 0 0 0… …

Page 39: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-39 Computer Architecture

state

6

4

11nextstate

op

zero

control points

state op condnextstate control points

Truth Table

datapath state

Control signals

Designing FSM Controller

Page 40: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-40 Computer Architecture

The Control UnitPCWrite

PCWriteCond

IorD

MemtoReg

PCSource

ALUOp

ALUSrcB

ALUSrcA

RegWrite

RegDst

NS3NS2NS1NS0

Op5

Op4

Op3

Op2

Op1

Op0

S3

S2

S1

S0

State register

IRWrite

MemRead

MemWrite

Instruction registeropcode field

Outputs

Control logic

Inputs

Fig. C.3.2

SequenceControl

Page 41: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-41 Computer Architecture

PLA Implementatio

n

Op5

Op4

Op3

Op2

Op1

Op0

S3

S2

S1

S0

IorD

IRWrite

MemReadMemWrite

PCWritePCWriteCond

MemtoRegPCSource1

ALUOp1

ALUSrcB0ALUSrcARegWriteRegDstNS3NS2NS1NS0

ALUSrcB1ALUOp0

PCSource0

Fig. C.3.9

Sequence Control

Page 42: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-43 Computer Architecture

Address ROM content op S Datapath control NS000000 0000 100101000000100 0001000000 0001 000000000011000 0110000000 0010 000000000010100 xxxx

000000 1010

000000 1111000001 0000

000010 0000 100101000000100 0001000010 0001 000000000011000 1001000000 0010 000000000010100 xxxx

ROM Implementation(Truth Table)

• Rather wasteful, since for lots of entries, outputs are same or don’t-care

• Could break up into two smaller ROMs (Fig. C.3.7, C.3.8)

• Control table: Input: 4-bit stateOutput:16 controls

• Next state table:Input: 4-bit state and 6-bit op-codeOutput: 4-bit state

Page 43: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-44 Computer Architecture

ROM vs PLA ROM: use two smaller ROMs (Fig. C.3.7, C.3.8)

4 state bits give the 16 outputs, 24x16 bits of ROM 10 bits (op + state) give 4 next state bits, 210x 4

bits of ROM Total = 4.3K bits of ROM (compared to 210x 20 bits

of single ROM implementation) PLA is much smaller

can share product terms only need entries that produce an active output can take into account don't-cares Size is (#inputs #product-terms) + (#outputs

#product-terms) For this example = (10x17)+(20x17) = 460 PLA cells

PLA cells usually about the size of a ROM cell (slightly bigger)

Page 44: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

PCWritePCSource = 10

ALUSrcA = 1ALUSrcB = 00ALUOp = 01PCWriteCond

PCSource = 01

ALUSrcA =1ALUSrcB = 00ALUOp= 10

RegDst = 1RegWrite

MemtoReg = 0

MemWriteIorD = 1

MemReadIorD = 1

ALUSrcA = 1ALUSrcB = 10ALUOp = 00

RegDst = 0RegWrite

MemtoReg=1

ALUSrcA = 0ALUSrcB = 11ALUOp = 00

MemReadALUSrcA = 0

IorD = 0IRWrite

ALUSrcB = 01ALUOp = 00

PCWritePCSource = 00

Instruction fetchInstruction decode/

register fetch

Jumpcompletion

BranchcompletionExecution

Memory addresscomputation

Memoryaccess

Memoryaccess R-type completion

Write-back step

(Op = 'J

')

(Op =

'LW

')

4

01

9862

753

Start

Complete FSM

Fig. 5.38

•State number assignment

Page 45: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-46 Computer Architecture

Use Counter for Sequence Control

AddrCtl

Outputs

PLA or ROM

State

Address select logicO

p[5

–0]

Adder

Instruction registeropcode field

1

Control unit

Input

PCWritePCWriteCondIorD

MemtoRegPCSourceALUOpALUSrcBALUSrcARegWriteRegDst

IRWrite

MemReadMemWrite

BWrite

Fig. C.4.1

No Next-State Lines

Page 46: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-47 Computer Architecture

Address Select Unit

State

Op

Adder

1

PLA or ROM

Mux3 2 1 0

Dispatch ROM 1Dispatch ROM 2

0

AddrCtl

Address select logic

Instruction registeropcode field

AddrCtl Value:0 Set state to 01 Dispatch ROM 12 Dispatch ROM 23 Use incremented

state

Fig. C.4.2

Page 47: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Control ContentsDispatch ROM 1 Dispatch ROM 2

Op Opcode name Value Op Opcode name Value000000 R-format 0110 100011 lw 0011000010 jmp 1001 101011 sw 0101000100 beq 1000100011 lw 0010101011 sw 0010

State number Address-control action Value of AddrCtl

0 Use incremented state 31 Use dispatch ROM 1 12 Use dispatch ROM 2 23 Use incremented state 34 Replace state number by 0 05 Replace state number by 0 06 Use incremented state 37 Replace state number by 0 08 Replace state number by 0 09 Replace state number by 0 0

Fig. C.4.3, C.4.4

Page 48: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-49 Computer Architecture

Outline Designing a processor Building the datapath A single-cycle implementation A multicycle implementation

Multicycle datapath Multicycle execution steps Multicycle control

Microprogramming: simplifying control (Appendix C.4)

Exceptions

Page 49: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-50 Computer Architecture

Several possible initial representations, sequence control and logic representation, and control implementation => all may be determined indep.

Initial Rep. Finite State Diagram Microprogram

Sequencing Explicit Next State MicroprogramControl Function Counter +

Dispatch ROMs

Logic Rep. Logic Equations Truth Tables

Implementation PLA ROM

Controller Design: An Overview

“hardwired control” “microprogrammed control”

Page 50: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-51 Computer Architecture

Control is the hard part of processor design Datapath is fairly regular and well-organized Memory is highly regular Control is irregular and global

But, the state diagrams that define the controller for an instruction set processor are highly structured Use this structure to

construct a simple“microsequencer”

Control reduces toprogramming thissimple device

=> microprogramming

sequencercontrol

datapath control

micro-PCsequencer

microinstruction

Microprogram

…..controlsignals

Page 51: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-52 Computer Architecture

Microinstruction Control signals :

Think of the set of control signals that must be asserted in a state as an instruction

Executing a microinstruction has the effect of asserting the control signal specified by the microinstruction

Sequencing What microinstruction should be executed next ?

Execute sequentially (next state unconditionally)

Branch (next state also depends on inputs)

A microprogram is a sequence of microinstructions executing a program flow chart (finite state machine)

Page 52: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-53 Computer Architecture

1) Start with a list of control signals2) Group signals together that make sense (vs.

random): called fields3) Places fields in some logical order

(e.g., ALU operation & ALU operands first and microinstruction sequencing last)

4) Create a symbolic legend for the microinstruction format, showing name of field values and how they set control signals Use computers to design computers

5) To minimize the width, encode operations that will never be used at the same time

Designing a Microinstruction Set

Page 53: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

1-3) Control Signals and FieldsSignal name Effect when deasserted Effect when asserted

ALUSrcA 1st ALU operand = PC 1st ALU operand = Reg[rs]RegWrite None Reg file is written MemtoReg Reg. write data input = ALU Reg. write data input = MDR RegDst

Reg. write dest. no. = rt Reg. write dest. no. = rdMemRead None Memory at address is readMemWrite None Memory at address is written IorD Memory address = PC Memory address = ALUoutIRWrite None IR = MemoryPCWrite None PC = PCSourcePCWriteCond None If zero then PC = PCSourceS

ingl

e B

it C

ontr

ol

Signal name Value Effect ALUOp 00 ALU adds 01 ALU subtracts 10 ALU operates according to func codeALUSrcB 00 2nd ALU input = B 01 2nd ALU input = 4 10 2nd ALU input = sign extended IR[15-0] 11 2nd ALU input = sign extended, shift left 2 IR[15-0]PCSource 00 PC = ALU (PC + 4) 01 PC = ALUout (branch target address) 10 PC = PC+4[31-28] : IR[25-0] << 2M

ult

iple

Bit

Con

trol

Page 54: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

4) Fields and LegendField Name Values for Field Function of Field with Specific ValueALU control Add ALU adds

Subt. ALU subtractsFunc code ALU does function code

SRC1 PC 1st ALU input = PCA 1st ALU input = A (Reg[rs])

SRC2 B 2nd ALU input = B (Reg[rt])4 2nd ALU input = 4Extend 2nd ALU input = sign ext. IR[15-0]Extshft 2nd ALU input = sign ex., sl IR[15-0]

Register control Read A = Reg[rs]; B = Reg[rt];Write ALU Reg[rd] = ALUoutWrite MDR Reg[rt] = MDR

Memory Read PC IR (MDR) = mem[PC]Read ALU MDR = mem[ALUout]Write ALU mem[ALUout] = B

PC write ALU PC = ALU outputALUout-cond. IF ALU zero then PC = ALUoutjump addr. PC = PCSource

Sequencing Seq Go to sequential microinstructionFetch Go to the first microinstructionDispatch 1 Dispatch using ROM1Dispatch 2 Dispatch using ROM2

Page 55: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Control SignalsField name Value Signals active Comment

Add ALUOp = 00 Cause the ALU to add.ALU control Subt ALUOp = 01 Cause the ALU to subtract; this implements the compare for

branches.Func code ALUOp = 10 Use the instruction's function code to determine ALU control.

SRC1 PC ALUSrcA = 0 Use the PC as the first ALU input.A ALUSrcA = 1 Register A is the first ALU input.B ALUSrcB = 00 Register B is the second ALU input.

SRC2 4 ALUSrcB = 01 Use 4 as the second ALU input.Extend ALUSrcB = 10 Use output of the sign extension unit as the second ALU input.Extshft ALUSrcB = 11 Use the output of the shift-by-two unit as the second ALU input.Read Read two registers using the rs and rt fields of the IR as the register

numbers and putting the data into registers A and B.Write ALU RegWrite, Write a register using the rd field of the IR as the register number and

Register RegDst = 1, the contents of the ALUOut as the data.control MemtoReg = 0

Write MDR RegWrite, Write a register using the rt field of the IR as the register number andRegDst = 0, the contents of the MDR as the data.MemtoReg = 1

Read PC MemRead, Read memory using the PC as address; write result into IR (and lorD = 0 the MDR).

Memory Read ALU MemRead, Read memory using the ALUOut as address; write result into MDR.lorD = 1

Write ALU MemWrite, Write memory using the ALUOut as address, contents of B as thelorD = 1 data.

ALU PCSource = 00 Write the output of the ALU into the PC.PCWrite

PC write control ALUOut-cond PCSource = 01, If the Zero output of the ALU is active, write the PC with the contentsPCWriteCond of the register ALUOut.

jump address PCSource = 10, Write the PC with the jump address from the instruction.PCWrite

Seq AddrCtl = 11 Choose the next microinstruction sequentially.Sequencing Fetch AddrCtl = 00 Go to the first microinstruction to begin a new instruction.

Dispatch 1 AddrCtl = 01 Dispatch using the ROM 1.Dispatch 2 AddrCtl = 10 Dispatch using the ROM 2.

Page 56: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-57 Computer Architecture

The Microprogram

LabelALU

control SRC1 SRC2Register control Memory

PCWrite control Sequencing

Fetch Add PC 4 Read PC ALU SeqAdd PC Extshft Read Dispatch 1

Mem1 Add A Extend Dispatch 2LW2 Read ALU Seq

Write MDR FetchSW2 Write ALU FetchRformat1 Func code A B Seq

Write ALU FetchBEQ1 Subt A B ALUOut-cond FetchJUMP1 Jump address Fetch

Page 57: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-58 Computer Architecture

The Controller

PCWritePCWriteCondIorD

MemtoRegPCSourceALUOpALUSrcBALUSrcARegWrite

AddrCtl

Outputs

Microcode memory

IRWrite

MemReadMemWrite

RegDst

Control unit

Input

Microprogram counter

Address select logic

Op[

5–

0]

Adder

1

Datapath

Instruction registeropcode field

BWrite

Fig. C.4.6

Page 58: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-59 Computer Architecture

The Dispatch ROMs

Fig. C.5.2

Rformat1

JUMP1

BEQ1

Mem1

Mem1

LW2

SW2

Page 59: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-60 Computer Architecture

Our Plan: Using ROM

Shiftleft 2

PCMux

0

1

RegistersWriteregister

Writedata

Readdata 1

Readdata 2

Readregister 1

Readregister 2

Instruction[15– 11]

Mux

0

1

Mux

0

1

4

Instruction[15– 0]

Signextend

3216

Instruction[25– 21]

Instruction[20– 16]

Instruction[15– 0]

Instructionregister

ALUcontrol

ALUresult

ALUZero

Memorydata

register

A

B

IorD

MemRead

MemWrite

MemtoReg

PCWriteCond

PCWrite

IRWrite

ALUOp

ALUSrcB

ALUSrcA

RegDst

PCSource

RegWrite

Control

Outputs

Op[5– 0]

Instruction[31-26]

Instruction [5– 0]

Mux

0

2

Jumpaddress [31-0]Instruction [25– 0] 26 28

Shiftleft 2

PC [31-28]

1

1 Mux

0

3

2

Mux

0

1ALUOut

Memory

MemData

Writedata

Address

-instructionROM

PC

Page 60: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-61 Computer Architecture

MainMemory

executionunit

controlmemory

CPU

ADDSUBAND

DATA

.

.

.

User program plus Data

this can change!

AND microsequence

e.g., Fetch Calc Operand Addr Fetch Operand(s) Calculate Save Answer(s)

one of these ismapped into oneof these

Microinstruction Interpretation

Page 61: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-62 Computer Architecture

Microprogramming Using ROM : Pros and Cons

Ease of design Flexibility

Each to adapt to changes in organization, timing, technology

Can make changes late in design cycle, or even in the field

Generality Implement multiple inst. sets on same machine Can tailor instruction set to application Can implement very powerful instruction sets

(just more control memory) Compatibility

Many organizations, same instruction set Costly to implement and Slow

Page 62: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-63 Computer Architecture

5) Microinstruction Encoding

State number Control bits 17- 2 Control bits 1- 00 1001010000001000 111 0000000000011000 012 0000000000010100 103 0011000000000000 114 0000001000000010 005 0010100000000000 006 0000000001000100 117 0000000000000011 008 0100000010100100 009 1000000100000000 00

Fig. C.4.5

• Bits 7-13 can be encoded to 3 bits because there are only 7 patterns of the control word

Page 63: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-64 Computer Architecture

Minimal vs. Maximal Encoding Minimal (Horizontal ):+ more control over the potential parallelism of

operations in the datapath- uses up lots of control store

Maximal (Vertical):+ uses less number of control store- extra level of decoding may slow the machine

down

Page 64: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-65 Computer Architecture

Control is specified by a finite state diagram Specialized state-diagrams easily captured

by microsequencer simple increment and “branch” fields datapath control fields

Control can also be specified by microprogramming

Control is more complicated with: complex instruction sets restricted datapaths

Simple instruction set and powerful datapath => simple control could reduce hardware Or go for speed => many instructions at once!

Summary of Control

Page 65: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-66 Computer Architecture

Outline Designing a processor Building the datapath A single-cycle implementation A multicycle implementation

Multicycle datapath Multicycle execution steps Multicycle control

Microprogramming: simplifying control Exceptions

Page 66: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-67 Computer Architecture

Exceptions

User ProgramSystemExceptionHandlerException

return fromexception

Normal control flow: sequential, jumps, branches, calls, returns

Exception = unprogrammed control transfer system takes action to handle the exception

must record address of the offending instruction should know cause and transfer to proper handler if returns to user, must save & restore user state

Page 67: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-68 Computer Architecture

By providing two modes of execution (user/system), computer may manage itself OS is a special program that runs in the

privileged system mode and has access to all of the resources of the computer

Presents “virtual resources” to each user that are more convenient than the physical resources

files vs. disk sectors virtual memory vs. physical memory

protects each user program from others Exceptions allow the system to take action in

response to events that occur while user program is executing OS begins at the handler

User/System Modes

Page 68: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-69 Computer Architecture

Two Types of Exceptions Interrupts:

caused by external events and asynchronous to execution=> may be handled between instructions

simply suspend and resume user program Exceptions:

caused by internal events and synchronous to execution, e.g., exceptional conditions (overflow), errors (parity), faults

instruction may be retried or simulated and program continued or program may be aborted

Page 69: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-70 Computer Architecture

MIPS Convention of ExceptionsMIPS convention:

exception means any unexpected change in control flow, without distinguishing internal or external

use interrupt only when the event is externally caused

Type of event From where? MIPS terminologyI/O device request External InterruptInvoke OS from user program Internal ExceptionHardware malfunctions Either Exception or

InterruptArithmetic overflow Internal ExceptionUsing an undefined inst. Internal Exception

Page 70: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-71 Computer Architecture

Precise Interrupts Precise: machine state is preserved as if

program executed upto the offending inst. Same system code will work on different

implementations of the architecture Position clearly established by IBM, and taken

by MIPS Difficult in the presence of pipelining, out-ot-

order execution, ... Imprecise: system software has to figure out

what is where and put it all back together Performance goals often lead designers to

forsake precise interrupts system software developers, user, markets

etc., usually wish they had not done this

Page 71: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-72 Computer Architecture

Handling Exceptions in Our Design

Consider two types of exceptions:undefined instruction & arithmetic overflow

Basic actions on exception: Save state: save the address of the offending

instruction in the exception program counter (EPC)

Transfer control to OS at some specified address=> need to know the cause for the exception=> then know the address of exception handler

After service, OS can terminate the program or continue its execution, using EPC to return

Page 72: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-73 Computer Architecture

Push it onto the stack Vax, 68k, 80x86

Save it in special registers MIPS EPC, BadVaddr, Status, Cause

Shadow Registers M88k Save state in a shadow of the internal pipeline

registers

Saving State: General Approaches

Page 73: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-74 Computer Architecture

iv_basecause

handlercode

iv_basecause

handler entry code

Traditional approach: interrupt vector The cause of exception is a vector

giving the address of the handler PC <- MEM[ IV_base + cause || 00] 68000, Vax, 80x86, . . .

RISC Handler Table PC <- IV_base + cause || 0000 Saves state and jumps Sparc, PA, M88K, . . .

MIPS approach: fixed entry use a status register (cause register)

to hold a field to indicate the cause PC <- EXC_addr

Addressing the Exception Handler

Page 74: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-75 Computer Architecture

Datapath with Exception Handling

Shiftleft 2

Memory

MemData

Writedata

Mux

0

1

Instruction[15– 11]

Mux

0

1

4

Instruction[15– 0]

Signextend

3216

Instruction[25– 21]

Instruction[20– 16]

Instruction[15– 0]

Instructionregister

ALUcontrol

ALUresult

ALUZero

Memorydata

register

A

B

IorD

MemRead

MemWrite

MemtoReg

PCWriteCond

PCWrite

IRWrite

Control

Outputs

Op[5– 0]

Instruction[31-26]

Instruction [5– 0]

Mux

0

2

Jumpaddress [31-0]Instruction [25– 0] 26 28

Shiftleft 2

PC [31-28]

1

Address

EPC

CO 00 00 00 3

Cause

ALUOp

ALUSrcB

ALUSrcA

RegDst

PCSource

RegWrite

EPCWriteIntCauseCauseWrite

1

0

1 Mux

0

3

2

Mux

0

1

Mux

0

1

PC

Mux

0

1

RegistersWriteregister

Writedata

Readdata 1

Readdata 2

Readregister 1

Readregister 2

ALUOut

Fig. 5.39

Page 75: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-76 Computer Architecture

Additions for Our Design EPC: reg. to hold address of affected inst. Cause: reg. to record cause of exception

Assume LSB encodes the two possible exception sources: undefined instruction=0 and arithmetic overflow=1

Two control signals to write EPC (EPCWrite) and Cause (CauseWrite), and one control signal (IntCause) to set LSB of Cause register

Be able to write exception address into PC, assuming at C000 0000hex

=> needs a 4-way MUX to PC May undo PC = PC + 4 (PC = PC - 4), since

want EPC to point to offending inst. (not its successor)

Page 76: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-77 Computer Architecture

Undefined instruction: detected when no next state is defined from state 1 for the op value Handle this by defining the next state value for

all op values other than lw, sw, 0 (R-type), jmp, and beq as a new state, “other”

Arithmetic overflow: detected with the Overflow signal out of the ALU This signal is used in the modified FSM to

specify an additional possible next stateNote: challenge in designing control of a real machine

is to handle different interactions between instructions and other exception-causing events such that control logic remains small and fast Complex interactions makes the control unit

the most challenging aspect of hardware design

Exception Detection

Page 77: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

FSM with Exception Handling

ALUSrcA = 1ALUSrcB = 00ALUOp = 01PCWriteCond

PCSource = 01

ALUSrcA = 1ALUSrcB = 00ALUOp = 10

RegDst = 1RegWrite

MemtoReg = 0

MemWriteIorD = 1

MemReadIorD = 1

ALUSrcA = 1ALUSrcB = 00ALUOp = 00

RegWriteMemtoReg = 1

RegDst = 0

ALUSrcA = 0ALUSrcB = 11ALUOp = 00

MemReadALUSrcA = 0

IorD = 0IRWrite

ALUSrcB = 01ALUOp = 00

PCWritePCSource = 00

Instruction fetchInstruction decode/

Register fetch

Jumpcompletion

BranchcompletionExecution

Memory addresscomputation

Memoryaccess

Memoryaccess R-type completion

Write-back step

(Op =

'J')

(Op

= 'L

W')

4

01

9862

7 11 1053

Start

Overflow

Overflow

ALUSrcA = 0ALUSrcB = 01ALUOp = 01

EPCWritePCWrite

PCSource = 11

IntCause = 0CauseWrite

ALUSrcA = 0ALUSrcB = 01ALUOp = 01

EPCWritePCWrite

PCSource = 11

IntCause = 1CauseWrite

PCWritePCSource = 10

Fig. 5.40

Page 78: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-79 Computer Architecture

Specialize state diagrams easily captured by microsequencer simple increment and branch fields datapath control fields

Control design reduces to microprogramming Exceptions are the hard part of control

Need to find convenient place to detect exceptions and to branch to state or microinstruction that saves PC and invokes OS

Harder with pipelined CPUs that support page faults on memory accesses, i.e., the instruction cannot complete AND you must restart program at exactly the instruction with the exception

Summary

Page 79: CS4100: 計算機結構 Designing a Multicycle Processor 國立清華大學資訊工程學系 九十六學年度第一學期 Adapted from class notes of D. Patterson and W. Dally Copyright

Multicycle Design-80 Computer Architecture

Datapath with Exception Handling

Fig. 5.39