key arithmetic unitsviplab.cs.nctu.edu.tw/course/dsd2017_spring/dsd_lecture_01.pdf · digital...
TRANSCRIPT
http://www.cs.nctu.edu.tw/~ldvan/
Key Arithmetic Units
Lan-Da Van (范倫達), Ph. D.
Department of Computer ScienceNational Chiao Tung University
Taiwan, R.O.C.Spring, 2017
Source: Prof. M. B. Lin. Digital System Designs
and Practices, 2008, Wiley. Adopt Chapter 1
slides from this book.
Digital System Design
Lecture 1Lecture 1
Content Relationships
Digital System
Design
DSP System: Adder, Mul, Filter
Computer Architecture
2
Digital System Design
Lecture 1Lecture 1
Outlines
Describe both addition and subtraction modules
Understand the principles of carry-look-ahead
(CLA) adder
Describe the operations of 2’s complement
multiplication
— Baugh-Wooley multiplication
— Booth multiplication
Describe the designs of arithmetic-logic unit (ALU)
3
Digital System Design
Lecture 1Lecture 1
Addition and Subtraction
The bottleneck of a conventional n-bit ripple-carry
adder is on the generation of carry needed in each
stage.
To overcome this, many schemes have been
proposed, including
— Carry-look-ahead (CLA) adder
— Carry skip adder (skip here)
— Carry save adder (skip here)
4
Digital System Design
Lecture 1Lecture 1
A CLA adder
Define two new functions:
— carry generate (gi): gi = xi · yi
— carry propagate (pi): pi = xi ⊕ yi
xi
si
ci+1
ci
yi
pi
gi
iii cps
iiii cpgc 1
0001 cpgc
001011
000111112 )(
cppgpg
cpgpgcpgc
5
Digital System Design
A Carry-Lookahead Generator
p0
g0
c0
c4
c3
c2
c1
p1
g1
p2
g2
p3
g3
6
Digital System Design
A CLA Adder
x0y
0
p0
g0
s0
s1
s2
s3
p1 g
1
x1y
1
p2
g2
x2
y2
x3y
3
p3 g
3
Sum generator
pg generator
c3 c
2c
1c
0
c0
c4
CLA generator
7
Digital System Design
A CLA Adder
// a 4-bit CLA adder using assign statements
module cla_adder_4bits(x, y, cin, sum, cout);
// inputs and outputs
input [3:0] x, y;
input cin;
output [3:0] sum;
output cout;
// internal wires
wire p0,g0, p1,g1, p2,g2, p3,g3;
wire c4, c3, c2, c1;
// compute the p for each stage
assign p0 = x[0] ^ y[0], p1 = x[1] ^ y[1],
p2 = x[2] ^ y[2], p3 = x[3] ^ y[3];
8
Digital System Design
A CLA Adder
// compute the g for each stage
assign g0 = x[0] & y[0], g1 = x[1] & y[1],
g2 = x[2] & y[2], g3 = x[3] & y[3];
// compute the carry for each stage
assign c1 = g0 | (p0 & cin),
c2 = g1 | (p1 & g0) | (p1 & p0 & cin),
c3 = g2 | (p2 & g1) | (p2 & p1 & g0) | (p2 & p1 & p0 & cin),
c4 = g3 | (p3 & g2) | (p3 & p2 & g1) | (p3 & p2 & p1 & g0) |
(p3 & p2 & p1 & p0 & cin);
// compute Sum
assign sum[0] = p0 ^ cin, sum[1] = p1 ^ c1,
sum[2] = p2 ^ c2, sum[3] = p3 ^ c3;
// assign carry output
assign cout = c4;
endmodule
9
Digital System Design
A CLA Adder -- Using Generate Statements
// an n-bit CLA adder using generate loops
module cla_adder_generate(x, y, cin, sum, cout);
// inputs and outputs
parameter N = 4; //define the default size
input [N-1:0] x, y;
input cin;
output [N-1:0] sum;
output cout;
// internal wires
wire [N-1:0] p, g;
wire [N:0] c;
// assign input carry
assign c[0] = cin;
n 4 8 16 32
f (MHz) 104.3 78.9 53.0 32.0
LUTs 8 16 32 64
Virtex 2 XC2V250 FG456 -6
10
Digital System Design
14-11
A CLA Adder -- Using Generate Statements
genvar i;
generate for (i = 0; i <N; i = i + 1) begin: pq_cla
assign p[i] = x[i] ^ y[i];
assign g[i] = x[i] & y[i];
end endgenerate // compute generate and propagation
generate for (i = 1; i < N+1; i = i + 1) begin: carry_cla
assign c[i] = g[i-1] | (p[i-1] & c[i-1]);
end endgenerate // compute carry for each stage
generate for (i = 0; i < N; i = i + 1) begin: sum_cla
assign sum[i] = p[i] ^ c[i];
end endgenerate // compute sum
assign cout = c[n]; // assign final carry
endmodule // end of cla_adder_generate module
Digital System Design
14-12
A CLA Adder --- Using Generate Statements
p[0]
p[1] p[2]
p[3]
g[0] g[1]
g[2] g[3]
carry_cla\[1\].un2_c
sum_1[0]
c[1] carry_cla\[2\].un10_c
sum_1[1]
c[2] carry_cla\[3\].un18_c
sum_1[2]
c[3] carry_cla\[4\].un26_c
sum_1[3]
c[4] [0]
[0][0]
[1][1]
[1]
[2][2]
[2]
[3][3]
[3]
[0][0]
[0]
[1][1]
[1]
[2][2]
[2]
[3][3]
[3]
[0]
[0][0]
[0][1]
[1]
[1]
[1][1]
[1]
[1][2]
[2]
[2]
[2][2]
[2]
[2][3]
[3]
[3]
[3][3]
[3]
[3][4]
cin
y[3:0][3:0]
x[3:0][3:0]
cout[4]
sum[3:0][3:0]
Digital System Design
Lecture 1Lecture 1
Shift-and-Add Multiplication
Rules for a multiple-bit multiplicand times a 1-bit
multiplier:
1. The partial product is the same as the multiplicand if
the multiplier is 1; otherwise,
2. The partial product is 0.
m-bit adder
M0
m-bit 2-to-1
MUX
A Q
Q[0]
Multiplicand
Multiplier/
partial product
m
m
m +1
m
0 1
m
m
M
Q[0]
m
13
Digital System Design
Lecture 1Lecture 1
Shift-and-Add Multiplication
Algorithm: Shift-and-added multiplication
Input: An m-bit multiplicand and an n-bit multiplier.
Output: The (m+ n)-bit product.
Begin
1. Load multiplicand and multiplier into registers M and Q,
respectively;
clear register A and set loop count CNT equal to n.
2. repeat
2.1 if (Q[0] == 1) then A = A +M;
2.2 Right shift register pair A : Q one bit;
2.3 CNT = CNT- 1;
until (CNT == 0);
End
14
Digital System Design
Lecture 1Lecture 1
A Basic Array Multiplier
A multiple-cycle structure can also be implemented
by using an iterative logic structure.
x3
x2
x1
x0
y3
y2
y1
y0
= X
= Y
(multiplier)
(multiplicand)
x0y
0x
1y
0x
2y
0x
3y
0
x0y
1x
1y
1x
2y
1x
3y
1
x0y
2x
1y
2x
2y
2x
3y
2
x0y
3x
1y
3x
2y
3x
3y
3
P6
P5
P4
P3
P2
P1
P0
Partial product
Product
+
1
0
1
0
1
0
1
0
1
0
2222
nm
k
kk
n
j
jiji
m
i
n
j
jj
m
i
ii PyxyxYXP
15
P7
Digital System Design
A Basic Unsigned Array Multiplier
y0
P0
P1
y1
x0
x1x
2x
3
y2
y3
x0x
1x
2x
3
P2
P3
P4
P5
P6
P7
xy
SCout
Cin
FA
x0 0x
1x
2x
3 000
0
0
x0
x1x
2x
3
0
0
X = 0111 (7)
Y = 1011 (11)1 1 1 1
0 1 1 1
1
0
1
0
1
0
0
0
1 1 1 1
0 1 1 1
0101
1110
0 0 0 0
1 1 1 1
0 1 1 1
0 1 1 1
0000
1010
1001
0110
0
These two rows may be
combined into one row.
Critical path
m
n
16
Digital System Design
An Unsigned CSA Array Multiplier
y0
x0 0x
1x
2x
3
y1
y2
y3
x0x
1x
2x
3
x0x
1x
2x
3
x0x
1x
2x
3
0
0
0
0
P0
P1
P2
P3
P4
P5
P6
P7
Ripple-carry adder or
carry-look-ahead adder
xy
SCout
Cin
FA
000
0
0000
Critical path
These two rows may be
combined into one row.
m
n
17
Digital System Design
Lecture 1Lecture 1
Signed Array Multiplication
Let X and Y be two two’s complement number
im
ii
mm xxX 22
2
0
11
jn
jj
nn yyY 22
2
0
11
2
0
2
0
11
11
211
2
0
2
0
2
0
11
2
0
11
2222
2222
m
i
n
j
mjjm
nini
nmnm
jij
m
i
n
ji
n
j
jj
nn
m
i
ii
mm
yxyxyxyx
yyxx
XYP
18
Digital System Design
Lecture 1Lecture 1
Signed Array Multiplication
x3
x2
x1
x0
y3
y2
y1
y0
= X
= Y (multiplier)
(multiplicand)
x0y
0x
1y
0x
2y
0y
3x
0
x0y
1x
1y
1x
2y
1y
3x
1
x0y
2x
1y
2x
2y
2y
3x
2
y0x
3y
1x
3y
2x
3y
3x
3
P6
P5
P4
P3
P2
P1
P0
Partial product
Product
+
1
P7
1
19
Digital System Design
Signed Array Multiplication
y0
x0 0x
1x
2x
3
y1
y2
y3
x0x
1x
2x
3
x0x
1x
2x
3
x0x
1x
2x
3
0
0
1
0
P0
P1
P2
P3
P4
P5
P6
P7
Ripple-carry adder or
carry-look-ahead adder
xy
SCout
Cin
FA
000
1
0000
20
Digital System Design
Lecture 1
Baugh-Wooley Multiplier
A
AHA
AFA
AFA
AFA
AFA
AFA
NFA
FA
A
AHA
AFA
AFA
AFA
AFA
AFA
NFA
FA
A
AHA
AFA
AFA
AFA
AFA
AFA
NFA
FA
A
AHA
AFA
AFA
AFA
AFA
AFA
NFA
FA
A
AHA
AFA
AFA
AFA
AFA
AFA
NFA
FA
A
AHA
AFA
AFA
AFA
AFA
AFA
NFA
FA
A
AHA
AFA
AFA
AFA
AFA
AFA
NFA
FA
ND
ND
ND
ND
ND
ND
ND
A
inverter 1
7x 6x 5x4x 3x
1x2x 0x
0y
1y
2y
3y
4y
5y
6y
7y
0P
1P
2P
3P
4P
5P
6P
7P
8P9P10P11P12P13P14P15P
Digital System Design
22
NFA Gate on Left Side and
AFA Gate on Right Side
FA
inS inC
outSoutC
ix
jy FA
inS inC
outSoutC
ix
jy
Digital System Design
Lecture 1Lecture 1
Lan-Da
Van
23
Booth Encoding
Instead of 3Y, try –Y, then increment next partial
product to add 4Y
Similarly, for 2Y, try –2Y + 4Y in next partial
product
Digital System Design
24
Booth Multiplication
Where B can be written as
---(1)
---(2)
---(3)
---(4)
12
0
2
n
i
iiPABP
2/)2(
0i
212212 2 )2(
ni
iii bbbB
2/)2(
0
2/)2(
0i
212212 2 )2(
n
i
i
n
iiii SAbbbABP
Where Si can be denoted as iiiii AbbbS 2
12212 2)2(
ii
nini
ninii SSSS 2
0,22
2,12
1, 2...22
Digital System Design
25
Sign-Generate Sign Extension Scheme
69
8j
7,34
11
8j
7,22
13
8j
7,10
15
8j
7,0 2 )2(2 )2(2 )2(2 )2(
jjjj SSSSS
)22()22()22( 127,2
13107,1
1187,0
9 SSS
8147,3
15 2)22( S
where S is the final sign result. Eqs. (3) and (5) can be
mapped into the partial product diagram and modified
Booth multiplier structure as shown in the following slide
---(5)
Digital System Design
Lan-
Da
Van
VLSI
-09-
26
Modified Booth Partial-Product Diagram for an 8x8 Multiplier
0,7 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0,01 S S S S S S S S S
1
1,7 1,7 1,6 1,5 1,4 1,3 1,2 1,1 1,01 S S S S S S S S S
2,7 2,7 2,6 2,5 2,4 2,3 2,2 2,1 2,01 S S S S S S S S S
3,7 3,7 3,6 3,5 3,4 3,3 3,2 3,1 3,01 S S S S S S S S S
Ctrl0 2
Ctrl1 2
Ctrl2 2
Ctrl3 2
w=0
w=1
w=n-
1
n columns
1n columns
2 1n columns
LPMP
Digital System Design
27
An 8x8 Modified Booth Multiplier
a7
0
0
0
0
P0P1P2P3P4P5P6P7P8P9P10P11P12P13P14P15
a6 a5 a4 a3 a2 a1 a0
a7 a6 a5 a4 a3 a2 a1 a0
a7 a6 a5 a4 a3 a2 a1 a0
a7 a6 a5 a4 a3 a2 a1 a0
Ctrl0[2]Ctrl1[2]
Ctrl3[2]
Ctrl2[2]
1
11
1
1
Booth
encoder
B[1:0],0
Booth
encoder
B[3:1]
Booth
encoder
B[5:3]
Booth
encoder
B[7:5]
sel selselselselselselselsel
sel sel sel sel sel sel sel selsel
selselselselselselselselsel
selselselselselselselselsel
Ctrl0[2:0]
Ctrl1[2:0]
Ctrl2[2:0]
Ctrl3[2:0]
HA
HA FA HA HA HA HA HA HA
HAHAFAFAFAFAFAFAFAFAFAFAFAFAFA
HA FA FA FA FA FA FA FA
HA FA FA FA FA FA FA FA
Digital System Design
Lecture 1Lecture 1
Arithmetic-Logic Units
An arithmetic and logic unit (ALU) is often the major component for the datapath of many applications, especially for central processing unit (CPU) design. An ALU contains two portions:
Arithmetic unit: — addition
— subtraction
— multiplication
— division
Logical unit:— AND
— OR
— NOT
28
Digital System Design
Lecture 1Lecture 1
Shift Operations
Types of shift operations:
— Logical shift
The vacancy bits are filled with 0s.
— Arithmetic shift
The vacancy bits are filled with 0s or the sign bit.
29
Digital System Design
Lecture 1Lecture 1
Logical Shift Operations
Logical shift operations:
— Logical left shift
The input is shifted left a specified number of bit positions.
All vacancy bits are filled with 0s.
— Logical right shift
The input is shifted right a specified number of bit positions.
All vacancy bits are filled with 0s.
30
Digital System Design
Lecture 1Lecture 1
Arithmetic Shift Operations
Arithmetic shift operations:
— Arithmetic left shift
The input is shifted left a specified number of bit positions.
All vacancy bits are filled with 0s.
Indeed, this is exactly the same as logical left shift.
— Arithmetic right shift
The input is shifted right a specified number of bit positions.
All vacancy bits are filled with the sign bit.
31
Digital System Design
Arithmetic-Logic Units
ALU
Shifter
A B
F
N ZV C
Cn
Cn-1
Fn
Overflow
Sum
Shifter_mode
Shift_amount
Cin
ALU_mode
x y
Flags
32
Digital System Design
Lecture 1Lecture 1
Summary
You have learn the following items:
— Addition and subtraction modules
— Carry-look-ahead (CLA) adder
— Multipliers: Shift-and-Add multiplier, Baugh-Wooley
multiplier, Booth multipliers
— Arithmetic-logic unit (ALU)
33