key arithmetic units -...
TRANSCRIPT
http://www.cs.nctu.edu.tw/~ldvan/
Key Arithmetic Units
Lan-Da Van (范倫達), Ph. D.
Department of Computer Science National Chiao Tung University
Taiwan, R.O.C. Spring, 2011
Source: Prof. M. B. Lin. Digital System Designs
and Practices, 2008, Wiley. Adopt Chapter 1
slides from this book.
Digital Systems Design
Lecture 1 Lecture 1
Outlines
Describe both addition and subtraction modules
Understand the principles of carry-look-ahead
(CLA) adder
Describe the operations of multiplication
Describe the operations of division
Describe the designs of arithmetic-logic unit (ALU)
2
Digital Systems Design
Lecture 1 Lecture 1
Addition and Subtraction
The bottleneck of a conventional n-bit ripple-carry
adder is on the generation of carry needed in each
stage.
To overcome this, many schemes have been
proposed, including
— carry-look-ahead (CLA) adder
— parallel-prefix adders:
Kogge-Stone adder
Brent-Kung adder
3
Digital Systems Design
Lecture 1 Lecture 1
A CLA adder
Define two new functions:
— carry generate (gi): gi = xi · yi
— carry propagate (pi): pi = xi ⊕ yi
xi
si
ci+1
ci
yi
pi
gi
iii cps
iiii cpgc 1
0001 cpgc
001011
000111112 )(
cppgpg
cpgpgcpgc
4
Digital Systems Design
A Carry-Lookahead Generator
p0
g0
c0
c4
c3
c2
c1
p1
g1
p2
g2
p3
g3
5
Digital Systems Design
A CLA Adder
x0y
0
p0
g0
s0
s1
s2
s3
p1 g
1
x1y
1
p2
g2
x2
y2
x3y
3
p3 g
3
Sum generator
pg generator
c3 c
2c
1c
0
c0
c4
CLA generator
6
Digital Systems Design
A CLA Adder
// a 4-bit CLA adder using assign statements
module cla_adder_4bits(x, y, cin, sum, cout);
// inputs and outputs
input [3:0] x, y;
input cin;
output [3:0] sum;
output cout;
// internal wires
wire p0,g0, p1,g1, p2,g2, p3,g3;
wire c4, c3, c2, c1;
// compute the p for each stage
assign p0 = x[0] ^ y[0], p1 = x[1] ^ y[1],
p2 = x[2] ^ y[2], p3 = x[3] ^ y[3];
7
Digital Systems Design
A CLA Adder
// compute the g for each stage
assign g0 = x[0] & y[0], g1 = x[1] & y[1],
g2 = x[2] & y[2], g3 = x[3] & y[3];
// compute the carry for each stage
assign c1 = g0 | (p0 & cin),
c2 = g1 | (p1 & g0) | (p1 & p0 & cin),
c3 = g2 | (p2 & g1) | (p2 & p1 & g0) | (p2 & p1 & p0 & cin),
c4 = g3 | (p3 & g2) | (p3 & p2 & g1) | (p3 & p2 & p1 & g0) |
(p3 & p2 & p1 & p0 & cin);
// compute Sum
assign sum[0] = p0 ^ cin, sum[1] = p1 ^ c1,
sum[2] = p2 ^ c2, sum[3] = p3 ^ c3;
// assign carry output
assign cout = c4;
endmodule
8
Digital Systems Design
A CLA Adder -- Using Generate Statements
// an n-bit CLA adder using generate loops
module cla_adder_generate(x, y, cin, sum, cout);
// inputs and outputs
parameter N = 4; //define the default size
input [N-1:0] x, y;
input cin;
output [N-1:0] sum;
output cout;
// internal wires
wire [N-1:0] p, g;
wire [N:0] c;
// assign input carry
assign c[0] = cin;
n 4 8 16 32
f (MHz) 104.3 78.9 53.0 32.0
LUTs 8 16 32 64
Virtex 2 XC2V250 FG456 -6
9
Digital Systems Design
14-10
A CLA Adder -- Using Generate Statements
genvar i;
generate for (i = 0; i <N; i = i + 1) begin: pq_cla
assign p[i] = x[i] ^ y[i];
assign g[i] = x[i] & y[i];
end endgenerate // compute generate and propagation
generate for (i = 1; i < N+1; i = i + 1) begin: carry_cla
assign c[i] = g[i-1] | (p[i-1] & c[i-1]);
end endgenerate // compute carry for each stage
generate for (i = 0; i < N; i = i + 1) begin: sum_cla
assign sum[i] = p[i] ^ c[i];
end endgenerate // compute sum
assign cout = c[n]; // assign final carry
endmodule // end of cla_adder_generate module
Digital Systems Design
14-11
A CLA Adder --- Using Generate Statements
p[0]
p[1] p[2]
p[3]
g[0] g[1]
g[2] g[3]
carry_cla\[1\].un2_c
sum_1[0]
c[1] carry_cla\[2\].un10_c
sum_1[1]
c[2] carry_cla\[3\].un18_c
sum_1[2]
c[3] carry_cla\[4\].un26_c
sum_1[3]
c[4] [0]
[0][0]
[1][1]
[1]
[2][2]
[2]
[3][3]
[3]
[0][0]
[0]
[1][1]
[1]
[2][2]
[2]
[3][3]
[3]
[0]
[0][0]
[0][1]
[1]
[1]
[1][1]
[1]
[1][2]
[2]
[2]
[2][2]
[2]
[2][3]
[3]
[3]
[3][3]
[3]
[3][4]
cin
y[3:0][3:0]
x[3:0][3:0]
cout[4]
sum[3:0][3:0]
Digital Systems Design
Lecture 1 Lecture 1
Shift-and-Add Multiplication
Rules for a multiple-bit multiplicand times a 1-bit
multiplier:
1. The partial product is the same as the multiplicand if
the multiplier is 1; otherwise,
2. The partial product is 0.
m-bit adder
M0
m-bit 2-to-1
MUX
A Q
Q[0]
Multiplicand
Multiplier/
partial product
m
m
m +1
m
0 1
m
m
M
Q[0]
m
12
Digital Systems Design
Lecture 1 Lecture 1
Shift-and-Add Multiplication
Algorithm: Shift-and-added multiplication
Input: An m-bit multiplicand and an n-bit multiplier.
Output: The (m+ n)-bit product.
Begin
1. Load multiplicand and multiplier into registers M and Q,
respectively;
clear register A and set loop count CNT equal to n.
2. repeat
2.1 if (Q[0] == 1) then A = A +M;
2.2 Right shift register pair A : Q one bit;
2.3 CNT = CNT- 1;
until (CNT == 0);
End
13
Digital Systems Design
Lecture 1 Lecture 1
A Basic Array Multiplier
A multiple-cycle structure can also be implemented
by using an iterative logic structure.
x3
x2
x1
x0
y3
y2
y1
y0
= X
= Y
(multiplier)
(multiplicand)
x0y
0x
1y
0x
2y
0x
3y
0
x0y
1x
1y
1x
2y
1x
3y
1
x0y
2x
1y
2x
2y
2x
3y
2
x0y
3x
1y
3x
2y
3x
3y
3
P6
P5
P4
P3
P2
P1
P0
Partial product
Product
+
1
0
1
0
1
0
1
0
1
0
2222
nm
k
kk
n
j
jiji
m
i
n
j
jj
m
i
ii PyxyxYXP
14
Digital Systems Design
A Basic Unsigned Array Multiplier
y0
P0
P1
y1
x0
x1x
2x
3
y2
y3
x0x
1x
2x
3
P2
P3
P4
P5
P6
P7
xy
SCout
Cin
FA
x0 0x
1x
2x
3 000
0
0
x0
x1x
2x
3
0
0
X = 0111 (7)
Y = 1011 (11)1 1 1 1
0 1 1 1
1
0
1
0
1
0
0
0
1 1 1 1
0 1 1 1
0101
1110
0 0 0 0
1 1 1 1
0 1 1 1
0 1 1 1
0000
1010
1001
0110
0
These two rows may be
combined into one row.
Critical path
m
n
15
Digital Systems Design
An Unsigned CSA Array Multiplier
y0
x0 0x
1x
2x
3
y1
y2
y3
x0x
1x
2x
3
x0x
1x
2x
3
x0x
1x
2x
3
0
0
0
0
P0
P1
P2
P3
P4
P5
P6
P7
Ripple-carry adder or
carry-look-ahead adder
xy
SCout
Cin
FA
000
0
0000
Critical path
These two rows may be
combined into one row.
m
n
16
Digital Systems Design
Lecture 1 Lecture 1
Signed Array Multiplication
Let X and Y be two two’s complement number
im
ii
mm xxX 22
2
0
11
jn
jj
nn yyY 22
2
0
11
2
0
2
0
11
11
211
2
0
2
0
2
0
11
2
0
11
2222
2222
m
i
n
j
mjjm
nini
nmnm
jij
m
i
n
ji
n
j
jj
nn
m
i
ii
mm
yxyxyxyx
yyxx
XYP
17
Digital Systems Design
Lecture 1 Lecture 1
Signed Array Multiplication
x3
x2
x1
x0
y3
y2
y1
y0
= X
= Y (multiplier)
(multiplicand)
x0y
0x
1y
0x
2y
0y
3x
0
x0y
1x
1y
1x
2y
1y
3x
1
x0y
2x
1y
2x
2y
2y
3x
2
y0x
3y
1x
3y
2x
3y
3x
3
P6
P5
P4
P3
P2
P1
P0
Partial product
Product
+
1
P7
1
18
Digital Systems Design
Signed Array Multiplication
y0
x0 0x
1x
2x
3
y1
y2
y3
x0x
1x
2x
3
x0x
1x
2x
3
x0x
1x
2x
3
0
0
1
0
P0
P1
P2
P3
P4
P5
P6
P7
Ripple-carry adder or
carry-look-ahead adder
xy
SCout
Cin
FA
000
1
0000
19
Digital Systems Design
Lecture 1
Baugh-Wooley Multiplier
A
AHA
AFA
AFA
AFA
AFA
AFA
NFA
FA
A
AHA
AFA
AFA
AFA
AFA
AFA
NFA
FA
A
AHA
AFA
AFA
AFA
AFA
AFA
NFA
FA
A
AHA
AFA
AFA
AFA
AFA
AFA
NFA
FA
A
AHA
AFA
AFA
AFA
AFA
AFA
NFA
FA
A
AHA
AFA
AFA
AFA
AFA
AFA
NFA
FA
A
AHA
AFA
AFA
AFA
AFA
AFA
NFA
FA
ND
ND
ND
ND
ND
ND
ND
A
inverter 1
7x 6x 5x4x 3x
1x2x 0x
0y
1y
2y
3y
4y
5y
6y
7y
0P
1P
2P
3P
4P
5P
6P
7P
8P9P10P11P12P13P14P15P
Digital Systems Design
21
NFA Gate on Left Side and
AFA Gate on Right Side
FA
inS inC
outSoutC
ix
jy FA
inS inC
outSoutC
ix
jy
Digital Systems Design
Lecture 1 Lecture 1
Lan-Da
Van
22
Booth Encoding
Instead of 3Y, try –Y, then increment next partial
product to add 4Y
Similarly, for 2Y, try –2Y + 4Y in next partial
product
Digital Systems Design
23
Booth Multiplication
Where B can be written as
---(1)
---(2)
---(3)
---(4)
12
0
2
n
i
iiPABP
2/)2(
0i
212212 2 )2(
ni
iii bbbB
2/)2(
0
2/)2(
0i
212212 2 )2(
n
i
i
n
iiii SAbbbABP
Where Si can be denoted as iiiii AbbbS 2
12212 2)2(
ii
nini
ninii SSSS 2
0,22
2,12
1, 2...22
Digital Systems Design
24
Sign-Generate Sign Extension Scheme
69
8j
7,34
11
8j
7,22
13
8j
7,10
15
8j
7,0 2 )2(2 )2(2 )2(2 )2(
jjjj SSSSS
)22()22()22( 127,2
13107,1
1187,0
9 SSS
8147,3
15 2)22( S
where S is the final sign result. Eqs. (3) and (5) can be
mapped into the partial product diagram and modified
Booth multiplier structure as shown in the following slide
---(5)
Digital Systems Design
Lan-
Da
Van
VLSI
-09-
25
Modified Booth Partial-Product Diagram for an 8x8 Multiplier
0,7 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0,01 S S S S S S S S S
1
1,7 1,7 1,6 1,5 1,4 1,3 1,2 1,1 1,01 S S S S S S S S S
2,7 2,7 2,6 2,5 2,4 2,3 2,2 2,1 2,01 S S S S S S S S S
3,7 3,7 3,6 3,5 3,4 3,3 3,2 3,1 3,01 S S S S S S S S S
Ctrl0 2
Ctrl1 2
Ctrl2 2
Ctrl3 2
w=0
w=1
w=n-
1
n columns
1n columns
2 1n columns
LP MP
Digital Systems Design
26
An 8x8 Modified Booth Multiplier
a7
0
0
0
0
P0P1P2P3P4P5P6P7P8P9P10P11P12P13P14P15
a6 a5 a4 a3 a2 a1 a0
a7 a6 a5 a4 a3 a2 a1 a0
a7 a6 a5 a4 a3 a2 a1 a0
a7 a6 a5 a4 a3 a2 a1 a0
Ctrl0[2]Ctrl1[2]
Ctrl3[2]
Ctrl2[2]
1
11
1
1
Booth
encoder
B[1:0],0
Booth
encoder
B[3:1]
Booth
encoder
B[5:3]
Booth
encoder
B[7:5]
sel selselselselselselselsel
sel sel sel sel sel sel sel selsel
selselselselselselselselsel
selselselselselselselselsel
Ctrl0[2:0]
Ctrl1[2:0]
Ctrl2[2:0]
Ctrl3[2:0]
HA
HA FA HA HA HA HA HA HA
HAHAFAFAFAFAFAFAFAFAFAFAFAFAFA
HA FA FA FA FA FA FA FA
HA FA FA FA FA FA FA FA
Digital Systems Design
Lecture 1 Lecture 1
An Unsigned Nonrestoring Division Algorithm
Algorithm: Unsigned nonrestoring division Input: An n-bit dividend and an m-bit divisor. Output: The quotient and remainder. Begin 1. Load divisor and dividend into registers M and D, respectively; clear partial-remainder register R and set loop count CNT equal to n - 1. 2. Left shift register pair R : D one bit. 3. Compute R = R - M; 4. repeat 4.1 if (R < 0) begin D[0] = 0; left shift R:D one bit; R = R +M; end else begin D[0] = 1; left shift R:D one bit; R = R - M;
end 4.2 CNT = CNT − 1; until (CNT == 0) 5. if (R < 0) begin D[0] = 0; R = R+M ; end else D[0] = 1; End
27
Digital Systems Design
An Unsigned Nonrestoring Division Example
0 0 0 0 1 0 1 0 1
dividend (D)divisor(M)
0 1 1 01 0 1 0
1 0 1 085
10=01010101
610
=0110
2's complement of 6
= 1010
10 1 1 0
1 0 1 1
0
0
0 1 1 0
1 1 0
0
0 1
0
1 1 1
0 1
0
0 1 1 0
1 0 0 10
1 0 1 0
0 0 1 1
1
1
D - M< 0,Q = 0
right shift M,D + M
> 0,Q = 1
Remainder
1 0
1 0
> 0,Q = 1
0 1or represents quotient bit
< 0,Q = 0
right shift M,D + M
< 0,Q = 0
right shift M,D + M
< 0,Q = 0right shift M,D + M
0 1
0
1 0 1 0
0 0 0 0
right shift M,D - M
right shift M,D - M
1 > 0,Q = 1
right shift M,D - M1 0 1 0
1 0 1 1
1
0 < 0,Q = 0
D + M0 1 1 0
0 0 0 1
Hence quotient = 00001110
remainder = 0001
28
Digital Systems Design
Lecture 1 Lecture 1
A Sequential Unsigned Nonrestoring Division
A sequential implementation of unsigned
nonrestoring division.
n-bit adder
M
R D D[0]
Divisor
Dividend/quotient
m
m
m
m
True/complement generator Sub/add
Remainder
Cout
29
Digital Systems Design
An Unsigned Array Nonrestoring Divider
M3
M2 M
1M
0
Q3
Q2
Q1
Q0
CAS CAS CAS CAS
CAS CAS CAS CAS
CAS CAS CAS CAS
CAS CAS CAS CAS
FAFAFAFA 0
R3
R2
R1
R0
D0
D1
D2
D3
FA
CAS
0
1
1 0 10 0 0 1
0 1 1 11 1 000 1 0 1
0 1 0 1
0 1 0 1
0
0
1
0
0 0 1 01 1 01
0 0 101 0 0 0
0 1 0 11 1 100 1 10
0 0 01
CAS: controlled adder and subtractor
1
0
0
0 1 10
1 1 10
0 1 0 1 0 0 0 1 1 0 00 0 1 0
0 1 0 11 1 0 0 1
0 1 0 11 1 1 0 0
0 1 0 1
0 0 0 1 00 1 0 1
1 1 0 10 1 0 1
0 0 1 0
Remainder
correction
30
Digital Systems Design
Lecture 1 Lecture 1
Arithmetic-Logic Units
An arithmetic and logic unit (ALU) is often the major component for the datapath of many applications, especially for central processing unit (CPU) design. An ALU contains two portions:
Arithmetic unit: — addition
— subtraction
— multiplication
— division
Logical unit: — AND
— OR
— NOT
31
Digital Systems Design
Lecture 1 Lecture 1
Shift Operations
Types of shift operations:
— Logical shift
The vacancy bits are filled with 0s.
— Arithmetic shift
The vacancy bits are filled with 0s or the sign bit.
32
Digital Systems Design
Lecture 1 Lecture 1
Logical Shift Operations
Logical shift operations:
— Logical left shift
The input is shifted left a specified number of bit positions.
All vacancy bits are filled with 0s.
— Logical right shift
The input is shifted right a specified number of bit positions.
All vacancy bits are filled with 0s.
33
Digital Systems Design
Lecture 1 Lecture 1
Arithmetic Shift Operations
Arithmetic shift operations:
— Arithmetic left shift
The input is shifted left a specified number of bit positions.
All vacancy bits are filled with 0s.
Indeed, this is exactly the same as logical left shift.
— Arithmetic right shift
The input is shifted right a specified number of bit positions.
All vacancy bits are filled with the sign bit.
34
Digital Systems Design
Arithmetic-Logic Units
ALU
Shifter
A B
F
N ZV C
Cn
Cn-1
Fn
Overflow
Sum
Shifter_mode
Shift_amount
Cin
ALU_mode
x y
Flags
35
Digital Systems Design
Lecture 1 Lecture 1
Summary
You have learn the following items:
— Addition and subtraction modules
— Carry-look-ahead (CLA) adder
— Multipliers: Shift-and-Add multiplier, Baugh-Wooley
multiplier, Booth multipliers
— Divider
— Arithmetic-logic unit (ALU)
36