第三章 研究背景 在本章,我們將介紹baugh-wooley乘法 … ·...

22

Click here to load reader

Upload: hahanh

Post on 27-May-2018

375 views

Category:

Documents


2 download

TRANSCRIPT

  • Baugh-Wooley

    Wallace Dadda

    Brent-Kung Dadda

    3.1 Baugh-Wooley

    DSP (signed) DSP

    Baugh-Wooley [22]

    2 Baugh-Wooley

    21 8 8 bit (unsigned) X

    X0X1X7 8 bitsY Y0Y1

    Y7 8 bitS (partial product)Si,j Xi AND Yj

    8864 P

    P0P1P15 16 bit P15 P14 (carry)

    22 8 8 bit Baugh-Wooley NSi,jNOT

    (Xi AND Yj) 88 bit Baugh -Wooley

    - 37 -

  • S7,0S7,1S7,6 S0,7S1,7S6,7

    14 P8

    P15 P14

    Baugh-Wooley (signed extension)

    21 8 8 bit (unsigned)

    22 8 8 bit Baugh-Wooley (signed)

    - 38 -

  • 3.2

    (array)Wallace

    Dadda

    3.2.1 (array)

    23 8 8 bit

    (half adder) (full adder)

    2 AB(sum) S(carry)

    C

    SAB

    CAB

    3 AB

    Cin S C

    SABCin (9)

    CABBCinACin (10)

    S

    C

    23 8 864 Stage 1

    - 39 -

  • 7Stage 2 Stage 7

    42

    n n bit Stage 1 nStage 2

    n1Stage 3 n2 Stage 2

    8 8 bit Stage

    2,3,4,5,6,7,8

    23 8 8 bit

    - 40 -

  • 23 24 8 8 bit 40

    F H(sum)

    (carry)S (partial product)Si,j

    Xi AND Yj

    24 8 8 bit

    Stage 7 -(carry-propagate

    adder, CPA)CPA (word length) bit 2

    - 41 -

  • 8 8 bit CPA 7 CPA

    -(ripple-carry adder, RCA)-

    (carry look-ahead adder, CLA)[23]-

    25 25 4 bit

    -A(3:0) B(3:0) 4 bit S(4:0)

    5 bit FA S C (9)(10)

    Cin0 C(3)S(4)

    C

    25 4 bit -

    CPA-264 bit

    -A(3:0) B(3:0) 4 bit S(4:0)

    5 bit FA C(i)

    (generate)G(i)(propagate)P(i)

    - 42 -

  • G(i)A(i)B(i)

    P(i)A(i)B(i)

    C(i)G(i)P(i)C(i-1) (11)

    C(i1)

    C(i1)G(i1)P(i1)C(i) (12)

    (11)(12)

    C(i1)G(i1)P(i1)G(i)P(i1)P(i)C(i-1)

    C(i)GP26 4 bit

    C

    C(0)G(0)P(0)Cin

    C(1)G(1)P(1)C(0)

    G(1)P(1)G(0)P(1)P(0)Cin

    C(2)G(2)P(2)C(1)

    G(2)P(2)G(1)P(2)P(1)G(0)P(2)P(1)P(0)Cin

    C(3)G(3)P(3)C(2)

    G(3)P(3)G(2)P(3)P(2)G(1)P(3)P(2)P(1)G(0)

    P(3)P(2)P(1)P(0)Cin

    - 43 -

  • 26 4 bit -

    26 Cin0 C(3)S(4)-

    Cin

    -

    3.2.2 Wallace

    Wallace (forward)[24]

    3:2 (compressor)(3,2)(counter) 3

    bits 2 bits 2:2 (2,2)

    2 bits 2 bits

    n n bitWallaceR0n

    Stage j1 Rj+1

    Rj+12floor(Rj/3)Rj mod 3

    - 44 -

  • floor(Rj/3) 2 8 8 bit

    WallaceR08Stage 1 Stage 4 R1R4

    R12floor(8/3)8 mod 3 222 6

    R22floor(6/3)6 mod 3 22 4

    R32floor(4/3)4 mod 3 211 3

    R42floor(3/3)3 mod 3 21 2

    8 8 bit Wallace Stage

    2,3,4,6 27 8 8 bit Wallace

    38 15 CPA 11

    27 8 8 bit Wallace

    - 45 -

  • 2.2.3 Dadda

    Dadda L. Dadda 1965 [25] Wallace

    Wallace (backward)

    28 8 8 bit Dadda

    2 3/2

    2,3,4,6,9,13,19,28,42,63, 8 8

    bit Dadda 2,3,4,6

    28 8 8 bit Dadda

    28 29 8 8 bit Dadda H

    F (sum)

    - 46 -

  • (carry)H F

    (product) sum

    carry

    stage 1 H6 sum stage 2 6

    F6 carry stage 2 7 F7

    29 8 8 bit Dadda

    - 47 -

  • 8 8 bit Dadda 35 7

    CPA 14 Wallace 4

    3.2.4

    6 8 816

    1632 32 64 64 bit

    6

    1.

    DaddaWallaceArray

    DaddaWallaceCPAArray

    CPA-(carry look-ahead

    - 48 -

  • adder, CLA)CPACLA

    DaddaWallace(gate delays)

    [26]

    DaddaWallace

    log2DaddaWallace

    Array

    Dadda Wallace

    Wallace CPA Dadda

    CPA CLA Wallace Dadda

    Timemill (worst case delay)

    [27]

    2.

    Array Wallace

    Dadda

    Cadence (place and route)8 64 bits

    Wallace Dadda 4% 7% [27]

    Wallace

    - 49 -

  • 3.

    Array

    Dadda Wallace

    Dadda 3

    (pipelined register)

    Dadda

    Array

    Array

    Array (latency) Dadda

    Dadda

    Array Dadda

    CPA Brent-Kung -

    (carry look-ahead adder, CLA)

    (pipelined register)

    3.3 (pipelined signed) Dadda

    Dadda

    Dadda CPABrent-Kung

    [28] CPA BrentKung

    - 50 -

  • 3.3.1 BrentKung CLA

    Brent-Kung CLA

    [29] Brent-Kung CPA

    Brent-Kung

    A(i) B(i)(sum) S(i)(carry) C(i) i

    n bit i0,1,2,,n-1

    C(0)0

    C(i)A(i)B(i)A(i)C(i-1)B(i)C(i-1)

    S(i)A(i)B(i)C(i-1)

    S(n)C(n-1)

    AND OR XOR

    (generate) G(i)(propagate)P(i)

    G(i)A(i)B(i) (13)

    P(i)A(i)B(i) (14)

    C(i)G(i)P(i)C(i-1)

    S(i)P(i)C(i-1) (15)

    - 51 -

  • Brent-Kung o

    [G(i),P(i)] o [G(i-1),P(i-1)]{G(i)[P(i)G(i-1)], P(i)

    P(i-1)}

    30 o -(carry

    look-ahead generator cell, CLG)

    30 -(CLG)

    [g(i),p(i)] [G(i),P(i)] if i0,

    [G(i),P(i)] o [g(i-1),p(i-1)] if 1i

    n-1.

    [g(i),p(i)][G(i),P(i)] o [G(i-1),P(i-1)] o o

    [G(0),P(0)] (16)

    C(i)g(i) for i0,1,,n-1 (17)

    - 52 -

  • 31 16 bit Brent-Kung AB 16 bit

    S 17 bit Brent-Kung Block A

    Block B Block C Block B 4 Stage

    Block ABlock B Block C

    31 16 bit Brent-Kung

    - 53 -

  • Block A: A B(13),(14) G P

    Block B: G P(16),(17) CLG C 31

    CLG 2 bitCLG G(i-1)

    P(i-1) G(i) P(i)(16) CLG

    g(i) p(i)(17) C(i)Block B CLG

    4 Stages

    Block C: P C(15) S

    P Block A S(0)P(0)S(16)C(15)

    3.3.2

    (signed) Dadda

    (pipeline register) 16 16 bit (signed) Dadda

    32 1 bit D (Filp-Flop)data_in

    clock cycle data_out

    32 1 bit

    - 54 -

  • VHDL code :

    library ieee;

    use ieee.std_logic_1164.all;

    entity reg is

    port( data_in: in std_logic;

    clk: in std_logic;

    data_out: out std_logic);

    end reg;

    architecture beha of reg is

    begin

    process(data_in,clk)

    begin

    if clk'event and clk='1' then

    data_out

  • 33 16 16 bit Dadda

    33 16 16 bit Dadda

    2,3,4,6,9,136Stage

    2 P16 3.1 Baugh-Wooley

    P16 1

    33 Stage 2

    - 56 -

  • S15,0S15,1S15,14 S0,15

    S1,15S14,15 30 P30

    P31 16 16 bit

    34 16 16 bit Dadda X_in Y_in

    Q 6 Dadda

    6 Dadda

    1stANDDadda

    12nd Dadda 2 3

    3rd Dadda 4 5

    CPA 16 16 bit Dadda CPA

    30 Brent-Kung 30 bit

    Block B CLG 5 CLG (gate delay)

    CLG 4th

    Dadda 6 Brent-Kung Block A

    5th Block B 1 2 CLG 6th

    Block B 3 4 CLG

    16 16 bit Dadda 6

    16 16 bit(pipelined)(signed) Dadda

    - 57 -

  • 34 16 16 bit Dadda

    - 58 -