unit5 memory ee577a nazarian spring12
TRANSCRIPT
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
1/56
EE577A
VLSI System Design
Memory Design
University of Southern California
Viterbi School of Engineering
Shahin Nazarian Spring 2012
References: syllabus textbooks, Slides and notes from
Professor Pedram, online resources
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
2/56
Shahin Nazarian/EE577A/Spring 2012
Digital Memories Types
2
Memory Arrays
Random Access Memory Serial Access Memory Content Addressable Memory
(CAM)
Read/Write Memory
(RAM)
(Volatile)
Read Only Memory
(ROM)
(Nonvolatile)
Static RAM
(SRAM)
Dynamic RAM
(DRAM)
Shift Registers Queues
First In
First Out
(FIFO)
Last In
First Out
(LIFO)
Serial In
Parallel Out
(SIPO)
Parallel In
Serial Out
(PISO)
Mask ROM Programmable
ROM
(PROM)
Erasable
Programmable
ROM
(EPROM)
Electrically
Erasable
Programmable
ROM
(EEPROM)
Flash ROM
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
3/56
Shahin Nazarian/EE577A/Spring 2012
Optional: Random Access Technology
• As the technology evolves randomness plays more important
role and we want data to be randomly accessible• We also want the access time to not be a function of the
location of the memory data
• Memory can be classified into Random Access Memory (RAM)and non-RAM memories
• Random Access Memories can be further classified intoROMs and Read/Write (R/W) memories
• In RAM technology access time is the same regardless ofthe location of the memory data
• R/W memory is also commonly called RAM due to historicalreasons
• R/W (or RAMs) have two main types of Dynamic RAMs(DRAMs) and Static RAMs (SRAMs)
3
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
4/56
Shahin Nazarian/EE577A/Spring 2012 4
Optional: Random Access: SRAM vsDRAM
• Random Access:•
DRAM: Dynamic Random Access Memory– High density, cheap, slow
– Dynamic: need to be “refreshed” regularly
• SRAM: Static Random Access Memory
– Low density, expensive, fast
– Static: content will last “forever” (until lose power)
– Typically lower power consumption when used at moderate and lowfrequencies; nearly negligible power when idle, however could be aspower-hungry as dynamic RAM, when used at high frequencies andbandwidths draws
• “Not-so-random” Access Technology:
• Access time varies from location to location and from time totime
• It’s randomly accessible, but it’s not exactly the same time
• Examples: Disk, CDROM
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
5/56
Shahin Nazarian/EE577A/Spring 2012
Optional: DRAM
• DRAM has high density compared to SRAM, however
DRAM cell info is degraded due to junction leakagecurrent at the storage node, so cell data must be readand rewritten periodically (refresh operation). Due to lowcost and high density, DRAM is widely used for the mainmemory in personal and mainframe computers andworkstations
• Example: 1T (one-transistor) DRAM cell consists of acapacitor to store binary 1 (high voltage) or 0 (lowvoltage) and a transistor to access the capacitor
5
1T DRAM
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
6/56
Shahin Nazarian/EE577A/Spring 2012
Optional: SRAM
• SRAM consists of a latch, so the cell data is kept as long
as power is turned on and refresh operation is notrequired
• SRAM is mainly used for the cache memory inmicroprocessors, main frame computers, engineering
workstations and memory in hand-held devices due to highspeed and low power consumption
• Example: 6T SRAM
6
6T SRAM
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
7/56Shahin Nazarian/EE577A/Spring 2012
Optional: Non Volatile Memory (NVM)
• A memory that can hold the data even when not
powered is referred to as NVM• Example are different types of ROMs such as Flash
memory, magnetic memories such magnetic tapes andhard disks, optical discs, even punch cards!
7
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
8/56Shahin Nazarian/EE577A/Spring 2012
Optional: ROM
• ROM allows only retrieval of previously stored data. No
modification is permitted. ROMs are nonvolatilememories, i.e., the stored data is not lost even whenthe power supply is off and refresh operation is notrequired
•
ROM is classified to Mask ROM and PROM • In Mask (Fuse) ROM, data is written during chip
manufacturing by using a photo mask
• In PROM the data is written electronically after
the chip is fabricated
Mask (Fuse) ROM
8
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
9/56Shahin Nazarian/EE577A/Spring 2012
Optional: ROM (Cont.)
• PROM is classified to EPROM, and EEPROM
•
Data written by blowing the fuse electrically cannot beerased and modified in Fuse ROM
• Data in EPROM and EEPROM can be rewritten, but thenumber of subsequent re-writes is limited to 104-105
•
In EPROM: ultraviolet rays that can penetrate through thecrystal glass on the package are used to erase whole data inchip simultaneously. Programming is done by higher thannormal voltages
• In EEPROM higher than normal electrical voltage is used to
program/erase data in 8 bit units• EEPROM drawback: slower write speed,
in order of microseconds
9EPROM, EEPROM
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
10/56Shahin Nazarian/EE577A/Spring 2012
Optional: ROM (Cont.)
• ROMs are generally used for permanent (look-up)
memory in printers, fax, game machines, and ID cards,due to lower cost than RAM
• Ferroelectric RAM (FRAM) utilizes the hysteresischaracteristics of a ferroelectric capacitor to
overcome the slow write operation of other EEPROMs • Flash ROM is similar to EEPROM and EPROM in using
an array of floating gates (also referred to as cells). Asingle-level cell can store one bit of information,
whereas a multi-level cells can store more than on bitof info by varying the number of electrons placed onthe floating gate of the cell. Similarly to EEPROM,higher than normal voltages are used to program/erasethe cells
10
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
11/56Shahin Nazarian/EE577A/Spring 2012
Optional: Memory Design Goals
• The goal is to design memories that are larger, denser
(more bits per area), faster (faster write and readoperation), more reliable, consume less power, andhave less design complexity
• However some of these goals are contradictory, so
compromises have to be made• Paradoxes of memory design
–Denser and faster
– Larger capacity and low power
–Reduced complexity and high reliability
11
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
12/56Shahin Nazarian/EE577A/Spring 2012
Optional: Memory Design Goals (Cont.)
• As we increase the memory capacity we also get sluggish
access. To mitigate this, some architectural techniquesare used, e.g., memory partitioning where there are
divided word lines, bit lines, etc.
• Similarly higher capacity and denser designs result in
higher power consumption (more specifically leakage) and
to alleviate, architectures such as 6T are used to reduce
the power
•
Last, but not least, using lower voltage operation resultsin reliability issues, which are addressed by adding more
transistors, using some architectural level techniques,
using error correcting codes (ECC) such as parity bits
12
O ti l F t C i B t
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
13/56Shahin Nazarian/EE577A/Spring 2012 13
Optional: Feature Comparison BetweenMemory Types
* FN Tunneling: Fowler-Nordheim tunneling
HCI: Host Control Interface
*
ti t i
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
14/56Shahin Nazarian/EE577A/Spring 2012
ptiona : emory eature omparison(Cont.)
• Flash memories are the slowest, but compared to
SRAMs, even DRAMs are considered slow• Also due to their technology, flash memories have
limited number of reads and writes
• However flash memories do not have the refresh
circuitry and some other overheads of DRAM, so theyare denser
• DRAM has the most volatile data retention, cause ofleakage and possibly destructive reads
• In addition DRAM has poor scalability because byincreasing the number of bit lines and hence longerbit lines, the issue of charge sharing becomes moreprominent
14
O ti l M Hi h f M d
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
15/56Shahin Nazarian/EE577A/Spring 2012 15
Optional: Memory Hierarchy of a ModernComputer System
• Memory hierarchy has been a very successful concept in computerarchitecture design. It exploits the principles of (temporal andspatial) locality
• Present the user with as much memory as is available in thecheapest technology
• Provide access at speed offered by fastest technology
Control
Datapath
Secondary
Storage
(Disk)
Processor
R e gi s t e r s
MainMemory
(DRAM)
SecondLevel
Cache
(SRAM)
On- Ch i p
C a c h e
ones 10,000,000’s
(10s ms)
Speed (ns): tens hundreds
100’s G’s Size (bytes): K’s M’s
Tertiary
Storage
(Tape)
10,000,000,000’s
(10s sec)T’s
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
16/56Shahin Nazarian/EE577A/Spring 2012
Optional: How is the hierarchy managed?
•
Registers Memory• by compiler (programmer?)
• cache memory•
by the hardware•memory disks
• by the hardware and operating system (virtualmemory)
• by the programmer (files)
16
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
17/56Shahin Nazarian/EE577A/Spring 2012
Static Read-Write Memory (SRAM)
17
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
18/56Shahin Nazarian/EE577A/Spring 2012 18
SRAM vs. DRAM Summary
• SRAM
• Faster because bit linesare actively driven bythe D-Latch
• Faster, simpler interface
due to lack of refresh• Larger area for each cell
which means less memoryper chip
• Used for cache memories(and also register files)memory wherespeed/latency is key
• DRAM
• Slower becausepassive value (chargeon cap.) drives bl
• Slower due to
refresh cycles• Small area means
much greaterdensity of cells and
thus large memories• Used for main
memories wheredensity is key
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
19/56Shahin Nazarian/EE577A/Spring 2012
Typical SRAM Array
19
S I i D t i i th
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
20/56Shahin Nazarian/EE577A/Spring 2012
Some Issues in Determining theMemory Array Organization
• Typically we want an aspect ratio that is nearly one
• How to divide up the row, column address decoding?
•
Consider an 8K x 32 SRAM = 256 Kb = 218
with 218
= 29 rows x 29 columns as an example
–Row decoder is 9 to 512 decoder. Every 32 (25)columns is a ‘word’, and we only need to decode
words. So, column decoder needs to decode 16words, that is, we only need a 4 to 16 columndecoder
20
S I i D t i i th
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
21/56Shahin Nazarian/EE577A/Spring 2012
Some Issues in Determining theMemory Array Organization (Cont.)
• Assertion of word line accesses all cells in a row
–Not all bits that are read from a row may beused
– Loading on word line is high!
• Bit lines connect all cells in a column, only one cellin a column can ever be ON at a time
• Would like to keep the bitline swing low to preservepower
21
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
22/56
Shahin Nazarian/EE577A/Spring 2012
SRAM Cell
22
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
23/56
Shahin Nazarian/EE577A/Spring 2012
Full CMOS (6-T) SRAM Cell
• Very low standby power consumption, large noisemargin, low supply voltage
•
Basic requirements for setting the (W/L) ratios:– Data-write operation is capable of modifying
stored data in SRAM cell– Data-read operation does not modify stored data
M3
M4
M1
M2
M6M5
23
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
24/56
Shahin Nazarian/EE577A/Spring 2012
Layout of the CMOS SRAM Cell
A different layout6T SRAM cell layout
M1
M2
M3
M4
M5M6
24
M3
M4
M1
M2
M6M5
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
25/56
Shahin Nazarian/EE577A/Spring 2012
Static Bit Line Biasing
25
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
26/56
Shahin Nazarian/EE577A/Spring 2012
Static Bit Line Biasing with Clamps
26
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
27/56
Shahin Nazarian/EE577A/Spring 2012
SRAM Cell w/ Static Bitline Pull-ups
• When the word line is not selected, RS=0. M3 and M4are OFF
•
If RS = 0 for ALL rows, the bit lines capacitancesC and NOT-C are charged-up to VDD by pull-up ofMP1 and MP2
• Depending on application, MP1 and MP2 are turned
OFF or are kept ON during the read operation
pseudo
27
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
28/56
Shahin Nazarian/EE577A/Spring 2012
CMOS SRAM Cell Design Strategy
• Consider data-read operation with “0” stored in cell
28
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
29/56
Shahin Nazarian/EE577A/Spring 2012
Data-Read Operation
• Conservative design constraint: V1,max ≤VT,2 to keep M2
OFF during the read operation. M3 will be insaturation whereas M1 operates in the linear region:
2 2,3 ,1
1 , , 1 1 1 ,2( ) at2 2
n n
DD T n DD T n T n
k k V V V V V V V V V
• A symmetrical condition also dictates the aspect ratios of M2 andM4
,3 , ,3
2
,1 ,
1
2( 1.5 )
( 2 )
n DD T n T n
n DD T n
W
k V V V L
W k V V
L
,
3
2
3 1
1
With 2.5 , 0.4 :
2(1.9)(0.4)0.5
(1.7)
DD T nV V V V
W
L W W
W L L
L
29
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
30/56
Shahin Nazarian/EE577A/Spring 2012
Read Operation (Cont.)
30
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
31/56
Shahin Nazarian/EE577A/Spring 2012
SRAM Column Read
• Large signal sensing
31
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
32/56
Shahin Nazarian/EE577A/Spring 2012
Data-Write Operation
• Consider the write “0” operation assuming a logic
“1” is already stored in the SRAM cell
32
→ → →
D ( )
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
33/56
Shahin Nazarian/EE577A/Spring 2012
Data-Write Operation (Cont.)
• Design constraint: V1,max ≤VT,2 so M2 turns OFF when
V1=VT,2. M3 is in linear region whereas M5 operates insaturation:
2,52,3
, 1 1 , 1 ,
2
,,3
,5 , ,
2( ) 0 at2 2
( )
2( 1.5 )
pn
DD T n DD T p T n
DD T pn
p DD T n T n
k k V V V V V V V V
V V k
k V V V
• A symmetrical condition also dictates the aspect ratios of M6
and M4
2
,3
, ,
5
( )
2( 1.5 )
p DD T p
n DD T n T n
W
V V L
W V V V
L
, ,
23
3 5
5
With 2.5 , 0.4 , 2.25
1 (2.1) 1.32.25 2(1.9)0.4
n
DD T n T p
p
V V V V V
W
L W W W L L
L
33
W O (C )
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
34/56
Shahin Nazarian/EE577A/Spring 2012
Write Operation (Cont.)
34
SRAM C l W i
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
35/56
Shahin Nazarian/EE577A/Spring 2012
SRAM Column Write
35
SRAM Si i S
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
36/56
Shahin Nazarian/EE577A/Spring 2012
SRAM Sizing Summary
• Bls are high during read, they should not overpower theinverters during read, therefore nMOS transistors shouldbe strong to pull them down
• However during write, the bls need to overpower, so wemake pMOS transistors weak
bit bit_b
med
A
weak
strong
med
A_b
word
36
T i l SRAM T i t Si
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
37/56
Shahin Nazarian/EE577A/Spring 2012
Typical SRAM Transistor Sizes
• Transistors may be sized as
follows:• nMOS pulldown:M1,M2: 6:2
• pMOS pullup: M5, M6: 4:3
• Access xtors: M3, M4: 4:2
•
All boundaries are shared• Reduces the Write delay
• One may also use equal-sizetransistors in the SRAM cell(e.g., 4:2 for all) however thisshould be carefully checked, asthis sizing is not conservativemay not work for all scenarios
WL
BitLine
Bit_barLine
M5
M1
M3
Yet a different layout
37
M4
M2
M4
E l D i f 256Kbit SRAM A
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
38/56
Shahin Nazarian/EE577A/Spring 2012
Example Design of a 256Kbit SRAM Array
• 2 macro-blocks (aka 2banks), each with 256 rows
and 512 columns (total 218 bits)
• Want to access a double-word (26=64 bits) at a time
• Need 12 address linesA0,…,A11• 4 LSB bits (A0,…,A3)
are used for columnaddressing while the
other 8 MSB bits(A4…A11) are used forrow addressing
38
Decoder SRAM Cells
Macro#1SRAM Cells
Macro#2
Read Multiplexer
Sense Amplifier
Output Buffer
DFF
Column Decoder
Sense_en
Control Circuit
Addr
clk
Addr
wldummy
prechargeRead_writeWrite_en
Sense_en
clk
prechargeprecharge
Addr wldummy
wlwl
Output
Write Multiplexer
Write CircuitWrite_en
512 512
1024
1024
64
256 256
512
64
4 16
• Need 8:256 row decoder, 4:16 column decoder, and 16:1Read and Write Multiplexers
•
Use 64 sense amplifiers
S A lifi
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
39/56
Shahin Nazarian/EE577A/Spring 2012
Sense Amplifier
• The bit line capacitance issignificant for large arrays
• If each cell contributes 2fF,with 256 cells per column, we get512fF plus wire cap. Pull-downresistance is about 15K. The RCdelay will be 5.3ns (with V =VDD) ?
• We cannot easily change R, C,or VDD, but can change V !
• It is possible to reliably senseV’s as small as 50mV
• With margin for noise, most
SRAMs sense bit-line swingsbetween 100~300mV
• For writes, we still need todrive the bit line to full-swing
• Only one driver needs to be this
big39
Use SPICE sweep function tooptimize transistor sizes(typically, Q0, Q5, Q6 areminimum-size transistors)
Isolation
Transistors
Regenerative
Amplifier
Q1 Q2
Q3 Q4
Q5
Q0
Q6
sense_en
sense_en sense_en
bit bit_bar
out out_bar
S A lifi W f
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
40/56
Shahin Nazarian/EE577A/Spring 2012
Sense Amplifier Waveforms
bit
bit_bar
out
out_bar
sense_en sense_en
40
C t b t th S A D i
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
41/56
Shahin Nazarian/EE577A/Spring 2012
Comments about the Sense Amp Design
• Isolation transistors must be pMOS•
Bit lines are within 0.2V of VDD (not enough to turn an nMOStransistor ON)
• Load on outputs of regenerative amplifier must beequal
•
Need to precharge the sense amp before opening theisolation transistors to avoid discharging the bit lines
Out Out_bar
DataData_bar
41
• Both outputs go high duringprecharge– Usually follow the regenerative
amplifier by a cross-coupledNAND latch
• Requires 3 timing phases–
Typically self-timed
P h d W it Ci it
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
42/56
Shahin Nazarian/EE577A/Spring 2012
Precharge and Write Circuitry
• Recall that M3 and M5 denotethe nMOS access transistorand the pMOS pullup transistorinside the SRAM cell (on thebit line side)
• For successful write operation,
R3+R9+R7 should be < ½R5• Let R* denote resistance of 2:2
nMOS transistor, and n/p=2• If M3=4:2 and M5=4:3, then
½R5 = ¾R* and R3 =½R*;therefore, R9+R7 should be¼R*
• M9 and M7 should be 16:2each
I
42
R D d
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
43/56
Shahin Nazarian/EE577A/Spring 2012
Row Decoder
• Another example of pre-decoding addresses – decode in octaladdresses
• One-level decoding of 9-bit address (A8 A7 … A0) requires 512 nine-inputAND gates
• Predecode (A2 A1 A0), (A5 A4 A3), and (A8 A7 A6) by using 3*23=24
three-input NAND gates, followed by 83=512 three-input NOR gates43
Two implementations of a 4:16 decoder
Requires 16 four-input AND gate
Requires 8+16=24 two-input NAND/NOR gates
4 t 1 T M ltipl x f R d Ci it
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
44/56
Shahin Nazarian/EE577A/Spring 2012
4-to-1 Tree Multiplexer for Read Circuitry
44
BL0 BL1 BL2 BL3
pmos
C l M
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
45/56
Shahin Nazarian/EE577A/Spring 2012
Precharge
SRAM
Cell
Write-mux
Write-en
data
4:16
Decoder
Sense-enSense-en
Sense-en
Read-mux
Decoder
(8:256)
A0
A1
A2
A3
A4
A5
A15
15 Write_mux transistors
15 Read_mux transistors
60λ
30λ
20λ
20λ
60λ
60λ
160λ
160λ
20λ
20λ
20λ
4λ 4λ
4λ
30λ
20λ
A4
A5
A11
A0
A1
A2
A3
To Read-mux’s
Column Mux• We have 16 read_mux and 16
write_mux transistors in parallel
• During read operation, one ofthe read MUX’s is selected,according to the values ofA0,…,A3 , and that columnenables the sense amplifierand the corresponding value
of SRAM cell will be read atthe output of sense
• During the write operation,the desired SRAM cell isselected and the data will bewritten into the correspondingSRAM cell
• Need to replicate the drawing forthe bit_bar side
• Need a total of 64 similarstructures which makes 16:1 64-
bit wide column MUX 45
SRAM Array Floor plan
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
46/56
Shahin Nazarian/EE577A/Spring 2012
SRAM Array Floor plan
Column Mux16:1
D E C O D E R
64 64
r rows
c columns
... 64 64
r rows
c columns
...
64
Sense Amplifier
Output buffer
SRAM Cells, Macro 1 SRAM Cells, Macro 2
Decoder
Control
10240
20480
10240
Precharge Precharge 82
670
10000
MUX 82
Sense Amplifier 40
Output buffer 1 40
Output buffer 2
1281900
128
Output Flip-Flop270
All units are in
46
Example Read Delay Calculation for an
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
47/56
Shahin Nazarian/EE577A/Spring 2012
Example Read Delay Calculation for anSRAM Array
• Consider a 256×512 SRAM core. Bit lines are pre-charged toV DD = 2.5V before each read operation. A read operation is
complete when the bit line has discharged by 0.25V . A memorycell can provide 0.25mA of pull-down current to discharge thebit line. Assume the word line resistance is 2Ω per memory cell,the word line capacitance is 20fF per memory cell, while the bitline capacitance is 12fF per cell. Ignore the bit-line resistance
and read-mux transistor. Calculate the worst-case read delayfor this SRAM. Assume row decoding takes 3ns while senseamplifier and output buffer take 1ns .
• Solution: Each word line drives 512 SRAM cells; The RC delayfor driving the furthest cell is:
• The time needed to discharge the bit or bit_bar line by 250mV is:
512
1 1
( 1)0.69 0.69 0.69 256 513 20 2 3.72
N k
row j k cell cell
k j
N N t R C R C f ns
256 12 0.253.1
0.25
10.8
col
col
dis
access dec row col sen buf
C V f t ns
I m
t t t t t ns
D
47
SRAM Scaling Challenges
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
48/56
Shahin Nazarian/EE577A/Spring 2012
SRAM Scaling Challenges
• For cell stability, separate power rails for cell array
vs. word line driver may be needed (bad forleakage)
• Reduced read and write margins as we scalevoltages
• Increased transistor leakage (high-k gate dielectric)• Introduction of various power management modes:
• Reduced VDD•
Raised VSS• Soft error immunity
• Low standby power
48
Optional: Leakage Currents in the SRAM
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
49/56
Shahin Nazarian/EE577A/Spring 2012
Optional: Leakage Currents in the SRAMCell
M1 M2
M5 M6
M3M4
“0” “Vdd”
VDD
I sub3
I s u b 5
I s u b 2
I gate 1
“0” “0”
“Vdd” “Vdd”
1 3 2 5
3 2 5 , ,( )
gateleak sub sub sub
sub sub sub bitline cell leak leak
I I I I I
I I I I I
= + + +
» + + = +
49
• Note that I leak is dominated by the drain-source
leakage in 90nm CMOS technology (i.e., we may ignoregate leakage and other leakage mechanisms which aresmall compared to the sub-threshold conductioncurrents.)
Optional: Bitcell Stability Failures
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
50/56
Shahin Nazarian/EE577A/Spring 2012
Optional: Bitcell Stability Failures
• Read Access Failure
• The WL activation period is too short for a pre-specified ΔV
to develop between bit line and bit_bar line in order totrigger the sense amplifier correctly during read
– This may occur due to increase in Vt for the pass-gate orpull-down transistors
• Read Stability Failure
• Cell may flip due to increase in the “0” storage node abovethe trip voltage of the other inverter during a read
– To quantify the bitcell's robustness against this failure,SNM is the most commonly used metric
– Notice that read stability failure can occur anytime the
WL is enabled even if the bitcell is not accessed for reador write operations
– SNM related failures are the limiter for VDD scalingespecially after accounting for device degradation due tohot electron effects and negative bias temperature
instability 50
Optional: Read Stability Failure in the
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
51/56
Shahin Nazarian/EE577A/Spring 2012
Optional Read Stability Failure in theSRAM Cell• A read stability (a.k.a. “hold”) failure occurs when
stored data flips during the memory standby mode(while WL is enabled)• A cell’s VTC is composed of the two inverters’ VTCs that
enclose two regions
• The cell’s hold stability is characterized by the static noisemargin (SNM), which is measured by the diagonal length ofthe largest square fitted in the enclosed region (the derivationis omitted)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.2 0.4 0.6 0.8 1
SNM
Ideal
VTC
Actua
l VTC
SNM
Vout,LVin,R
Vout,RVin,L 51
The SNM butterflycurves must be analyzedfor different processcorners, FS: fastNMOS, slow PMOS andSF: slow NMOS, fastPMOS and differenttemperatures
Optional: Bitcell Stability Failures (Cont )
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
52/56
Shahin Nazarian/EE577A/Spring 2012
Optional: Bitcell Stability Failures (Cont.)
• Write Stability (or write-ability) Failure
•
The internal “1” storage node may not be reduced belowthe trip point of the other inverter during the WLactivation period
• One way to quantify a cell's write stability is to use writetrip voltage or write margin (WM), which is the maximumbit line voltage at which the bitcell flips state (assumingthat bit line is pulled to GND by the line driver)
• Data Retention Failure
• When VDD is reduced to the Data Retention Voltage, allsix transistors in the SRAM cell operate in subthresholdregion, hence, they show strong sensitivity to variations
• PMOS transistor must provide enough current tocompensate for leakage in the NMOS pull-down and accesstransistors
• Due to L and VT variations, data retention current may notbe sufficient to compensate the leakage current
52
Optional: Minimum Voltage Needed to
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
53/56
Shahin Nazarian/EE577A/Spring 2012
Optional Minimum Voltage Needed toPreserve Data• The Data Retention Voltage (DRV) is
defined as the minimum VDD under which
the data in a SRAM cell is stillpreserved
• When VDD is reduced to DRV, alltransistors are in the sub-thresholdregion, thus SRAM data retention
strongly depends on the sub-VT current conduction behavior (i.e.,leakage)
• Cell leakage is greatly reduced atDRV
• This provides a highly effectiveleakage suppression scheme forstandby mode
– Maximum leakage saving andminimum design overhead
Distribution of DRV in a 0.13u
CMOS with 3σ variations in VT
and L
Measured SRAM leakagecurrent 53
Optional: DRV of SRAM (Cont )
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
54/56
Shahin Nazarian/EE577A/Spring 2012
Optional: DRV of SRAM (Cont.)
•
When VDD scales down to DRV,the Voltage Transfer Curves(VTC) of the internal invertersdegrade to such a level thatStatic Noise Margin (SNM) of the
SRAM cell is reduced to zero• The temperature coefficient of
DRV is 0.169mV/°C, which impliesan increase of 12.3mV in DRVwhen temperature rises from
27°C to 100°C 54
DRVVwhen, DDinverter Right2
1
inverter Left2
1
V
V
V
V
DRV Condition:
0 0.1 0.2 0.3 0.40
0.1
0.2
0.3
0.4
V1 (V)
VTC1
VTC2
VDD
=0.18V
VDD
=0.4V
VTC of SRAM cell inverters
V DD
V 1
M 2
M 6
M 4
M 3
M 1
M 5 V 2
Leakage current
V DD
V DD
0 0
Optional: Soft Error Rate for thell
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
55/56
Shahin Nazarian/EE577A/Spring 2012
• If collected charge Q s exceeds some critical chargelevel Q
cr i t , it will upset bit value and cause a soft error
• Soft Erro r Rate (SER) in SRAM:
• Q cr i t
is 10fC in a 65nm CMOS process
Optional Soft Error Rate for theSRAM Cell
• A high-energy alpha particle or
an atmospheric Neutron strikinga capacitive node• Deposits charge leading to a time-
varying current injection at thenode
• In case of atmosphericNeutrons:
2( , ) exp( )
s s s
Q t t I Q t
T T T p
-=
exp( )crit
s
QSE R
Q
-;
0
2040
60
80
100
120
140
0 50 100 150 200
Time(ps)
I ( Q , t
) ( u A )
55
Optional: How to Mitigate the SER Fail Rate
-
8/13/2019 Unit5 Memory EE577A Nazarian Spring12
56/56
Optional: How to Mitigate the SER Fail Rate
• To mitigate soft errors, several radiation-hardeningtechniques can be implemented
• Process technology changes (e.g., SOI technology)• Circuit design (e.g., adding capacitor, using larger
transistors, memory words interleaving)
• Architecture (e.g., parity, error correction codes)
Good News: SER per bit value tend to decrease with scaling