phase change memory aware data management and application
DESCRIPTION
Phase Change Memory Aware Data Management and Application. Jiangtao Wang. Outline. Introduction Integrating PCM into the Memory Hierarchy PCM for main memory PCM for auxiliary memory Conclusion. Phase change memory. An emerging memory technology Memory(DRAM) - PowerPoint PPT PresentationTRANSCRIPT
Phase Change Memory Aware Data Management and Application
Jiangtao Wang
Outline
• Introduction• Integrating PCM into the Memory Hierarchy
− PCM for main memory− PCM for auxiliary memory
• Conclusion
Phase change memory•An emerging memory technology•Memory(DRAM)
−Read/write speeds and Byte-addressable−Lower Idle power•Storage(SSD & HDD)
−Non-volatile−high capacity (high density)
DRAM PCM NAND FlashPage size 64B 64B 2KB
Page read latency 20-50ns ~50ns ~25usPage write latency 20-50ns ~1us ~500us
Endurance ∞ 106-108 104-105
Idle power ~100mW/GB ~1mW/GB 1-10mW/GB
Density 1x 2-4x 4x
Phase change memory
•Cons:− Asymmetry read/write latency− Limited write endurance
Phase change memory
Read operation
10ns 100ns 1us 10us 100us 1ms 10ms
Write operation
DRAM
HDD
PCM
FLASH
DRAM
PCM
FLASH
HDD
Outline
• Introduction• Integrating PCM into the Memory Hierarchy
− PCM for main memory− PCM for auxiliary memory
• Conclusion
Integrating PCM into the Memory Hierarchy
• PCM for main memory– Replacing DRAM with PCM to achieve larger main
memory capacity• PCM for auxiliary memory– PCM as a write buffer for HDD/SSD DISKBuffering dirty page to minimize the disk write I/Os– PCM as secondary storageStoring log records
PCM for main memory
Phase Change Memory
Memory Controller
HDD/SSD Disk
CPU
L1/L2 Cache
(a)PCM-only memory
Phase Change Memory
Memory Controller
HDD/SSD Disk
CPU
L1/L2 Cache
DRAM Cache
(b)DRAM as a cache memory
Phase Change Memory
Memory Controller
HDD/SSD Disk
CPU
L1/L2 Cache
DRAMWrite buffer
(c)DRAM as a write buffer
[ISCA’09][ICCD’11][DAC’09][CIDR’11]
PCM for main memoryChallenges with PCM
• Major disadvantage – Writes Compared to read operation ,PCM writes incur higher energy consumption、 higher latency and limited endurance
Read latency 20~50ns Write latency ~1us
Read energy 1 J/GB Write energy 6 J/GB
Endurance 106~108
Reducing PCM writes is an important goal of data management on PCM !
• Optimization: data comparison write• Goal: write only modified bits rather than entire cache line• Approach: read-compare-write
0 1 0 1 1 0 1 1 0 1 1 0 1 1 1 0
0 1 0 1 1 0 1 1 0 1 1 0 1 1 1 01 0 0 0 1 0 1 1
0 1 0 1 1 0 1 1 0 1 1 0 1 1 1 0
0 1 0 1 1 0 0 0 0 1 1 0 1 0 1 1
CPU cache
PCM
PCM for main memory Optimization on PCM write
[ISCAS’07][ISCA’09][MICRO’09]
read
PCM for main memory PCM-friendly algorithms
• MotivationChoosing PCM-friendly database algorithms and data structures to reduce the number of writes
Rethinking Database Algorithms for Phase Change Memory(CIDR2011)
PCM for main memory PCM-friendly DB algorithms
• Prior design goals for DRAM− Low computation complexity− Good CPU cache performance− Power efficiency (more recently)
• New goal for PCM− minimizing PCM writes− Low wear , energy and latency− Finer-grained access granularity:bits,words,cache line
• Two core database techniques− B+-Tree Index− Hash Joins
PCM-friendly DB algorithmsB+-Tree Index
• B+ -Tree– Records at leaf nodes– High fan out– Suitable for file systems
• For PCM– Insertion/deletion incur a lot of write operations
– K keys and K pointers in a node: 2(K/2)+1=K+1
num keys
5 2 4 7 8 9
pointers
num keys
6 2 3 4 7 8 9
pointers
Insert 3
incurs 11 writes
• PCM-friendly B+-Tree– Unsorted: all the non-leaf and leaf nodes unsorted – Unsorted leaf: sorted non-leaf and unsorted leaf– Unsorted leaf with bitmap :sorted non-leaf and unsorted
leaf with bitmaps
PCM-friendly DB algorithmsB+-Tree Index
num keys
5 8 2 9 4 7
pointers
num keys10111010 8 2 9 4 7
pointers
Unsorted leaf node Unsorted leaf node with bitmap
• Unsorted leaf– Insert/delete incurs 3 writes
PCM-friendly DB algorithmsB+-Tree Index
num keys
5 8 2 9 4 7
pointers
num keys
4 8 7 9 4
pointers
Delete 2
• Unsorted leaf with bitmap– Insert incurs 3 writes; delete incurs 1 writenum keys10111010 8 2 9 4 7
pointers
Delete 2
num keys10011010 8 2 9 4 7
pointers
Experimental evaluation B+-Tree Index
• Simulation Platform– Cycle-accurate X86-64 simulator: PTLSim– Extended the simulator with PCM support– Modeled data comparison write– CPU cache(8MB), B+-Tree (50 million entrys,75% full,1GB)
• Three workloads:– Inserting 500K random keys – Deleting 500K random keys– Searching 500K random keys
• Node size 8 cache lines; 50 million entries, 75% full;
Experimental evaluation B+-Tree Index
insert delete search0
2
4
6
8
10
12
14
16
ener
gy (m
J)
insert delete search0E+0
1E+8
2E+8
3E+8
num
bits
mod
ified
insert delete search0E+0
1E+9
2E+9
3E+9
4E+9
5E+9
cycl
es
Total wear Energy Execution time
Unsorted schemes achieve the best performance• For insert intensive workload: unsorted-leaf• For insert & delete intensive workload : unsorted-leaf with bitmap
• Two representative algorithms– Simple Hash Join – Cache Partitioning
PCM-friendly DB algorithmsHash Joins
########
########
########
########
R Build Phase SProbe Relation
Hash Table
•Problem – too many cache misses– Build and probe hash table(exceeds CPU cache size)– Small record size
• Simple Hash Join
• Cache Partitioning
PCM-friendly DB algorithmsHash Joins
Partition Phase SPartition Phase
S1
S2
S4
S3
Join Phase
R
R1
R2
R4
R3
•Problems : Too many writes!
• Virtual Partitioning ( PCM-friendly DB algorithms)PCM-friendly DB algorithms
Hash Joins
Partition phase
RVirtual
partitioning SVirtual
Partitioning
R’1S’1
R’2
R’3
R’4
S’1
S’1
S’1
Store record ID
PCM-friendly DB algorithmsHash Joins
Join phase
R’1########
########
########
########
Hash table
Build Probe S’1
R S
• Good CPU cache performance • Reducing writes
• Virtual Partitioning (PCM-friendly DB algorithms)
• Relations R and S are in main memory(PCM)• R(50MB) joins S(100MB) (2 matches per R record)• Varying record size from 20B to 100B
Experimental evaluation Hash Join
20B 40B 60B 80B 100B0E+0
2E+9
4E+9
6E+9
8E+9
1E+10
record size
cycl
es
20B 40B 60B 80B 100B0
10
20
30
40
record size
ener
gy (m
J)
Total wear PCM energy Execution time
PCM for auxiliary memory
PCM write buffer
Memory Controller
HDD/SSD Disk
CPU
L1/L2 Cache
DRAM
PCM as a write buffer for HDD/SSD DISK
[DAC’09][CIKM’11][TCDE’10][VLDB’11]
PCMSSD/HDD
CPU
L1/L2 Cache
DRAM
PCM as secondary storage
PCM for auxiliary memory• PCM as a write buffer for HDD/SSD DISK
PCMLogging: Reducing Transaction Logging Overhead with PCM(CIKM2011)
• PCM as secondary storage– Accelerating In-Page Logging with Non-Volatile
Memory(TCDE2010)– IPL-P: In-Page Logging with PCRAM (VLDB2011 demo)
• Motivation Buffering dirty page and transaction logging to minimize disk I/Os
PCM for auxiliary memory
• PCM as a write buffer for HDD/SSD DISKPCMLogging: Reducing Transaction Logging Overhead with PCM(CIKM2011)
PCMBasic• Two schemes– PCMBasic– PCMLogging
• PCMBasicDRAM
Buffer pool Log pool
Dirty pages Write log
PCM
DISK
•Cons:− Data redundancy− Space management on PCM
PCM for auxiliary memory
• PCMLogging– Eliminate explicit logs (REDO and UNDO log)– Integrate implicit logs into buffered updated(shadow pages)
PCMLogging
DRAM
PCM
DISK
MetaData
MetaDataP1 P1 P1
P2 P2…
P
• Overview– DRAM• Mapping Table(MT):map logial page to physical page
– PCM• Page format• FreePageBitmap • ActiveTxList
Page Content MetaData
XXXXXXXXXX XID PID
PCMLogging
PCMLogging•Overview
• PCMLogging OperationTwo additional data structures in the main memory to support undo memory
• Transaction Table(TT) Record all in-progress transaction and their corresponding dirty pages in DRAM and PCM
• Dirty Page Table(DPT)Keep track of the previous version for each PCM “overwritten” by an in-progress transaction
PCMLogging
• Flushing Dirty pages to PCM– Add XID to ActiveTxList before writing dirty page to PCM– If page P exists in the PCM, do not overwrite and create an out-of-
place P’
PCMLogging
T3 update P5
• Commit– flush all its dirty pages– Modify metadata:
PCMLogging
• Abort– discard its dirty pages and restore previous data– Modify metadata:
• Tuple-based Buffering– In the PCM
• the buffer slots be managed in the unit of tuples, • To manage the free space, employ a slotted directory instead of a
bitmap – In the DRAM
• Mapping Table, we still keep track of dirty pages, but maintain the mappings for the buffered tuples in each dirty page
– Merge tuples with the corresponding page of the disk• read/write request• move committed tuples from PCM to the external disk
PCMLogging
• Simulator based on DiskSim• TPC-C benchmark• DRAM 64MB• Tuple-based
Experimental evaluation
(PL=PCMLogging)
• PCM as secondary storage– Accelerating In-Page Logging with Non-Volatile
Memory(TCDE2010)– IPL-P: In-Page Logging with PCRAM (VLDB2011 demo)
PCM for auxiliary memory
• MotivationIPL scheme with PCRAM can improve the performance of flash memory database systems by storing frequent log records in PCRAM
Design of Flash-Based DBMS: An In-Page Logging Approach(SIGMOD2007)
In-Page Logging
• Introduction– Updating a single record may result in invalidating the
current page– Sequential logging approaches incur expensive merge operation– Co-locate a data page and its log records in the same
physical block
Design of Flash-Based DBMS: An In-Page Logging Approach(SIGMOD2007)
In-Page Logging
log
…
log
log
log
…
Database Buffer
Flash Memory
Physical block(128K)
15 data pages
Log region(8K)16 sectors(512B)
in-memorydata page(8K)
update
in-memorylog sector(512B)
log
log
log
loglog
Database Buffer
Flash Memory
in-memorydata page(8K)
update
in-memorylog sector(512B)
log
…log…log
+log
…
…log
log
merge
In-Page Logging
• Cons– Units of write log is a sector(512B)– Only SLC-type NAND flash supports partial programming– The amount of log records for a page is usually small
In-Page Logging
• Pros– log records can be flushed in a finer granularity– the low latency of flushing log records– PCRAM is faster than flash memory for small reads– SLC or MLC flash memory can be used for IPL policy.
In-Page Logging
Experimental evaluation
• A trace-driven simulation• Implement an IPL module to the B+-tree based Berkeley DB• Million key-value records insert/search• Log sector in memory(128B/512B)
Accelerating In-Page Logging with Non-Volatile Memory(TCDE2010)
Experimental evaluation
• Hardware platform– PCRAM(512M,the granularity:128B)– Intel X25-M SSD (USB interface)
• Workload– Million key-value records insert/search/update– B+-tree based Berkeley DB – Page size :8KB
IPL-P: In-Page Logging with PCRAM (VLDB2011 demo)
Outline
• Introduction• Integrating PCM into the Memory Hierarchy
− PCM for main memory− PCM for auxiliary memory
• Conclusion
Conclusion
• PCM is expected to play an important role in the memory hierarchy
• It is important to consider read/write asymmetry of PCM when design PCM-friendly algorithms
• Integrating PCM into Hybrid memory might be more practical
• If we use PCM as main memory,we had to revise some system application(e.g. Main Memory Database Systems )to address PCM-specific challenges.
Thank You!