understanding intrinsic characteristics and system implications of flash memory based solid state...
TRANSCRIPT
Understanding Intrinsic Characteristics and System Implications of Flash Memory
based Solid State Drives
Embedded Lab.Kim Sewoog
Feng Chen, David A. Koufaty, and Xiaodong Zhang2009 ACM SIGMETRICS/Performance
Solid State Drive(SSD) “pivotal technology”
http://www.youtube.com/watch?v=96dWOEa4Djs http://www.youtube.com/watch?v=pJMGAdpCLVg&feature=fvw
Motivation
HDD SSD
SSD Internals Array of flash memory packages Flash Translation Layer(FTL) in the SSD controller
Hybrid mapping, Over-Provisioning, Using the original host interface(SATA) for compatibility Interleaving, DMA for data processing, low power comsumption, etc…
SSD Block Diagram
< SSD Block Diagram >
The common belief Accesses to SSD are uncorrelated with access patterns!
Betrayal Unexpected performance issues Uncertain behavior
SSD is not just another ‘faster’ disk!
We need to understand intrinsic limits unexpected performance behavior
Belief and betrayal of SSD
7 questions1. Access Patterns of workloads
2. Random writes
3. Caching for optimizing performance
4. Interference between read and write operations
5. Background operation effects
6. Internal fragmentation
7. System implications
Experiments and analysis
Solid State Drives
Experiment System DellTM PowerEdgeTM 1900 server
Benchmarks Intel® Open Storage Toolkit : generate various types of I/O workloads blktrace / blkparse : trace and parse I/O activities (completion event)
Measurement environment
Bandwidths 4 distinct workloads : Random/Sequential Read/Write Random workload : 4KB request size, 1024MB storage space Sequential workload : 256KB request size 32 parallel jobs, direct I/O, 30 seconds Comparison with harddisk (WD1600JS)
General Tests
Various combinations of factors 3 access patterns : Sequential / Random / Stride 10 seconds running, one job, synchronous I/O Full utilization for initialization (using 256KB sequential write)
Micro-benchmark Workloads
Read operations on the SSD SSD-L : uniform distribution of latencies SSD-M/H : non-uniform distribution of latencies
Reason : ① specification
② a readahead mechanism
③ multi-plane operations
④ interleaving
Distribution of access latencies
65% 65%
Write operations on the SSD SSD-L : non-uniform distribution of latencies SSD-M/H : uniform distribution of latencies
Independent of workload access patterns
Distribution of access latencies
88%
over 90%
Sequential vs. non-sequential writes on SSD-L (seems to) use a small buffer Sequential write : stripped Non-sequential write : no-stripped
Distribution of access latencies
64 requests
from host to buffer
initiate the prog. process & write data into flash memory in parallel
from buffer to register
Large RAM cache(disk cache)
Using hdparm tool to enable and disable the disk cache Disk cache off : increase of latencies both SSD-M/H Performance comparision between SSD-M/H without disk cache
SSD-H is good performance -> SLC
Disk cache effect of SSDs
Writes : high-cost internal operations Cleaning and asynchronous write-back of dirty data from the disk cache Negatively affect foreground read operations
Reads : competition for buffer space with writes
Break sequential patterns
4 workload patterns1) Read(n) + Write(n)
2) Write(n) + Read(n)
3) Read(n) + Write(n+1)
4) Read(n) + Write(n+4MB)
Interference between read/write opera-tions
only non-sequential pattern simultaneously, sequential pattern
SSD-L Substantial degradation SSD-L optimizes performance for sequential writes
Interference between read/write opera-tions
Non-shared buffer
Non-sequential write
SSD-M/H
Interference between read/write opera-tions
readahead effect
random read la-tency
asynchronouswrite-back
Writes lead almost background operations Sequential workload using request size of 4KB Request type : random (50% write requests) interval time : 10ms
Background operation effects
disk cache
Background operations are completed during the idle periods !
Randomness effects (only SSD-L) Random write : random range from 1GB to 30GB Stride write : stride step from 4KB to 128MB Request size : 4KB
Workload randomness effects
① metadata synchroniza-tion
② log block merging
16MB: individual mapping unit
Internal fragmentation Invalid pages in flash memory blocks Cleaning efficiency : block num x valid page num - read/write, block num - erase Non-continuous physical pages : readahead mechanism is not effective
Internal fragmentation
No readahead effect
Over-provisioning(25% of the SSD capacity)
Many well understood features of SSDs
Many unexpected performance issues1. Access Patterns of workloads
2. Random writes
3. Caching for optimizing performance
4. Interference between read and write operations
5. Background operation effects
6. Internal fragmentation
7. System implications
Conclusion
Reference
N. Agrawal, V. Prabhakaran, T. Wobber, J. D. Davis, M. Manasse, and R. Panigrahy. “Design tradeoffs for SSD performance”, In Proc. of USENIX’08, 2008.
Blktrace. http://linux.die.net/man/8/blktrace.
S. Lee, D. Park, T. Chung, D. Lee, S. Park, and H. Song. “A log buffer based flash translation layer using fully associative sector translation”. In IEEE Tran. on Embedded Computing Systems, 2007.
M. Mesnier. Intel open storage toolkit. http://www.sourceforge.org/projects/intel-iscsi.
V. Prabhakaran, T. L. Rodeheffeer, and L. Zhou. “Transactional flash.” In Proc. of OSDI’08, 2008.
THANK YOU !