smartdedup: optimizing deduplication for resource ......virtualized infrastructures, systems, &...
TRANSCRIPT
-
Virtualized Infrastructures, Systems, & Applications
SmartDedup: Optimizing Deduplication for Resource-constrained Devices
Qirui Yang Runyu Jin Ming [email protected], [email protected], [email protected]
Arizona State University
http://visa.lab.asu.edu
-
Resource Management on Edge and IoT
Data-intensive workloadso Multitude of sensorso Data-rich applicationso Multitasking systems
Limited on-device storageo Limited I/O performanceo Limited capacityo Limited endurance
2
• • •
• • •
-
Deduplication to the Rescue?
Well-known benefitso Eliminate redundant I/Os à improve performanceo Reduce flash writes à improve enduranceo Remove redundant data à improve utilization
Widely used in the cloudo Backup storage, primary storage, cache storage, …
But o Is there enough data duplication in device workloads?
o How to exploit it using limited resources on the device?
3SmartDedup
-
Smartphone Trace Analysis
Real-world device workloads are indeed I/O intensive
All workloads have a good level of duplicates
4SmartDedup
ID Daily I/Os(GB) Daily write
(GB)# of
months Duplication
ratio (%)
1 12.1 1.4 6 21.9
2 9.1 2.1 3 45.3
3 5.9 1 2.5 23.2
4 16.5 2.4 5 47.5
5 12.2 2.8 3 41.6
6 10.6 2.4 2.1 28.3
File system level traces collected from users from different countries
Trace: http://visa.lab.asu.edu/traces
-
More In-depth Analysis
5
Top 3 duplicate contributors for the traces
Trace1Trace2Trace3
Trace4Trace5Trace6
Five categories:
Resource files (res)
Database files (db)
Executables (exe)
Temporary files (tmp)
Multimedia (media)
-
More In-depth Analysis
Duplicate sources differ by devices
80% of duplicates are from files that are not entirely duplicate
System-level deduplication is necessary
6
Top 3 duplicate contributors for the traces
Trace1Trace2Trace3
Trace4Trace5Trace6
-
Challenges to Device Deduplication
Limited memoryo Deduplication relies on the in-memory fingerprint store for quickly
detecting duplicates
Limited storage performance and enduranceo Deduplication incurs additional metadata operations
Limited power and energy capacityo Fingerprinting is CPU intensive and draws quite a bit of power
7SmartDedup
-
SmartDedup—A Smart Deduplication Solution for Smart Devices
Cohesively designed two-level fingerprint storeso On-disk store complements the small in-memory store
Synergistically integrated hybrid deduplicationo Out-of-line deduplicates data skipped by in-line
Dynamically adapted deduplication o According to resource availability
and workload characteristics
8SmartDedup
-
In-memory Fingerprint Store
Store only important fingerprints
Organized by a prefix treeo Index groups of
fingerprints — small memory footprint
o Support the two-level fingerprint store
9SmartDedup
In-memoryFingerprint Store
First 6 bits0 1 2 63…
Second 6 bits
0 1 2 63…Second 6 bits
0 2 63…
leaf
prefix=0 prefix=129
…
…
FPa
FPb
PBNa
PBNbFPg PBNg
In-memoryfingerprint group
FPm
FPn
PBNm
PBNn…
FPq PBNq
In-memoryfingerprint group
…
leaf
1
-
On-disk Fingerprint Store
Maintain fingerprints that are not in the in-memory store
Share the same index with in-memory storeo Save memoryo Facilitate the fingerprint
migration
10SmartDedup
On-disk Fingerprint Store
On-diskfingerprint groupPBNx
FPh
FPi
PBNh
PBNi…
PBNyFPm
FPn
PBNm
PBNn…
FPs PBNs
…
In-memoryFingerprint Store
First 6 bits0 1 2 63…
Second 6 bits
0 1 2 63…Second 6 bits
0 2 63…
leaf
prefix=0 prefix=129
…
…
FPa
FPb
PBNa
PBNbFPg PBNg
In-memoryfingerprint group
FPm
FPn
PBNm
PBNn…
FPq PBNq
In-memoryfingerprint group
…
PBNx
FPk PBNk
leaf
1
-
Fingerprints Migration
Fingerprints migrate between memory and disk on demand
In-memory store keeps the most recently used fingerprints
Fingerprints evicted by groups to save I/Os
11SmartDedup
On-disk Fingerprint Store
On-diskfingerprint groupPBNx
FPh
FPi
PBNh
PBNi…
PBNyFPm
FPn
PBNm
PBNn…
FPs PBNs
…
In-memoryFingerprint Store
First 6 bits0 1 2 63…
Second 6 bits
0 1 2 63…Second 6 bits
0 2 63…
leaf
prefix=0 prefix=129
…
…
In-memoryfingerprint group
FPn
PBNm
PBNn…
FPq PBNq
FPm
In-memoryfingerprint group
…
PBNx
FPk PBNk
leaf
1
FPb
PBNa
PBNb…
FPt PBNt
FPa
-
Fingerprints Migration
Fingerprints migrate between memory and disk on demand
In-memory store keeps the most recently used fingerprints
Fingerprints evicted by groups to save I/Os
12SmartDedup
On-disk Fingerprint Store
On-diskfingerprint groupPBNx
FPh
FPi
PBNh
PBNi…
PBNyFPm
FPn
PBNm
PBNn…
FPs PBNs
…
In-memoryFingerprint Store
First 6 bits0 1 2 63…
Second 6 bits
0 1 2 63…Second 6 bits
0 2 63…
leaf
prefix=0 prefix=129
…
…
In-memoryfingerprint group
FPn
PBNm
PBNn…
FPq PBNq
FPm
In-memoryfingerprint group
…
PBNx
Free Space
FPk PBNk
leaf
1
FPb
PBNa
PBNb…
FPt PBNt
FPa
-
Fingerprints Migration
Fingerprints migrate between memory and disk on demand
In-memory store keeps the most recently used fingerprints
Fingerprints evicted by groups to save I/Os
13SmartDedup
On-disk Fingerprint Store
On-diskfingerprint groupPBNx
FPh
FPi
PBNh
PBNi…
PBNyFPm
FPn
PBNm
PBNn…
FPs PBNs
…
In-memoryFingerprint Store
First 6 bits0 1 2 63…
Second 6 bits
0 1 2 63…Second 6 bits
0 2 63…
leaf
prefix=0 prefix=129
…
…
In-memoryfingerprint group
FPn
PBNm
PBNn…
FPq PBNq
FPm
In-memoryfingerprint group
…
PBNx
Free SpaceFPk PBNk
leaf
1
FPb
PBNa
PBNb…
FPt PBNt
FPa
-
Hybrid Deduplication
In-line Deduplicationo Work in the I/O patho Immediately eliminate duplicate writes o Limited by the available resources on the device
Out-of-line Deduplicationo Work in backgroundo Slowly and thoroughly eliminate duplicate data
o Limited improvement on performance and endurance
14SmartDedup
-
Shared Fingerprint Index
Write Path
15SmartDedup
write
Skipped Buffer
In-memory FingerprintStore
Grp #1
Grp #2
On-disk FingerprintStore
Grp #1
Grp #2
…3
1 Fingerprinting2
Search in-memory store
Miss; pass it to out-of-line
4Search on-disk fingerprint store
4Search on-disk store
5Promote one fingerprint to in-memory store
If in-memory store full, evict a group of fingerprints to disk
6
-
Read Path
Native read path cannot utilize duplicate data in the page cacheo Page cache is indexed by logical block numbers (LBNs)o Cannot avoid reads to different LBNs if the same data is already in the page
cache
16
ReadLBNx
ReadLBNy
DATAxPBNx
DATAxLBNx
DATAxLBNy
DiskPage CacheMiss
Miss
PBNx
LBNx LBNyDuplicate Read
-
Optimized Read Path
Page Cache Indexo Map from PBNs to the corresponding pages in page cacheo Use this index to find data for reads with different LBNs
17SmartDedup
ReadLBNx
ReadLBNy
DATAxPBNx
DATAxLBNx
DiskPage CacheMiss
PBN-based IndexPBNx->DATAx
PBNx
LBNx LBNySave Read
-
Adaptive Deduplication
Based on resource availabilityo Keep CPU and disk utilization under 100%o Proportional to available battery lifeo Completely disable deduplication in low battery state
Based on the current duplication levelo Gradually reduce the rate if the duplication level is droppingo Quickly increase the rate if the duplication level is growing
18SmartDedup
-
Evaluation
19
Prototypeso EXT4 and F2FS (4KB fixed-size chunking)
Testing deviceso Nexus 5X and Raspberry Pi 3
Benchmarkso FIO—intensive I/O benchmarko DEDISbench—
workloads from real-world device imageso Real-world device traces
Baselineso Native EXT4 and F2FSo Dmdedup: block-level deduplication, flexible metadata backends [OLS 14’]o CAFTL: deduplication for resource-constrained FTL layer [FAST 11’]
Nexus 5X Raspberry Pi 3
CPU Qualcomm Snapdragon 808
Broadcom BCM2837
RAM 2GB 1GB
Storage 32GB eMMC 16GB SDHC UHS-1
OS Android Nougat Raspbian Stretch Lite
Kernel Linux 3.10 Linux 4.4
File System
EXT4 F2FS
-
FIO (on Nexus)
20SmartDedup
FIO Write FIO Read
0 20 40 60 80
100
0 25 50 75 100Thr
ough
put (
MB/
s)
Percentage of Duplicates
EXT4SmartDedup
DmdedupCAFTL
0 30 60 90
0 25 50 75 100Thro
ughp
ut (M
B/s)
Percentage of Duplicates
200 300
E.g., SmartDedup achieves 16.9% write speedup and 38.2% read speedup vs EXT4 with 25% of duplicates
-
FIO (Resource Usage)
21
Small CPU overhead (3.3%) vs. EXT4
Save battery (49.2%) vs. F2FS
Small memory footprint: 3.5MB
CPU Load Battery usage
0 200 400 600 800
1000 1200 1400
Mix 1 Mix 2Ba
ttery
Us
age(
W *
s) 0 5
10 15 20 25 30 35 40
Mix 1 Mix 2
CPU
Load
(%) EXT4
EXT4 (SmartDedup)F2FS
F2FS (SmartDedup)
I/O(GB)
Read/write ratio
Duplication ratio (%)
Mix 1 4 1 25
Mix 2 6 2 25
-
Trace Replay (on Nexus)
22
0%
20%
40%
60%
80%
Segment 1 Segment 2 Segment 3
Deduplication ratioWrite speedupStorage savingWrite reduction
ID Trace src
Write(GB)
Duplication ratio (%)
Read/write ratio
1 4 17.2 75.8 1.5
2 6 12.4 47.9 2.2
3 2 9.1 26.4 6.8
SmartDedup achieveso Up to 51.1% write speedup
o Up to 70.9% write reduction
-
Adaptive Deduplication (on Nexus)
23
Adapting to duplication level of current workload
0
200
400
600
800
0 40 80 120 160
Po
we
r C
on
sum
ptio
n (
mW
)
Time (s)
EXT4SmartDedup (basic)
SmartDedup (adaptive)
SmartDedup (adaptive)o Saves 14% power
overheado With only 8% loss
in deduplication ratio
-
Conclusions
Deduplication is important to smart devices—performance, utilization, endurance
SmartDedup achieves significant performance and endurance improvement with low resource cost
Trace: http://visa.lab.asu.edu/traces
24SmartDedup
Acknowledgementso Colleagues @ the ASU VISA Lab o National Science Foundation
Welcome to SmartDedup’sposter tonight 6:30pm