大規模hdfs & erasurecoding#yjdsw3
TRANSCRIPT
-
15/12/02
HDFS & ErasureCoding
-
P2
2004 2008
2013Hadoop 2015 OSS(Hadoop)
201510
-
P3 Agenda
Hadoop HDFS
-
P4 Agenda
Hadoop HDFS
-
P5
Copyright (C) 2015 Yahoo Japan Corporation. All Rights Reserved.
y = 3.284 e 0.0021x
2.1
HDFS 2.1
-
P6
Copyright (C) 2015 Yahoo Japan Corporation. All Rights Reserved.
CPU
3.3
y = 19.859 e
CPU 3.3
0.0033x
1
-
P7
Copyright (C) 2015 Yahoo Japan Corporation. All Rights Reserved.
Hadoop
2010 20112012 2013 2014
6,000
3,000
-
P8 Agenda
Hadoop HDFS
-
P9 HDFS
NameNode active
fsimage
NameNode standby
fsimage
JournalNode
client
create,rm
editlog
k
block
k
block
k
block
k
block
DataNode
k
block
k
block
BlockReport
client
ls, create, mv
editlog
editlog
/
-
P10 HDFS
NameNode active
fsimage
NameNode standby
fsimage
JournalNode
clientcreate,rm
3000 connection
BlockReport
BlockReport
3000 connection
k
block
k
block
k
block
k
block
k
block
k
block
k
block
k
block
k
block
k
block
k
block
k
block
k
block
k
block
k
block
k
block
DataNode
blockblock
clientclientclientclient
1.6
201511:1.3 files and directories, 1.6 blocks
3000
/
:1/360PB->20PB
two
-
P11 Agenda
Hadoop HDFS
BlockReport()Storage
-
P12 BlockReport
NameNode
k
block
k
block
k
block
k
block
k
block
k
block
k
block
k
block
k
block
k
block
k
block
k
block
k
block
k
block
k
block
k
block
DataNode
blockblock
NameNode
fsimage
BlocksMap
blockId ,dn3,dn2dn1
blockdn
blockId ,dn6,dn5dn4
blockId ,dn9,dn8dn7
BlockReport
BlockReport
-
P13
BlockReport
diff
under/over Replication
pendingReplications
corruptReplicas
///
blockDN
firstBlockReport
BlockReport
HDFS-7980
-
P14 firstBlockReport
NameNode DataNode
full
increment
BlockReport fullBlockReport(1.6)
24 incrementBlockReport
1
NameNode DataNode
full
increment
busy
-
P15 BlockReport
(incrementBlockReport) DataNode stop NameNode/ DataNode start
10
-
P16 BlockReport
(HDFS-7980) fullBlockReport Hadoop2.6.1
NameNode DataNode
full
increment
busy
-
P17 Agenda
Hadoop HDFS
BlockReport()Storage
-
P18 HDFSOverhead-200%
Replication(3)
Block1
File
Block2
Block3
rep1 rep2 rep3
Block1
dn1 dn2 dn3
Durabality=2Overhead=200%
Block1
-
P19 ErasureCoding
ReedSolomon(6,3)
Block1
File
Block2
BlockN
stripe1Block1
stripe2
StripeN
DataBlock ParityBlock
Durabality=3Overhead=50%
s1
s7
dn1
s1 s2
s8
dn2
s2 s6
s12
dn6
s6 p1
p4
dn7
p1 p2
s5
dn8
p2 p3
s6
dn9
p3
HDFS-7285
-
P20
RESOLVED:
56
-
P21 Replication vs ErasureCoding
Ac
-
P22
HDFS-8425 TestDFSIO
DataNode 20
CPU Xeon E5-2630L 2.00GHz/2CPU
RAM 64G
Disk SATA 3TB x4
Network 1G
2015
-
P23 -Read
TestDFSIO -nrFiles (10,20,30,40,50,60,70,80,90,100) -size 384MB
-
P24 -Write
TestDFSIO -nrFiles (10,20,30,40,50,60,70,80,90,100) -size 384MB
ErasureCoding
-
P25 -CPU
110~200 fileswriteDataNode CPU
replicaPon ErasureCoding
5.5%
-
P26 ErasureCoding
NetworkCPU Storage
-
P27
HDFS-6584 Storage Policy
Storage Policy Data
HOT
WARM
COLD
:log