図でわかるhdfs erasure coding
TRANSCRIPT
Who am I
佐々木 海(Kai Sasaki)
Software Engineer at Treasure Data Inc. http://www.treasuredata.com
Hadoop, Spark, DL4J
Erasure Coding
Block
Block
HDFS
Block
Block
Block
Block
Block
Block
Block
Block
CapacityOverhead
x1.5Redundancy 3
long BlockId0 64
BlockInfo
…
Block
Block
Block
BlockGroup
index GroupId4bit 60bit
index 0
index 2index 1
Saving memory
まとめ
• Namespace -> Saving memory BlockInfoStriped, BlockIdManager
• Writing Side -> Saving diskspace usage INodeFile
• Reading Side -> Saving reading time DFSStripedInputStream