wide-area cooperative storage with cfs. overview cfs = cooperative file system peer-to-peer...

26
Wide-area cooperative storage with CFS

Upload: reginald-lyons

Post on 21-Dec-2015

221 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Wide-area cooperative storage with CFS. Overview CFS = Cooperative File System Peer-to-peer read-only file system Distributed hash table for block storage

Wide-area cooperative storage with CFS

Page 2: Wide-area cooperative storage with CFS. Overview CFS = Cooperative File System Peer-to-peer read-only file system Distributed hash table for block storage

Overview

• CFS = Cooperative File System

• Peer-to-peer read-only file system

• Distributed hash table for block storage

• Lookup performed by Chord

Page 3: Wide-area cooperative storage with CFS. Overview CFS = Cooperative File System Peer-to-peer read-only file system Distributed hash table for block storage

Design Overview

• CFS clients contain 3 layers:– A file system client– Dhash storage layer– Chord lookup layer

• CFS servers contain 2 layers:– Dhash storage layer– Chord lookup layer

Page 4: Wide-area cooperative storage with CFS. Overview CFS = Cooperative File System Peer-to-peer read-only file system Distributed hash table for block storage

Overview con’t

• Disk blocks:DHash blocks::Disk Addresses:Block Identifiers

• CFS file systems are read-only as far as clients are concerned– can be modified by its publisher

Page 5: Wide-area cooperative storage with CFS. Overview CFS = Cooperative File System Peer-to-peer read-only file system Distributed hash table for block storage

File System Layout

• Insert file system blocks into the CFS system using a content hash as its identifier

• Then signs the root block with its private key

• Inserts the root block into CFS using the corresponding public key as its identifier

Page 6: Wide-area cooperative storage with CFS. Overview CFS = Cooperative File System Peer-to-peer read-only file system Distributed hash table for block storage

Publisher Updates

• Updating the file system’s root block to point to the new data

• Authentication by checking to make sure that the same key signed both old and new block– Timestamps prevent replays of old data– File systems are updated without changing the

root blocks identifier

Page 7: Wide-area cooperative storage with CFS. Overview CFS = Cooperative File System Peer-to-peer read-only file system Distributed hash table for block storage

CFS properties

• Decentralized control

• Scalability

• Availability

• Load balance

• Persistence

• Quotas

• Efficiency

Page 8: Wide-area cooperative storage with CFS. Overview CFS = Cooperative File System Peer-to-peer read-only file system Distributed hash table for block storage

Chord Layer

• Same Chord protocol as mentioned earlier with a few modifications– Server Selection– Node ID authentication

Page 9: Wide-area cooperative storage with CFS. Overview CFS = Cooperative File System Peer-to-peer read-only file system Distributed hash table for block storage

Quick Chord Overview

• Consistent Hashing– Node joins/leaves

• Successor lists– O(N)

• finger tables– O(log N)

Page 10: Wide-area cooperative storage with CFS. Overview CFS = Cooperative File System Peer-to-peer read-only file system Distributed hash table for block storage

Server Selection

• Chooses next node to contact from finger table– Gets the closest node to the destination

– What about network latency?

• Introduced measuring and storing latency in the finger tables– Calculated when acquiring finger table entries

– Reasoning: RPCs to different nodes will incur varying latency so you want to choose on that minimizes this

Page 11: Wide-area cooperative storage with CFS. Overview CFS = Cooperative File System Peer-to-peer read-only file system Distributed hash table for block storage

Node ID authentication

• Idea: network security• All chord IDs must be in the form h(x)

– H = SHA-1 has function

– x = the nodes IP + virtual node index

• When a new node joins the system– Existing node will send a message to the new node

– The ID must match the claimed IP + virtual node index hash to be accepted

Page 12: Wide-area cooperative storage with CFS. Overview CFS = Cooperative File System Peer-to-peer read-only file system Distributed hash table for block storage

DHash Layer

• Handles– Storing and retrieving blocks

– Distribution

– Replication

– Caching of blocks

• Uses Chord to locate blocks• Key CFS design: split each file system into blocks

and distribute those across many servers

Page 13: Wide-area cooperative storage with CFS. Overview CFS = Cooperative File System Peer-to-peer read-only file system Distributed hash table for block storage

Replication

• DHash replicates each block on k servers immediately after the blocks sucessor– Why? …even if the block’s successor fails,

the block is still available– Server independence guaranteed because

location on the ring is determined by hash of IP not by physical location

Page 14: Wide-area cooperative storage with CFS. Overview CFS = Cooperative File System Peer-to-peer read-only file system Distributed hash table for block storage

Replication con’t

• Could save space by storing coded pieces of blocks but….storage space is not expected to be a highly-constrained resource

• Placement of replicas allows a client to select the replica with the fastest download– Result from Chord lookup will be the

immediate predecessor to the node with X– This node’s successor table contains entries for

the latencies of the nearest Y nodes

Page 15: Wide-area cooperative storage with CFS. Overview CFS = Cooperative File System Peer-to-peer read-only file system Distributed hash table for block storage

Caching

• Caching blocks prevents overloading servers with popular data

• Using Chord…– Clients contact server closer and closer to the desired

location

– Once source is found or an intermediate cached copy is found all servers just contacted receive file to cache

• Replaces cached blocks in least-recently-used order

Page 16: Wide-area cooperative storage with CFS. Overview CFS = Cooperative File System Peer-to-peer read-only file system Distributed hash table for block storage

Load Balance

• Virtual servers…1 real server acting as several “virtual” servers– Administrator can configure the number of based on

server’s storage and network capabilities

• Possible Side-effect: creating more hops in Chord algorithm– More nodes = more hops

– Solution: allow virtual server’s to look at each other’s tables

Page 17: Wide-area cooperative storage with CFS. Overview CFS = Cooperative File System Peer-to-peer read-only file system Distributed hash table for block storage

Quotas

• Control amount of data a publisher can inject– Based on reliable ID of publishers won’t work because

it requires centralized administration

– CFS uses quotas based on IP address of publishers

– Each server imposes a 0.1% limit…so as the capacity grows the total data amount grows

• Not easy to subvert this system because publishers must respond to initial confirmation requests

Page 18: Wide-area cooperative storage with CFS. Overview CFS = Cooperative File System Peer-to-peer read-only file system Distributed hash table for block storage

Updates and Deletion

• Only allows the publisher to modify data– 2 conditions for acceptance

• Marked as a content-hash block supplied key = SHA-1 hash of the blocks content

• Marked as a signed block signed by public key = SHA-1 hash is the block’s CFS key

• No explicit delete– Publishers must refresh blocks if they want them stored

– CFS deletes blocks that have not been refreshed recently

Page 19: Wide-area cooperative storage with CFS. Overview CFS = Cooperative File System Peer-to-peer read-only file system Distributed hash table for block storage

Experimental Results – Real Life

• 12 machines over the internet – US, the Netherlands, Sweden and South Korea

Page 20: Wide-area cooperative storage with CFS. Overview CFS = Cooperative File System Peer-to-peer read-only file system Distributed hash table for block storage

Lookup

• Range of servers• 10,000 blocks• 10,000 lookups for

random blocks• Distribution is

roughly linear on log plot

…so O(log N)

Page 21: Wide-area cooperative storage with CFS. Overview CFS = Cooperative File System Peer-to-peer read-only file system Distributed hash table for block storage

Load Balance

• Theoretical:– 64 physical servers

– 1,6 and 24 virtual servers each

• Actual:– 10,000 blocks

– 64 actual

– 6 virtual each

Page 22: Wide-area cooperative storage with CFS. Overview CFS = Cooperative File System Peer-to-peer read-only file system Distributed hash table for block storage

Caching

• Single block (1)• 1,000 server system

• Average without:5.7• Average: 3.2

– (10 look-ups)

Page 23: Wide-area cooperative storage with CFS. Overview CFS = Cooperative File System Peer-to-peer read-only file system Distributed hash table for block storage

Storage Space Control

• Varying number of virtual servers

• 7 physical servers• 1, 2, 4, 8, 16, 32, 64

and 128 virtual• 10,000 blocks

Page 24: Wide-area cooperative storage with CFS. Overview CFS = Cooperative File System Peer-to-peer read-only file system Distributed hash table for block storage

Effect of Failure

• 1,000 blocks• 1,000 server system• Each block has 6

replicas• Fraction of servers fail

before the stabilization algorithm is run

Page 25: Wide-area cooperative storage with CFS. Overview CFS = Cooperative File System Peer-to-peer read-only file system Distributed hash table for block storage

Effect of Failure con’t

• Same set-up as before• X% of servers fail

Page 26: Wide-area cooperative storage with CFS. Overview CFS = Cooperative File System Peer-to-peer read-only file system Distributed hash table for block storage

Conclusion

• Highly scalable, available, secure read-only file system

• Uses peer-to-peer Chord protocol for lookup

• Uses replication and caching to achieve availability and load balance

• Simple but effective protection against inserting large amount of malicious data