differentiated admission for peer-to-peer systems joint work with prof. ht kung, harvard university,...

37
Differentiated Admission for Peer-to-Peer Systems Joint Work with Prof. HT Kung, Harvard University, USA 吳吳吳 吳吳吳吳 吳吳吳吳吳吳 吳吳吳吳 ( 吳吳吳吳吳 吳吳吳吳吳吳吳 吳吳吳吳吳吳 ) [email protected] November 17, 2004

Post on 18-Dec-2015

223 views

Category:

Documents


2 download

TRANSCRIPT

Differentiated Admission forPeer-to-Peer Systems

Joint Work with Prof. HT Kung, Harvard University, USA

吳俊興

高雄大學 資訊工程學系 助理教授( 中央研究院 資訊科學研究所 合聘助研究員 )

[email protected]

November 17, 2004

2

Outline

I. Introduction to P2P

II. Motivation and Main Ideas– Freeloader Problem and Differentiated Admission

III. System Design1. Reputation-based, differentiated admission control2. Eigenvector-based reputation system3. Adaptation system of willingness-to-serve4. Sampling technique5. Distributed trust-enhancement scheme

IV. Summary and Concluding Remarks

3

Peer-to-peer (P2P) Model

“Peer-to-Peer (P2P) is a way of structuring distributed applications such that the individual nodes have symmetric roles. Rather than being divided into clients and servers each with quite distinct roles, in P2P applications a node may act as both a client and a server.”

Excerpt from the Charter of Peer-to-Peer Research Group, IETF/IRTF, June 24, 2003

http://www.irtf.org/charters/p2prg.html

I. Introduction to P2P

Heterogeneous and Autonomous

4

Client-server Model

U11

U12

U13

U21

U22

U31

S

U32

The server and the network become the bottlenecks and points of failure

•DDoS•Flash Crowd

Clients ServerRequest

Service

Clients and servers each with distinct roles

I. Introduction to P2P

5

Content Distribution Networks

Hosting + Hierarchical Proxies+ DNS Request Routing (Akamai, CacheFlow, etc.)

U11

U12

U13

U21

U22

U31

S

U32

CR1

CR2

CR3

CRb

CRa

CRContent Router

or Peer Node

Overlay Network

I. Introduction to P2P

Name: www2.microsoft.akadns.net (www.microsoft.com.nsatc.net)Addresses: 207.46.249.221, 207.46.134.221, 207.46.249.189, 207.46.134.189, …Aliases: www.microsoft.com, www.microsoft.akadns.net

Name: www.yahoo.akadns.netAddresses: 66.218.71.93, 66.218.70.50, 66.218.71.91, 66.218.71.86, 66.218.71.89, …Aliases: www.yahoo.com

Name: www.google.akadns.netAddresses: 66.102.11.104, 66.102.11.99Aliases: www.google.com

6

Content Distribution Networks (Cont.)

ICPs

Content NetworkProviders (CNPs)

“Content Foundry”(Semiconductor Foundry)

ISPs

ICPs

ICPs need contents, servers, equipments, IT experts, etc.

ISPs

“Serverless ICP”(Fabless IC Design House)

Now, ICPs need only contents and trusted CNPs.

ParadigmShift

I.C. =

Internet Contents

I. Introduction to P2P

7

NapsterU11

U12

U13

U21

U22

DS

U32

U13 P2P Users

Content SearchContent TransferU31

sharing MP3 files

Any two P2P nodes can exchange contents directly+ Running P2P software is much easier than maintaining servers+ Reduce the load of server, even without servers

I. Introduction to P2P

8

Paradigm Shift of Computing System Models

1980~Terminal-Mainframe(Super-computing)

1990~

Client-Server(Micro-computing

/Personal Computer)

2000~

Peer-to-Peer(Macro-computing)

RS-232 Dialup/10M Ethernet ADSL/100M+ Ethernet

VT100/DOS Windows 31/95 Linux/Windows 2K

I. Introduction to P2P

9I. Introduction to P2P

P2P Applications• P2P File Swapping (Sharing)

– Napster, FreeNet, Gnutella, KaZaA, eDonkey, EZPeer, Kuro• P2P Communication

– NetNews (NNTP), Instant Messaging (IM), Skype• P2P Lookup Services and Their Applications

(Distributed Hash Tables and Global Repositories)– IRIS, Chord/CFS, Tapestry/OceanStore, Pastry/PAST, CAN

• P2P Overlay Networking– (Inter-domain Routing – BGP), RON, PDF, Detour, LRR

• P2P Multimedia Streaming (Application-layer Multicast)– CoopNet, Zigzag, Narada, P2Cast, Streamlla

• Proxies and Content Distribution Networks– Squid, Akamai, DigitIsland, Amplicast

• Overlay Testbed– PlanetLab, NetBed/EmuLab

• Other Areas– P2P Gaming, Grid Computing

10

Most Popular Titles in Windows

•From CNet’s download.com, Week Ending Jan 25, 2004•Kazaa is the Top 1 title in the history, more than 2M downloads a week.•Top downloads in SourceForge: eMule, BitTorrent•Taiwan: Kuro, EzPeer, eDonkey, …

# Titles WeeksDownloads

This Week

Total

Downloads1 Kazaa Media Desktop 91 2,376,379 320,322,6272 ICQ Lite 68 614,061 39,306,3703 Ad-aware 62 550,018 27,483,9354 iMesh 196 462,495 65,806,6185 WinZip 380 428,195 110,442,4236 ICQ Pro 2003b 331 337,363 241,257,4367 Spybot - Search & Destroy 48 325,234 10,777,5508 AOL Instant Messenger (AIM) 68 288,657 29,551,9289 Morpheus 143 209,759 119,261,633

10 LimeWire 27 161,299 18,307,602

I. Introduction to P2P - Status

11

Example of P2P File Swapping: KaZaA

Users Online 3,824,411

Total Number of MP3 Files 294,468,383

Total Size of Movies Files 894,435

Total Number of Files Shared 2,936,033,461

Total Size of Shared Files 8,283,375GB

As of July 2003

• 77% of surveyed companies had at least one installation of file-swapping software (AssetMetrix, July 16, 2003)

• Today more KaZaA traffic than Web traffic!I. Introduction to P2P - Status

12

Skype – P2P VoIPBackground

– Created by Niklas Zennstrom and Janus Friis, founders of KaZaA

– Beta launched August 29, 2003– As of 16 Nov 2004: 34,642,017 downloads and

2,551,163,700 Minutes served (1,771,641 days; 4853 years)– Uses 3-16 KB/s while calling and 0-0.5 KB/s while idle

Features– High-quality Internet conference calls– Firewall and NAT traversal– Global decentralized user directory– Intelligent routing for encrypted calls through effective path– Encrypting all calls and instant messages end-to-end

13

Example of Centralized P2P Systems:Napster

• Announced in January 1999 by Shawn Fanning for sharing MP3 files and pulled plug in July 2001

• Centralized server for search, direct file transfer among peer nodes

• Proprietary client-server protocol and client-client protocol

• Relying on the user to choose a ‘best’ source

• Disruptive, proof of concepts• IPR and firewall issues

P P

Directory

I. Introduction to P2P – Basic Models

14

Example of Decentralized P2P Systems: Gnutella

• Open source

• 3/14/2000: Released by NullSoft/AOL, almost immediately withdrawn, and became open source

• Message flooding: serverless, decentralized search by message broadcast, direct file transfer using HTTP

• Limited-scope query

C S

1

2

2

1

11

22 2

3

3Client

Server

Servent (=Serv er + Cli ent)

I. Introduction to P2P – Basic Models

15

Example of Unstructured P2P Systems: Freenet

• Ian Clarke, Scotland, 2000• Distributed depth-first

search, Exhaustive search• File hash key,

lexicographically closest match

• Store-and-forward file transfer

• Anonymity• Open source

C S321 4

56

I. Introduction to P2P – Basic Models

16

Example of Hybrid P2P Systems: FastTrack / KaZaA

• Proprietary software developed by FastTrack in Amsterdam and licensed to many companies

• Summer 2001, Sharman networks, founded in Vanuatu, acquires FastTrack

• Hierarchical supernodes (Ultra-peers)

• Dedicated authentication server and supernode list server

• From user’s perspective, it’s like Google.

• Encrypted files and control data transported using HTTP

• Parallel download• Automatically switch to new server

Supernodes inoverlay mesh

C

S

I. Introduction to P2P – Basic Models

17

Example of Structured P2P Systems:Chord

• Frans Kaashoek, et. al., MIT, 2001

• IRIS: Infrastructure for Resilient Internet Systems, 2003

• Distributed Hashing Table

• Scalable lookup service

• Hyper-cubic structure

2

10

14

4

6

12

8

0 1

3

5

79

11

13

15

I. Introduction to P2P – Basic Models

18

Challenges to P2P Systems

In an ideal P2P system– Each node contributes as he can and consumes as he need– It cooperates to perform much more stably, securely and

efficiently than any single one computer

In a real open P2P system– There are many nodes that consume many more resources

than they contribute– Some nodes or groups of nodes who may attack or cheat

the other normal nodes of the system

These may probably make the system malfunctioned, unfair

or inefficient

II. Freeloader Problem

19

Fair Contributions of Resourcesby Participating Nodes

8-node transaction exampleP1

P2

P3

P4

P5

P6

P7

P8

Fairness is a well-known concern in P2P communities

• P5 is likely a freeloader because he consumes much more resources than he contributes

• P6 is very nice

II. Freeloader Problem

20

Peer Selection IssuesIn an open P2P, it happens often that

A

B

C

D

I want X

I have X

I have XI have XX

A

B

C S

D

I want X

I want P&Q

I want ZI want Y

Server Selectiona requesting peer (client) needs to decide which servers it should request service from

Client Selectiona supplying peer (server) needs to decide which clients it should grant requests first

so many requests

21

Related Works

• Access Control List– White list and black list

• Payment-based Approaches– Mojo Notion

• Reputation-based Approaches– Free Haven, Reputella, NodeRank, PeerTrust,

EigenTrust, PeerRank

II. Freeloader Problem

22

Service Reputation Usage Reputation

P17 4

P25 4

P34 3

P42 2

P58 1

P61 4

P73 4

P85 4

PeerRank - Reputation RankingP1

P2

P3

P4

P5

P6

P7

P8

00021100

00031200

00042000

00000002

00040000

00031000

00021100

00001100

P8

P7

P6

P5

P4

P3

P2

P1

P8P7P6P5P4P3P2P1

Service credit matrix Computed reputation rankings

For illustration, we assume that a node will receive 2 credits for providing content and 1 credit for transporting content. Assume there are 9 transactions:P5 ← P6 P5 ← P6 ← P7

P5 ← P6 ← P7 ← P8

P5 ← P4

P5 ← P4 ← P3

P5 ← P4 ← P3 ← P2

P4 ← P3 ← P2 ← P1 ← P8 ← P7 ← P6

P3 ← P2 ← P1 ← P8 ← P7 and P1 ← P5

II. Freeloader Problem – Main Ideas

23

Admission System: An OverviewWhen node X receives a request from node Y for content, X triggers a series of steps: 1. X grants the request with a probability based on

X’s current willingness-to-serve parameter 2. If granted, X determines whether Y should be

“admitted” or “denied” by figuring out Y’s service and usage reputation ranking from a set of sampling nodes

3. If admitted, X sends Y the requested content and uses a third-party node to record X’s credits for trust-enhancement purposes

An Approach to Solving the Freeloader’s Problem

II. Freeloader Problem – Main Ideas

24

Admission System: Main Ideas1. A reputation-based, differentiated admission control

that allows a node to receive a level of service based on its service and usage reputations in the past

2. An eigenvector-based method that derives service and usage reputations of nodes by computing the largest eigenvalue / eigenvector pairs of the credit matrix associated with past transactions

3. An adaptation system that allows a node to take care of its own interest by contributing resources at a level just sufficient for its desired level of service

4. A sampling technique that uses top service and usage nodes, as well as benchmark nodes, to reduce the cost of computing service and usage reputations of nodes

5. A distributed trust-enhancement scheme that uses third-party nodes to manage and store credits required by reputation computations

II. Freeloader Problem – Main Ideas

25

Eigen-systems

r1 r2

r4 r3

a12r2

a21r1

a24r4

a31r1a41r1

a42r2

a32r2

a34r4

a14r4a23r3a13r3

a43r3

r1’=a11r1+a12r2+a13r3+a14r4

r1

r2

r3

r4

a11 a12 a13 a14

a21 a22 a23 a24

a31 a32 a33 a34

a41 a42 a43 a44

r1’

r2 ’

r3 ’

r4 ’

=

Arnewrold

x’=Ax

Ax=xx: eigen-vector: eigen-value

II. Freeloader Problem – Main Ideas

26

Computing Reputations with EigenvectorsNotations

S: service credit matrix vector s: service reputationsU = ST: usage credit matrix vector u: usage reputations

An iterative method of computing s and u: Node X’s service reputation

s(i+1) = Su(i) and

u(i) = Us(i) That is,

s(i+1) = SUs(i) or s(i+1) = SSTs(i)

–Note that we can view the latter iteration as the power method of computing the largest eigenvalue/eigenvector pair of SST

–This implies that s is the eigenvector of SST corresponding to the largest eigenvalue of SST. Similarly, u is the eigenvector of STS corresponding to the largest eigenvalue of STS

The matrix formulism here parallels to that used in ranking web pages (ref: Kleinberg HITS algorithm)

Idea 1: Eigenvector-based reputation system

n

y

n

y

ysyxxu

yuyxxs

1

1

)(*),(U)(

)(*),(S)(

III. System Design

27

Reputation-based Admission• Transactions → Credit Matrices → Reputations → Rankings

– After an allowed transaction is complete, the participating nodes will update their credits to reflect their roles in the transaction. Thus, the service and usage credit matrices will change accordingly. These changes will in turn affect future reputation rankings of the nodes

• The reputation ranking of nodes is comparative– We use s or u to denote both reputation and ranking depending on the context– Increasing the ranking of a node implies decreasing the rankings of some other

nodes• A request from node Y will be denied by node X if

Y’s usage reputation u is above A%while its service reputation s is below B%

– A and B are certain preconfigured thresholds with A>B(For example, A and B can be 80 and 20, respectively)

– While being denied of receiving content, Y can still continue providing content and transport services, thereby improving its service reputation

• A freeloader which has a low service reputation (s < B%) and a high usage reputation (u > A%) will be denied of service

– The freeloader will need to either provide an increased level of service (to increase s) or reduce its content usage (to decrease u) in order to be readmitted again

Idea 2: Differentiated admission control

28

Nodes’ Adaptation in Willingness to Serve

Goal: search for a minimum level of contribution a node needs to provide in order to receive a desired level of service from other nodes

Idea 3: Adaptation system

willingness-to-serve

desired service level C

success rate

• willingness-to-serve : a parameter determining the probability at which the node will grant arriving service requests

• desired service level C: a desired level of service. C is constrained by the system-wide parameters A and B: 1 - B*(1-A)

• observed success rate : the percentage of its content or transport requests that are granted by requested nodes

Intuitively, when a node finds that its success rate is above a desired service level C, it will decrease its willingness-to-serve parameter . On the other hand, when the node finds that its is below C, it will increase its

29

Nodes’ Finite State Machine for the Adaptation of Willingness to Serve

“poor man's equilibrium” problem: if the two admit states were combined, many nodes could be in this combined “Admit” state with small and very low These nodes would never be able to increase their since they cannot enter the “Deny” state due to their u being below A%

Increase

Admit& NotMore

Deny

Admit& More

u>A & s<B(uA or

sB)& ≥C

<C≥C u>A &

s<B(uA or sB)& <C

Increase Decrease A node increases its willingness-to-serve parameter

→ This will improve its service reputation ranking s and cause some of the other nodes to drop their service reputation rankings or raise their usage reputation rankings

→ Some of these nodes may then enter the “Deny” state

→ These nodes will then increase their thereby allowing the current node to improve its success rate

30

Newcomer IssueFor a new node, its initial state is configured to be “Admit & More” and its initial is 0. That means it can receive services without providing services initially.

• If a new node makes many service requests such that its u is above A% and its s is below B%, it will enter the “Deny” state

• As long as its success rate does not reach its desired level of service C, it will stay at the “Admit & More” state and increase its to raise

• When its reaches C, it will enter the “Admit & Not More” state and try to reduce its

Increase

Admit& NotMore

Deny

Admit& More

u>A & s<B(uA or

sB)& ≥C

<C≥C u>A &

s<B(uA or sB)& <C

Increase Decrease

31

System Bootstrap

• Initially there have not been any transactions.

Thus:

– the service or usage credit matrix is initially a zero matrix

– the success rate of every node is zero

• Before the success rate of a node reaches C,

it will increase its to raise

– That means each node will try to obtain its desired level

of service by providing more services

32

Convergence of Nodes’ Adaptation:A Simulation Result

• A = .8, B = .2, C = .4 or .8, and # transactions per round = 10,000• When in the “Admit & Not More” state, a node decreases using

← .95 * , else it increases using ← max (+ .05, 1).Except when = 1, the decreasing rate is smaller than the increasing rate

• The simulation results above show that– generally tracks changes in C as the node will adapt its value– A new node will be able to adapt its to achieve its desired level of

service C within about 16 rounds

1 101 201 301 401 Round #

Desired Level of Service C

Success Rate

Willingness

.8

.4

1

01

0

33

Sampling Heuristic: Use of Benchmark NodesTo determine the current state of a node X, we will only compare X to two benchmark nodes:

Node PA has its usage-reputation ranking percentile at A% andNode PB has its service-reputation ranking percentile at B%

Need not know the exact rankings of a node.

Idea 4: Sampling technique

Admit& NotMore

Deny

Admit& More

•usage reputation higher than PA

and service reputation lower than PB

< C ≥ C

Increase ρ

Increase ρDecrease ρ

• usage reputation not higher than PA

or service reputation not lower than PB

< C

• usage reputation not higher than PA

or service reputation not lower than PB

≥ C

34

A Top-nodes Sampling Heuristic

In computing reputation, we will only use a sampling set, consisting of top service and usage nodes, node X, and two benchmark nodes PA and PB

Background processes consider all nodes to select top service and usage nodes and benchmark nodesPA and PB

Form a sampling set

Construct the service creditmatrix of the sampling set

Compute reputation

Compare X to PA and PB

Receive a request from node X

35

Effectiveness of the Sampling Heuristic: A Simulation Result

When transactions exhibit the “hub” phenomenon, it generally suffices to use a small number of top nodes, such as 10, in the sampling set

FalseAdmit

0

10

20

30

40

50

0 10 20 30Number of Top Nodes

Err

or

Rate

(%

)

FalseDeny

0

10

20

30

40

50

0 10 20 30

Number of Top Nodes

Err

or

Rate

(%

)

0

50

100

150

200

250

300

Node Id

# o

f T

ran

sa

ctio

ns Transaction

Source

TransactionDestination

0 16 32 48 64 80 96 112

Load: mixtures of multiple Laplace distributions for requesting and requested nodes

36

Distributed Trust-enhancement Using a Third-party Node

• For a transaction of sending content from Pi to Pj

– The associated service and usage credits are stored at a third-party node Ph, where h = hash (i, j)

– We can use a distributed hash table mechanism (DHT) such as Chord to maintain the credit matrix

• Generally we can assume that Pi and Pj have no management authority on data stored at third-party nodes

Pi Pj

Ph

(2) k

(1) Content encryptedin session key k

(4) k(3) Request k

(5) Service and usagecredit update

Idea 5: Distributed trust-enhancement scheme

37

Summary and Concluding Remarks• A distributed admission system with following features:

1. reputation-based admission controlConsider the service and usage reputation rankings of a requesting

node as well as the willingness of the requested node2. service and usage reputations computed with eigenvectors

Consider the authorities of individual nodes3. the 3-state model and nodes’ adaptation on

Search for a minimum level of contribution a node needs to provide4. use of sampling with top nodes and benchmark nodes

Reduce the information needed to determine current state of a node5. trust-enhancement with third-party nodes

Provide protection against possible cheating of the system

• Further research into control strategies on and various application-specific scenarios would be useful

IV. Conclusion