differentiated admission for peer-to-peer systems joint work with prof. ht kung, harvard university,...
Post on 18-Dec-2015
223 views
TRANSCRIPT
Differentiated Admission forPeer-to-Peer Systems
Joint Work with Prof. HT Kung, Harvard University, USA
吳俊興
高雄大學 資訊工程學系 助理教授( 中央研究院 資訊科學研究所 合聘助研究員 )
November 17, 2004
2
Outline
I. Introduction to P2P
II. Motivation and Main Ideas– Freeloader Problem and Differentiated Admission
III. System Design1. Reputation-based, differentiated admission control2. Eigenvector-based reputation system3. Adaptation system of willingness-to-serve4. Sampling technique5. Distributed trust-enhancement scheme
IV. Summary and Concluding Remarks
3
Peer-to-peer (P2P) Model
“Peer-to-Peer (P2P) is a way of structuring distributed applications such that the individual nodes have symmetric roles. Rather than being divided into clients and servers each with quite distinct roles, in P2P applications a node may act as both a client and a server.”
Excerpt from the Charter of Peer-to-Peer Research Group, IETF/IRTF, June 24, 2003
http://www.irtf.org/charters/p2prg.html
I. Introduction to P2P
Heterogeneous and Autonomous
4
Client-server Model
U11
U12
U13
U21
U22
U31
S
U32
The server and the network become the bottlenecks and points of failure
•DDoS•Flash Crowd
Clients ServerRequest
Service
Clients and servers each with distinct roles
I. Introduction to P2P
5
Content Distribution Networks
Hosting + Hierarchical Proxies+ DNS Request Routing (Akamai, CacheFlow, etc.)
U11
U12
U13
U21
U22
U31
S
U32
CR1
CR2
CR3
CRb
CRa
CRContent Router
or Peer Node
Overlay Network
I. Introduction to P2P
Name: www2.microsoft.akadns.net (www.microsoft.com.nsatc.net)Addresses: 207.46.249.221, 207.46.134.221, 207.46.249.189, 207.46.134.189, …Aliases: www.microsoft.com, www.microsoft.akadns.net
Name: www.yahoo.akadns.netAddresses: 66.218.71.93, 66.218.70.50, 66.218.71.91, 66.218.71.86, 66.218.71.89, …Aliases: www.yahoo.com
Name: www.google.akadns.netAddresses: 66.102.11.104, 66.102.11.99Aliases: www.google.com
6
Content Distribution Networks (Cont.)
ICPs
Content NetworkProviders (CNPs)
“Content Foundry”(Semiconductor Foundry)
ISPs
ICPs
ICPs need contents, servers, equipments, IT experts, etc.
ISPs
“Serverless ICP”(Fabless IC Design House)
Now, ICPs need only contents and trusted CNPs.
ParadigmShift
I.C. =
Internet Contents
I. Introduction to P2P
7
NapsterU11
U12
U13
U21
U22
DS
U32
U13 P2P Users
Content SearchContent TransferU31
sharing MP3 files
Any two P2P nodes can exchange contents directly+ Running P2P software is much easier than maintaining servers+ Reduce the load of server, even without servers
I. Introduction to P2P
8
Paradigm Shift of Computing System Models
1980~Terminal-Mainframe(Super-computing)
1990~
Client-Server(Micro-computing
/Personal Computer)
2000~
Peer-to-Peer(Macro-computing)
RS-232 Dialup/10M Ethernet ADSL/100M+ Ethernet
VT100/DOS Windows 31/95 Linux/Windows 2K
I. Introduction to P2P
9I. Introduction to P2P
P2P Applications• P2P File Swapping (Sharing)
– Napster, FreeNet, Gnutella, KaZaA, eDonkey, EZPeer, Kuro• P2P Communication
– NetNews (NNTP), Instant Messaging (IM), Skype• P2P Lookup Services and Their Applications
(Distributed Hash Tables and Global Repositories)– IRIS, Chord/CFS, Tapestry/OceanStore, Pastry/PAST, CAN
• P2P Overlay Networking– (Inter-domain Routing – BGP), RON, PDF, Detour, LRR
• P2P Multimedia Streaming (Application-layer Multicast)– CoopNet, Zigzag, Narada, P2Cast, Streamlla
• Proxies and Content Distribution Networks– Squid, Akamai, DigitIsland, Amplicast
• Overlay Testbed– PlanetLab, NetBed/EmuLab
• Other Areas– P2P Gaming, Grid Computing
10
Most Popular Titles in Windows
•From CNet’s download.com, Week Ending Jan 25, 2004•Kazaa is the Top 1 title in the history, more than 2M downloads a week.•Top downloads in SourceForge: eMule, BitTorrent•Taiwan: Kuro, EzPeer, eDonkey, …
# Titles WeeksDownloads
This Week
Total
Downloads1 Kazaa Media Desktop 91 2,376,379 320,322,6272 ICQ Lite 68 614,061 39,306,3703 Ad-aware 62 550,018 27,483,9354 iMesh 196 462,495 65,806,6185 WinZip 380 428,195 110,442,4236 ICQ Pro 2003b 331 337,363 241,257,4367 Spybot - Search & Destroy 48 325,234 10,777,5508 AOL Instant Messenger (AIM) 68 288,657 29,551,9289 Morpheus 143 209,759 119,261,633
10 LimeWire 27 161,299 18,307,602
I. Introduction to P2P - Status
11
Example of P2P File Swapping: KaZaA
Users Online 3,824,411
Total Number of MP3 Files 294,468,383
Total Size of Movies Files 894,435
Total Number of Files Shared 2,936,033,461
Total Size of Shared Files 8,283,375GB
As of July 2003
• 77% of surveyed companies had at least one installation of file-swapping software (AssetMetrix, July 16, 2003)
• Today more KaZaA traffic than Web traffic!I. Introduction to P2P - Status
12
Skype – P2P VoIPBackground
– Created by Niklas Zennstrom and Janus Friis, founders of KaZaA
– Beta launched August 29, 2003– As of 16 Nov 2004: 34,642,017 downloads and
2,551,163,700 Minutes served (1,771,641 days; 4853 years)– Uses 3-16 KB/s while calling and 0-0.5 KB/s while idle
Features– High-quality Internet conference calls– Firewall and NAT traversal– Global decentralized user directory– Intelligent routing for encrypted calls through effective path– Encrypting all calls and instant messages end-to-end
13
Example of Centralized P2P Systems:Napster
• Announced in January 1999 by Shawn Fanning for sharing MP3 files and pulled plug in July 2001
• Centralized server for search, direct file transfer among peer nodes
• Proprietary client-server protocol and client-client protocol
• Relying on the user to choose a ‘best’ source
• Disruptive, proof of concepts• IPR and firewall issues
P P
Directory
I. Introduction to P2P – Basic Models
14
Example of Decentralized P2P Systems: Gnutella
• Open source
• 3/14/2000: Released by NullSoft/AOL, almost immediately withdrawn, and became open source
• Message flooding: serverless, decentralized search by message broadcast, direct file transfer using HTTP
• Limited-scope query
C S
1
2
2
1
11
22 2
3
3Client
Server
Servent (=Serv er + Cli ent)
I. Introduction to P2P – Basic Models
15
Example of Unstructured P2P Systems: Freenet
• Ian Clarke, Scotland, 2000• Distributed depth-first
search, Exhaustive search• File hash key,
lexicographically closest match
• Store-and-forward file transfer
• Anonymity• Open source
C S321 4
56
I. Introduction to P2P – Basic Models
16
Example of Hybrid P2P Systems: FastTrack / KaZaA
• Proprietary software developed by FastTrack in Amsterdam and licensed to many companies
• Summer 2001, Sharman networks, founded in Vanuatu, acquires FastTrack
• Hierarchical supernodes (Ultra-peers)
• Dedicated authentication server and supernode list server
• From user’s perspective, it’s like Google.
• Encrypted files and control data transported using HTTP
• Parallel download• Automatically switch to new server
Supernodes inoverlay mesh
C
S
I. Introduction to P2P – Basic Models
17
Example of Structured P2P Systems:Chord
• Frans Kaashoek, et. al., MIT, 2001
• IRIS: Infrastructure for Resilient Internet Systems, 2003
• Distributed Hashing Table
• Scalable lookup service
• Hyper-cubic structure
2
10
14
4
6
12
8
0 1
3
5
79
11
13
15
I. Introduction to P2P – Basic Models
18
Challenges to P2P Systems
In an ideal P2P system– Each node contributes as he can and consumes as he need– It cooperates to perform much more stably, securely and
efficiently than any single one computer
In a real open P2P system– There are many nodes that consume many more resources
than they contribute– Some nodes or groups of nodes who may attack or cheat
the other normal nodes of the system
These may probably make the system malfunctioned, unfair
or inefficient
II. Freeloader Problem
19
Fair Contributions of Resourcesby Participating Nodes
8-node transaction exampleP1
P2
P3
P4
P5
P6
P7
P8
Fairness is a well-known concern in P2P communities
• P5 is likely a freeloader because he consumes much more resources than he contributes
• P6 is very nice
II. Freeloader Problem
20
Peer Selection IssuesIn an open P2P, it happens often that
A
B
C
D
I want X
I have X
I have XI have XX
A
B
C S
D
I want X
I want P&Q
I want ZI want Y
Server Selectiona requesting peer (client) needs to decide which servers it should request service from
Client Selectiona supplying peer (server) needs to decide which clients it should grant requests first
so many requests
21
Related Works
• Access Control List– White list and black list
• Payment-based Approaches– Mojo Notion
• Reputation-based Approaches– Free Haven, Reputella, NodeRank, PeerTrust,
EigenTrust, PeerRank
II. Freeloader Problem
22
Service Reputation Usage Reputation
P17 4
P25 4
P34 3
P42 2
P58 1
P61 4
P73 4
P85 4
PeerRank - Reputation RankingP1
P2
P3
P4
P5
P6
P7
P8
00021100
00031200
00042000
00000002
00040000
00031000
00021100
00001100
P8
P7
P6
P5
P4
P3
P2
P1
P8P7P6P5P4P3P2P1
Service credit matrix Computed reputation rankings
For illustration, we assume that a node will receive 2 credits for providing content and 1 credit for transporting content. Assume there are 9 transactions:P5 ← P6 P5 ← P6 ← P7
P5 ← P6 ← P7 ← P8
P5 ← P4
P5 ← P4 ← P3
P5 ← P4 ← P3 ← P2
P4 ← P3 ← P2 ← P1 ← P8 ← P7 ← P6
P3 ← P2 ← P1 ← P8 ← P7 and P1 ← P5
II. Freeloader Problem – Main Ideas
23
Admission System: An OverviewWhen node X receives a request from node Y for content, X triggers a series of steps: 1. X grants the request with a probability based on
X’s current willingness-to-serve parameter 2. If granted, X determines whether Y should be
“admitted” or “denied” by figuring out Y’s service and usage reputation ranking from a set of sampling nodes
3. If admitted, X sends Y the requested content and uses a third-party node to record X’s credits for trust-enhancement purposes
An Approach to Solving the Freeloader’s Problem
II. Freeloader Problem – Main Ideas
24
Admission System: Main Ideas1. A reputation-based, differentiated admission control
that allows a node to receive a level of service based on its service and usage reputations in the past
2. An eigenvector-based method that derives service and usage reputations of nodes by computing the largest eigenvalue / eigenvector pairs of the credit matrix associated with past transactions
3. An adaptation system that allows a node to take care of its own interest by contributing resources at a level just sufficient for its desired level of service
4. A sampling technique that uses top service and usage nodes, as well as benchmark nodes, to reduce the cost of computing service and usage reputations of nodes
5. A distributed trust-enhancement scheme that uses third-party nodes to manage and store credits required by reputation computations
II. Freeloader Problem – Main Ideas
25
Eigen-systems
r1 r2
r4 r3
a12r2
a21r1
a24r4
a31r1a41r1
a42r2
a32r2
a34r4
a14r4a23r3a13r3
a43r3
r1’=a11r1+a12r2+a13r3+a14r4
r1
r2
r3
r4
a11 a12 a13 a14
a21 a22 a23 a24
a31 a32 a33 a34
a41 a42 a43 a44
r1’
r2 ’
r3 ’
r4 ’
=
Arnewrold
x’=Ax
Ax=xx: eigen-vector: eigen-value
II. Freeloader Problem – Main Ideas
26
Computing Reputations with EigenvectorsNotations
S: service credit matrix vector s: service reputationsU = ST: usage credit matrix vector u: usage reputations
An iterative method of computing s and u: Node X’s service reputation
s(i+1) = Su(i) and
u(i) = Us(i) That is,
s(i+1) = SUs(i) or s(i+1) = SSTs(i)
–Note that we can view the latter iteration as the power method of computing the largest eigenvalue/eigenvector pair of SST
–This implies that s is the eigenvector of SST corresponding to the largest eigenvalue of SST. Similarly, u is the eigenvector of STS corresponding to the largest eigenvalue of STS
The matrix formulism here parallels to that used in ranking web pages (ref: Kleinberg HITS algorithm)
Idea 1: Eigenvector-based reputation system
n
y
n
y
ysyxxu
yuyxxs
1
1
)(*),(U)(
)(*),(S)(
III. System Design
27
Reputation-based Admission• Transactions → Credit Matrices → Reputations → Rankings
– After an allowed transaction is complete, the participating nodes will update their credits to reflect their roles in the transaction. Thus, the service and usage credit matrices will change accordingly. These changes will in turn affect future reputation rankings of the nodes
• The reputation ranking of nodes is comparative– We use s or u to denote both reputation and ranking depending on the context– Increasing the ranking of a node implies decreasing the rankings of some other
nodes• A request from node Y will be denied by node X if
Y’s usage reputation u is above A%while its service reputation s is below B%
– A and B are certain preconfigured thresholds with A>B(For example, A and B can be 80 and 20, respectively)
– While being denied of receiving content, Y can still continue providing content and transport services, thereby improving its service reputation
• A freeloader which has a low service reputation (s < B%) and a high usage reputation (u > A%) will be denied of service
– The freeloader will need to either provide an increased level of service (to increase s) or reduce its content usage (to decrease u) in order to be readmitted again
Idea 2: Differentiated admission control
28
Nodes’ Adaptation in Willingness to Serve
Goal: search for a minimum level of contribution a node needs to provide in order to receive a desired level of service from other nodes
Idea 3: Adaptation system
willingness-to-serve
desired service level C
success rate
• willingness-to-serve : a parameter determining the probability at which the node will grant arriving service requests
• desired service level C: a desired level of service. C is constrained by the system-wide parameters A and B: 1 - B*(1-A)
• observed success rate : the percentage of its content or transport requests that are granted by requested nodes
Intuitively, when a node finds that its success rate is above a desired service level C, it will decrease its willingness-to-serve parameter . On the other hand, when the node finds that its is below C, it will increase its
29
Nodes’ Finite State Machine for the Adaptation of Willingness to Serve
“poor man's equilibrium” problem: if the two admit states were combined, many nodes could be in this combined “Admit” state with small and very low These nodes would never be able to increase their since they cannot enter the “Deny” state due to their u being below A%
Increase
Admit& NotMore
Deny
Admit& More
u>A & s<B(uA or
sB)& ≥C
<C≥C u>A &
s<B(uA or sB)& <C
Increase Decrease A node increases its willingness-to-serve parameter
→ This will improve its service reputation ranking s and cause some of the other nodes to drop their service reputation rankings or raise their usage reputation rankings
→ Some of these nodes may then enter the “Deny” state
→ These nodes will then increase their thereby allowing the current node to improve its success rate
30
Newcomer IssueFor a new node, its initial state is configured to be “Admit & More” and its initial is 0. That means it can receive services without providing services initially.
• If a new node makes many service requests such that its u is above A% and its s is below B%, it will enter the “Deny” state
• As long as its success rate does not reach its desired level of service C, it will stay at the “Admit & More” state and increase its to raise
• When its reaches C, it will enter the “Admit & Not More” state and try to reduce its
Increase
Admit& NotMore
Deny
Admit& More
u>A & s<B(uA or
sB)& ≥C
<C≥C u>A &
s<B(uA or sB)& <C
Increase Decrease
31
System Bootstrap
• Initially there have not been any transactions.
Thus:
– the service or usage credit matrix is initially a zero matrix
– the success rate of every node is zero
• Before the success rate of a node reaches C,
it will increase its to raise
– That means each node will try to obtain its desired level
of service by providing more services
32
Convergence of Nodes’ Adaptation:A Simulation Result
• A = .8, B = .2, C = .4 or .8, and # transactions per round = 10,000• When in the “Admit & Not More” state, a node decreases using
← .95 * , else it increases using ← max (+ .05, 1).Except when = 1, the decreasing rate is smaller than the increasing rate
• The simulation results above show that– generally tracks changes in C as the node will adapt its value– A new node will be able to adapt its to achieve its desired level of
service C within about 16 rounds
1 101 201 301 401 Round #
Desired Level of Service C
Success Rate
Willingness
.8
.4
1
01
0
33
Sampling Heuristic: Use of Benchmark NodesTo determine the current state of a node X, we will only compare X to two benchmark nodes:
Node PA has its usage-reputation ranking percentile at A% andNode PB has its service-reputation ranking percentile at B%
Need not know the exact rankings of a node.
Idea 4: Sampling technique
Admit& NotMore
Deny
Admit& More
•usage reputation higher than PA
and service reputation lower than PB
< C ≥ C
Increase ρ
Increase ρDecrease ρ
• usage reputation not higher than PA
or service reputation not lower than PB
< C
• usage reputation not higher than PA
or service reputation not lower than PB
≥ C
34
A Top-nodes Sampling Heuristic
In computing reputation, we will only use a sampling set, consisting of top service and usage nodes, node X, and two benchmark nodes PA and PB
Background processes consider all nodes to select top service and usage nodes and benchmark nodesPA and PB
Form a sampling set
Construct the service creditmatrix of the sampling set
Compute reputation
Compare X to PA and PB
Receive a request from node X
35
Effectiveness of the Sampling Heuristic: A Simulation Result
When transactions exhibit the “hub” phenomenon, it generally suffices to use a small number of top nodes, such as 10, in the sampling set
FalseAdmit
0
10
20
30
40
50
0 10 20 30Number of Top Nodes
Err
or
Rate
(%
)
FalseDeny
0
10
20
30
40
50
0 10 20 30
Number of Top Nodes
Err
or
Rate
(%
)
0
50
100
150
200
250
300
Node Id
# o
f T
ran
sa
ctio
ns Transaction
Source
TransactionDestination
0 16 32 48 64 80 96 112
Load: mixtures of multiple Laplace distributions for requesting and requested nodes
36
Distributed Trust-enhancement Using a Third-party Node
• For a transaction of sending content from Pi to Pj
– The associated service and usage credits are stored at a third-party node Ph, where h = hash (i, j)
– We can use a distributed hash table mechanism (DHT) such as Chord to maintain the credit matrix
• Generally we can assume that Pi and Pj have no management authority on data stored at third-party nodes
Pi Pj
Ph
(2) k
(1) Content encryptedin session key k
(4) k(3) Request k
(5) Service and usagecredit update
Idea 5: Distributed trust-enhancement scheme
37
Summary and Concluding Remarks• A distributed admission system with following features:
1. reputation-based admission controlConsider the service and usage reputation rankings of a requesting
node as well as the willingness of the requested node2. service and usage reputations computed with eigenvectors
Consider the authorities of individual nodes3. the 3-state model and nodes’ adaptation on
Search for a minimum level of contribution a node needs to provide4. use of sampling with top nodes and benchmark nodes
Reduce the information needed to determine current state of a node5. trust-enhancement with third-party nodes
Provide protection against possible cheating of the system
• Further research into control strategies on and various application-specific scenarios would be useful
IV. Conclusion