aplicações p2pclasses/coppe-redes-2008/slides/p2p.pdf · edonkey searching and sharing files:...
TRANSCRIPT
Rosa Leão – 2008
Aplicações P2PRoteiro:
EvoluçãoNapsterGnutellaFastTrackEdonkeyBitTorrentSkype
Rosa Leão – 2008
Aplicações P2PEvolução do tráfego:
Rosa Leão – 2008
Aplicações P2PVantagens:
Escalabilidade: cada cliente contribui com os seus próprios recursosCusto distribuído entre os peersRobustez: redundância devido a arquitetura do sistema pois dados podem ser recuperados de diversas fontes
Rosa Leão – 2008
Aplicações P2PPrimeira Geração:
Sistemas centralizados Servidor central usado para indexar os arquivosPortas pré-definidas para transferência de dadosSistemas fáceis de serem identificadosExemplo: Napster
Rosa Leão – 2008
Aplicações P2PSegunda Geração:
Sistemas totalmente distribuídos Requisições são enviadas para os vizinhosPortas são atribuídas dinamicamenteSistemas mais difíceis de serem identificadosExemplo: Gnutella
Rosa Leão – 2008
Aplicações P2PTerceira Geração:
Combinam idéias de sistemas centralizados e distribuídosIdéia de “super” nós que possuem mais recursos que os outros e são responsáveis pelos arquivos de índices de um subconjunto de paresPortas são atribuídas dinamicamente Podem ser usadas portas de aplicações conhecidas
Rosa Leão – 2008
Aplicações P2PTerceira Geração:
Partes do arquivo podem ser recuperadas de diversos peers simultaneamente Técnicas de criptografia usadas nos dados. Ex: FastTrack Sistemas muito mais difíceis de serem identificadosExemplo: FastTrack, KazaA, Gnutella2, BitTorrent
Rosa Leão – 2008
NapsterHistória:
Maio 1999 – criado um programa chamado Napster por um estudante de 18 anos da Boston Northeastern UniversityObjetivo – troca de arquivos mp3 de músicaFevereiro 2001 – Napster atingiu a taxa de pico de 1.5 milhões de usuários simultâneosJulho 2001 – Napster desativado por uma ordem judicial
Rosa Leão – 2008
Napster Architecture
Rosa Leão – 2008
Naspter
Application-level, client-server protocol over point-to-point TCPCentralized directory server which holds an index of offered MP3/WMA filesClients connect to this server, identify themselves and send a list of files they are sharing
Rosa Leão – 2008
NaspterOther clients can search the index and learn from which clients they can retrieve the fileClient pings hosts that apparently have data. Looks for best transfer rateProblems:
Centralized server: single logical point of failureNo security: passwords in plain text, no authentication, no anonymity
Rosa Leão – 2008
GnutellaHistory:
14th March 2000: Nullsoft published under GPLWithdrawn after a couple of hours by AOLHowever protocol was reverse-engineered and many clones released
Rosa Leão – 2008
GnutellaFocus: decentralized method of searching for filesEach application instance serves to:
store selected filesroute queries (file searches) from and to its neighboring peersrespond to queries (serve file) if file stored locally
Rosa Leão – 2008
Gnutella
Searching by flooding:if you don’t have file, query some partnersif they don’t have it, they contact other partners, for a maximum hop countreverse path forwarding
Rosa Leão – 2008
GnutellaProblems:
freeloading: WWW sites offering search/retrieval from Gnutella network without providing file sharing or query routingprematurely terminated downloads:
long download times over modemsmodem users run gnutella peer only briefly or users becomes overloadedlate 2000: only 10% of downloads succeed
Searh requests grow exponentially to the number of connected users, then they overwhelm the Internet
Rosa Leão – 2008
GnutellaSolution:
Create peer hierarchy based on capabilitiesPreviously: all peers identical, most modem blackholesConnection preferencing:
favors routing to well-connected peersfavors reply to clients that themselves serve large number of files: prevent freeloading
Rosa Leão – 2008
FastTrackArchitecture:
Organized in super nodes (SN) and ordinary nodes (ON) – TCP connections
Overlay network
Rosa Leão – 2008
FastTrackOverlay maintenance:
list of potential SN included within software downloadnew peer goes through list until it finds operational SN
connects, obtains more up-to-date listnode then pings 5 SN on list and connects with the one with smallest RTT
if SN goes down, node obtains updated list and chooses new SN
Rosa Leão – 2008
FastTrackSearching and sharing files:
ON chooses a parent SN and upload the metadata for files it is sharingMetada contains: file name, file size, ContentHash, file descriptors (eg., artist name, album name, etc.)ON sends a query to SN to locate filesFor each match SN sends IP address, port number, metadata
Rosa Leão – 2008
FastTrackSignaling traffic is encrypted:
handshaking (connections between peers)metada, uploaded from ONs to SN supernode listsqueries and replies
File transfer traffic (eg., MP3s, videos, etc) is not encrypted and is sent within HTTP messagesOne of the most popular clients is KaZaA-Lite
Rosa Leão – 2008
eDonkeySearching and sharing files:
Consists of servers (index for file location) and clientsEach server keeps a list of all files shared by the clients connected to it
Client gathers a list of all potential file providers when he connects to a server
Most popular clients: eDonkey2000, eMule
Rosa Leão – 2008
eDonkeyFile download:
Can download from multiple peersEach file is divided into pieces of approximately 10 MBytes
Clients can restrict the upload bandwidth to a given limit
A mechanism decides which client is served next
The mechanism gives priority to peers from which blocks have been previously downloaded
Rosa Leão – 2008
BitTorrent
Main principles:Challenge: maximize the speed of replicationFile is divided into pieces of 256KBCan download from multiple peersHybrid architecture: tracker and peers
Tit-for-tat strategy encourages cooperation and avoid free-riding
Rarest first algorithm to select pieces
Created in 2002 by Brian Cohen
Rosa Leão – 2008
BitTorrentElements:
Two kinds of peers:Seeds: have a complete copy of the fileLeechers: still downloading the file
Trackers: Keeps meta-information about the peers that are currently activeAct as a rendez-vous point for the clients of the torrent
Rosa Leão – 2008
BitTorrentHow to obtain the list of peers ?
IP address of the tracker
Rosa Leão – 2008
BitTorrentHow to obtain the list of peers ?
Report its state
Rosa Leão – 2008
BitTorrent
Random peer list50 peers typically
How to obtain the list of peers ?
Rosa Leão – 2008
BitTorrentHow to obtain the list of pieces ?
Maintain connections to 20-40 peers
List of pieces
Rosa Leão – 2008
BitTorrentWhich peers to upload from – choke/unchoke algorithm
Every 10 seconds a peer re-evaluates the upload rates for all peers
Chokes the peer with the smallest upload rate
Every 30 seconds unchokes a peers regardless the upload rate offered (optimistic unchoking)
Seeds apply the same strategy based on download rates
Rosa Leão – 2008
BitTorrentWhich peers to upload from – choke/unchoke algorithm
Every 10 seconds a peer re-evaluates the upload rates for all peers
Chokes the peers with upload rate equal to zero
Unchokes the choked peers with upload rate greater than zero
Seed unchokes the peer that is longer chocked
Rosa Leão – 2008
BitTorrentPiece selection - rarest first algorithm
Peer has no pieces: random police to obtain the first one
For others pieces peers selects the rarest first piece
Peers have a local view
Each time the download of a new piece is completed, peer informs all the peers it is connected with
Rosa Leão – 2008
BitTorrentPiece selection – strict priority
Peer breaks pieces of 256KBytes into sub-pieces of 16KBytes
Requests the remaining sub-pieces for a particular piece before sub-pieces of another piece
The goal is to have “complete” pieces as soon as possible
Rosa Leão – 2008
BitTorrentPiece selection – end game mode
Peer sends requests for all of its missing pieces to all of its peers when there are few pieces to finish the download
To keep this from becoming horribly inefficient, the client also sends a cancel to everyone else every time a piece arrives
Rosa Leão – 2008
BitTorrentUpdating the list of peers
If the number of connections is less than 20
Peer recontact the tracker to obtain additional peers
Rosa Leão – 2008
P2P file sharing applications traffic distribution
Rosa Leão – 2008
SkypeReleased in 2003 by Niklas Zennstrom and Janus Friis (who founded FastTrack and Kazaa)Proprietary protocolP2P network for user directory and NAT/firewall traversalHierarchical infra-structure: supernodes and clients
Every peer starts as client node, the Skype software checks if client has CPU, memory, network bandwidth, and is not behind NAT or firewalls
Skype promotes client node to supernode
Rosa Leão – 2008
SkypeSupernode functions: user directory service, relay traffic for computers behind NAT and firewall
Skype client find a supernode to be its default gateway
Try to establish a UDP connection to the supernode
Try to establish a TCP connection on an arbitrary port
Try to establish a TCP connection on port 80 or 443
Skype first tries to establish a direct connection with a certain NAT traversal mechanism
If the connection fails, Skype relays the traffic through the gateway supernode of the Skype clients