Download - ContentDelivery:HTTP,WebCaching,& …sylvia/cs268/papers/gz01-lecture... · ContentDelivery:HTTP,WebCaching,& and&Content&Distribu;on&Networks& 3035/GZ01!Networked!Systems! Kyle(Jamieson(Lecture(17((Departmentof(Computer(Science

Content Delivery: HTTP, Web Caching, and Content Distribu;on Networks

3035/GZ01 Networked Systems Kyle Jamieson Lecture 17

Department of Computer Science

University College London

Outline

•  Background and HTTP – Cookies: client-‐side state – LimitaFons of HTTP/1.0 – HTTP range requests

•  InteracFon between applicaFons and TCP •  Web caching •  Content distribuFon networks

HTTP: the Hypertext Transfer Protocol •  The web’s applicaFon layer protocol

•  Client/server model over TCP –  Client: browser that requests, receives, web objects –  Server: web server responds to requests with data

•  Accepts connecFons on TCP port 80

•  HTTP itself is “stateless:” the protocol maintains no informaFon about past client requests –  Design goal: don’t burden web servers with state maintenance –  Cookies added later to carry state client-‐side (scalable)

•  Metadata plays a key role in design and operaFon

An HTTP/1.0 exchange

Request: GET /index.html HTTP/1.0\n\n!Response: HTTP/1.0 200 OK!MIME-Version: 1.0!Server: CERN/3.0!Date: Thu, 03 Dec 2009 16:15:53 GMT!Content-Type: text/html!Content-Length: 22631!Last-Modified: Wed, 11 Nov 2009 09:10:45 GMT!!<p class="contact_new">Computer Science Department University College London - Gower Street - London - WC1E 6BT - <img src=“/images/phone.gif” width="13" height="9" id="phone” alt="Telephone:”/> +44 (0)20 7679 7214 - Copyright © 1999-2007 UCL</p>

HTTP Header

HTTP body

Cookies: Client-‐side state

•  Recall: the HTTP protocol is stateless

•  However: state is useful for session authenFcaFon, online sessions (online shopping, banking, etc.)

•  Idea: client keeps small pieces of state (cookies) –  Received from server in responses, sent to server in requests –  At least two possibiliFes:

•  All state may be encoded into cookie stored at client •  Cookie indexes state in a database on the server (online shopping)

•  Privacy issue: o]en user is unaware of cookies e.g.: search engine returns link http://ad.doubleclick.net/x26362d2/

www.foo.com/foo.html!

Cookies allow content personaliza;on Client Server

ucl.ac.uk HTTP GET

HTTP/1.0 200 OK; Set-‐cookie: 1678

HTTP GET; Cookie: 1678

HTTP/1.0 200 OK

HTTP GET; Cookie: 1678

HTTP/1.0 200 OK

One week later

cookies.txt

ucl.ac.uk: 1678 Backend database

Create id 1678

Lookup id 1678

Lookup id 1678

Problems with HTTP/1.0

1.  Poor use of TCP, especially for short responses –  Recall: three-‐way handshake to open TCP connecFon, four packets to close TCP connecFon •  HTTP message o]en fits in ≈10 packets → overhead •  Don’t get past slow-‐start, so don’t use available bandwidth

–  For each embedded image: open a new TCP connecFon and send another HTTP GET

2.  Lack of support for web caching 3.  Lack of bandwidth opFmizaFon –  Range requests –  Data compression

HTTP/1.1 bandwidth op;miza;on

•  Data compression –  Image files pre-‐compressed, but much other content is not –  [1997]: compression would save 40% of bytes sent via HTTP –  Client uses Content-‐Encoding header in HTTP GET to signal which

encodings it can decompress, and which it prefers

•  Range requests –  Problem: client wants to resume an incomplete download, view

certain pages of a long document –  Need to retrieve just a small secFon of a resource

GET bigfile.html HTTP/1.1!Range: 2000-3999!

HTTP/1.1 206 Partial Content!Date: Thu, 10 Feb 2006 20:02:06 GMT!Last-Modified: Wed, 9 Feb 2006 14:58:08 GMT!Content-Range: bytes 2000-3999/100000!Content-Length: 2000!Content-Type: text/html!

Outline

•  Background and HTTP •  Interac;on between applica;ons and TCP – ConnecFon establishment and slow start – Web transfers, HTTP/1.1 persistent connecFons –  InteracFve applicaFons: Nagle’s algorithm

•  Web caching •  Content distribuFon networks

TCP connec;on establishment delay

•  Recall TCP retransmit Fmeout: RTO = RTT + 4*variance –  IniFal RTO = 3 seconds –  Double RTO on Fmeout

•  Avoids unnecessary SYN retransmissions if RTT is large

•  Frustrated users hit stop buron (terminate TCP connecFons) and reload buron

•  Ongoing issue: Lowering iniFal RTO is a topic of acFve discussion in the IETF (May ‘11)

SYN

SYN

SYN

3 s

SYN-‐ACK

6 s

Effect of lowering ini;al RTO •  Today on the Internet –  ≈ 2% TCP flows lose first SYN •  Current proposal: Retransmit SYN a]er 1 sec •  Lower penalty for SYN loss from 3 secs to one sec

–  ≈ 2% TCP flows’ RTT > 1 sec • Why? 802.11 MAC layer, slow dialup line, slow VPN connecFon •  Spurious SYN retransmission

•  Tradeoff between spurious SYN retransmission and user wait

SYN SYN 1 s

SYN-‐ACK

SYN 1 s

SYN-‐ACK SYN

SYN-‐ACK

Delay in the middle of a web transfer

•  Now, the TCP connecFon is open and in slow start •  Recall TCP slow start: increase sender cwnd by a packet

size for each received acknowledgement –  Short HTTP transfers spend most of their Fme in slow start

•  Small cwnd means that a]er a loss, receiver unlikely to generate three dup ACKs –  Retransmission Fmeouts are more likely

2 CONSERVATIONAT EQUILIBRIUM: ROUND-TRIP TIMING 6

Figure 4: Startup behavior of TCP with Slow-start

• •• ••• ••••• •••••••• ••••••

••••• ••••••••••••••••• ••• ••••••

••••••••••••••••••• ••••••

••••••••••••

••••••••••••

•• •••• ••••••••••••

••••••••••••••• • ••••••

•••• ••••••••••••••••••

••••••••••••

•• ••••••••••••

••••••••••••

•••••••• ••••••

••••••••••••

••••••••••••

•• ••••••••••••

••••••••••••

••••••• ••• ••••••

• ••••••••••••

••••••••••

Send Time (sec)

Pack

et S

eque

nce

Num

ber (

KB)

0 2 4 6 8 10

020

4060

8010

012

014

016

0

Same conditions as the previous figure (same time of day, same Suns, same network path,same buffer and window sizes), except the machines were running the 4.3+TCP with slow-start. No bandwidth is wasted on retransmits but two seconds is spent on the slow-startso the effective bandwidth of this part of the trace is 16 KBps — two times better thanfigure 3. (This is slightly misleading: Unlike the previous figure, the slope of the trace is20 KBps and the effect of the 2 second offset decreases as the trace lengthens. E.g., if thistrace had run a minute, the effective bandwidth would have been 19 KBps. The effectivebandwidth without slow-start stays at 7 KBps no matter how long the trace.)

important feature of any protocol implementation that expects to survive heavy load. Andit is frequently botched ([26] and [13] describe typical problems).

One mistake is not estimating the variation, σR, of the round trip time, R. From queuingtheory we know that R and the variation in R increase quickly with load. If the load is ρ(the ratio of average arrival rate to average departure rate), R and σR scale like (1!ρ)!1.To make this concrete, if the network is running at 75% of capacity, as the Arpanet was inlast April’s collapse, one should expect round-trip-time to vary by a factor of sixteen (!2σto +2σ).

The TCP protocol specification[2] suggests estimating mean round trip time via the low-pass filter

R" αR+(1!α)M

where R is the average RTT estimate, M is a round trip time measurement from the mostrecently acked data packet, and α is a filter gain constant with a suggested value of 0.9.Once the R estimate is updated, the retransmit timeout interval, rto, for the next packet sentis set to βR.

Time (seconds)

Sender sequence number (bytes)

10 packets

An HTTP/1.0 page fetch (Obsolete)

•  Fetch an 8.5 Kbyte page with 10 embedded objects, most < 10 Kbyte •  One HTTP transacFon/fetch, open and close a new TCP connecFon each Fme •  Stay in slow start, except for the large object

Bytes received

Time (milliseconds)

Improvement: Persistent HTTP connec;ons

•  Keep TCP connecFon open for many HTTP transacFons –  Reduce TCP connecFon setup and teardown overhead –  Remove slow start performance problems (cwnd grows)

•  Keep-‐Alive mechanism in HTTP/1.0 (a persistent TCP connecFon is the default in HTTP/1.1) GET /index.html HTTP/1.0!Connection: Keep-Alive!!reply from server!GET image1.jpg HTTP/1.0!Connection: Keep-Alive!!reply from server!GET image2.jpg HTTP/1.0!Connection: Keep-Alive!

GET /index.html HTTP/1.1!!reply from server!GET image1.jpg HTTP/1.1!!reply from server!GET image2.jpg HTTP/1.1!

Persistent connec;ons avoid unnecessary slow starts

•  Fetch an 8.5 Kbyte page with 10 embedded objects, most < 10 Kbyte •  Leave TCP connecFon open a]er server response, next HTTP request reuses •  Only incur one slow start, but takes an RTT to issue next request

Bytes received

Time (milliseconds)

HTTP/1.0 with Keep-‐alive

First slow start HTTP/1.0

HTTP request pipelining •  Normally client only issues HTTP requests aUer server’s

response •  Clients using HTTP pipelining issue mulFple requests without

wai;ng for responses •  Advantages

–  Reduces latency of page loading –  Can reduce overhead, packing mulFple requests into one packet

GET /index.html HTTP/1.1!GET image1.jpg HTTP/1.1!GET image2.jpg HTTP/1.1!

Client Server

GET /index.html HTTP/1.1!GET image1.jpg HTTP/1.1!GET image2.jpg HTTP/1.1!

Client Server

Pipelined requests overlap RTTs

•  Fetch an 8.5 Kbyte page with 10 embedded objects, most < 10 Kbyte •  Send mulFple HTTP requests simultaneously •  May or may not use same TCP connec;on -‐ fairness issues

Bytes received

Time (milliseconds)

HTTP/1.0 with Keep-‐alive

HTTP/1.1 + pipelined requests

HTTP/1.0

Outline

•  Background and HTTP •  Interac;on between applica;ons and TCP – ConnecFon establishment and slow start – Web transfers, HTTP/1.1 persistent connecFons –  Interac;ve applica;ons: Nagle’s algorithm

•  Web caching •  Content distribuFon networks

Interac;ve applica;ons: When exactly should TCP send?

•  Server wriFng data in pieces of size < MSS into a TCP sender socket buffer

•  Common case: cwnd ≥ maximum segment size (MSS) •  Transport layer problem: when to transmit a segment? –  Early TCP: send immediately when data arrives

•  Wastes capacity, sending small packets (“Fnygrams”) –  Send when reach MSS? Break ssh, remote desktop, other interacFve apps

TCP sender TCP receiver

Sender app

MSS

Nagle’s algorithm •  Proposed in RFC 896, in 1984 •  Delay sending unFl have MSS bytes of data to send, but •  Use the ACK clock to trigger small packet transmission

–  Low RTT connecFons: ACK comes back before next keystroke –  High RTT connecFons: amorFze keystrokes, avoid Fnygrams

When applica.on generates new data: if available data ≥ MSS and cwnd ≥ MSS then send a full segment

else if unacked data in flight then buffer new data unFl ACK arrives

else send immediately On ACK receipt: send pending data (subject to cwnd)

Nagle’s algorithm on a WAN •  Various amounts of data

from slip (client) to vangogh: 1, 1, 2, 1, 2, 2, 3… bytes

•  Coalescing data unFl previous data acknowledged

•  Segments 14, 15 appear to contradict Nagle –  14 is in response to 12 –  15 is in response to 13

--

--

268 TCP Interactive Data Flow Chapte

Things change, however, when the round-trip time (RTI) increases, typically ac a WAN. Let's look at an Rlogin connection between our host slip and the vangogh. cs . berkeley. edu. To get out of our network (see inside front cover), SLIP links must be traversed, and then the Internet is used. We expect much Ion round-trip times. Figure 19.4 shows the time line of some data flow while charac were being typed quickly on the client (similar to a fast typist). (We have removed type-of-service information, but have left in the window size advertisements.)

slip.l023 vangogh.login

PSH 5:6(1) ack 47, win 40960.0 1

PSH 47:48(1) ack 6, win 8192 2 0.197694 (0.1977) -

PSH 6:7(1) ack 48, win 40960.232457 (0.0348) 3

PSH 48:49(1) ack 7, win 8192 4 0.437593 (0.2051)

0.464257 (0.0267) 5 PSH 7:9(2) ack 49, win 4095

PSH 49:51(2) ack 9, win 8192 6 0.677658 (0.2134)

0.707709 (0.0301) 7 PSH 9:10(1) ack 51, win 4094

PSH 51:52(1) ack 10, win 8192 8 0.917762 (0.2101)

0.945862 (0.0281) 9 PSH 10:12(2) ack 52, win 4095

PSH 52:54(2) ack 12, win 8192 10 1.157640 (0.2118)

1.187501 (0.0299) 11 PSH 12:14(2) ack 54, win 4094

ack 14, win 8190 12 1.427852 (0.2404) -- PSH 54:56(2) ack 14, win 8192 13 1.428025 (0.0002)

1.457191 (0.0292) 14 PSH 14:17(3) ack 54, win 4096

1.478429 (0.0212) 15 PSH 17:18(1) ack 56, win 4096 --PSH 56:59(3) ack 18, win 8191 16

1.727608 (0.2492)

1.762913 (0.0353) 17 PSH 18:21(3) ack 59, win 4093 -PSH 59:60(1) ack 21, win 8189 1

1.997900 (0.2350)

Figure 19.4 Data flow using rlogin between sl ip and vangogh. cs. berkeley. edu . [Stevens, TCP/IP Illustrated, vol. 1]

Nagle’s algorithm at a web server

•  HTTP server should disable Nagle’s algorithm (setsockopt(…,TCP_NODELAY,…)) and send data in one write

Client Server Client Server

•  Suppose MSS=1460 bytes, and HTTP header length = 25 bytes, HTTP response length = 1000 bytes, cwnd large

1:1460

1461:2920

2921:2925

Wasted Fme

Server with Nagle on, one write call

Server with Nagle on, separate write calls

write(1:2925) (5 bytes remain)

1:25 write(1:25)

26:1025

write(26:1025)

Wasted Fme

Outline

•  Background and HTTP •  InteracFon between applicaFons and TCP •  Web caching – CondiFonal GET – Cache coherency – Web cache hierarchies

•  Content distribuFon networks

Web caching: Introduc;on

•  Why cache web pages? 1.  Reduce load on web servers

(“origin” servers) •  Persistent load •  Flash crowds (“Slashdot effect”)

2.  Reduce Internet congesFon 3.  Reduce bandwidth

consumpFon in the Internet •  IncenFve: reduce ISP peering costs

4.  Improve client fetch latency 5.  Improve availability of web

pages for clients

Time (one day)

Requests per second at web server

7000

[Jung, Krishnamurthy, Rabinovich in WWW ’02]

Web caching: Mechanism •  Web proxy is somewhere on path from client to origin server •  User configures browser to access web via proxy •  Browser sends all HTTP requests to the cache –  Object in cache: cache returns object in HTTP response –  Otherwise: cache requests object from origin server, then returns object to client

•  Proxies may be chained: requests go from client to server via more than one proxy

Origin server Client

Client

Web proxy server (aka web cache)

More about web caching •  Cache acts as both client and server •  Typically cache is installed by ISP (university, company, residenFal ISP)

•  Internet dense with caches: enables “poor” content providers to effecFvely deliver content (but so does peer-‐to-‐peer file sharing)

Web proxy server (aka web cache)

Web server Client

Client

Caching example Assump;ons •  Avg. object size: 100 Kbits •  Avg. request rate from

insFtuFon’s browsers to origin servers: 15/sec

•  InsFtuFonal router − origin server RTT: 2 sec.

Consequences •  LAN UFlizaFon: 15% •  Access link uFlizaFon: 100% •  Total delay = Internet delay +

access delay + LAN delay = 2 sec. + minutes +

milliseconds

Origin servers

Public Internet

Ins;tu;onal network

10 Mbit/s LAN

1.5 Mbit/s Access link

Caching example: Provisioning •  Increase bandwidth of access

link to, say, 10 Mbps Consequence •  LAN uFlizaFon: 15% •  Access link uFlizaFon : 15% •  Total delay = Internet delay +

access delay + LAN delay = 2 sec. + millis. + millis. •  O]en a costly upgrade

Origin servers

Public Internet

Ins;tu;onal network

10 Mbit/s LAN

10 Mbit/s Access link

Caching example: Web cache •  Install a web cache –  Suppose hit rate is 0.4

Consequence •  40% requests will be saFsfied

almost immediately •  60% requests saFsfied by

origin server •  UFlizaFon of access link

reduced to 60%, resulFng in negligible delays (say 10 milliseconds)

•  Total average delay = Internet delay + access delay + LAN delay = .6*(2.01) secs + .4*milliseconds < 1.4 secs

Origin servers

Public Internet

Ins;tu;onal network

10 Mbit/s LAN

1.5 Mbit/s Access link

Placing web caches Where to place caches? ☞ Where are Internet borlenecks?

1.  Network link from HosFng AS to server 2.  Access line from client to ISP 3.  Client’s ISP to backbone AS 4.  Client’s Internal AS 5.  Peering points 6.  The web server itself

•  HosFng service AS edge (2.)✔, (6.)✔

•  Client’s premises (1.)✔

•  An ISP’s AS (3.)✔

HosFngAS Backbone

AS

Client’s ISP AS

Web server

Client

Access line STOP

STOP

STOP

STOP STOP

(1.)

(2.)

(3.) STOP

STOP (6.)

Cache consistency •  DefiniFon: Cached objects match corresponding objects on

origin server •  HTTP/1.1 has mechanisms for ensuring freshness:

–  Expires header on server responses Expires: Fri, 30 Oct 1998 14:19:41 GMT!

–  Cache control header on server responses Cache-control: no-cache!

•  Forces caches to resubmit request to origin server every Fme Cache-control: no-store!

•  Forbids caches from storing content under any condiFons Cache-control: max-age=[seconds]!

•  Similar to expires header, relaFve Fme !

Condi;onal GET •  Goal: Don’t send object if

cache has up-‐to-‐date cached version

•  Cache: specify date of cached copy in HTTP request –  If-‐modified-‐since: <date>

•  Server: response contains no object if cached copy is up-‐to-‐date: –  HTTP/1.0 304 Not Modified

Cache Server

HTTP request If-modified-since:

<date>

HTTP response HTTP/1.0

304 Not Modified

Object not

modified

HTTP request If-modified-since:

<date>

HTTP response HTTP/1.0 200 OK

<data>

Object modified

Coopera;ve caching •  Can arrange caches together in a cluster •  How to locate an item in a cluster of caches? 1.  Always local (caches in cluster don’t cooperate); results

in duplicaFon 2.  Primary plus mulFcast –  Uses inter-‐cache protocol such as Internet Cache Protocol

(ICP) 3.  Hashing –  Direct access to resource –  Changing the hash funcFon can result in objects moving about

4.  Consistent hashing

Outline

•  Background and HTTP •  InteracFon between applicaFons and TCP •  Web caching •  Content distribu;on networks – Mechanism: client DNS redirecFon –  Consistent hashing for server selecFon –  OperaFon under failure

Content distribu;on networks •  Challenge ca. 2002: How to reliably deliver large amounts of content to users worldwide? –  Flash crowds overwhelm web server, access link, or back end database infrastructure

– More rich content: audio, video

•  Possible soluFons – Web caching: dynamic content and diversity causes low proxy hit rates (25−40%)

– MulFhoming for the server: BGP takes minutes to converge upon route failure

Content distribu;on networks

•  Replicate content at the edge –  Deploy thousands of content distribuFon network (CDN) edge servers at many ISPs

–  Push content to edge servers ahead of Fme

–  Avoid impairments (loss, delay) of using long paths

–  Solve “flash crowds” problem

•  Which CDN server? 1.  Nearest server to client 2.  Likely to have content 3.  Available (load, bandwidth) 4.  Adapt choice quickly (secs.)

Content Server

Clients

Clients Edge Server

Edge Server

Sending clients to the CDN for content

•  Web client fetches web page via HTTP GET, as usual •  Content provider rewrites content links to point to CDN

–  Browser will issue HTTP GETs for rich content to the CDN edge servers instead of the content server

–  CDN companies provide automated tools to rewrite links!!!

Content provider’s network

Content Server (news.bbc.co.uk)

Clients

An “Akamized” URL (ARL)

Serial number: Defines content served from same set of servers Type code: f = staFc TTL, 6 = If-‐Modified-‐Since, 7 = MD5 hash –  Various ways of ensuring freshness of object served

Content provider code (CPC): Unique ID for content provider Object data: f = Fme, 6 = “000”, 7 = MD5 hash value Absolute URL: Used to retrieve content from provider! Serial number Type Serial CPC Object data !http://a764.g.akamai.net/f/764/16742/1h/↵!!!! !www.1800flowers.com/800f_assets/jet/website/↵!!!! !images/flowers/flowers/banners/hp/hpb/↵!!!! !btn_findagift.gif"!

•  Note: Serial number is the only input to DNS name resoluFon

Absolute URL

Mapping requests to CDN servers

•  Goal: find the CDN server nearest to the client –  Establish BGP peering sessions with Internet border routers → coarse-‐grained AS map of Internet

–  Combine with live traceroute, loss measurement data between CDN servers

–  Result: a map of the Internet

•  Finding an available CDN server –  Server health, service requested, server load balancing (CPU, disk, network), and network condiFon

–  Agents simulate user behavior, downloading objects, and reporFng failure rates and service Fmes

DNS-‐based redirec;on

•  Two levels of DNS indirecFon 1.   Akamai top-‐level nameservers (TLNSs) •  LocaFons: US (4), Europe (4), Asia (1)

•  TLNSs return eight LLNSs in three different regions –  Chosen to be close to the requesFng client (use the map) –  Handles complete failure of any two parFcular regions

2.   Akamai low-‐level nameservers (LLNSs) •  Point to Akamai edge servers, which serve content •  Do most of the load-‐balancing

End user

DNS resolu;on

Akamai TLNS

10 g.akamai.net

1

OS

2

Local name server

3

xyz.com’s nameserver

6 7 9

16

15

11 20.20.123.55

Akamai LLNS

12 a212.g.akamai.net

30.30.123.5 13 14

4 xyz.com Generic Top-‐Level Domain Nameserver 10.10.123.5 5

8

Select cluster

Select servers within cluster

[Slide: Bruce Maggs]

Akamai DNS redirec;on

Content server (news.bbc.co.uk)

LLNS

Any network

TLNS TTL 30−60m

Web clients’ network

Content provider’s network Network close to client

Select cluster

HTML

TTL 20s

Edge server cluster

Select server

End user

LLNS

Akamai Cluster

Servers pool resources

• RAM

• Disk

• Throughput

Hashing

•  Universe U of all possible objects, set S of servers – Object: Web content objects – Server: Edge content server

•  Hash funcFon h: U → S assigns objects to servers – E.g., h(x) = [ax + b (mod p)] (mod |S|), where •  p is a prime integer, and p > |S| •  a, b are constant integers chosen uniformly at random from [0, p − 1] •  x is an object’s serial number

Difficulty: Changing number of servers

Server

Object serial number

h(x) = x + 1 (mod 4)

7 10 11 27 29 36 38 40

4

3

2

1

0

5

Add one machine: h(x) = x + 1 (mod 5)

Consistent hashing

•  Low-‐level DNS servers act independently and may have different ideas about how many and which servers are alive

•  Who uses it? –  Akamai –  amazon.com: Dynamo data store –  Last.fm music website –  Akamai CDN –  Chord P2P network

D. Karger, E. Lehman, T. Leighton, M. Levine, R. Panigrahy. Consistent Hashing and Random Trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web. Proceedings of the ACM STOC, 1997

Consistent hashing Map both objects and servers onto the unit circle.

0 1

➙

For objects with serial number x: h(x) = successor( [ax + b (mod p)] / p), where successor finds the next server clockwise along the ring.

For servers with id number x´: h(x´) = [ax´ + b (mod p)] / p

Server 0 Content object

Proper;es of consistent hashing •  Balance: Objects are assigned to buckets “randomly” •  Locality: When a server is added/removed, the only objects

affected are those that are/were mapped to the bucket •  Load: Objects are assigned to buckets evenly, even over a set

of views –  Can be improved by mapping each server to mulFple places on the unit circle

•  Spread: An object should be mapped to a small number of servers over a set of views –  Can choose different a, b to make a new hash funcFon –  Use mulFple hash funcFons to map object to mulFple servers

Adding and removing servers

•  Each server is responsible for a different secFon of the ring

•  Adding a server –  Only need to move content between predecessor server and new server (segment AC)

•  Removing a server –  Clean shutdown: Server pushes content to its successor

–  Failures: lookup a different server, and push content to successor in background

Server Content object

A B

C

0

Hardware/server failures

•  Linux and Windows servers with large RAM and disk capacity

•  Example failures 1.  Memory modules jumping out of their sockets 2.  Network cards screwed down but not in slot 3.  Hardware component failures: HDD, motherboard, etc. 4.  Others

Opera;on under failure

LLNS

Network X

TLNS

Network C

Transit network B

Network A

Cluster C

LLNS

Cluster A

•  Network X fails –  No impact on already-‐

selected LLNSs –  Different TLNS next Fme –  Mapping fails over

•  Server in cluster A fails –  DetecFon using heartbeats –  Another server

(“buddy”)proxy-‐ARPs (immediately)

–  LLNS removes failed server (30−60 min)

•  Transit network B fails –  Route using intermediate

server in transit network E •  Network A or LLNS A fails

–  Client’s local NS will use another LLNS (30−60 min)

Transit network E

Low-‐level DNS load balancing

Random permutaFons of servers

Why? To spread load for one serial number.

a212

a213

a214

a215

10.10.10.1 10.10.10.4 10.10.10.3 10.10.10.2

10.10.10.3 10.10.10.4 10.10.10.2 10.10.10.1

10.10.10.1 10.10.10.2 10.10.10.3 10.10.10.4

10.10.10.2 10.10.10.1 10.10.10.4 10.10.10.3

[Slide: Bruce Maggs]

LLNS DNS-‐based load balancing

•  Input: which servers have which content (from consistent hashing), server load, proximity to client (from DNS request)

•  Consider edge server A. Two thresholds: 1.  Load > low threshold: add servers to distribute load

(more hash funcFons to consistent hash ring) 2.  Load > high threshold: shed load on that server (remove

that hash funcFon from consistent hash ring)

Download - ContentDelivery:HTTP,WebCaching,& …sylvia/cs268/papers/gz01-lecture... · ContentDelivery:HTTP,WebCaching,& and&Content&Distribu;on&Networks& 3035/GZ01!Networked!Systems! Kyle(Jamieson(Lecture(17((Departmentof(Computer(Science

Top Related