web prefetching between low-bandwidth clients and proxies : potential and performance li fan, pei...

23
Web Prefetching Between Low-Band width Clients and Proxies : Pote ntial and Performance Li Fan, Pei Cao and Wei Lin Quinn Jacobson (University of Wisconsin-Madsion) SIGMETRICS 1999 20003611 황황황 , CNR Lab. Web prefetching 중중중중 Low-Bandwidth Modem C lient 중 중중 중중중중 , Clien t 중 Proxy 중중중 prefetch ing 중 중중중 중중중중 중중중중 . Originality 중 중중중중 중중중중 중중 중중중중중 중중중중 .

Upload: juliet-foster

Post on 30-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Web Prefetching Between Low-Bandwidth Clients and Proxies : Potential and Performance Li Fan, Pei Cao and Wei Lin Quinn Jacobson (University of Wisconsin-Madsion)

Web Prefetching Between Low-Bandwidth Clients and Proxies : Potential and Performan

ce

Li Fan, Pei Cao and Wei LinQuinn Jacobson

(University of Wisconsin-Madsion)SIGMETRICS 1999

20003611 황호영 , CNR Lab.

Web prefetching 중에서도 Low-Bandwidth Modem Client에 대한 논문으로 , Client 와 Proxy 사이의 prefetching 에 대해서 언급하고 있습니다 .

Originality 가 떨어지는 논문으로 크게 추천하지는 않습니다 .

Web prefetching 중에서도 Low-Bandwidth Modem Client에 대한 논문으로 , Client 와 Proxy 사이의 prefetching 에 대해서 언급하고 있습니다 .

Originality 가 떨어지는 논문으로 크게 추천하지는 않습니다 .

Page 2: Web Prefetching Between Low-Bandwidth Clients and Proxies : Potential and Performance Li Fan, Pei Cao and Wei Lin Quinn Jacobson (University of Wisconsin-Madsion)

2Communication Networks Research Lab.

1. Introduction 2. Proxy-Initiated Prefetching 3. Traces and Simulator 4. Reducing Client Latency 5. Prediction Algorithm 6. Performance 7. Implementation Experience 8. Conclusion and Critique

Content

Page 3: Web Prefetching Between Low-Bandwidth Clients and Proxies : Potential and Performance Li Fan, Pei Cao and Wei Lin Quinn Jacobson (University of Wisconsin-Madsion)

3Communication Networks Research Lab.

One approach to reduce latency prefetching between caching proxies and browsers.

The majority of the Internet population access the WWW via dial-up modem connections.

The low modem bandwidth is a primary contributor to client latency.

Investigate one technique to reduce latency for modem users.

1. Introduction

Page 4: Web Prefetching Between Low-Bandwidth Clients and Proxies : Potential and Performance Li Fan, Pei Cao and Wei Lin Quinn Jacobson (University of Wisconsin-Madsion)

4Communication Networks Research Lab.

Proxy-Initiated PrefetchingThe proxy can often predict what objects a user might access nex

t.

The modem link to the user often has idle periods as the user is reading the current Web document.

If the objects are cached at the proxy, the proxy can utilize the idle periods to push them to user, or to have the browser pull them.

Since the proxy only initiates prefetches for objects in its cache, there is no extra Internet traffic.

2. Proxy-Initiated Prefetching (1/3)

Page 5: Web Prefetching Between Low-Bandwidth Clients and Proxies : Potential and Performance Li Fan, Pei Cao and Wei Lin Quinn Jacobson (University of Wisconsin-Madsion)

5Communication Networks Research Lab.

AssumptionsUsers have idle times between requests, because users often read

some parts of one document before jumping to the next one.The proxy can predict which Web pages a user will access in the

near future based on reference patterns observed from many users

The proxy has a cache that hold recently accessed Web pages.

Proxy maintain a history structureEvery time the proxy services a request, it updates the history

structure, establishing the connection between past accesses made by the same user and the current request.

In browser cache, assume LRU(Least-Recently-Used) algorithm

2. Proxy-Initiated Prefetching (2/3)

Page 6: Web Prefetching Between Low-Bandwidth Clients and Proxies : Potential and Performance Li Fan, Pei Cao and Wei Lin Quinn Jacobson (University of Wisconsin-Madsion)

6Communication Networks Research Lab.

Performance MetricsRequest Savings : the number of times that a user request hits in

the browser cache or the requested object is being prefetched, in percentages of the total number of user requests.

Prefetched Cached Partially Prefetched

Latency Reduction : the reduction in client latency, in percentages

Wasted Bandwidth : the sum of bytes that are prefetched but are not read by the client

2. Proxy-Initiated Prefetching (3/3)

Page 7: Web Prefetching Between Low-Bandwidth Clients and Proxies : Potential and Performance Li Fan, Pei Cao and Wei Lin Quinn Jacobson (University of Wisconsin-Madsion)

7Communication Networks Research Lab.

TracesWe use the HTTP traces gathered from the University of

California at Berkeley home dial-up populations from November 14~19, 1996.

SimulatorThe simulator uses timing information in the traces to estimate

latency seen by each modem client.

The simulator assumes that each modem link has a bandwidth of 21kb/s.

The simulator assumes the existence of a proxy between the modem clients and the Internet. (16GB : proxy cache size)

3. Traces and the Simulator

Page 8: Web Prefetching Between Low-Bandwidth Clients and Proxies : Potential and Performance Li Fan, Pei Cao and Wei Lin Quinn Jacobson (University of Wisconsin-Madsion)

8Communication Networks Research Lab.

Increase the size of browser cache Use delta compression to transfer modified Web

pages between the proxy and clients Apply application-level compression to HTML

pages

4. Reducing Client Latency (1/2)

Page 9: Web Prefetching Between Low-Bandwidth Clients and Proxies : Potential and Performance Li Fan, Pei Cao and Wei Lin Quinn Jacobson (University of Wisconsin-Madsion)

9Communication Networks Research Lab.

Cumulative distribution of user idle time in the UCB tracesAbout 40% of the requests are preceded by 2 to 128 seconds of i

dle time, indicating plenty of prefetching opportunities.

4. Reducing Client Latency (2/2)

Page 10: Web Prefetching Between Low-Bandwidth Clients and Proxies : Potential and Performance Li Fan, Pei Cao and Wei Lin Quinn Jacobson (University of Wisconsin-Madsion)

10Communication Networks Research Lab.

The realistic prediction algorithm is based on the Prediction-by-Partial-Matching (PPM)

PPM Predictorsm : prefix depth (# of past accesses that are used to predict futur

e ones)

l : search depth (# of steps that the algorithm tries to predict into the future)

t : threshold (only candidates whose probability of access is higher than t, 0 t 1, is considered for prefetching)

Past m references are matched against the collection of trees to produce the set of URLs for the next l steps. Only URLs whose frequencies of accesses are larger than t are included.

5. Prediction Algorithms (1/4)

Page 11: Web Prefetching Between Low-Bandwidth Clients and Proxies : Potential and Performance Li Fan, Pei Cao and Wei Lin Quinn Jacobson (University of Wisconsin-Madsion)

11Communication Networks Research Lab.

PPM PredictorsFinally, the URLs are sorted first by giving preferences to longer

prefixes, and then by giving preferences to URLs with higher probability within the same prefix.

Previous proposed prefetching algorithmsPapadumanta and Mogul -> m always equal to 1

Krishna and Vitter -> l always equal to 1

m>1 : more contexts might improve the accuracy of the prediction

l>1 : URL is not always requested as the immediate next request after another URL, but rather than within the next few requests.

Best performing : m=2, l=4

5. Prediction Algorithms (2/4)

Page 12: Web Prefetching Between Low-Bandwidth Clients and Proxies : Potential and Performance Li Fan, Pei Cao and Wei Lin Quinn Jacobson (University of Wisconsin-Madsion)

12Communication Networks Research Lab.

History StructureThe history structure is a forest of trees of a fixed depth K,

where K=m+lThe history encodes all sequences of accesses up to a maximum

length K.The history structure is updated every time a user makes a

request. being updated for user sequence A…B…C (K=3)

5. Prediction Algorithms (3/4)

Page 13: Web Prefetching Between Low-Bandwidth Clients and Proxies : Potential and Performance Li Fan, Pei Cao and Wei Lin Quinn Jacobson (University of Wisconsin-Madsion)

13Communication Networks Research Lab.

Every time the modem link to a user is idle, the proxy calls the predictor for the list of candidate URLs.

The proxy then initiates prefetching of the objects in the order specified in the list.

When the user makes a new request, the ongoing prefetching is stopped, and a new round of prediction and prefetching starts again next time.

The size of history structure can be controlled using LRU algorithm.

5. Prediction Algorithms (4/4)

Page 14: Web Prefetching Between Low-Bandwidth Clients and Proxies : Potential and Performance Li Fan, Pei Cao and Wei Lin Quinn Jacobson (University of Wisconsin-Madsion)

14Communication Networks Research Lab.

Performance of Proxy-Initiated Prefetching

6. Performance (1/6)

Page 15: Web Prefetching Between Low-Bandwidth Clients and Proxies : Potential and Performance Li Fan, Pei Cao and Wei Lin Quinn Jacobson (University of Wisconsin-Madsion)

15Communication Networks Research Lab.

Assumptionprefetch threshold : 50KB, 8 objects

browser cache : 16MB(extended), LRU replacement algorithm

Performance of Proxy-Initiated Prefetchingdecreasing the threshold t increases the wasted bandwidth and h

elps to generate enough candidates. for l>1, t=0.25 is the best choice.

increasing the search depth l increases both the latency reduction and the wasted bandwidth.

l=4 appears the best choice, as larger l makes little difference

increasing the prefix depth m increases both the latency reduction and the wasted bandwidth.

6. Performance (2/6)

Page 16: Web Prefetching Between Low-Bandwidth Clients and Proxies : Potential and Performance Li Fan, Pei Cao and Wei Lin Quinn Jacobson (University of Wisconsin-Madsion)

16Communication Networks Research Lab.

The accuracy of the prediction algorithm

6. Performance (3/6)

Page 17: Web Prefetching Between Low-Bandwidth Clients and Proxies : Potential and Performance Li Fan, Pei Cao and Wei Lin Quinn Jacobson (University of Wisconsin-Madsion)

17Communication Networks Research Lab.

The accuracy of the prediction algorithmattempted : the total number of candidates suggested by the predi

ctor

prefetched : the actual number of objects that are prefetched

used : the number of objects that are prefetched and actually accessed by the user

The ratio between used and prefetched is the accuracy of the prediction algorithm

accuracy range : 40% (2,4,0.125) ~ 73% (1,1,0.5)

low threshold configurations appear to sacrifice accuracy for more prefetches.

6. Performance (4/6)

Page 18: Web Prefetching Between Low-Bandwidth Clients and Proxies : Potential and Performance Li Fan, Pei Cao and Wei Lin Quinn Jacobson (University of Wisconsin-Madsion)

18Communication Networks Research Lab.

Recommendations for the configuration of PPM If the highest latency reduction is the goal and some amount of w

asted bandwidth can be tolerated

(2,4,0.125) is the best choice.

If both high latency reduction and low wasted bandwidth are desired

(2,4,0.5) is the best choice.

If limits on storage requirements make smaller m and l desirable,

(2,1,0.25) and (1,1,0.125) are good choices.

6. Performance (5/6)

Page 19: Web Prefetching Between Low-Bandwidth Clients and Proxies : Potential and Performance Li Fan, Pei Cao and Wei Lin Quinn Jacobson (University of Wisconsin-Madsion)

19Communication Networks Research Lab.

Effects of Implementation VariationsNo proxy notification upon browser cache hits : no-notice

Prefetching without knowledge of the content of browser caches : oblivious

Limiting the size of history structure

6. Performance (6/6)

Page 20: Web Prefetching Between Low-Bandwidth Clients and Proxies : Potential and Performance Li Fan, Pei Cao and Wei Lin Quinn Jacobson (University of Wisconsin-Madsion)

20Communication Networks Research Lab.

Proxy-initiated prefetchingwe have implemented proxy-initiated prefetching in the CERN h

ttpd proxy software.

CERN httpd uses a process-based structure and forks a new process each time a new request arrives.

A separate predictor process communicates with other processes via UDP messages.

The predictor runs in an infinite loop, waiting to receive updates and queries messages.

The process checks a shared global array of flags to see whether the modem link is idle.

If it is, starts pushing the URL objects on the existing connection.

7. Implementation Experience (1/2)

Page 21: Web Prefetching Between Low-Bandwidth Clients and Proxies : Potential and Performance Li Fan, Pei Cao and Wei Lin Quinn Jacobson (University of Wisconsin-Madsion)

21Communication Networks Research Lab.

At the client side, instead of modifying browsers, we set up a copy of the CERN htt

pd proxy.

The browser requests are first sent to the local proxy.

The local proxy manages its own cache, issues requests to the main proxy, and receives pushed objects.

Measurementemulate modem connections on the LAN, and generate workload

s that reflect typical browser behavior and Internet latencies.

We have instrumented the Linux kernel to simulate modem connections on our Ethernet LAN

7. Implementation Experience (2/2)

Page 22: Web Prefetching Between Low-Bandwidth Clients and Proxies : Potential and Performance Li Fan, Pei Cao and Wei Lin Quinn Jacobson (University of Wisconsin-Madsion)

22Communication Networks Research Lab.

ConclusionWe have investigated the potential and performance of one techn

ique, prefetching between the low-bandwidth clients and caching proxies, and found that combined with delta compression

It can reduce user-visible latency by over 23%

Prediction algorithm based on the PPM compressor perform well.

The technique is easy to implement and can have a considerable effect on user’s Web surfing experience.

8. Conclusion and Critique (1/2)

Page 23: Web Prefetching Between Low-Bandwidth Clients and Proxies : Potential and Performance Li Fan, Pei Cao and Wei Lin Quinn Jacobson (University of Wisconsin-Madsion)

23Communication Networks Research Lab.

WeaknessWe assumed fixed user request arrival times in simulation

Our calculation of client latency is merely an estimate based on the time stamps recorded in the traces and the modem bandwidth

We also does not model the proxy in detail

We have not investigated the implementation of delta compression.

Didn’t consider CPU overhead and delay

PPM algorithm is a little different from previous proposed prefetching algorithms.

8. Conclusion and Critique (2/2)