web prefetch 張燕光資訊工程系成功大學 [email protected]

Web Prefetch

張燕光資訊工程系成功大學

[email protected]

2

Introduction• Prefetch a web page before a user really requests

this page.• The ultimate goal of Web prefetching is to reduce

what is called User Perceived Latency (UPL) on the Web.– The delay that an end user (client) actually experiences

when requesting a Web resource.– A user perceives Web latency as the time between

issuing a request for a resource and the time the Web page is actually displayed in the browser window.

– The reduction of UPL does not imply the reduction of actual network latency or the reduction of network traffic.

– On the contrary in most cases even when UPL is reduced, network traffic increases.

3

Introduction• Sources for User Perceived Latency

– Round trip time (RTT) at the lower level• Processing latency in end systems – load of the end

system

• Communication latency over the network – queuing delay and propagation delay

– Bandwidth, – size of the web pages

4

Introduction• Besides prefetching, what other methods can reduce

the User Perceived Latency (UPL) on the Web.– Increase the size of browser caches. Browser caches

typically have a default size of MB. Increasing the size of the cache increases the hit ratio and reduces modem tracfic.

– Use delta compression to transfer modified Web pages between the proxy and clients. That is if an old copy of the modified page exists in the browser cache the proxy only sends the difference between the latest version and the old version.

– Apply application-level compression to HTML pages. Studies have suggested that HTML texts can be first compressed and then transferred from one end to another. HTTP supports application-level compression via the transfer-encoding tag

5

Introduction• Prefetching on a Web system is to

“separate” the time when a resource is actually requested by a client from the time that the client chooses to see the resource.

6

Introduction• Optimization of T = t1 − t0 in order for T to

be:– big enough in order for the resource Ri to be

prefetched before the client requests it, and– small enough in order for the resource Ri not to

have expired before the client requests it.

• the method used to foresee a client’s request before it is placed since t0 < t1.

7

Introduction• Every prefetching system must be designed carefully and

put to work only after an extensive trial period and after providing adequate answers to the following basic validation questions:– IF prefetching can and must be added to the specific system.

Not all Web systems can be facilitated by prefetching.• For example, a highly dynamic Web system may require the

time tolerance (i.e., t1 − t0 in previous Figure) to be so small before a resource expires. So, no prefetching approach would be adequate to facilitate it.

– WHO (referring to the Web system’s components) will take part in the prefetching procedure.

• Prefetching can be facilitated by all basic components that a Web system consists of (clients, proxies, mediators and servers).

• One must answer to the basic WHO question before going to work, in order to produce prediction policies etc.

8

Introduction– HOW (referring to the procedure) prefetching will be

carried out. • Prefetching can be considered as an application level

protocol. • This protocol must clearly state 1. the procedures that

will be followed in order to initiate, execute and terminate prefetching and also 2. the communication procedures between the Web system components.

• For example, in some prefetching approaches proprietary text files containing most popular URLs are transferred between servers, clients and proxies. In order to implement the software modules that will handle these files on the appropriate Web system component the question of HOW prefetching will be applied must be answered.

9

Introduction– WHAT (referring to the different types of available

resources) will be prefetched. • As argued above, dynamic content is the most “difficult”

candidate for prefetching.

• Before designing the prefetching system, this question must be answered and the file types that will take part in prefetching should be clearly depicted.

– WHEN (referring to the time of prefetching) a resource will be prefetched.

• In order to answer this question one must apply a suitable prediction algorithm that will receive a number of inputs and provide the answer to the WHEN question as the output.

10

Introduction• Effectiveness of prefetching depends on

whether there is certain predictability in user’ Web page accesses– The info on access patterns may be derived

from servers’ access statistics or from clients’ configuration

– Recent studies on WWW traffic show that there are considerable inter-dependencies among consecutive accesses to some Web pages

11

Introduction– WWW is a hyperlink based information system– Constraints: what hyperlinks can be followed

from a particular page and the contents may also provide some strong leads as to the order in which the Web pages will be viewed.

– User’s personal preferences are also an important factor– the sequence of accesses is ultimately decided by users’ individual selection.

12

Introduction• Prefetching can be performed in 3 ways:

– Between browser clients and Web servers.– Between proxies and Web servers.– Between browser clients and proxies.

• Initiating agent– Client-side (Client Initiated ) prefetch– Server-side (Server-Initiated ) prefetch

13

Server-Initiated Prefetch• Server anticipates what hyperlinks are

likely to be followed, and preload the corresponding Web pages to the client.

• The client has to be prefetching-aware so it can deal with preloaded pages correctly.

• This would require extensions to current HTTP protocol and modification to both client and server software.

14

Client-Initiated Prefetch• It can be done by individual clients in a way

transparent to the servers

• The implementation is therefore much simpler.

15

Criteria• Criteria for deciding whether a Web page

should be prefetched can be either statistical or deterministic:– Statistical: calculate the inter-dependencies of

page accesses periodically based on the most recent access logs, and group Web pages with interdependencies higher than a certain threshold for prefetching

– Deterministic: configured statically by users as part of their personalized user interfaces or by page designer as part of the content design (e.g. must be read newspapers)

16

Bandwidth and Delay Tradeoff• Statistical prefetching is easy to be

automated. But, some bandwidth will be wasted and total bandwidth consumption is increased.

• The delay for non-prefetched web pages may increase as a result of the extra load causes by prefetching– When traffic is heavy, aggressive prefetching,

such as “get all links”, may actually increase the average latency of all Web pages

17

Analysis of Bandwidth and Delay Tradeoff

• P: the hit rate of prefetching, i.e., the probability of a correct prefetch

• Do is the average retrieving delay without prefetch.

• Assume the retrieving delay is 0 for prefetched pages and Dx is delay for non-prefetched ones.

• The average delay with prefetching Dn is

Dn = P×0 + (1 – P)×Dx = (1 – P)×Dx

18

• To ensure that prefetching reduces average delay, i.e., Dn < Do, we haveDx/Do < 1/(1 – P) …………………… (1)

• Assume the delay Dx and Do can be calculated based on M/M/1 queuing model

Do = 1/(1 – Ro) and Dx = 1/(1 – Rx), where Ro and Rx are the link utilization with and without prefetching, respectively.

• Thus, From (1), we haveP/((Rx – Ro)/Ro) > Ro/(1 – Ro)


19


• We define the efficiency (E) of prefetching as the ratio of (1) the hit ratio of prefetching and (2) the ratio of traffic increase to achieve that hit rate, i.e.,

E = P/((Rx – Ro)/Ro) > Ro/(1 – Ro).

• The above implies that the efficiency of prefetching must be larger than Ro/(1 – Ro), otherwise, the average delay can actually be higher than that without prefetching

20


• Thus, the above inequality is re-written it asRo < E/(1 + E)

• If E is known, one can calculate the maximum Ro for statistical prefetching to be useful.

21


• Feasible regions for E and Ro (above curve)

22


• Feasible regions for E and Ro (above curve)

23


• It is clear that prefetching is only useful when traffic is very light or prefetching efficiency is very high.

• Example 1: for E = 0.5 (i.e., for each 1% traffic increase, the prefetch hit rate improves 0.5%), Ro must be smaller than 0.3. (see the first curve above)

24


• Example 2: For Ro = 0.8, E must be larger than 4. (see the second curve above)

• This is because when traffic is heavy, very little extra traffic may result in substantial increase in queuing delay.

• Unless the prefetching efficiency is very high, the extra delay experienced by non-prefetched pages may outweight the decrease in the delay of prefetched pages.

25

Deterministic Client-Initiated Prefetch

• Deterministic prefetch is the most conservative type as it often has little or no bandwidth overhead– When users know what needs to be prefetched,

it can reduce perceived latency, and even ease congestion at very little cost

• But, its scope of use is limited.• Configured statically by the users• Can be implemented as part of browser or

simply as an add-on without changing client and server software

26


• Batch prefetching– Many pages are read on a regular basis, such as

newspapers, weekly work reports, etc.– Large web pages – papers with large graphics– Similar to mirroring but batch prefetching is

more flexible as it does not require any central administration.

27


• Start-up prefetching– Start prefetching when a browser is started– A set of pages users need to look at at that day

may be prefetched in the background– It can be integrated with planning tools so that a

ToDo web page is constructed each day for the users and corresponding web pages are prefetched at the start-up for later viewing.

28


• Pipelining with prefetching– Current model for navigation is a series of

“click, fetch, and view” operation.– As a user usually spends some time (seconds or

minutes) on a page, we can potentially pipeline the operation by fetching the next page while the user is looking at the current page.

– Usefule for some information services:• On-line newspapers, stock market prices and

headline tracking services where users can easily specify the sequence of pages to be viewed.

29

Server-Initiated prefetch• Predictive prefetch from Berkeley• Idea:(Similar to transparent content negotiation)

– Typically, there is a pause after each page is loaded for reading the loaded page

– Server computes the likelihood that a particular page will be accessed next and conveys this information to the client.

– The client program then decides whether or not to actually prefetch the page

30

Predictive Prefetch– The server has the opportunity to observe the

pattern of accesses from several clients and use this information to make intelligent predictions

– The client is in the best position to decide if it should prefetch files based on whether it already has them cached or the cost in terms of CPU time memory network bandwidth and so on needed to prefetch data

31

Predictive Prefetch• A dependency graph is constructed to depict

the pattern of accesses to different files stored at the server

• The graph has a node for every file ever been accessed

• There is an arc from node A to B if and only if at some point in time B was accessed within w accesses after A, where w is the lookahead window size

32

Predictive Prefetch• The weight on the arc is the ratio of (1)

number of accesses to B within a window after A to (2) number of accesses to A itself

• This weight is not actually the probability that the B will be requested immediately after A

• So the weights on arcs emanating from a particular node need not add up to. The figure on next page depicts a portion of a hypothetical dependency graph

33

Predictive Prefetch

A small hypothetical dependency graphBased on past observations when home.html is accessed, there is a

chance that image.gif will be accessed soon afterwards and also a chance that image.gif will be accessed soon afterwards. Furthermore if image.gif is accessed there is a chance that image.gif will follow soon afterwards

34

Predictive Prefetch• The dependency graph is dynamically updated by

a process predictd as the server receives new requests from each httpd process running on the server machine

• predictd maintains a ring buffer of size equal to the window size w for each client that is currently connected to this server

• When predictd receives a new request from a httpd, it inserts the ID of the file accessed into the corresponding ring buffer

• Only the entries within the same ring buffer are considered related, so only the corresponding arcs in the dependency graph are updated

35

Predictive Prefetch• This logically separates out accesses by different

clients and thereby avoids the problem of false correlations

• However in some cases such as clients located behind a proxy cache. predictd will not be able to distinguish between accesses from different clients

• One way of getting around this problem is to use mechanisms to pass session-state identification between clients and servers even when there is a proxy between them

36

Predictive Prefetch• Predictd bases its predictions on the dependency

graph.• When A is accessed it would make sense to

prefetch B if the arc from A to B has a large weight which implies that there is a good chance of B being accessed soon afterwards

• In general predictd would declare B as a candidate for prefetching if the arc from A to B has a weight higher than the prefetch threshold p

• It is possible to set this threshold differently for each client and also vary it dynamically

37

Server-Initiated prefetch• Top-10 approach from ICS France• Combine server’s active knowledge of their most

popular pages (top-10) with client access profiles.• based on the cooperation of clients and servers to

make successful prefetch operations • The server side is responsible for periodically

calculating a list with its most popular documents (the Top-10) and serving it to its clients

• Actually quite a few servers today calculate their most popular documents among other statistics regularly

38

Server-Initiated prefetch• Top-10 approach from ICS France• Calculating beyond the most popular documents

is an obvious extension to the existing functionality

• Top-10 does not treat all clients equally• Time is divided in intervals and prefetching from

any server is activated only after the client has made sucient number of requests to that server (> THRESHOLD)

39

Proxy-Initiated prefetch• From SIgmetric99• Relies on the proxy to make predictions and either

the proxy or the browser to perform the prefetch • Assumption

– users have idle times between requests because users often read some parts of one document before jumping to the next one

– the proxy can predict which Web pages a user will access in the near future based on reference patterns observed from many users

– the proxy has a cache that hold recently accessed Web pages

40

Proxy-Initiated prefetch• The proxy can then either push the Web

pages to the users browser or

• piggyback the predictions with regular responses to the browser and let the browser fetch the Web pages

• Only objects that are already in the proxy cache can be prefetched

• Thus the approach generates no wide area network trac

41

Proxy-Initiated prefetch• The proxy maintains a history structure • Every time the proxy services a request, it

updates the history structure establishing the connection between past accesses made by the same user and the current request.

• When the proxy detects that the connection to a user is idle it uses the history structure to predict pages that the user might access next checks which of the pages are in its cache and generates a list of candidates ordered by their probabilities of access

42

Proxy-Initiated prefetch• The candidates are pushed or fetched one

by one into browser cache • The moment the user issues a new request

the prefetching is stopped and any partially fetched object is discarded at the browser end unless the request is for the object that is being fetched

• In addition the proxy clears the list of candidate pages and recomputes a new one next time

43

PPM Predictors• Prediction by Partial Matching (PPM) data

compressor • The algorithms observe patterns from past

accesses from all the clients to predict the future accesses of individual clients

• The patterns that we capture are in the form of a user is likely to access

• URL B right after he/she accesses URL A • Clearly only accesses from the same user

should be connected • Accesses from different users are not related

44

PPM Predictors• The algorithm has three parameters • m: number of past accesses used to predict future ones

– It is also called the order of the predictor or the prefix depth

• l: number of steps the algorithm tries to predict into the future– For example if l = 2, it means that the algorithm not only

tries to predict the immediate next access for the user, but it also tries to predict the access after that

– We call l the search depth• t: threshold used to weed out candidates

– Only candidates whose probability of access is higher than t,

– where 0 t 1 is considered for prefetching ≦ ≦

45

PPM Predictors• The algorithm maintains a data structure (typically a

collection of trees) that keeps track of the sequence of l URLs following another URL, a sequence of two URLs, and so on, up to a sequence of m URLs

• For prediction, the past reference, the past two references, up to the past m references are matched against the collection of trees to produce the set of URLs for the next l steps.

• Only URLs whose frequencies of accesses are larger than t are included

• Finally, URLs are sorted first by giving preferences to longer prefixes and then by giving preferences to URLs with higher probability within the same prefix

46

PPM - history structure • The history structure is a forest of trees of a fixed

depth K, where K = m + l • The history encodes all dynamic sequences of

accesses by any one user up to a max length K • One root node is maintained for every page seen

and in this node is a count of how often this page was seen

• Directly below each root node are all pages ever requested immediately after the root page and a count of how often the pair of requests occurred

• The next level encodes all series of three pages and a count of how often this particular sequence of three pages

47

PPM - history structure • History structure is updated every time a user makes a request • For each user there is a list of the last K pages a user requested • The update involves incrementing counters and possibly

adding new nodes to the trees • Each update changes one node at each level of history structure • Figure in next page shows an example of the history structure • In this example K = 3 and the structure is being updated after

a user accesses page C following pages A and B • The sequence ABC is updated with the counters for A and B

and C incremented • The sequence BC is updated and so is the sequence C

48

PPM - history structure

49

PPM - history structure • Encoded into the history structure is the

probability of accesses for URLs following a given sequence of references

• The predictor looks at a users recent m accesses and processes each sequence of the last n accesses where n = m … 1 separately

• For each sequence it first finds the corresponding tree and node in the history structure

• It then follows all paths down from the node for l levels listing all the URLs at each level along with their counts

50

PPM - history structure • If one URL appears in more than one path the

node counts for the URL are added together • It then divides the count of each URL by the

count of the sequence, yielding the URL’s relative probability of access

• It then sorts the list based on the probability and deletes those whose probabilities are less than t

• Finally the predictor generates the list of candidates by concatenating the lists from the m sequences putting the list of the longer sequence ie longer prefix first

51

Contenet-based prefetch• Use the content associated with

hyperlinks as a hint to determine which pages to prefetch

52

Contenet-based prefetchClient

Client

Client

RequestManager HTML

Parser

PrefetchingEngine

PrefetchingKnowledge

PrefetchingLearner

Cache

ResourceTransmitter

WWWServer

WWWServer

WWWServer

PriorityQueue

53

Contenet-based prefetch

No

Locate key word of newly fetched page (binary search)

Found?

Calculate access priority of each referenced key word (breadth-

first search)

Locate key words of referenced pages(binary search)

Decide access priority of each referenced key word

by overall access frequency

Send requests with higher priorities to Resource Transmitter

Yes

web prefetch 張燕光 資訊工程系 成功大學 [email protected]

Documents

web prefetch 張燕光資訊工程系成功大學 [email protected]