evaluating content management techniques for web proxy caches

Evaluating Content Management Techniques for Web Proxy Caches

Martin Arlitt, Ludmila Cherkasova, John Dilley, Rich Friedrich and Tai Jin

Hewlett-Packard Laboratories

4th International WWW Caching Workshop

元智大學資訊工程所系統實驗室陳桂慧

1999.10.06

Outline

• Key workload characteristic• Experimental design • Simulation results• Conclusion

Key Workload Characteristics

• Cacheable objects• Object set sizes• Object sizes• Recency of reference• Frequency of reference• Turnover

Experimental Design

• Cache size– 256 MB, 1 GB, 4 GB, 16 GB, 64 GB, 256 GB and

1TB…...

• Cache replacement policy– LRU, SIZE, GD-Size, LFU, GDSF, LFU-DA

– LAT, HYB

• Performance metrics– Hit rate

– Byte hit rate

Replacement Algorithm (1)

• Least-Recently-Used (LRU)– replaces the object requested least recently.

• SIZE– replaces the largest object.

• LFU– replaces the least frequently used object.

• GreedyDual-Size (GD-Size)– replaces the object with the lowest utility.

– Ki = Ci / Si + L

Replacement Algorithm (2)

• GreedyDual-Size with Frequency (GDSF)– Ki = Fi * Ci / Si + L

• Least Frequently Used with Dynamic Aging(LFU-DA)– Ki = Ci * Fi + L

Hybrid Algorithm (HYB)

• Motivated by Bolot and Joschka’s algorithmW1rtti + W2 si + (W3 + W4 si)/ti

– ti : the time since the document was last referenced

– rtti : the time it took to retrieve the document

• (clatser(i) + WB/cbwser(i))(nrefi** WN)/ si

– nrefi : the number of references to document i since it last entered the cache

– si : the size in bytes of document i– WB and WN : constants that set the relative importance of the variables

cbwser(i) and nrefj

Latency Estimation Algorithm (LAT)

• clatj = (1-ALPHA) clatj + ALPHA sclat

• cbwj = (1-ALPHA) cbwj + ALPHA scbw.– Clatj : estimated latency (time) to open a connection to the

server– cbwj : estimated bandwidth of the connection– sclat and scbw : the connection establishment latency and

bandwidth for that document are measured

• di = clatser(i) + si/cbwser(i)– ser(i) : the server on which document i resides

– si : the document's size – di : LAT selects for replacement the document i with the smallest

download time estimate

• Comparison of existing replacement policies

GD-Size(1)LFU-AgingSIZELFUGD-Size(P)LRU

LFU-AgingGD-Size(P)LRULFUGD-Size(1) SIZE

• Comparison of proposed policies to existing replacement policies

GDSF-HitsGD-Size(1)LFU-AgingLFU-DAGD-Size(P)LRU

LFU-AgingLFU-DAGD-Size(P)LRU GDSF-HitsGD-Size(1)

Virtual Caches

• An approach that can focus on both of high hit rates and high byte rate.– each virtual cache (VC) is then managed with its own rep

lacement policy. • initially all objects are added to VC 0,

• replacements from VC i are moved to VC i+1,

• replacements from VC n-1 are evicted from the cache.

• all objects that are reaccessed while in the cache (i.e., cache hits) are reinserted in VC 0 .

– this allows in-demand objects to stay in the cache for a longer period of time.

GDSF-HitsVC-HB-75/25VC-HB-50/50VC-HB-25/75LFU-DALRU

LFU-DAVC-HB-25/75VC-HB-50/50VC-HB-75/25LRUGDSF-Hits

• Analysis of Virtual Cache Performance – VC0 using GDSF-Hits, VC1 using LFU-DA.

• Analysis of Virtual Cache Management – VC0 using LFU-DA, VC1 using GDSF-Hits.

GDSF-HitsVC-HB-25/75VC-HB-50/50VC-HB-75/25 LFU-DALRU

LFU-DAVC-HB-25/75VC-HB-50/50VC-HB-75/25GDSF-HitsLRU

• Analysis of Virtual Cache Management – effects of VC order on performance

VC-BH-25/75VC-HB-75/25VC-BH-50/50VC-HB-50/50 VC-BH-75/25VC-HB-25/75

VC-HB-25/75 VC-BH-75/25 VC-HB-50/50 VC-BH-50/50VC-HB-75/25VC-BH-25/75

Conclusion

• Size-based policies achieve higher hit rates than other policies.

• Frequency-based policies are more effective at improving the byte hit rate of a proxy cache.

• Virtual caches as an approach provide optimal cache performance for multiple metrics simultaneously.

evaluating content management techniques for web proxy caches

Documents