cache技术讨论_20130304

Upload: seasonchu

Post on 03-Apr-2018

227 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/29/2019 CACHE _20130304

    1/17

    CACHE

    sina@

  • 7/29/2019 CACHE _20130304

    2/17

    2

    CACHE

    "Caching is a temp location where I store data in (data that I

    need it frequently) as the original data is expensive to be

    fetched, so I can retrieve it faster.

    INTRO

  • 7/29/2019 CACHE _20130304

    3/17

    3

    CACHE

    1. The user will get upset and complain and even wont use this

    application again

    2. The storage place will pack up its bags and leave your application ,

    and that made a big problems(no place to store data)

    Why do we need cache ?

  • 7/29/2019 CACHE _20130304

    4/17

    4

    CACHE

    What is cache ?

  • 7/29/2019 CACHE _20130304

    5/17

    5

    CACHE

    1. When the client invokes a request (lets say he want to view product

    information) and our application gets the request it will need to access the

    product data in our storage (database), it first checks the cache.

    2. If an entry can be found with a tag matching that of the desired data (say

    product Id), the entry is used instead. This is known as a cache hit (cache hit

    is the primary measurement for the caching effectiveness we will discuss that

    later on).

    3. And the percentage of accesses that result in cache hits is known as the hit

    rate orhit ratio of the cache.

    Cache Hit

  • 7/29/2019 CACHE _20130304

    6/17

    6

    CACHE

    Cache Miss

    On the contrary when the tag isnt found in the cache (no match were found) thisis known as cache miss , a hit to the back storage is made and the data is

    fetched back and it is placed in the cache so in future hits it will be found and will

    make a cache hit.

    If we encountered a cache miss there can be either a scenarios from two

    scenarios:

    1.There is free space in the cache (the cache didnt reach its limit and there is

    free space) so in this case the object that cause the cache miss will be retrieved

    from our storage and get inserted in to the cache.

    2.There is no free space in the cache (cache reached its capacity) so the objectthat cause cache miss will be fetched from the storage and then we will have to

    decide which object in the cache we need to move in order to place our newly

    created object (the one we just retrieved) this is done by replacement policy

    (caching algorithms) that decide which entry will be remove to make more room

    which will be discussed below.

  • 7/29/2019 CACHE _20130304

    7/17

    7

    CACHE

    When a cache miss occurs, data will be fetch it from the back storage, load it

    and place it in the cache but how much space the data we just fetched takes in

    the cache memory? This is known as Storage cost

    Storage Cost

  • 7/29/2019 CACHE _20130304

    8/17

    8

    CACHE

    And when we need to load the data we need to know how much does it take to

    load the data. This is known as Retrieval cost

    Retrieval Cost

  • 7/29/2019 CACHE _20130304

    9/17

    9

    CACHE

    When cache miss happens, the cache ejects some other entry in order to make

    room for the previously uncached data (in case we dont have enough room).

    The heuristic used to select the entry to eject is known as the replacement

    policy.

    Replacement Policy

  • 7/29/2019 CACHE _20130304

    10/17

    10

    Caching Algorithms

    Least Frequently Used (LFU):

    I am Least Frequently used; I count how often an entry is needed by

    incrementing a counter associated with each entry.

    I remove the entry with least frequently used counter first am not that fast and Iam not that good in adaptive actions (which means that it keeps the entries

    which is really needed and discard the ones that arent needed for the longest

    period based on the access pattern or in other words the request pattern)

    LFU

  • 7/29/2019 CACHE _20130304

    11/17

    11

    Caching Algorithms

    Least Recently Used (LRU):

    I am Least Recently Used cache algorithm; I remove the least recently used

    items first. The one that wasnt used for a longest time.

    I require keeping track of what was used when, which is expensive if one wants

    to make sure that I always discards the least recently used item.

    Web browsers use me for caching. New items are placed into the top of the

    cache. When the cache exceeds its size limit, I will discard items from the

    bottom. The trick is that whenever an item is accessed, I place at the top.

    So items which are frequently accessed tend to stay in the cache. There are twoways to implement me either an array or a linked list (which will have the least

    recently used entry at the back and the recently used at the front).

    I am fast and I am adaptive in other words I can adopt to data access pattern, I

    have a large family which completes me and they are even better than me (I do

    feel jealous some times but it is ok) some of my family member are (LRU2 and2Q) (they were implemented in order to improve LRU caching

    LRU

  • 7/29/2019 CACHE _20130304

    12/17

    12

    Caching Algorithms

    Least Recently Used 2(LRU2):

    I am Least recently used 2, some people calls me least recently used twice

    which I like it more, I add entries to the cache the second time they are

    accessed (it requires two times in order to place an entry in the cache); when

    the cache is full, I remove the entry that has a second most recent access.

    Because of the need to track the two most recent accesses, access overhead

    increases with cache size, If I am applied to a big cache size, that would be a

    problem, which can be a disadvantage. In addition, I have to keep track of some

    items not yet in the cache (they arent requested two times yet).I am better that

    LRU and I am also adoptive to access patterns.

    I am Two Queues; I add entries to an LRU cache as they are accessed. If an

    entry is accessed again, I move them to second, larger, LRU cache.

    LRU2

  • 7/29/2019 CACHE _20130304

    13/17

    13

    Caching Algorithms

    Adaptive Replacement Cache (ARC):

    I am Adaptive Replacement Cache; some people say that I balance between

    LRU and LFU, to improve combined result, well thats not 100% true actually I

    am made from 2 LRU lists, One list, say L1, contains entries that have been

    seen only once recently, while the other list, say L2, contains entries that have

    been seen at least twice recently.

    The items that have been seen twice within a short time have a low inter-arrival

    rate, and, hence, are thought of as high-frequency. Hence, we think of L1 as

    capturing recency while L2 as capturing frequency so most of people think I

    am a balance between LRU and LFU but that is ok I am not angry form that.

    I am considered one of the best performance replacement algorithms, Self

    tuning algorithm and low overhead replacement cache I also keep history of

    entries equal to the size of the cache location; this is to remember the entries

    that were removed and it allows me to see if a removed entry should have

    stayed and we should have chosen another one to remove.(I really have bad

    memory)And yes I am fast and adaptive.

    ARC

  • 7/29/2019 CACHE _20130304

    14/17

    14

    Caching Algorithms

    Most Recently Used (MRU):

    I am most recently used, in contrast to LRU; I remove the most recently used

    items first. You will ask me why for sure, well let me tell you something when

    access is unpredictable, and determining the least most recently used entry in

    the cache system is a high time complexity operation, I am the best choice thats

    why.

    I am so common in the database memory caches, whenever a cached record is

    used; I replace it to the top of stack. And when there is no room the entry on the

    top of the stack, guess what? I will replace the top most entry with the new entry.

    MRU

  • 7/29/2019 CACHE _20130304

    15/17

    15

    Caching Algorithms

    First in First out (FIFO):

    I am first in first out; I am a low-overhead algorithm I require little effort for

    managing the cache entries. The idea is that I keep track of all the cache entries

    in a queue, with the most recent entry at the back, and the earliest entry in the

    front. When there e is no place and an entry needs to be replaced, I will remove

    the entry at the front of the queue (the oldest entry) and replaced with thecurrent fetched entry. I am fast but I am not adaptive.

    Hello I am Second Change I am a modified form of the FIFO replacement

    algorithm, known as the Second chance replacement algorithm, I am better than

    FIFO at little cost for the improvement.

    I am Clock and I am a more efficient version of FIFO than Second chance

    because I dont push the cached entries to the back of the list like Second

    change do, but I perform the same general function as Second-Chance.

    FIFO

  • 7/29/2019 CACHE _20130304

    16/17

    16

    Caching Algorithms

    Distributed caching:

    1.Caching Data can be stored in separate memory area from the caching

    directory itself (who handle the caching entries and so on) can be across

    network or disk for example.

    2.Distrusting the cache allows increase in the cache size.

    3.In this case the retrieval cost will increase also due to network request time.

    4.This will also lead to hit ratio increase due to the large size of the cache

    FIFO

  • 7/29/2019 CACHE _20130304

    17/17

    Thank You

    SINA@Make Presentation much more fun