caching up and down the stack
DESCRIPTION
Presented by James Meickle, developer evangelist at AppNeta, to the Boston Django Meetup on March 28, 2014.TRANSCRIPT
Caching Up and
Down the Stack
1
James MeickleDeveloper evangelist, AppNeta
@jmeickle
Django Boston Meetup GroupMarch 26, 2014
2
4WHAT IS CACHING?
Uncached
Client
Data Source
3
4WHAT IS CACHING?
Uncached Cached
Client
Data Source
Data Source
Client
Cache Intermediary
4
4WHAT IS CACHING?
Uncached Cached
Client
Data Source
Data Source
Client
Cache Intermediary
Fast!
Slow...
5
• Images
• CSS
• JavaScript
• HTML documents
• DNS
4WHAT GETS CACHED BY CLIENTS?
6
• HTML documents
• HTML fragments
• Operations on data
• Database queries
• Expensive objects
4WHAT GETS CACHED BY APPS?
7
• Compiled source
• Packages
• Disk access
• Memory access
• CPU instructions
4WHAT GETS CACHED BY SERVERS?
8
• Client-side assets
• Full pages
4CACHING IN DJANGO: FRONTEND
9
4OUR SITESappneta.com
(static site via Jekyll)
10
4OUR SITESinternal.tracelytics.com
(dynamic single page Pylons app)
11
4CLIENT-SIDE ASSETS
12
• Use HTTP caches!
• CDN
• Intermediate proxies
• Browser
• Set policy with cache headers
• Cache-Control
• Expires
• ETag
4CLIENT-SIDE ASSETS
13
4CLIENT-SIDE ASSETS
/tl-layouts_base-compiled-757f5eec3603f60850acfdb86e6701cf104f80ae.cssRequest Method: GETStatus Code: 304 Not Modified
Cache-Control: max-age=315360000Connection: keep-aliveDate: Mon, 18 Feb 2013 22:46:12 GMTExpires: Thu, 31 Dec 2037 23:55:55 GMTLast-Modified: Tue, 12 Feb 2013 21:10:20 GMTServer: nginx/0.8.54
14
4CLIENT-SIDE ASSETSappneta.com
internal.tracelytics.com
15
4CLIENT-SIDE ASSETS (IN DJANGO)
VS
16
• Full pages
• Partial pages
• Objects
• Queries
4CACHING IN DJANGO: BACKEND
17
4WE’RE STILL TALKING ABOUT PAGES?
Client-sideassets
Pages
18
4FULL-PAGE HTTP CACHING
Client Varnish
Do HTTP caching, but with your
rules.
No internet standards necessary! Webserver
19
• Why do it server-side?
• Invalidation
• Amount cached
• Changing cache policies
4FULL-PAGE HTTP CACHING
20
4FULL-PAGE HTTP CACHING (IN DJANGO)
21
• Full pages
• Partial pages
• Objects
• Queries
4CACHING IN DJANGO: BACKEND
22
4FULL-PAGE HTTP CACHING?
appneta.com
24
4FRAGMENT CACHING
appneta.com
Ruins everything
25
4FRAGMENT CACHING
26
4FRAGMENT CACHING
27
4FRAGMENT CACHING
28
4FRAGMENT CACHING (IN DJANGO)
29
• Full pages
• Partial pages
• Objects
• Queries
4CACHING IN DJANGO: BACKEND
30
4OBJECT CACHING
def get_item_by_id(key): # Look up the item in our database return session.query(User)\ .filter_by(id=key)\ .first()
31
4OBJECT CACHING
def get_item_by_id(key): # Check in cache val = mc.get(key)
# If exists, return it if val: return val
# If not, get the val, store it in the cache val = return session.query(User)\ .filter_by(id=key)\ .first() mc.set(key, val) return val
32
4OBJECT CACHING
@decoratordef cache(expensive_func, key): # Check in cache val = mc.get(key)
# If exists, return it if val: return val
# If not, get the val, store it in the cache val = expensive_func(key) mc.set(key, val) return val
33
4OBJECT CACHING
@cachedef get_item_by_id(key): # Look up the item in our database return session.query(User)\ .filter_by(id=key)\ .first()
34
4OBJECT CACHING (IN DJANGO)
35
• Full pages
• Partial pages
• Objects
• Queries
4CACHING IN DJANGO: BACKEND
36
4QUERY CACHING
Cached
TableData
DB client
Query Cache
SQL server
Retrieve results from
memory…
…or from memcached…
…or cache in the DB itself!
37
4QUERY CACHING
mysql> select SQL_CACHE count(*) from traces; +----------+| count(*) |+----------+| 3135623 |+----------+1 row in set (0.56 sec)
mysql> select SQL_CACHE count(*) from traces;+----------+| count(*) |+----------+| 3135623 |+----------+1 row in set (0.00 sec)
38
4QUERY CACHING (IN DJANGO)
39
4QUERY CACHING (IN TRACEVIEW)
Cassandra in TraceView:Retrieving data from data warehouse
40
4QUERY CACHING (IN TRACEVIEW)
Memcache in TraceView:97%ile is 50ms!
41
• Invalidation
• Fragmentation
• Stampedes
• Complexity
4WHAT CAN GO WRONG?
42
4INVALIDATION
Uncached Cached
Client
Data Source
Data Source
Client
Cache Intermediary
Invalidation
43
4INVALIDATION
Drupal: Automatically cache everything!
44
4INVALIDATION
Memcache GET performance
45
4INVALIDATION
Do you know what you’re storing in this call?
46
4INVALIDATION
Memcache SET performance
47
4FRAGMENTATION
“In the beginning there was NCSA Mosaic, and Mosaic called itself NCSA_Mosaic/2.0 (Windows 3.1), and
Mosaic displayed pictures along with text, and there was much rejoicing.”
History of the browser user-agent string
48
• On a cache miss extra work is
done
• The result is stored in the cache
• What if multiple simultaneous
misses?
• Every node tries to do the same
work at the same time and your
cache dies
4STAMPEDES
49
• What caching scheme?
• How many extra servers?
• What happens if they fail?
• What will you do to debug it?
4COMPLEXITY
50
• The ‘how’ of caching:
• What are you caching?
• Where are you caching it?
• How bad is a cache miss?
• How and when are you
invalidating?
4TAKEAWAYS
51
• The ‘why’ of caching:
• Did it actually get faster?
• Is speed worth extra
complexity?
• Don’t guess – measure!
• Always use real-world
conditions.
4TAKEAWAYS
52
• ?
4QUESTIONS
53
• Django documentation on caching: https://docs.djangoproject.com/en/dev/topics/cache/
• Varnish caching, via Disqus: http://blog.disqus.com/post/62187806135/scaling-django-to-8-billion-page-views
• Django cache option comparisons: http://codysoyland.com/2010/jan/17/evaluating-django-caching-options/
• More Django-specific tips: http://www.slideshare.net/csky/where-django-caching-bust-at-the-seams
• Guide to cache-related HTTP headers: http://www.mobify.com/blog/beginners-guide-to-http-cache-headers/
4RESOURCES
54
TraceView
Stop by later tonight to hear more about AppNeta or TraceView (make sure to get a sticker!)
April 1: Come drink on our tab after the Boston Python Meetup
April 11: PyCon! Swing by our booth (214) for an awesome shirt
Or, just try us out:
THANK YOU!
http://www.appneta.com/products/traceview/