caching the uncacheable: leveraging your cdn to cache dynamic content
DESCRIPTION
June 25, 2014. Hooman Beheshti, VP Technology at Fastly, discusses how using a real-time, modern CDN that provides instant cache invalidation and real-time analytics allows for instantaneous control over dynamic content caching. In this session, he looks at the challenges CDNs face with dynamic content and how you can use programmatic means to fully integrate your applications with your CDN.TRANSCRIPT
Caching The Uncacheable:
Leveraging Your CDN to Cache Dynamic
Content
Hooman Behesh+, VP Technology
Dynamic Content Is Really Interes=ng!
What Is Dynamic Content? • Stuff that’s not sta+c! • With web traffic, generally the base HTML – Big deal because it’s blocking – And some+mes the largest object à longer download
• Could be other things too – AJAX calls – API calls
• More…
Blocking
Classically, with dynamic content…
Caching
Dynamic Content Caching Problems
• Serving stale pages – Lack of good invalida+on framework
Caching vs.
Invalida=on
We tried…
Dynamic Content Caching Problems
• Serving stale pages – Lack of good invalida+on framework
Dynamic Content Caching Problems
• Serving stale pages – Lack of good invalida+on framework
• Real-‐+me visibility – Real-‐+me analy+cs/stats – Real-‐+me logging
CDNs and Dynamic Content
• Generally, handling dynamic content has been a maRer of transport – Op+mize from-‐origin delivery – “DSA” (Dynamic Site Accelera+on) – Middle mile op+miza+ons – TCP tweaks
Dynamic Content, Tradi=onally
CDN Node
Client
Origin
Some TCP Tweaks
Dynamic Content, Tradi=onally
CDN Node CDN Node
Client
Origin
Lots of TCP Tweaks
Dynamic Content, Tradi=onally
• We some+mes do micro caching of HTML – Short TTL for HTML content – Not full proof
• Ex: news stories faux-‐pas!
• ESI (Edge Side Includes) – Par+al caching – Hard and onerous
Actually…
• Dynamic content is more cacheable than we think
• Sta+c for short periods of +me
• Unpredictable invalida+on – Standard HTTP caching rules aren’t good enough
A Lot BeMer!
CDN Node CDN Node
Client
Origin
Blocking
So Many Benefits!
• Performance – Faster +me to first byte – Faster start render – Happy users!
• Offload – Less work for our servers – Less bandwidth at origin
What would make it beMer?
Programma=c Invalida=on
• Invalida+on API • Granular • Instantaneous – Big problem with classic CDNs (mul+-‐minute purges)
Power of the Purge!
• Instant purging: – As a page gets published, a purge command also gets published
– Instant means: predictable and determinis+c behavior
Power of the Purge!
• Purge dependencies – Surrogate Keys – Using tags to purge en+re chunks of content at once
More than just Invalida=on…
The Influence of Clouds
• The CDN is an extension of the app • No longer a black box • Real-‐+me integra+on with the app • Infrastructure as code – Your content => You need control
Control
• Programmability – Configura+on API – Invalida+on API – Instantaneous and real +me – Granular caching
• Ex: Geo-‐based caching
Control at the Edge
• Moving app logic to the edge • VCL – Varnish Configura+on Language – Script-‐like configura+on for func+onality at the edge
Visibility
• Real +me analy+cs – Network stats – HTTP stats (status codes , etc) – Caching stats (hits, misses, etc) – Stats API
• Logging – Real +me logs – Streaming to various log des+na+ons
Example: CMS + Purge
WordPress: Before
CDN Node
WordPress: Before
CDN Node
WordPress: Before
CDN Node
WordPress: Before
CDN Node
WordPress: Before
CDN Node
Cache
WordPress: AWer
CDN Node
WordPress: AWer
CDN Node
HTTP/1.1 200 OKContent-Type: text/htmlContent-Length: 55,666Cache-Control: Long Time, totally!
WordPress: AWer
CDN Node
WordPress: AWer
CDN Node
WordPress: AWer
CDN Node PURGE
WordPress: AWer
CDN Node PURGE
WordPress: AWer
CDN Node PURGE
(Has to be instantaneous!)
WordPress: AWer
CDN Node
HTTP/1.1 200 OKContent-Type: text/htmlContent-Length: 55,666Cache-Control: Long Time, totally!
Example: Beacon Termina=on at the Edge
Before
CDN Node
Origin
Log Analysis
http://collector.site.com/beacon.img?a=1&b=2&c=3
Before
CDN Node
Origin
Log Analysis
http://collector.site.com/beacon.img?a=1&b=2&c=3
Before
CDN Node
Origin
Log Analysis
HTTP/1.1 200 OKPragma: no-cacheExpires: Wed, 19 Apr 2000 11:43:00 GMTCache-Control: no-cache, no-storeLast-Modified: Wed, 21 Jan 2004 19:51:30 GMTContent-Type: image/gifDate: Fri, 20 Jun 2014 12:22:20 GMTServer: ApacheContent-Length: 35
http://collector.site.com/beacon.img?a=1&b=2&c=3
Before
CDN Node
Origin
Log Analysis
HTTP/1.1 200 OKPragma: no-cacheExpires: Wed, 19 Apr 2000 11:43:00 GMTCache-Control: no-cache, no-storeLast-Modified: Wed, 21 Jan 2004 19:51:30 GMTContent-Type: image/gifDate: Fri, 20 Jun 2014 12:22:20 GMTServer: ApacheContent-Length: 35
http://collector.site.com/beacon.img?a=1&b=2&c=3
AWer
CDN Node
Origin http://collector.site.com/beacon.img?a=1&b=2&c=3
AWer
CDN Node
Origin
HTTP/1.1 200 OKPragma: no-cacheExpires: Wed, 19 Apr 2000 11:43:00 GMTCache-Control: no-cache, no-storeLast-Modified: Wed, 21 Jan 2004 19:51:30 GMTContent-Type: image/gifDate: Fri, 20 Jun 2014 12:22:20 GMTServer: ApacheContent-Length: 35
http://collector.site.com/beacon.img?a=1&b=2&c=3
AWer
CDN Node
Origin
HTTP/1.1 204 No ContentDate: Sat, 21 Jun 2014 23:21:12 GMTServer: Awesome ServerContent-Length: 0
http://collector.site.com/beacon.img?a=1&b=2&c=3
AWer
CDN Node
Origin
Syslog / S3 / FTP/etc
http://collector.site.com/beacon.img?a=1&b=2&c=3
Example: Edge-‐generated Content
JSON Data Center ID
CDN Node
Origin http://www.site.com/which_datacenter.js
JSON Data Center ID
CDN Node
Origin
{ ‘datacenter’ : ‘SJC’ }
http://www.site.com/which_datacenter.js
VCL Snippet
More Examples
• Caching with tracking cookies: – hRp://www.fastly.com/blog/how-‐to-‐cache-‐with-‐tracking-‐cookies
• API Caching: – hRp://www.fastly.com/blog/api-‐caching-‐part-‐iii (part 3, with links to previous two parts)
• Log Streaming: – hRp://www.fastly.com/blog/+ps-‐for-‐streaming-‐logs
Let’s Sum Up!
Summary
• Dynamic content can be cached – We need instant purging – We need real-‐+me logs and stats
• Real-‐+me integra+on of our CDN with our app is cool! – Extensive/granular API to control the CDN – Control and visibility at the edge lets us be really crea+ve
• Never use “Long Time, totally!” in a Cache-Control header!
Thank you!