what we can learn from cdns about web development, deployment, and performance

Post on 18-May-2015

1.365 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

CDNs have become a core part of internet infrastructure, and application owners are building them into development and product roadmaps for improved efficiency, transparency and performance. In his talk, Hooman shares recent learnings about the world of CDNs, how they're changing, and how Devs, Ops, and DevOps can integrate with them for optimal deployment and performance. Hooman Beheshti is VP of Technology at Fastly, where he develops web performance services for the world's smartest CDN platform. A pioneer in the application acceleration space, Hooman helped design one of the original load balancers while at Radware and has held senior technology positions with Strangeloop Networks and Crescendo Networks. He has worked on the core technologies that make the Internet work faster for nearly 20 years and is an expert and frequent speaker on the subjects of load balancing, application performance, and content delivery networks.

TRANSCRIPT

What  we  can  learn  from  CDNs  about  Web  Development,  

Deployment,  and  Performance  

Hooman  Behesh+,  VP  Technology        

WebPerf  Meetup,  NYC,  May  13,  2014  

Who  am  I?  

•  Early  Load  Balancing  Vendors  – Radware  – Crescendo  Networks  

•  Front  End  Op+miza+on  – Strangeloop  Networks  

•  Off  the  grid  for  a  year!  •  Joined  Fastly  6  weeks  ago  

So,  really…  

What  I’ve  Learned  From  Working  at  a  CDN  Company  

for  6  Weeks!!  

Lesson:    

CDNs  Are  Not  Solved!  

We  Don’t  Cache  As  Much  As  We  Should!  

•  HTML  and  other  dynamic  content  

•  Worse  cache  hit  rate  than  we  think  – Especially  for  long  tail  content  

•  Mobile  Apps,  APIs,  etc  

Making  Changes  SUCKS!!  

•  Configura+on  changes  take  way  too  long  – People  are  used  to  making  changes  real-­‐+me  – CDNs  aren’t  classically  good  at  this  – Phone??  

•  Purging  is  a  real  problem  – Slow  – Difficult  – Not  granular  enough  

Lots  of  Room  

•  New  Demands  from  Customers  •  Plenty  of  room  for  differen+a+on  •  Can’t  take  some  things  for  granted:  – DNS  – Rou+ng  – TCP  – SCALE!  

•  Plus:  lots  of  room  to  be  crea+ve  at  the  edge!  

Lesson:    

There’s  More  to  the  Web  Than  the  Web!  

Non  “web”  Traffic  •  Video  – HLS  (HTTP  Live  Streaming)  – HTTP-­‐based  small  video  chunks  – Unique  by  URL  

•  APIs  –  Instant  purging  can  let  API  calls  be  cacheable  – Another  example  of  dynamic  content  cached  at  the  edge  

•  Mobile  Apps  

Lesson:    

People  Use  Their  CDNs  Wrong  

CDNs  offer  a  toolset  

•  The  black  box  approach  isn’t  always  good  •  Configura+on  isn’t  trivial  – And  a  lot  s+ll  depends  on  configura+on  

•  Can’t  depend  on  the  CDN  to  solve  all  your  problems  

•  Don’t  exacerbate  your  problems!  

hbp://bigqueri.es/t/sites-­‐that-­‐deliver-­‐images-­‐using-­‐gzip-­‐deflate-­‐encoding/220  

Gzipping  Images  

•  Not  a  very  good  thing  for  performance  

– Extra  bytes  

– Extra  work  for  the  browser  

•  But  was  this  the  Surrogate’s  fault?  

More  Examples  

•  Bad  caching  headers  – max-­‐age,  s-­‐maxage  have  a  lot  of  power!  

•  Bad  TCP  connec+on  management  at  origin  

•  Not  Gzipping  (actual,  compressible  content)  for  origin  fetches  

 

With  Great  Power…  

Lesson:    

Dynamic  Content  Is  Really  InteresVng!  

What  Is  Dynamic  Content?  

•  Stuff  that’s  not  sta+c!  

•  With  web  traffic,  generally  the  base  HTML  –  Big  deal  because  it’s  blocking  – And  some+mes  the  largest  object;  longer  download  

•  Some  AJAX  

•  More…  

Blocking  

Classically,  with  dynamic  content…  

Caching  

Caching    vs.    

InvalidaVon  

We  tried…  

Dynamic  Content  Caching  Problems  

•  Serving  stale  pages  – Lack  of  good  invalida+on  framework  

•  Visibility  

•  Logging  

 

CDNs  and  Dynamic  Content  

•  Generally,  handling  dynamic  content  has  been  a  maber  of  transport  – Middle  mile  op+miza+ons  – TCP  tweaks  

•  Some  edge  micro  caching,  but  not  easy  

•  ESI  

 

Actually…  

•  Dynamic  content  is  more  cacheable  than  we  think  

•  Sta+c  for  short  periods  of  +me  

•  Unpredictable  invalida+on  – Standard  HTTP  caching  rules  aren’t  good  enough  

So  Many  Benefits!  

•  Performance  – Faster  +me  to  first  byte  – Faster  start  render  – Happy  users!  

•  Offload  – Less  work  for  our  servers  – Less  bandwidth  at  origin  

What  Would  Make  It  BeZer?  

•  Programma+c  Invalida+on  – Granular  –  Instantaneous  

•  Control  at  the  edge,  and  not  just  for  web  pages  – Real-­‐+me  log  files  –  Imagine  termina+ng  beacons  at  the  edge!  

 

Lesson:    

IntegraVng  is  Awesome!  

The  Influence  of  Clouds  

•  DevOps  people  like  programmability  and  integra+on  

•  The  CDN  is  no  longer  a  black  box  mechanism,  necessarily  

•  Cliché  Alert:  Content  as  a  Service!  

Real  Time  IntegraVon  

•  Tap  in  to  the  CDN:  –  Instantaneous  configura+on  changes  –  Instantaneous  cache  purge  and  invalida+on  – Real  +me  stats  and  logs  

•  Infrastructure  as  code  – Expect  extensive  APIs  – Apps  need  to  naturally  extend  to  the  CDN  – Your  content  =>  you  need  control  

About  Time!!  

Lesson:    

Measurement  is  SVll  Hard  

“SVll”  

•  In  the  world  of  FEO  – Webpagetest.org  – RUM  – Synthe+c  tes+ng  vendors  

•  In  the  world  of  CDNs  – Same  as  far  as  client  performance  goes  – Some  new  things…  

Client-­‐side  Measurement  in  CDNs  

•  Cache  hit  ra+o  – How  do  you  test  and  measure?  

•  Long  tail  content?  •  DNS  and  edge  node  selec+on  •  TTFB  out  of  datacenter  – Memory  hit  vs  disk  hit  vs  mid-­‐+er  hit  vs  miss  

•  RUM  and  synthe+c  (Cedexis,  Catchpoint,  etc)  •  There’s  s+ll  gaming  going  on!  

Let’s  Test  It!  •  3  Objects  on  the  same  CDN  (anonymous)  

–  Cedexis  object  –  Small  image  from  Alexa  5000  site  –  Long  tail  object:  ~40  +mes  every  3-­‐4  hours  

•  Use  Catchpoint  last  mile  clients  in  US  –  Test  every  15  minutes  –  ~11,500  total  tests  across  all  test  nodes  

•  Focus  measurement  on:  –  Connect  +me  (TCP)  –  Wait  +me  (TTFB)  

Cedexis  Object  

Connect  (median)   Wait  (median)  

Cedexis   14ms   19ms  

Cedexis  Object   Alexa  5000  

Connect  (median)   Wait  (median)  

Cedexis   14ms   19ms  

Alexa  5000   14ms   24ms  

Cedexis  Object   Alexa  5000  

Connect  (median)   Wait  (median)  

Cedexis   14ms   19ms  

Alexa  5000   14ms   24ms  26%  

Cedexis  Object   Long  Tail  Alexa  5000  

Connect  (median)   Wait  (median)  

Cedexis   14ms   19ms  

Alexa  5000   14ms   24ms  

Long  Tail   15ms   29ms  

Cedexis  Object   Long  Tail  Alexa  5000  

Connect  (median)   Wait  (median)  

Cedexis   14ms   19ms  

Alexa  5000   14ms   24ms  

Long  Tail   15ms   29ms   20%  

Cedexis  Object  

Count   TCP   TTFB   Count   TCP   TTFB   Count   TCP   TTFB  

Mem   11,074   14ms   19ms   481   14ms   19ms   6741   14ms   20ms  

Disk   428   12ms   24ms   9626   15ms   28ms   4692   14ms   31ms  

Miss   1   6ms   38ms   1355   16ms   51ms   28   13ms   45ms  

Cedexis  Object   Alexa  5000  

Count   TCP   TTFB   Count   TCP   TTFB   Count   TCP   TTFB  

Mem   11,074   14ms   19ms   6741   14ms   20ms   481   14ms   19ms  

Disk   428   12ms   24ms   4692   14ms   31ms   9626   15ms   28ms  

Miss   1   6ms   38ms   28   13ms   45ms   1355   16ms   51ms  

Cedexis  Object   Long  Tail  Alexa  5000  

Count   TCP   TTFB   Count   TCP   TTFB   Count   TCP   TTFB  

Mem   11,074   14ms   19ms   6741   14ms   20ms   481   14ms   19ms  

Disk   428   12ms   24ms   4692   14ms   31ms   9626   15ms   28ms  

Miss   1   6ms   38ms   28   13ms   45ms   1355   16ms   51ms  

Cedexis  Object   Long  Tail  Alexa  5000  

Count   TCP   TTFB   Count   TCP   TTFB   Count   TCP   TTFB  

Mem   11,074   14ms   19ms   6741   14ms   20ms   481   14ms   19ms  

Disk   428   12ms   24ms   4692   14ms   31ms   9626   15ms   28ms  

Miss   1   6ms   38ms   28   13ms   45ms   1355   16ms   51ms  

99.99%  Mem:  96.27%  Disk:  3.72%  

Cedexis  Object   Long  Tail  Alexa  5000  

Count   TCP   TTFB   Count   TCP   TTFB   Count   TCP   TTFB  

Mem   11,074   14ms   19ms   6741   14ms   20ms   481   14ms   19ms  

Disk   428   12ms   24ms   4692   14ms   31ms   9626   15ms   28ms  

Miss   1   6ms   38ms   28   13ms   45ms   1355   16ms   51ms  

99.99%  Mem:  96.27%  Disk:  3.72%  

99.76%  Mem:  58.82%  Disk:  40.94%  

Cedexis  Object   Long  Tail  Alexa  5000  

Count   TCP   TTFB   Count   TCP   TTFB   Count   TCP   TTFB  

Mem   11,074   14ms   19ms   6741   14ms   20ms   481   14ms   19ms  

Disk   428   12ms   24ms   4692   14ms   31ms   9626   15ms   28ms  

Miss   1   6ms   38ms   28   13ms   45ms   1355   16ms   51ms  

99.99%  Mem:  96.27%  Disk:  3.72%  

99.76%   88.17%  Mem:  58.82%  Disk:  40.94%  

Mem:  4.19%  Disk:  83.98%  

Measurement!  

•  Not  only  do  I  care  about:  – Cache  hit  rate  – Long  tail  – Measuring  the  right  thing  

•  Fetching  from  disk  could  suck!  – SSDs!  

•  Caching  ≠  Caching  

Lesson:    

It’s  Not  Only  About…  

…Performance!!!  

Security  

•  Cer+ficate  management  

•  Perimeter  security  

•  DDoS  protec+on  <-­‐  benefit  of  scale!  

Flexibility,  Visibility,  and  Control  

•  Integra+on  •  Programmability  •  APIs  •  Instant  purging  •  Real  +me  logs  

Fun  at  the  Edge!  •  Synthe+c  responses  –  Example:  node  ID  for  Cedexis  measurements  

•  Cookie  manipula+on  –  Remove/inject/replace/recall  

•  Beacon  termina+on  –  204  responses  –  Real  +me  logs  – Awesome!  

PERFORMANCE!!  It’s  s+ll  preby  damn  important!  

Thank  you!  

hooman@fastly.com  

top related