sab17## storage#founda1on#deep#dive# … b17.pdf · example#aaer#tuning:#vxfs#buffer#cache#size# #...

46
SA B17 Storage Founda1on Deep Dive Performance and Tuning Oscar Wahlberg Sr. Principal Technical Product Manager

Upload: lydiep

Post on 06-Feb-2018

218 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: SAB17## Storage#Founda1on#Deep#Dive# … B17.pdf · Example#aaer#tuning:#VxFS#Buffer#Cache#size# # vxfsstat$–b$/var/mail$ buffer$cache$stasLcs$ $1728556$Kbyte$current$$$2097152$maximum$

SA  B17    Storage  Founda1on  Deep  Dive  Performance  and  Tuning  

Oscar  Wahlberg  Sr.  Principal  Technical  Product  Manager  

Page 2: SAB17## Storage#Founda1on#Deep#Dive# … B17.pdf · Example#aaer#tuning:#VxFS#Buffer#Cache#size# # vxfsstat$–b$/var/mail$ buffer$cache$stasLcs$ $1728556$Kbyte$current$$$2097152$maximum$

SYMANTEC  VISION  2012  

Performance  Tuning  is  not  easy  But  common  sense  goes  a  long  way  

   

 Nope,  sorry.  There’s  no  easy  buHon…    

• Almost  every  situaLon  is  unique  –  applicaLon,  HW,  load  etc  

• The  goal  of  today’s  session  is  to  build  your  toolset  –  The  know-­‐how  to  collect  useful  informaLon  –  The  ability  to  idenLfy  boHlenecks  –  Increase  your  knowledge  of  common  benchmarking  tools  –  Improve  your  understanding  for  when  and  how  to  tune  Storage  FoundaLon  

 

Storage  FoundaLon  Deep  Dive:  Performance  and  Tuning  

Page 3: SAB17## Storage#Founda1on#Deep#Dive# … B17.pdf · Example#aaer#tuning:#VxFS#Buffer#Cache#size# # vxfsstat$–b$/var/mail$ buffer$cache$stasLcs$ $1728556$Kbyte$current$$$2097152$maximum$

SYMANTEC  VISION  2012  

Agenda  

Storage  FoundaLon  Architecture  1  

Methodology  and  Tools  2  

IdenLfy  and  Remediate  BoHlenecks  3  

Tuning  and  Best  PracLces  for  common  use  cases  4  

Storage  FoundaLon  Deep  Dive:  Performance  and  Tuning  

AddiLonal  Resources  5  

Page 4: SAB17## Storage#Founda1on#Deep#Dive# … B17.pdf · Example#aaer#tuning:#VxFS#Buffer#Cache#size# # vxfsstat$–b$/var/mail$ buffer$cache$stasLcs$ $1728556$Kbyte$current$$$2097152$maximum$

SYMANTEC  VISION  2012  

Storage  Founda1on  Architecture    

Veritas  File  System  –  A  method  for  organizing  data  blocks  to  make  

them  easy  to  access  –  Files  are  tracked  in  structures  called  inodes  –  InformaLon  about  files  are  stored  in  

metadata  

Veritas  Volume  Manager  –  Abstracts  file  system  from  individual  LUNs  –  Increases  performance  by  spreading  I/O  

across  mulLple  LUNs  –  Protects  data  by  using  RAID  technology  

Dynamic  Mul1-­‐Pathing  Layer  –  Provides  device  independent  fault  recovery  –  Discovers  LUN  characterisLcs        

Storage  Founda1on  Deep  Dive:  Performance  and  Tuning  

Disk

HBA  Drivers  

OS  SCSI  Drivers  

Error Analysis

Regular I/O

VxVM  VxFS  

DMP  

ApplicaLon  

Page 5: SAB17## Storage#Founda1on#Deep#Dive# … B17.pdf · Example#aaer#tuning:#VxFS#Buffer#Cache#size# # vxfsstat$–b$/var/mail$ buffer$cache$stasLcs$ $1728556$Kbyte$current$$$2097152$maximum$

SYMANTEC  VISION  2012  

Agenda  

Storage  FoundaLon  Architecture  1  

Methodology  and  Tools  2  

IdenLfy  and  Remediate  BoHlenecks  3  

Tuning  and  Best  PracLces  for  common  use  cases  4  

Storage  FoundaLon  Deep  Dive:  Performance  and  Tuning  

AddiLonal  Resources  5  

Page 6: SAB17## Storage#Founda1on#Deep#Dive# … B17.pdf · Example#aaer#tuning:#VxFS#Buffer#Cache#size# # vxfsstat$–b$/var/mail$ buffer$cache$stasLcs$ $1728556$Kbyte$current$$$2097152$maximum$

SYMANTEC  VISION  2012  

How  do  you  measure  performance?  

Look  at  the  Bigger  Picture:  –  OpLmal  Performance  means  a  balanced  system  

–  Balance  Memory  and  CPU  consumpLon  with  performance  requirements  

 

Start  by  answering  three  fundamental  ques1ons:  – Why  are  you  measuring  ?  

– What  do  you  need  to  measure  ?  

–  How  are  you  going  to  measure  ?  

Storage  FoundaLon  Deep  Dive:  Performance  and  Tuning  

HBA  Ports   Switch  

Storage  Ports  

I/O  DMP  

DMP  

Page 7: SAB17## Storage#Founda1on#Deep#Dive# … B17.pdf · Example#aaer#tuning:#VxFS#Buffer#Cache#size# # vxfsstat$–b$/var/mail$ buffer$cache$stasLcs$ $1728556$Kbyte$current$$$2097152$maximum$

SYMANTEC  VISION  2012  

What  type  of  data  do  you  need  to  collect?  

Applica1on  Performance  –   Can  be  operaLon/second  or  Lme  to  perform  a  single  operaLon  

 

Fundamental  OS  Sta1s1cs  –   CPU  uLlizaLon  (mpstat,  sar,  prstat  etc)  –   Memory  uLlizaLon  (vmstat)  –   I/O  throughput  and  operaLons  per  second  (iostat)  –   Network  staLsLcs  (netstat)  –   System  AcLvity  Reporter  (sar)  

 

Storage  Founda1on  Sta1s1cs  –   Volume  Manager  staLsLcs  (vxstat  and  vxdmpadm  iostat)  –   File  System  staLsLcs  (vxfsstat)  –   DMP  layer  staLsLcs  (vxdmadm  iostat)  

Collect  this  data  at  regular  intervals  (every  5-­‐30  seconds)  

–  SAR  is  a  great  tool  to  save  historical  data  

Storage  FoundaLon  Deep  Dive:  Performance  and  Tuning  

SymCLI  /  HiCommand  

SF  

Page 8: SAB17## Storage#Founda1on#Deep#Dive# … B17.pdf · Example#aaer#tuning:#VxFS#Buffer#Cache#size# # vxfsstat$–b$/var/mail$ buffer$cache$stasLcs$ $1728556$Kbyte$current$$$2097152$maximum$

SYMANTEC  VISION  2012  

Common  Benchmarking  Tools  

I/O  Zone  • A  File  system  Benchmark  uLlity  • Simulates  a  variety  of  workloads  and  systems  • Can  simulate  buffered,  direct,  as  well  as  mulLple  threads  • Less  useful  in  single  threaded  mode  

Database  Benchmarks  • TPC-­‐C  and  TPC-­‐H  from  TPC.org  –  The  Industry  standard  • Swingbench  • 3rd  party  sobware  to  capture  and  replay  producLon  workloads  

Postmark  • Simulates  an  email  server  workload  • Large  number  of  small  files  created  as  a  baseline  • Single-­‐threaded  

Unix  U1li1es  • Includes  dd,  tar,  mkfile  and  cp  • Oben  used  for  back  of  napkin  numbers  • These  tools  are  single  threaded  • Even  when  used  in  tandem,  do  not  simulate  a  real-­‐world  workload  

Storage  FoundaLon  Deep  Dive:  Performance  and  Tuning  

Page 9: SAB17## Storage#Founda1on#Deep#Dive# … B17.pdf · Example#aaer#tuning:#VxFS#Buffer#Cache#size# # vxfsstat$–b$/var/mail$ buffer$cache$stasLcs$ $1728556$Kbyte$current$$$2097152$maximum$

SYMANTEC  VISION  2012  

Benchmarking  Tools  What  is  vxbench?  

 –  Vxbench  is  a  tool  available  on  AIX,  HP-­‐UX,  Linux  and  Solaris  –  Used  for  benchmarking  I/O  loads  on  raw  disk  or  file  systems    –  Generates  various  I/O  workloads  such  as    

•  SequenLal  and  random  reads/writes  •  Asynchronous  I/Os,  and  memory  mapped  (mmap)  operaLons.  

 

vxbench_platform -w workload [options] file_name  

 

–  The  vxbench  uLlity  is  part  of  the  SFHA  Support  package:  VRTSspt    

–  It  can  be  found  on  the  install  media,  or  on  SORT  at  hHp://sort.symantec.com    

Storage  FoundaLon  Deep  Dive:  Performance  and  Tuning  

Page 10: SAB17## Storage#Founda1on#Deep#Dive# … B17.pdf · Example#aaer#tuning:#VxFS#Buffer#Cache#size# # vxfsstat$–b$/var/mail$ buffer$cache$stasLcs$ $1728556$Kbyte$current$$$2097152$maximum$

SYMANTEC  VISION  2012  

Agenda  

Storage  FoundaLon  Architecture  1  

Methodology  and  Tools  2  

IdenLfy  and  Remediate  BoHlenecks  3  

Tuning  and  Best  PracLces  for  common  use  cases  4  

Storage  FoundaLon  Deep  Dive:  Performance  and  Tuning  

AddiLonal  Resources  5  

Page 11: SAB17## Storage#Founda1on#Deep#Dive# … B17.pdf · Example#aaer#tuning:#VxFS#Buffer#Cache#size# # vxfsstat$–b$/var/mail$ buffer$cache$stasLcs$ $1728556$Kbyte$current$$$2097152$maximum$

SYMANTEC  VISION  2012  

How  to  iden1fy  a  Volume  Manager  BoZleneck  

• Use  vxstat  to  measure  Volume  level  performance  –  OperaLons/s,  Throughput  and  Average  Service  Lmes  

• Unusual  numbers  warrant  a  deeper  look  and  drill  down  –  For  example:  High  service  Lmes  can  indicate  hot  spots  or  faulty  hardware  

Storage  FoundaLon  Deep  Dive:  Performance  and  Tuning  

DMP  

ApplicaLon(s)  File  System  

Volume  Manager  

OS  Kernel  Drivers  

Page 12: SAB17## Storage#Founda1on#Deep#Dive# … B17.pdf · Example#aaer#tuning:#VxFS#Buffer#Cache#size# # vxfsstat$–b$/var/mail$ buffer$cache$stasLcs$ $1728556$Kbyte$current$$$2097152$maximum$

SYMANTEC  VISION  2012  

High  I/O  service  1me  is  rarely  a  good  thing  

Storage  FoundaLon  Deep  Dive:  Performance  and  Tuning  

•  A  single  disk  with  very  high  average  write  Lme  indicates  a  problem  

•  A  single  slow  disk  in  a  stripe  set  lower  the  performance  of  the  enLre  volume  

# vxstat –g maildg -s -i5 mailvol

TYP NAME

sd 3pardata0_96-01

sd 3pardata0_97-01

sd 3pardata0_98-01

sd 3pardata0_99-01

AVG TIME(ms)

WRITE READ WRITE

587136 0.00 11.85

588801 0.00 111.96

586136 0.00 11.05

588136 0.00 12.85

Page 13: SAB17## Storage#Founda1on#Deep#Dive# … B17.pdf · Example#aaer#tuning:#VxFS#Buffer#Cache#size# # vxfsstat$–b$/var/mail$ buffer$cache$stasLcs$ $1728556$Kbyte$current$$$2097152$maximum$

SYMANTEC  VISION  2012  

High  I/O  service  1me  is  rarely  a  good  thing  

# vxstat -g maildg –u m -i5 mailvol

OPERATIONS BLOCKS AVG TIME(ms)

TYP NAME READ WRITE READ WRITE READ WRITE

vol mailvol1 0 5227 0 112.1 0.00 84.66

# vxdmpadm -u m -s iostat show groupby=ctlr ctlr=c1 interval=5

OPERATIONS/SEC BYTES/SEC

CTLRNAME READS WRITES READS WRITES

c1 0 1832 0 113.78

Storage  FoundaLon  Deep  Dive:  Performance  and  Tuning  

•  High  latency  on  the  volume  indicaLng  a  potenLal  problem  lower  in  the  stack  

•  Use  DMP  staLsLcs  to  drill  down      -­‐    114Mb/s  is  ~90%  of  a  1  Gbit  HBA  

Page 14: SAB17## Storage#Founda1on#Deep#Dive# … B17.pdf · Example#aaer#tuning:#VxFS#Buffer#Cache#size# # vxfsstat$–b$/var/mail$ buffer$cache$stasLcs$ $1728556$Kbyte$current$$$2097152$maximum$

SYMANTEC  VISION  2012  

# vxdmpadm -u m –s iostat show groupby=ctlr interval=30 ctrl=c1 cpu usage = 878us per cpu memory = 32768b OPERATIONS/SEC BLOCKS/SEC CTLRNAME READS WRITES READS WRITES c1 33400 9040 233.24m 229.73m # vxdmpadm -u m -s iostat show groupby=enclosure interval=30 cpu usage = 1396us per cpu memory = 32768b OPERATIONS/SEC BLOCKS/SEC ENCLOSURENAME READS WRITES READS WRITES 3pardata0 17200 4000 118.97m 73.65m

BeZer  environment  control  with  DMP  I/O  Sta1s1cs  

Storage  FoundaLon  Deep  Dive:  Performance  and  Tuning  

•  Use  DMP  to  drill  down  or  summarize  staLsLcs  per  controller  or  array  

Page 15: SAB17## Storage#Founda1on#Deep#Dive# … B17.pdf · Example#aaer#tuning:#VxFS#Buffer#Cache#size# # vxfsstat$–b$/var/mail$ buffer$cache$stasLcs$ $1728556$Kbyte$current$$$2097152$maximum$

SYMANTEC  VISION  2012  

Recommended  Tuning  for  DMP  in  SF  5.x/6.0  •  Use  Minimum  Queue  or  AdapLve  Minimum  Queue  I/O  Policy  

•  Increase  DMP  thread  count  on  large  systems  improve  recovery  scalability  –  dmp_daemon_count  =  Number  of  CPU  

cores  /  2  

•  Reduce  and  manage  the  DMP  failover  Lme    –  Two  parameters  to  look  at:  

dmp_failed_io_threshold  and  dmp_retry_count  

 An  extensive  whitepaper  on  DMP  is  available  online  on  Symantec  Connect.  Keyword:  DMP    

Storage  FoundaLon  Deep  Dive:  Performance  and  Tuning  

DMP  

ApplicaLon(s)  File  System  

Volume  Manager  

OS  Kernel  Drivers  

Page 16: SAB17## Storage#Founda1on#Deep#Dive# … B17.pdf · Example#aaer#tuning:#VxFS#Buffer#Cache#size# # vxfsstat$–b$/var/mail$ buffer$cache$stasLcs$ $1728556$Kbyte$current$$$2097152$maximum$

SYMANTEC  VISION  2012  

Tuning  VxVM  for  beZer  I/O  performance  

Create  a  balanced  I/O  sub-­‐system  • Volume  layouts:  Concat  or  Stripe  

• Stripe  unit  size  – Medium  (default  64Kb)  for  random  access  workloads,  increase  for  larger  concurrent  I/O’s  

• Use  online  relayout  capabiliLes  to  adapt  in  a  dynamic  environment  

Storage  FoundaLon  Deep  Dive:  Performance  and  Tuning  

DMP  

ApplicaLon(s)  File  System  

Volume  Manager  

OS  Kernel  Drivers  

Page 17: SAB17## Storage#Founda1on#Deep#Dive# … B17.pdf · Example#aaer#tuning:#VxFS#Buffer#Cache#size# # vxfsstat$–b$/var/mail$ buffer$cache$stasLcs$ $1728556$Kbyte$current$$$2097152$maximum$

SYMANTEC  VISION  2012  

Overview  of  VxFS  Tuning  

•  Buffered  I/O  –  ApplicaLon  data  that  is  cached  in  memory  (in  OS  page  cache)  –  Tunable  parameters  to  control  read-­‐ahead  of  data  into  page  cache,  flushing  of  dirty  

data  from  cache,  freeing  up  cache  

•  Direct  I/O  –  ApplicaLon  data  that  is  not  cached.  Suitable  for  databases  –  Direct  I/O  is  not  the  default,  but  can  be  enabled  with  mount  opLons  

•  Metadata  Caches  –  VxFS  metadata  cache  have  size  limits  that  are  automaLcally  chosen  based  on  amount  

of  available  memory,  but  manual  tuning  can  someLmes  help  

•  Intent  Log  –  Depending  on  workload  the  default  intent  log  size  (which  is  based  on  file  system  size)  

may  not  be  large  enough  

Storage  FoundaLon  Deep  Dive:  Performance  and  Tuning  

Page 18: SAB17## Storage#Founda1on#Deep#Dive# … B17.pdf · Example#aaer#tuning:#VxFS#Buffer#Cache#size# # vxfsstat$–b$/var/mail$ buffer$cache$stasLcs$ $1728556$Kbyte$current$$$2097152$maximum$

SYMANTEC  VISION  2012  

When  can  VxFS  tuning  changes  be  made    and  when  do  they  take  effect?  

At  file  system  create  Lme  –  File  system  creaLon:  Block  size  

When  a  file  system  is  mounted  –  Direct  I/O,  Concurrent  I/O,  Asynchronous  I/O  behavior  

 Parameters  that  can  be  changed  online  through  vxtunefs  uLlity  

–  Examples:  read-­‐ahead,  write-­‐flushing  parameters  

VxFS  kernel  module  parameters  –  Example:  Maximum  sizes  for  VxFS  metadata  caches  –  Requires  a  reboot,  or  kernel  module  reload  

 Storage  FoundaLon  Deep  Dive:  Performance  and  Tuning  

Page 19: SAB17## Storage#Founda1on#Deep#Dive# … B17.pdf · Example#aaer#tuning:#VxFS#Buffer#Cache#size# # vxfsstat$–b$/var/mail$ buffer$cache$stasLcs$ $1728556$Kbyte$current$$$2097152$maximum$

SYMANTEC  VISION  2012  

Collect  and  Display  VxFS  sta1s1cs    

VxFS  make  staLsLcs  and  counters  visible  through  vxfsstat  –  Reports  current/maximum  size,  hit  rate  and  age  for  metadata  caches  

–  IdenLfy  when  intent  log  wrap-­‐around  is  causing  flushing  of  metadata  

–  Counters  can  idenLfy  number  of  sequenLal  versus  random  reads/writes  

 

Through  vxfsstat  an  administrator  can  collect  absolute  values,  

 or  collect  staLsLcs  over  a  specific  period  of  Lme  

Storage  FoundaLon  Deep  Dive:  Performance  and  Tuning  

Page 20: SAB17## Storage#Founda1on#Deep#Dive# … B17.pdf · Example#aaer#tuning:#VxFS#Buffer#Cache#size# # vxfsstat$–b$/var/mail$ buffer$cache$stasLcs$ $1728556$Kbyte$current$$$2097152$maximum$

SYMANTEC  VISION  2012  

 Use  vxfsstat  to  iden1fy  a  sequen1al  workload  

# vxfsstat -v /datawarehouse | grep vxi_read

vxi_read_dio 0 vxi_read_rand 2

vxi_read_seq 253328 vxi_setattr_nochange 0

# vxfsstat -v /datawarehouse | grep vxi_write

vxi_write_logonly 0 vxi_write_rand 20

vxi_write_seq 2633043 vxi_anonpgin 0

A workload with primarily sequential reads will have vxi_read_seq >> vxi_read_rand

Tuning the read ahead can yield huge performance benefits for sequential workloads

Storage  FoundaLon  Deep  Dive:  Performance  and  Tuning  

Page 21: SAB17## Storage#Founda1on#Deep#Dive# … B17.pdf · Example#aaer#tuning:#VxFS#Buffer#Cache#size# # vxfsstat$–b$/var/mail$ buffer$cache$stasLcs$ $1728556$Kbyte$current$$$2097152$maximum$

SYMANTEC  VISION  2012  

 Tuning  VxFS  read  ahead  behavior  

Storage  FoundaLon  Deep  Dive:  Performance  and  Tuning  

# vxtunefs /datawarehouse| grep read_

read_pref_io = 65536

read_nstream = 4

read_ahead = 1

Read  Ahead  behavior  is  controlled  by  the  per  file  system  tunable  read_ahead  •  Can  be  set  to  Off/On/Enhanced    The  amount  of  read  ahead  is  controlled  by  two  parameters:  

•  Preferred  read  I/O  size  read_pref_io    •  Number  of  read  streams/threads,  read_nstreams  

When  VxVM  is  used  these  tunables  will  be  set  based  on  volume  geometry  

Page 22: SAB17## Storage#Founda1on#Deep#Dive# … B17.pdf · Example#aaer#tuning:#VxFS#Buffer#Cache#size# # vxfsstat$–b$/var/mail$ buffer$cache$stasLcs$ $1728556$Kbyte$current$$$2097152$maximum$

SYMANTEC  VISION  2012  

VxFS  Read  Ahead  recommenda1ons  

• Make  sure  Read  Ahead  is  turned  on  (default)  –  Use  enhanced  read-­‐ahead  mode  if  you  have  mulLple  threads  accessing  the  same  file  

•  Increase  read_pref_io  –  up  to  1Mb  –  Be  aware  of  that  increasing  read_pref_io  may  put  a  higher  I/O  load  on  your  system  

•  Increase  read_nstream  if  you  use  concat  volumes  on  disks  striped  in  an  array  

•  Experiment  with  read-­‐ahead  tuning.  Can  be  done  online  via  vxtunefs  

Storage  FoundaLon  Deep  Dive:  Performance  and  Tuning  

Page 23: SAB17## Storage#Founda1on#Deep#Dive# … B17.pdf · Example#aaer#tuning:#VxFS#Buffer#Cache#size# # vxfsstat$–b$/var/mail$ buffer$cache$stasLcs$ $1728556$Kbyte$current$$$2097152$maximum$

SYMANTEC  VISION  2012  

Example:  VxFS  Buffer  Cache  size  

#  vxfsstat  –b  /var/mail  buffer  cache  staLsLcs  

 1528576  Kbyte  current        1528576  maximum    850259661  lookups                        69.84%  hit  rate  

         102  sec  recycle  age  [not  limited  by  maximum]  

•  A  cache  hit  rate  of  >90%  is  desirable  •  A  low  recycle  age  (<100s)  indicate  pressure  on  the  inode  cache  

•  Tuning  one  cache  may  have  implicaLons  on  the  other  caches  •  The  buffer  cache  size  is  a  kernel  tunable  

•  On  Solaris  “set  vxfs:vx_bc_buywm=  2097152”  in  /etc/system Storage  FoundaLon  Deep  Dive:  Performance  and  Tuning  

Page 24: SAB17## Storage#Founda1on#Deep#Dive# … B17.pdf · Example#aaer#tuning:#VxFS#Buffer#Cache#size# # vxfsstat$–b$/var/mail$ buffer$cache$stasLcs$ $1728556$Kbyte$current$$$2097152$maximum$

SYMANTEC  VISION  2012  

Example  aaer  tuning:  VxFS  Buffer  Cache  size  

#  vxfsstat  –b  /var/mail  buffer  cache  staLsLcs  

 1728556  Kbyte  current        2097152  maximum    120259661  lookups                        92.84%  hit  rate  

         3605  sec  recycle  age  [not  limited  by  maximum]  

Storage  FoundaLon  Deep  Dive:  Performance  and  Tuning  

Aber  increasing  the  buffer  cache  by  512M:  •  The  buffer  no  longer  grows  to  the  maximum  size  •  The  cache  hit  rate  increased  to  >90%  •  The  recycle  Lme  increased  

Page 25: SAB17## Storage#Founda1on#Deep#Dive# … B17.pdf · Example#aaer#tuning:#VxFS#Buffer#Cache#size# # vxfsstat$–b$/var/mail$ buffer$cache$stasLcs$ $1728556$Kbyte$current$$$2097152$maximum$

SYMANTEC  VISION  2012  

Tuning  is  oaen  Reac1ve  

• “My  applicaLon  is  running  slow  and  I  don’t  know  why.              Can  you  fix  it?”  

• ApplicaLon  design  problems  needs  to  be  addressed  in  the  applicaLon  (80/20  rule)  –  Tweaking  the  Server  or  Storage  layers  will  oben  not  yield  enough  

• One  of  the  keys  that  allow  you  to  become  more  ProacLve  is  to  have  historical  data  to  reference  

• Tuning  is  an  iteraLve  process  –  it  never  stops  

Storage  FoundaLon  Deep  Dive:  Performance  and  Tuning  

Page 26: SAB17## Storage#Founda1on#Deep#Dive# … B17.pdf · Example#aaer#tuning:#VxFS#Buffer#Cache#size# # vxfsstat$–b$/var/mail$ buffer$cache$stasLcs$ $1728556$Kbyte$current$$$2097152$maximum$

SYMANTEC  VISION  2012  

What  happens  if  you  need  to  call  Support?  

• Support  will  help  troubleshoot,  but  it’s  “best  effort”  

•  If  a  problem  or  bug  is  idenLfied  SF  engineering  takes  over  

• What  can  you  do  to  help  things  along?  –  Create  a  crisp  explanaLon  of  the  problem.  

–  Describe  how  to  reproduce  the  problem  –  Collect  a  VRTSexplorer  –  Collect  relevant  staLsLcs  –  Use  FirstLook  

•  Download  FirstLook  

Storage  FoundaLon  Deep  Dive:  Performance  and  Tuning  

Page 27: SAB17## Storage#Founda1on#Deep#Dive# … B17.pdf · Example#aaer#tuning:#VxFS#Buffer#Cache#size# # vxfsstat$–b$/var/mail$ buffer$cache$stasLcs$ $1728556$Kbyte$current$$$2097152$maximum$

SYMANTEC  VISION  2012  

Agenda  

Storage  FoundaLon  Architecture  1  

Methodology  and  Tools  2  

IdenLfy  and  Remediate  BoHlenecks  3  

Tuning  and  Best  PracLces  for  common  use  cases  4  

Storage  FoundaLon  Deep  Dive:  Performance  and  Tuning  

AddiLonal  Resources  5  

Page 28: SAB17## Storage#Founda1on#Deep#Dive# … B17.pdf · Example#aaer#tuning:#VxFS#Buffer#Cache#size# # vxfsstat$–b$/var/mail$ buffer$cache$stasLcs$ $1728556$Kbyte$current$$$2097152$maximum$

SYMANTEC  VISION  2012  

OLTP  Workload  

• Workload  characterisLcs  –  Storage  FoundaLon  provides  performance  of  raw  volumes  with  ease  of  managing  files  •  No  caching:  Use  the  memory  for  DB  cache;  avoid  double-­‐buffering,  copy  overhead  •  Ease  up  on  locking:  DB  maintains  its  own  locks,  issues  “safe”  concurrent  accesses  

–  I/O  characterisLcs  from  an  OLTP  database  •  SequenLal,  Synchronous  writes  to  the  database  recovery  log  (aka  redo  log)  • Most  writes  to  DB  tables  are  random  and  asynchronous  in  nature  •  Reads  from  DB  tables  are  mostly  random  

Storage  FoundaLon  Deep  Dive:  Performance  and  Tuning  

Page 29: SAB17## Storage#Founda1on#Deep#Dive# … B17.pdf · Example#aaer#tuning:#VxFS#Buffer#Cache#size# # vxfsstat$–b$/var/mail$ buffer$cache$stasLcs$ $1728556$Kbyte$current$$$2097152$maximum$

SYMANTEC  VISION  2012  

OLTP  Workload:  Best  Prac1ces  

• Best  PracLces  –  Use  ODM  for  Oracle  databases;  cio  mount  opLon  for  other  databases  

–  Separate  volume  and  FS  for  DB  recovery  log  (aka  redo  log)  

–  Stripe  all  data  volumes  –  Use  an  8Kb  block  size  for  the  file  systems  hosLng  the  data  files  

–  Strive  to  create  a  balanced  I/O  subsystem  

• AddiLonal  tuning  –  DB  cache  size  (has  big  impact)  

–  SequenLal  DRL  tuning  for  redo  log  if  they  are  mirrored  using  SF  

Storage  FoundaLon  Deep  Dive:  Performance  and  Tuning  

Page 30: SAB17## Storage#Founda1on#Deep#Dive# … B17.pdf · Example#aaer#tuning:#VxFS#Buffer#Cache#size# # vxfsstat$–b$/var/mail$ buffer$cache$stasLcs$ $1728556$Kbyte$current$$$2097152$maximum$

SYMANTEC  VISION  2012  

De-­‐duplica1on:  Overview  

Feature  Overview  • Periodic,  Out-­‐of-­‐band  deduplicaLon  • De-­‐duplicate  at  the  file  system  level  • Leverages  shared  extent  in  VxFS,  improving  read  IO  performance  via  caching  

• CLI  based  scheduler    • CLI  based  analyzer  for  de-­‐duplicaLon  • Requires  VxFS  Layout  v9    

Use  Case  • Virtual  Machine  Storage  –  ~80%  reducLon  • Unstructured  data    • Code  repositories    • Anything  else  with  “repeated”  data  

 

 

 Without  De-­‐duplica1on  

 With  VxFS  De-­‐duplica1on  

New  in  6.0  

Storage  FoundaLon  Deep  Dive:  Performance  and  Tuning  

Page 31: SAB17## Storage#Founda1on#Deep#Dive# … B17.pdf · Example#aaer#tuning:#VxFS#Buffer#Cache#size# # vxfsstat$–b$/var/mail$ buffer$cache$stasLcs$ $1728556$Kbyte$current$$$2097152$maximum$

SYMANTEC  VISION  2012  

Time/GB as a function of unique data •  Illustrates that the data becomes more unique the deduplication process takes longer

(more work to do) – It also saves more space! Time taken for dedupe as an function of chunk size: •  As the chunk size increases the time remains low for bigger datasets •  As a result, for really big datasets big chunk size may be advisable if you are willing to

compromise on storage savings

De-­‐duplica1on:  Performance  Details  New  in  6.0  

Storage  FoundaLon  Deep  Dive:  Performance  and  Tuning  

Page 32: SAB17## Storage#Founda1on#Deep#Dive# … B17.pdf · Example#aaer#tuning:#VxFS#Buffer#Cache#size# # vxfsstat$–b$/var/mail$ buffer$cache$stasLcs$ $1728556$Kbyte$current$$$2097152$maximum$

SYMANTEC  VISION  2012  

Deduplica1on  Considera1ons  and  Best  Prac1ces  

• Choose  your  chunk  size  –  it’s  not  easy  to  change  –  The  chunk  size  is  a  create  Lme  tunable  that  is  not  easily  changed  

–  Can  be  varied  between  4Kb  to  128Kb  

• When  in  doubt  use  the  default  chunk  size  of  4Kb    

• Use  the  dry-­‐run  capability  for  modeling  

• Schedule  your  deduplicaLon  run  for  non-­‐producLon  hours  

Storage  FoundaLon  Deep  Dive:  Performance  and  Tuning  

New  in  6.0  

Page 33: SAB17## Storage#Founda1on#Deep#Dive# … B17.pdf · Example#aaer#tuning:#VxFS#Buffer#Cache#size# # vxfsstat$–b$/var/mail$ buffer$cache$stasLcs$ $1728556$Kbyte$current$$$2097152$maximum$

SYMANTEC  VISION  2012  

Compression:  Details  

#  vxcompress  -­‐L  etcfile3  

%Comp      Physical        Logical    %Exts    Alg-­‐Str    Filename  

   67%        64.4  MB      194.9  MB      100%      gzip-­‐5    etcfile3  

 

vxcompress  

Storage  Saved  

File1  

Compression  Block  Ext  Ext   Ext   …   Ext  

Compressed  Extents  

Feature Overview •  Compress  files,  directories,  or  the  whole    file  system  

•  Compression  happens  at  the  extent  level  

•  Reads  decompress  in  memory,  not  on  disk  

•  Can  reduce  replicaLon  and  snapshot  Lmes  • HydraLon  Performance  impact  is  minimal  

•  Feature  available  on  Solaris  and  Linux  •  Tune-­‐able  algorithm,  block  size  and  CPU  use  

Use cases •  Archival data •  Long term data retention is the name of

the game

#fsadm –S compressed /test1 Mountpoint Size(KB) Available(KB) Used(KB) Logical_Size(KB) Compr /test1 1048576 957837 26890 30663 12%

New  in  6.0  

Storage  FoundaLon  Deep  Dive:  Performance  and  Tuning  

Page 34: SAB17## Storage#Founda1on#Deep#Dive# … B17.pdf · Example#aaer#tuning:#VxFS#Buffer#Cache#size# # vxfsstat$–b$/var/mail$ buffer$cache$stasLcs$ $1728556$Kbyte$current$$$2097152$maximum$

SYMANTEC  VISION  2012  

Compression:  Performance  Details  

•  Compression  is  CPU  heavy  and  the  CPU  load  should  be  considered  carefully  

•  Reading  from  compressed  files  can  also  result  in  performance  degradaLon  due  to  the  increased  I/O.    

•  However,  with  mulL-­‐core  CPUs  becoming  standard  in  large  enterprises,  CPU  Lme  should  be  readily  available.      

•  The  total  space  savings  and  Lme  to  compress/uncompress  will  vary  greatly  depending  on:    

•  Server  type  and  available  CPU  •  File/Data  type  and  compression  se|ngs  

New  in  6.0  

Storage  FoundaLon  Deep  Dive:  Performance  and  Tuning  

Page 35: SAB17## Storage#Founda1on#Deep#Dive# … B17.pdf · Example#aaer#tuning:#VxFS#Buffer#Cache#size# # vxfsstat$–b$/var/mail$ buffer$cache$stasLcs$ $1728556$Kbyte$current$$$2097152$maximum$

SYMANTEC  VISION  2012  

Compression:  Considera1ons  and  Best  Prac1ces  

•  Compression  works  best  for  archival  data  –  low  update  rates  

•  Compress  data  during  non-­‐producLon  hours  to  minimize  performance  impact  

•  Experiment  with  the  number  of  compression  threads  to  find  your  sweet  spot  

•  Compression  requires  addiLonal  CPU  cycles  to  uncompress  data  into  memory  when  it’s  read  from  disk  

•  Choose  your  algorithm  strength  with  cauLon  –  It’s  a  trade  off  

•  Compression  can  increase  fragmentaLon  

Storage  FoundaLon  Deep  Dive:  Performance  and  Tuning  

New  in  6.0  

Page 36: SAB17## Storage#Founda1on#Deep#Dive# … B17.pdf · Example#aaer#tuning:#VxFS#Buffer#Cache#size# # vxfsstat$–b$/var/mail$ buffer$cache$stasLcs$ $1728556$Kbyte$current$$$2097152$maximum$

SYMANTEC  VISION  2012  

Dealing  with  Fragmented  file  systems  

• Badly  fragmented  file  systems  sLll  cause  problems  in  real  life  

• Types  of  fragmentaLon    –  Extent  –  A  single  file  split  into  many  small  chunks  

–  Free  Space  –  Free  space  only  available  in  very  small  chunks  

• Three  ways  to  reorganize  (defragment)  VxFS  –  Extent  based  (fsadm  –e)  –  Improve  access  performance  of  current  data  

–  Free  Space  (fsadm    -­‐C)  –  Improves  new  data  writes  

–  Directory  opLmizaLon  (fsadm  –d)  –  Improve  directory  lookup  

Storage  FoundaLon  Deep  Dive:  Performance  and  Tuning  

Page 37: SAB17## Storage#Founda1on#Deep#Dive# … B17.pdf · Example#aaer#tuning:#VxFS#Buffer#Cache#size# # vxfsstat$–b$/var/mail$ buffer$cache$stasLcs$ $1728556$Kbyte$current$$$2097152$maximum$

SYMANTEC  VISION  2012  

Using  fsadm  to  iden1fy  a  fragmented  file  system  

Warning:  This  used  to  be  rocket  science!  

Storage  FoundaLon  Deep  Dive:  Performance  and  Tuning  

Enhanced    in  6.0  

# fsadm –t vxfs –E ... Free Extents by Size 1: 2077988 2: 2104073 4: 1371895 8: 2226679 16: 1618029 32: 1000385

64: 53134 128: 1667 256: 480 512: 352 1024: 302 2048: 244 4096: 172 8192: 107 16384: 76 32768: 122 65536: 5 131072: 0 262144: 0 524288: 0 1048576: 0 2097152: 0 4194304: 0 8388608: 0 16777216: 0 33554432: 0 67108864: 0 134217728: 0 268435456: 0 536870912: 0

1073741824: 0 2147483648: 0

# fsadm –t vxfs –E /data

File System Extent Fragmentation Report

Free Space Fragmentation Index : 80 File Fragmentation Index : 60

# Files Fragmented by Fragmentation Index 0 1-25 26-50 51-75 76-100 7838181 3774 82836 281144 8080449 ...

Fragmenta1on  indexes  are  new  in  6.0  

Page 38: SAB17## Storage#Founda1on#Deep#Dive# … B17.pdf · Example#aaer#tuning:#VxFS#Buffer#Cache#size# # vxfsstat$–b$/var/mail$ buffer$cache$stasLcs$ $1728556$Kbyte$current$$$2097152$maximum$

SYMANTEC  VISION  2012  

How  to  effec1vely  run  defragmenta1on  in  produc1on  

• Best  PracLces  –  Start  early  –  Don’t  wait  unLl  it’s  too  late  –  Run  regularly  –  Maybe  a  file  system  a  night  or  one  per  a  week  

–  Run  for  X  amount  of  Lme  (fsadm  -­‐T  <seconds>  …)  –  If  you  have  a  single  file/dir  you  are  worried  about  –  use  “fsadm  –e  –f  /file”  

–  Use  the  latest  version  of  Storage  FoundaLon  you  can  •  Some  enhancements  have  been  backported  to  earlier  versions,  but  not  all  

• PrevenLng  excessive  fragmentaLon  is  a  good  way  to  get  consistent  file  system  performance  

Storage  FoundaLon  Deep  Dive:  Performance  and  Tuning  

Page 39: SAB17## Storage#Founda1on#Deep#Dive# … B17.pdf · Example#aaer#tuning:#VxFS#Buffer#Cache#size# # vxfsstat$–b$/var/mail$ buffer$cache$stasLcs$ $1728556$Kbyte$current$$$2097152$maximum$

SYMANTEC  VISION  2012  

Performance  Op1miza1on:  Fast  Mirror  Resync  v4  Enhanced    in  6.0  

What’s  new   -­‐  Improved  Performance  -­‐  SequenLal  logging  vs  bitmap  based  DRL  

-­‐  Improved  Scalability  -­‐  DRL  independence  from:  

-­‐  Volume  size  -­‐  Volume  workload  

Storage  FoundaLon  Deep  Dive:  Performance  and  Tuning  

Page 40: SAB17## Storage#Founda1on#Deep#Dive# … B17.pdf · Example#aaer#tuning:#VxFS#Buffer#Cache#size# # vxfsstat$–b$/var/mail$ buffer$cache$stasLcs$ $1728556$Kbyte$current$$$2097152$maximum$

SYMANTEC  VISION  2012  

Does  FMR4  maZer  in  real  life?  

• On  the  leb,  the  best  case  scenario  100%  random  writes,  100-­‐200%  improvement  !  

• On  the  right,  SPECsfs  (NFS  File  serving  workload),  a  throughput  increase  of  30-­‐50%  compared  to  FMR3  

0  

10000  

20000  

30000  

40000  

50000  

60000  

1   2   4   8   16  

Throughp

ut  (K

B/sec)  

Number  of  Threads  

Performance  Gain    FMR3  /  FMR4  

Pure  Random  Writes  (8k)    

FMR3  

FMR4  

0  2000  4000  6000  8000  

10000  12000  14000  16000  18000  

0   2   4   6   8   10  

NFS  Ope

raLo

ns  /  sec  

Response  Time  (msec)  

Performance  Gain    FMR3  /  FMR  4    

NFS  File  serving  workload    

FMR3  Def  Memsz  FMR4  

FMR3  Memsz=128m  

Storage  FoundaLon  Deep  Dive:  Performance  and  Tuning  

Page 41: SAB17## Storage#Founda1on#Deep#Dive# … B17.pdf · Example#aaer#tuning:#VxFS#Buffer#Cache#size# # vxfsstat$–b$/var/mail$ buffer$cache$stasLcs$ $1728556$Kbyte$current$$$2097152$maximum$

SYMANTEC  VISION  2012  

Performance  Op1miza1on:  Par11oned  Directories  New  in  6.0  

What’s  new   -­‐  ParLLon  directories  into  hidden  sub-­‐directories  -­‐  Parallel  update  events  -­‐  Read  events  parallel  to  update  events  

-­‐  Transparent  to  user  -­‐  Supported  on  disk  layout  8+  

-­‐  New  per-­‐mountpoint  tune-­‐able  parameters  -­‐  pdir_enable  =  0|1  -­‐  pdir_threshold  =  value  

-­‐  ParLLoned  status  added  to  vxfssat  –x,  fsdb  

Details:  

-­‐  Off  by  default  for  Solaris,  AIX  and  Linux  

-­‐  32k  default  threshold  for  AIX,  HP-­‐UX,  and  Linux  

-­‐  4K  for  Solaris  

-­‐  Directories  do  NOT  get  converted  once  under  threshold  

Storage  FoundaLon  Deep  Dive:  Performance  and  Tuning  

Page 42: SAB17## Storage#Founda1on#Deep#Dive# … B17.pdf · Example#aaer#tuning:#VxFS#Buffer#Cache#size# # vxfsstat$–b$/var/mail$ buffer$cache$stasLcs$ $1728556$Kbyte$current$$$2097152$maximum$

SYMANTEC  VISION  2012  

Performance  Op1miza1on:  Delayed  Alloca1on  New  in  6.0  

What’s  new   -­‐  Improved  extended  write  performance  in  VxFS  -­‐  Reduced  fragmenta1on  

Targets   -­‐  Applica1ons  using  extended  writes  -­‐  Applica1ons  with  fast-­‐moving,  temporary  files  

•  To  turn  off:  vxtunefs –o dalloc_enable=0 mount_point

On  HP,  use  /etc/vx/tunefstab to  turn  it  off  during  boot  Lme.  

•  To  modify  the  usage  threshold  at  which  delayed  allocaLon  is  automaLcally  disabled  vxtunefs –o dalloc_limit=value_from_50_to_95

Default  value  :  90  

•  Internal  tesLng  showed  up  to  40%  throughput  performance  

Storage  FoundaLon  Deep  Dive:  Performance  and  Tuning  

Page 43: SAB17## Storage#Founda1on#Deep#Dive# … B17.pdf · Example#aaer#tuning:#VxFS#Buffer#Cache#size# # vxfsstat$–b$/var/mail$ buffer$cache$stasLcs$ $1728556$Kbyte$current$$$2097152$maximum$

SYMANTEC  VISION  2012  

Agenda  

Storage  FoundaLon  Architecture  1  

Methodology  and  Tools  2  

IdenLfy  and  Remediate  BoHlenecks  3  

Tuning  and  Best  PracLces  for  common  use  cases  4  

Storage  FoundaLon  Deep  Dive:  Performance  and  Tuning  

AddiLonal  Resources  5  

Page 44: SAB17## Storage#Founda1on#Deep#Dive# … B17.pdf · Example#aaer#tuning:#VxFS#Buffer#Cache#size# # vxfsstat$–b$/var/mail$ buffer$cache$stasLcs$ $1728556$Kbyte$current$$$2097152$maximum$

SYMANTEC  VISION  2012  

Storage  Founda1on  Performance  Tuning  Guide  

• Based  on  real  world  experience            and  research  from  our    

       Performance  Engineering  Group  

• Available  since  SF  5.1SP1    –  But  applicable  to  many  more  versions  

• Available  online  on  SORT  

Storage  FoundaLon  Deep  Dive:  Performance  and  Tuning  

sort.symantec.com  

Page 45: SAB17## Storage#Founda1on#Deep#Dive# … B17.pdf · Example#aaer#tuning:#VxFS#Buffer#Cache#size# # vxfsstat$–b$/var/mail$ buffer$cache$stasLcs$ $1728556$Kbyte$current$$$2097152$maximum$

SYMANTEC  VISION  2012  

Performance  Tuning  with  Storage  Founda1on  

Tuning  is  oaen  Reac1ve  

• Use  Storage  FoundaLon  and  OS  tools  to  understand  what  is  happening  

• AHempt  to  find  a  way  to  duplicate  the  workload  

Understand  the  BoZleneck  

• Look  at  the  bigger  picture  • Resource  ContenLon  oben  masquerades  as  latency  and  bandwidth  issues  

• Tuning  the  incorrect  piece  of  the  stack  can  oben  have  bad  results    

Tuning  is  an  itera1ve  process  

•  Change  one  thing  at  a  Lme  •  Evaluate  the  changes  and  start  again  

•  Work  towards  being  proacLve  •  Collect  a  baseline  when  things  perform  well  

Storage  FoundaLon  Deep  Dive:  Performance  and  Tuning  

Page 46: SAB17## Storage#Founda1on#Deep#Dive# … B17.pdf · Example#aaer#tuning:#VxFS#Buffer#Cache#size# # vxfsstat$–b$/var/mail$ buffer$cache$stasLcs$ $1728556$Kbyte$current$$$2097152$maximum$

Thank  you!  

Copyright  ©  2012  Symantec  Corpora1on.  All  rights  reserved.  Symantec  and  the  Symantec  Logo  are  trademarks  or  registered  trademarks  of  Symantec  CorporaLon  or  its  affiliates  in  the  U.S.  and  other  countries.    Other  names  may  be  trademarks  of  their  respecLve  owners.    This  document  is  provided  for  informaLonal  purposes  only  and  is  not  intended  as  adverLsing.    All  warranLes  relaLng  to  the  informaLon  in  this  document,  either  express  or  implied,  are  disclaimed  to  the  maximum  extent  allowed  by  law.    The  informaLon  in  this  document  is  subject  to  change  without  noLce.  

Storage  FoundaLon  Deep  Dive:  Performance  and  Tuning  

Oscar  Wahlberg  [email protected]