2015-mar-02: ago fluxgate data: extracting value from an imperfect time series

23
AGO Fluxgate Data Extracting value from an imperfect time series Kevin Urban, NJIT, 2015Mar02

Upload: inverseuniverse

Post on 10-Aug-2015

54 views

Category:

Science


3 download

TRANSCRIPT

Page 1: 2015-Mar-02: AGO Fluxgate Data: Extracting value from an imperfect time series

AGO Fluxgate

Data

Extracting value from an imperfect time series

Kevin  Urban,  NJIT,  2015-­‐Mar-­‐02  

Page 2: 2015-Mar-02: AGO Fluxgate Data: Extracting value from an imperfect time series

This presentation is NOT about “Perfect Data”

Perfect  Data  Evenly  sampled:    No  need  for  downsampling  to  use  FFT  techniques;  no  need  to  use  more  sophis9cated  non-­‐FFT  techniques.    ConEnuously  sampled:  No  missing  data;  no  need  to  interpolate  or  downsample.    Properly  calibrated:  the  9me  series  values  are  exact  to  a  specified  uncertainty  (this  is  in  contrast  to  those  9me  series  that  have  an  inexact  constant  offset,  but  an  exact  deriva9ve).    No  noise  contaminaEon:  the  spectra  are  resolved  all  the  way  through  to  the  high  frequency  end  of  the  spectrum  (in  contrast  to  noisy  9me  series  which  have  a  “noise  floor”  in  the  spectral  domain  which  tends  to  flaHen  out  and  dominate  the  high-­‐frequency  end  of  the  spectrum).  

Page 3: 2015-Mar-02: AGO Fluxgate Data: Extracting value from an imperfect time series

This presentation is about “Imperfect Data”

Imperfect,  Evenly  Sampled  Data    Improperly   calibrated:   the   9me   series   values   are   not   exact:   in   addi9on   to   a  constant   offset,   there   exists   an   improper   scaling,   which   slowly   changes   over  intervals  much   longer  than  scales  of   interest   (e.g.,  over  months  or  years  when  we  care  about  periods  of  3  –  20  mins).    Noise  contaminaEon:  the  high  frequency  end  of  the  spectrum  is  dominated  by  a  noise  floor,  which  affects  how  one  can  analyze  and  transform  the  data.      Reserved  for  future  talks  Data   gaps:   Some   missing   data,   presumed   small   rela9ve   to   scales   of   interest  (e.g.,  3  con9guous  seconds  when  we  care  about  periods  of  3  mins  or  greater)  or  moderately-­‐sized  (e.g.,  1  min  data  gap  when  we  care  about  periods  3-­‐10  mins);  need  to  interpolate  or  downsample.    Unevenly  sampled  data  To   use   FFT   techniques,   one   needs   evenly   sampled   data   and   so   one   must  downsample  to  an  evenly  sampled  9me  series  or  one  may  resort  to  alterna9ve  techniques  (e.g.,  Lomb-­‐Scargle).    

Page 4: 2015-Mar-02: AGO Fluxgate Data: Extracting value from an imperfect time series

Inexact  values  up  to  a  constant  offset:    exact  deriva4ve  

•  Absolute:  Some  people  care  about  the  absolute  value  of  the  geomagne9c  field;  these  people  are  usually  geologists  of  some  variety    

 •  Variometer:  Magnetosphere/ionosphere  scien9sts  are  oWen  less  stringent,  caring  

mostly  about  the  field’s  deriva9ve,  or  rela9ve  varia9ons.    -­‐-­‐  i.e.,  the  data’s  mean  offset  from  zero  is  trivial  -­‐-­‐-­‐  so  one  might  as  well  standardize  the  mean  offset  to  zero  (Zero  Mean  Sequence),  which  is  necessary  fully  benefit  from  many  spectral  techniques  (e.g.,  windowing).  

Absolute  Magnetometer  Data    

Variometer-­‐Quality  Data  

45015    45010    45005    45000    44995    44990  

nT  15    10    5    0    -­‐5    -­‐10  

nT  

Eme  

Variometer Data

Page 5: 2015-Mar-02: AGO Fluxgate Data: Extracting value from an imperfect time series

Spectrally,  the  only  difference  between  the  two  data  types  is  in  the  “DC  offset”  –  or,  “zero  frequency”  power  contribu9on:    the  non-­‐varying,  constant  background  component.    

•  This  is  just  ONE  SPECTRAL  VALUE    Geologists  care  about  this  term  immensely  in  order  to  study  the  gradual  decay  and/or  growth  of  the  main  field  over  years,  centuries,  etc.  However,  this  term  is  largely  irrelevant  to  many  magnetosphere-­‐ionosphere  studies  where  we  are  interested  in  changes  in  the  field  on  the  order  of  hours,  minutes,  seconds,  and  shorter!  

In  the  spectral  domain,  there  is  a  trivial  difference  between  “absolute”  and  “variometer”  data  

Absolute  Magnetometer  Data    

Variometer-­‐Quality  Data  

Every  spectral  component  except  the  

first  is  idenEcal!    

Variometer Data

Page 6: 2015-Mar-02: AGO Fluxgate Data: Extracting value from an imperfect time series

Example: Calibration Issue What  if  the  variometer’s  calibra9on  between  registered  voltages  and  actual  field  values  is  off  by  a  constant  factor?    

Spectrally  we  get  the  same  informaEon  concerning   peaks.   However,   the  exac9tude   of   the   actual   values   may   no  longer  be  absolutely  trustworthy.  

Black:  data              Red:  0.85*data  

The  detrended  versions  of  these  power  spectra  are  iden9cal  when  one  uses  a  robust  detrending  scheme  (shown  later).    

Inexact  values;    inexact  deriva4ve  up  to  scale  factor  

Scaled Variometer Data

Page 7: 2015-Mar-02: AGO Fluxgate Data: Extracting value from an imperfect time series

Example: Evolving Calibration Issue What  if  the  variometer  is  inaccessible  (e.g.,  lost  10  feet  under  ice,  but  s9ll  recording)  and  one  no9ces  the  mean  spectral  amplitudes  are  unnaturally  decaying  over  9me?    

Possible  causes  (fluxgate  magnetometer  under  ice  in  Antarc9ca):  •  Calibra9on  sensors  degrading  in  quality  •  Slow  rota9on  of  magnetometer  out  of  ini9al  coordinate  system  due  to  slow  ice  flow  •  Slow  rota9on  of  the  Earth’s  main  field,  effec9vely  rota9ng  magnetometer  out  of  its  

presumed  coordinate  system  

Black:  PSD(data)                        Orange:  PSD(0.333*data)                    Blue:  PSD(0.11*data)    

Black:  data                        Orange:  0.333*data  Blue:  0.11*data    

NOTHING  TO  FEAR:    One  can  s9ll  extract  value  from  such  data.  The  detrended  versions  of  these  power  spectra  are  iden9cal  when  one  uses  a  robust  detrending  scheme  (next  few  slides).    

Scaled Variometer Data

Page 8: 2015-Mar-02: AGO Fluxgate Data: Extracting value from an imperfect time series

Extracting value from “Imperfect Data”

 Given  we  have  slowly  evolving,  

improperly  scaled  variometer  data,  exactly  what  value  can  we  sEll  

extract  from  it,  and  how?    

Page 9: 2015-Mar-02: AGO Fluxgate Data: Extracting value from an imperfect time series

The  Background  Power  Law  [BPL]    Geomagne9c  power   spectra  oWen  appear   to  fluctuate  about  a  background  power  law.    *  Note  the  two  uses  of  “power”  here:      

(1)  “Power  spectra”  refers  to  signal  “energy”  (or  signal  variance)  decomposed  by  frequency.    (2)  “Power  law”  refers  to  an  exponent  (e.g.,  inverse  square  root,  quadraEc,  etc)    

 

Page 10: 2015-Mar-02: AGO Fluxgate Data: Extracting value from an imperfect time series

The  Detrended  PSD  Some9mes   called   a   Rela9ve   PSD,   Residual   PSD,   or   Whitened   PSD.   One   may   even   call   it   a  “decorrelated  spectrum.”    “RelaEve”  makes  sense  in  regular-­‐regular  domain  since  PSD{f}    =  DPSD{f}*BPL{f},      

-­‐-­‐  detrended  spectra  are  enhancements/depleEons  relaEve  to  the  BPL    “Residual”  makes  sense  in  the  log-­‐log  domain  since:      Log{PSD}  =  Log(DPSD)  +  Log(BPL)                      

-­‐-­‐  detrended  spectra  are  the  residuals  of  the  log(BPL)-­‐subtracted  log(PSD)    “Whitened”  because  a  *properly*  detrended  colored  noise  spectrum  is  a  white  noise  spectrum.    “Decorrelated”  because  detrended  spectral  values  are  uncorrelated      

BPL  =  “Background  Power  Law”    PSD  =  “Power  Spectral  Density”  

Page 11: 2015-Mar-02: AGO Fluxgate Data: Extracting value from an imperfect time series

The  Detrended  PSD      (conEnued)  “Detrended  PSD”  is  appropriate  in  both  the  regular-­‐regular  and  log-­‐log  domains:  removal  of  the   background   power   law   amounts   to   addi9ve   detrending   in   the   log   domain   and  mul9plica9ve  detrending  in  the  regular  domain.    IMHO,   “Detrended   PSD”   is   unambiguous   (its   meaning   is   fairly   straighLorward   and   easily  communicated)  and  unassuming  (it  states  only  that  you’ve  detrended  a  power  spectrum,  not  that  you  did  it  correctly).      The  terms    “whitened  spectrum”  and  “decorrelated  spectrum”  both  presume  you’ve  properly  whitened  your  spectrum,  which  is  not  always  the  case  (next  few  slides!).  

GOAL:  we  want  a  “detrended  PSD”  that  is  robust  against  the  aforemen9oned  calibra9on  errors  and  also  properly  docorrelates/whitens  our  power  spectra.  

Page 12: 2015-Mar-02: AGO Fluxgate Data: Extracting value from an imperfect time series

How  to  NOT  detrend:  First  Differencing  (“Pre-­‐Whitening”  )  Pro:   The   peaks   and   rela9ve   differences   (spectral  morphology)  remain  unchanged  Con:   The   unaware   data   analyst   might   assume   one  loca9on  had  greater  power  fluctua9ons  than  another  (in  the  case  of  one  properly-­‐  and  one   improperly-­‐calibrated  magnetometers)  

Con:   Detrending   the   spectrum  via   “pre-­‐whitening”   (first-­‐differencing   the   9me   series)   is  not   fully   robust   against   the  aforemen9oned   calibra9on  issues.    

Con :   s p e c t r a   a r e   NOT  decorrelated,   i.e.,   the   spectra  are   typically   not   whitened,  despite   the   name   “pre-­‐whitening.”  

Page 13: 2015-Mar-02: AGO Fluxgate Data: Extracting value from an imperfect time series

Where  “Pre-­‐Whitening”  Goes  Wrong    

In  prac9ce  most  people  use  first  differencing  to  pre-­‐whiten  a  discrete-­‐9me  sequence.  However,  one   may   choose   any   numerical   deriva9ve  without   avoiding   the   shortcomings   of   this  method.      If   you   work   out   the   math   in   the   con9nuous-­‐9me   senng   using   the   normal   deriva9ve,   you  will   find   that   the   method   of   pre-­‐whitening  assumes  your  spectra  have  a  BPL  with  spectral  index  of  2,  i.e.,  a  Brownian  MoEon  spectrum     -­‐-­‐   the   spectral   index   of   geomagne9c   9me  series   varies   between   1.5   and   2.5   all  throughout   the   day,   by   la9tude,   and   by  geomagne9c  ac9vity  

Page 14: 2015-Mar-02: AGO Fluxgate Data: Extracting value from an imperfect time series

How  to  NOT  detrend  a  PSD:            Least-­‐Squares  Log-­‐Linear  Fit  over                    EnEre  Spectrum  

Pro:   As   with   “pre-­‐whitening,”   the   peaks   and   rela9ve  differences  (spectral  morphology)  remain  unchanged.    Pro:   Unlike   pre-­‐whitening,   this   method   at   least   is   robust  against  calibra9on  issues:  the  3  spectra  are  iden9cal  

This  is  because  no  assumpEon  is  made  about  the  logarithmic  slope  and  offset:  they  are  esEmated,  not  prescribed.    

Con:   The   unaware   data   analyst   might   assume   the   lower  frequency   band   have   much   greater   power   fluctua9ons  than  higher  frequency  bands.    Con:  Like  pre-­‐whitening,   the  spectra  are  typically  not   fully  whitened/decorrelated  using  this  method.    

Page 15: 2015-Mar-02: AGO Fluxgate Data: Extracting value from an imperfect time series

Where  the  Least-­‐Square  Log-­‐Linear  Fit  over  the  EnEre  Spectrum  Goes  Wrong!  

Theore9cally,   this   should   work,   but   in   prac9ce   a  magnetometer  has  a  “noise  floor”  -­‐-­‐-­‐  NEXT  SLIDE!  

Page 16: 2015-Mar-02: AGO Fluxgate Data: Extracting value from an imperfect time series

PSD  Noise  Floor  In  most  geomagne9c  power  spectra  obtained  via  fluxgate  magnetometers,  one  encounters  a  “noise  floor”   in   the  high-­‐frequency   range  of   the  PSD.  The  noise  floor   is   the  high-­‐frequency  region   of   the   spectra   dominated   by  white   noise   power   rather   than   signal   power.   A   noise  floor  is  very  easy  to  see  in  the  log-­‐log  domain.  

The  noise  floor   limits  what  frequency  bands  are  amenable  to  es9ma9on  of  geophysical  signal  power.  For  example,  magnetometers  that  measure  the  geomagne9c  field  at  1-­‐Hz  correspond  to  a  Nyquist  period  of  2-­‐sec,  which  means  that  we  should  be  able  to  resolve  the  spectral  power  of  periodici9es  as  short  as  ~2  seconds.  However,  in  many  magnetometer  9me  series  I’ve  worked  with,  geomagne9c  PSD  es9mates  cannot  be  resolved  un9l  about  30-­‐45  second  periodici9es  (top  half  of  Pc3  band)  

Page 17: 2015-Mar-02: AGO Fluxgate Data: Extracting value from an imperfect time series

How  to  More  Reliably  Detrend  the  PSD  Least-­‐Squares  Log-­‐Linear  over  the  first  5%  of  the  lower  frequencies  -­‐-­‐  this  way  has  more  pros  than  last  two  methods  -­‐-­‐  however,  there  exist  yet  more  sophis9cated  ways  that  some  argue  are  much  beHer    Why  just  5%?  (i)  In  many  types  of  9me  series  of  measurements  (not  just  magnetometers)  there  exists  a  

point  in  the  higher  frequencies  where  the  signal  power  is  no  longer  stronger  than  the  white  noise  power.  As  demonstrated,  an  undetected  high-­‐frequency  noise  floor  will  kill  your  fit  if  fit  is  over  the  whole  spectrum  

(ii)  Even  without  a  noise  floor,  any  rela9ve  enhancement  or  deple9on  across  a  high-­‐frequency  band  will  severely  bias  the  logarithmic  slope  and  offset.  (Bands  are  defined  logarithmically,  white  the  DFT  frequencies  are  spaced  linearly.)  

EXAMPLE:  in  a  1-­‐hour  window  of  a  1-­‐Hz  9me  series,  the  Pc4-­‐Pc6  bands  cons9tute  ~4.4%  of  all  the   frequencies,   while   the   Pc3   band   makes   up   ~15.6%.   The   rest   is   usually   the   noise   floor  (~80%  of  the  DFT  frequencies!).  Should  one  fit  over  the  Pc3-­‐Pc6  band?    (HINT:  No.)    

Any  power  bump  or  lull  across  the  Pc3  band  will  strongly  dominate  the  fit.  Even  if  one  fits  over  the  lowest  quarter  of  the  Pc3  band,  that  is  ~70  Pc3  frequences,  which  is  almost  the  amount  of  frequencies  in  Pc4-­‐6.    

 

Page 18: 2015-Mar-02: AGO Fluxgate Data: Extracting value from an imperfect time series

For  geomagneEc  fluxgate  data:  Just  fit  over  the  Pc4-­‐Pc6  bands,  which  cons9tutes  just  under  the  first  5%  of  low  frequencies  in  the  spectrum.  This  is  in  line  with  what  many  sta9s9cians  recommend  and  is  comparable  to  maximum-­‐likelihood  parameter  es9ma9on  (which  I  have  not  tried).  One  may  even  include  the  low  ~10%  of  the  Pc3  band  (~28  frequencies  for  a  1-­‐Hr  window  of  a  1-­‐Hz  9me  series).  

Least-­‐Squares  Log-­‐Linear  over  the  first  5%  of  the  lower  frequencies  

Page 19: 2015-Mar-02: AGO Fluxgate Data: Extracting value from an imperfect time series

Another  bit  about  the  Noise  Floor  The  noise  floor  can  actually  be  used  to  recalibrate  the  data  from  a  magnetometer.  We  showed  that  the  mis-­‐calibra9on  results  in  a  logarithmic  offset,  and  nothing  more.  If  one  has   data   from   the   magnetometer   during   a   9me   when   it   was   known   to   be   properly  calibrated,   then  one   can   shiW   the   spectra   by   the   appropriate   logarithmic   offset   during  dates  when  the  magnetometer  was  mis-­‐calibrated.      For  the  AGOs,  the  magnetometers  were  properly  calibrated  in  the  last  1990s.  If  absolute  power  data  is  desired,  we  can  likely  develop  a  scheme  for  adjus9ng  data  in  later  years.  (The  magnetometers  are  no  longer  accessible  to  calibrate  -­‐-­‐-­‐  they  are  deep  down  inside  the  ice.)  

Black:  PSD(data)                        Orange:  PSD(0.333*data)                    Blue:  PSD(0.11*data)    

Black:  data                        Orange:  0.333*data  Blue:  0.11*data    

See  Slide  5  again  for  context.  

Page 20: 2015-Mar-02: AGO Fluxgate Data: Extracting value from an imperfect time series

The  Detrended  PSD:  not  just  good  for  imperfect  data  

The   raw   geomagne9c   power  spectra   are   strongly   correlated  over   9me,   e.g.,   when   a   CME  strikes  the  Earth,  the  geomagne9c  noise   power   (i.e.,   the   BPL)  increases   across   the   spectrum.  S t rong   co r re l a9ons   found  between  power  is  separate  bands  during,   say,   solar   wind   ac9vity,  then,  is  fairly  trivial.    

The  detrended   spectrum,  however,   shows  us   informa9on  about   strong,  coherent  waves  (enhancements  above  the  noise)  and  evidence  of  band-­‐filtering  (significant  deple9ons  below  the  noise  spectrum).   If   these  were  real  spectra,  we  would  no9ce  that  the  wave  ac9vity,  although  enhanced  by  the  solar  wind  along  with  the  BPL,  is  actually  fairly  independent  of  it.  

Let’s  assume  our  magnetometer  is  perfectly  calibrated  and  pretend  t1  is  a  spectrum  computed  on  a  geomagne9cally  quiet  day,  and  that  t2  is  a  spectrum  computed  during  the  passage  of  a  coronal  mass  ejec9on  [CME].  

Log(PSD)  

Log(Frequency)  

t1  

Detrended   Detrended  

Log(PSD)  

t2  

Log(Frequency)  

Page 21: 2015-Mar-02: AGO Fluxgate Data: Extracting value from an imperfect time series

Two   of   these   spectra   have   about   the   same  logarithmic   slope   (aka   “spectral   index”),   but   vastly  logarithmic  offsets.        Two  of   them  have   the   same   logarithmic   offset,   but  vastly  different  spectral  indices.    However,   if  one  properly  detrendeds   the   spectrum,  the   detrended   spectrum   is   the   same   for   both.  We  just  said  this  was  a  good  thing!  But  it  is  not  always  a  good  thing.  

LimitaEons  of  the  Detrended  PSD  in  isolaEon  

However,  we  do  not  look  at  just  the  DPSD.  When  compu9ng  the  linear  fits,  we  need  not  throw  out  this  addi9onal  informa9on.      Even  with  poorly  calibrated  data,  the  spectral  index  is  leW  unharmed.  The  offset  will  differ,  but  only  between  magnetometers,  for  example.  It  is  s9ll  very  useful  for  a  given  magnetometer  to  gauge  how  the  total  power  is  varying  over  9me.  

Page 22: 2015-Mar-02: AGO Fluxgate Data: Extracting value from an imperfect time series

DATA  QUALITY  RECAP  THE  MOST  IMPORTANT  TAKE  AWAY:  Quality  issues  in  the  9me  domain  do  not  necessarily  map  to  the  frequency  domain,  and  those  that  do  can  be  controlled  and  mi9gated.    •  Absolute  field  data  and  variometer-­‐quality  data  essen9ally  have  the  same  power  spectrum    •  Spectral  morphology  (shape  and  log  slope)  is  “invariant  under  calibra9on  error.”  I.e.,  a  

poorly-­‐calibrated  magnetometer  s9ll  gives  us  a  lot  of  relevant  informa9on.      •  If  necessary,  one  can  re-­‐adjust  the  poorly-­‐calibrated  data  to  approximately  absolute  values  if  

one  knows  the  what  the  noise  floor  is  supposed  to  be.    •  However,  magnetometers  of  varying  calibra9on  quality  can  always  be  compared  using  

detrended  power  spectra,  which  when  done  properly  is  “invariant  under  calibra9on  error.”    •  Although  there  are  many  ways  to  detrend  spectra,  all  detrending  schemes  are  not  created  

equal:  one  should  choose  a  scheme  that  will  live  up  to  the  synonyms  of  detrended  spectra:  whitened,  decorrelated  

 •  Detrended  power  spectra  are  great  for  telling  you  about  which  bands  hold  coherent  wave  

energy  and  which  bands  have  been  filtered  (somehow).  However,  they  hold  no  informa9on  concerning  the  background  geomagne9c  noise  (logarithmic  slope  and  offset  /  absolute  values).  

Page 23: 2015-Mar-02: AGO Fluxgate Data: Extracting value from an imperfect time series

Conclusion  

You  can  trust  my  spectral  data  and  data  products  derived  from  the  spectral  data.