2015-mar-02: ago fluxgate data: extracting value from an imperfect time series
Post on 10-Aug-2015
54 Views
Preview:
TRANSCRIPT
AGO Fluxgate
Data
Extracting value from an imperfect time series
Kevin Urban, NJIT, 2015-‐Mar-‐02
This presentation is NOT about “Perfect Data”
Perfect Data Evenly sampled: No need for downsampling to use FFT techniques; no need to use more sophis9cated non-‐FFT techniques. ConEnuously sampled: No missing data; no need to interpolate or downsample. Properly calibrated: the 9me series values are exact to a specified uncertainty (this is in contrast to those 9me series that have an inexact constant offset, but an exact deriva9ve). No noise contaminaEon: the spectra are resolved all the way through to the high frequency end of the spectrum (in contrast to noisy 9me series which have a “noise floor” in the spectral domain which tends to flaHen out and dominate the high-‐frequency end of the spectrum).
This presentation is about “Imperfect Data”
Imperfect, Evenly Sampled Data Improperly calibrated: the 9me series values are not exact: in addi9on to a constant offset, there exists an improper scaling, which slowly changes over intervals much longer than scales of interest (e.g., over months or years when we care about periods of 3 – 20 mins). Noise contaminaEon: the high frequency end of the spectrum is dominated by a noise floor, which affects how one can analyze and transform the data. Reserved for future talks Data gaps: Some missing data, presumed small rela9ve to scales of interest (e.g., 3 con9guous seconds when we care about periods of 3 mins or greater) or moderately-‐sized (e.g., 1 min data gap when we care about periods 3-‐10 mins); need to interpolate or downsample. Unevenly sampled data To use FFT techniques, one needs evenly sampled data and so one must downsample to an evenly sampled 9me series or one may resort to alterna9ve techniques (e.g., Lomb-‐Scargle).
Inexact values up to a constant offset: exact deriva4ve
• Absolute: Some people care about the absolute value of the geomagne9c field; these people are usually geologists of some variety
• Variometer: Magnetosphere/ionosphere scien9sts are oWen less stringent, caring
mostly about the field’s deriva9ve, or rela9ve varia9ons. -‐-‐ i.e., the data’s mean offset from zero is trivial -‐-‐-‐ so one might as well standardize the mean offset to zero (Zero Mean Sequence), which is necessary fully benefit from many spectral techniques (e.g., windowing).
Absolute Magnetometer Data
Variometer-‐Quality Data
45015 45010 45005 45000 44995 44990
nT 15 10 5 0 -‐5 -‐10
nT
Eme
Variometer Data
Spectrally, the only difference between the two data types is in the “DC offset” – or, “zero frequency” power contribu9on: the non-‐varying, constant background component.
• This is just ONE SPECTRAL VALUE Geologists care about this term immensely in order to study the gradual decay and/or growth of the main field over years, centuries, etc. However, this term is largely irrelevant to many magnetosphere-‐ionosphere studies where we are interested in changes in the field on the order of hours, minutes, seconds, and shorter!
In the spectral domain, there is a trivial difference between “absolute” and “variometer” data
Absolute Magnetometer Data
Variometer-‐Quality Data
Every spectral component except the
first is idenEcal!
Variometer Data
Example: Calibration Issue What if the variometer’s calibra9on between registered voltages and actual field values is off by a constant factor?
Spectrally we get the same informaEon concerning peaks. However, the exac9tude of the actual values may no longer be absolutely trustworthy.
Black: data Red: 0.85*data
The detrended versions of these power spectra are iden9cal when one uses a robust detrending scheme (shown later).
Inexact values; inexact deriva4ve up to scale factor
Scaled Variometer Data
Example: Evolving Calibration Issue What if the variometer is inaccessible (e.g., lost 10 feet under ice, but s9ll recording) and one no9ces the mean spectral amplitudes are unnaturally decaying over 9me?
Possible causes (fluxgate magnetometer under ice in Antarc9ca): • Calibra9on sensors degrading in quality • Slow rota9on of magnetometer out of ini9al coordinate system due to slow ice flow • Slow rota9on of the Earth’s main field, effec9vely rota9ng magnetometer out of its
presumed coordinate system
Black: PSD(data) Orange: PSD(0.333*data) Blue: PSD(0.11*data)
Black: data Orange: 0.333*data Blue: 0.11*data
NOTHING TO FEAR: One can s9ll extract value from such data. The detrended versions of these power spectra are iden9cal when one uses a robust detrending scheme (next few slides).
Scaled Variometer Data
Extracting value from “Imperfect Data”
Given we have slowly evolving,
improperly scaled variometer data, exactly what value can we sEll
extract from it, and how?
The Background Power Law [BPL] Geomagne9c power spectra oWen appear to fluctuate about a background power law. * Note the two uses of “power” here:
(1) “Power spectra” refers to signal “energy” (or signal variance) decomposed by frequency. (2) “Power law” refers to an exponent (e.g., inverse square root, quadraEc, etc)
The Detrended PSD Some9mes called a Rela9ve PSD, Residual PSD, or Whitened PSD. One may even call it a “decorrelated spectrum.” “RelaEve” makes sense in regular-‐regular domain since PSD{f} = DPSD{f}*BPL{f},
-‐-‐ detrended spectra are enhancements/depleEons relaEve to the BPL “Residual” makes sense in the log-‐log domain since: Log{PSD} = Log(DPSD) + Log(BPL)
-‐-‐ detrended spectra are the residuals of the log(BPL)-‐subtracted log(PSD) “Whitened” because a *properly* detrended colored noise spectrum is a white noise spectrum. “Decorrelated” because detrended spectral values are uncorrelated
BPL = “Background Power Law” PSD = “Power Spectral Density”
The Detrended PSD (conEnued) “Detrended PSD” is appropriate in both the regular-‐regular and log-‐log domains: removal of the background power law amounts to addi9ve detrending in the log domain and mul9plica9ve detrending in the regular domain. IMHO, “Detrended PSD” is unambiguous (its meaning is fairly straighLorward and easily communicated) and unassuming (it states only that you’ve detrended a power spectrum, not that you did it correctly). The terms “whitened spectrum” and “decorrelated spectrum” both presume you’ve properly whitened your spectrum, which is not always the case (next few slides!).
GOAL: we want a “detrended PSD” that is robust against the aforemen9oned calibra9on errors and also properly docorrelates/whitens our power spectra.
How to NOT detrend: First Differencing (“Pre-‐Whitening” ) Pro: The peaks and rela9ve differences (spectral morphology) remain unchanged Con: The unaware data analyst might assume one loca9on had greater power fluctua9ons than another (in the case of one properly-‐ and one improperly-‐calibrated magnetometers)
Con: Detrending the spectrum via “pre-‐whitening” (first-‐differencing the 9me series) is not fully robust against the aforemen9oned calibra9on issues.
Con : s p e c t r a a r e NOT decorrelated, i.e., the spectra are typically not whitened, despite the name “pre-‐whitening.”
Where “Pre-‐Whitening” Goes Wrong
In prac9ce most people use first differencing to pre-‐whiten a discrete-‐9me sequence. However, one may choose any numerical deriva9ve without avoiding the shortcomings of this method. If you work out the math in the con9nuous-‐9me senng using the normal deriva9ve, you will find that the method of pre-‐whitening assumes your spectra have a BPL with spectral index of 2, i.e., a Brownian MoEon spectrum -‐-‐ the spectral index of geomagne9c 9me series varies between 1.5 and 2.5 all throughout the day, by la9tude, and by geomagne9c ac9vity
How to NOT detrend a PSD: Least-‐Squares Log-‐Linear Fit over EnEre Spectrum
Pro: As with “pre-‐whitening,” the peaks and rela9ve differences (spectral morphology) remain unchanged. Pro: Unlike pre-‐whitening, this method at least is robust against calibra9on issues: the 3 spectra are iden9cal
This is because no assumpEon is made about the logarithmic slope and offset: they are esEmated, not prescribed.
Con: The unaware data analyst might assume the lower frequency band have much greater power fluctua9ons than higher frequency bands. Con: Like pre-‐whitening, the spectra are typically not fully whitened/decorrelated using this method.
Where the Least-‐Square Log-‐Linear Fit over the EnEre Spectrum Goes Wrong!
Theore9cally, this should work, but in prac9ce a magnetometer has a “noise floor” -‐-‐-‐ NEXT SLIDE!
PSD Noise Floor In most geomagne9c power spectra obtained via fluxgate magnetometers, one encounters a “noise floor” in the high-‐frequency range of the PSD. The noise floor is the high-‐frequency region of the spectra dominated by white noise power rather than signal power. A noise floor is very easy to see in the log-‐log domain.
The noise floor limits what frequency bands are amenable to es9ma9on of geophysical signal power. For example, magnetometers that measure the geomagne9c field at 1-‐Hz correspond to a Nyquist period of 2-‐sec, which means that we should be able to resolve the spectral power of periodici9es as short as ~2 seconds. However, in many magnetometer 9me series I’ve worked with, geomagne9c PSD es9mates cannot be resolved un9l about 30-‐45 second periodici9es (top half of Pc3 band)
How to More Reliably Detrend the PSD Least-‐Squares Log-‐Linear over the first 5% of the lower frequencies -‐-‐ this way has more pros than last two methods -‐-‐ however, there exist yet more sophis9cated ways that some argue are much beHer Why just 5%? (i) In many types of 9me series of measurements (not just magnetometers) there exists a
point in the higher frequencies where the signal power is no longer stronger than the white noise power. As demonstrated, an undetected high-‐frequency noise floor will kill your fit if fit is over the whole spectrum
(ii) Even without a noise floor, any rela9ve enhancement or deple9on across a high-‐frequency band will severely bias the logarithmic slope and offset. (Bands are defined logarithmically, white the DFT frequencies are spaced linearly.)
EXAMPLE: in a 1-‐hour window of a 1-‐Hz 9me series, the Pc4-‐Pc6 bands cons9tute ~4.4% of all the frequencies, while the Pc3 band makes up ~15.6%. The rest is usually the noise floor (~80% of the DFT frequencies!). Should one fit over the Pc3-‐Pc6 band? (HINT: No.)
Any power bump or lull across the Pc3 band will strongly dominate the fit. Even if one fits over the lowest quarter of the Pc3 band, that is ~70 Pc3 frequences, which is almost the amount of frequencies in Pc4-‐6.
For geomagneEc fluxgate data: Just fit over the Pc4-‐Pc6 bands, which cons9tutes just under the first 5% of low frequencies in the spectrum. This is in line with what many sta9s9cians recommend and is comparable to maximum-‐likelihood parameter es9ma9on (which I have not tried). One may even include the low ~10% of the Pc3 band (~28 frequencies for a 1-‐Hr window of a 1-‐Hz 9me series).
Least-‐Squares Log-‐Linear over the first 5% of the lower frequencies
Another bit about the Noise Floor The noise floor can actually be used to recalibrate the data from a magnetometer. We showed that the mis-‐calibra9on results in a logarithmic offset, and nothing more. If one has data from the magnetometer during a 9me when it was known to be properly calibrated, then one can shiW the spectra by the appropriate logarithmic offset during dates when the magnetometer was mis-‐calibrated. For the AGOs, the magnetometers were properly calibrated in the last 1990s. If absolute power data is desired, we can likely develop a scheme for adjus9ng data in later years. (The magnetometers are no longer accessible to calibrate -‐-‐-‐ they are deep down inside the ice.)
Black: PSD(data) Orange: PSD(0.333*data) Blue: PSD(0.11*data)
Black: data Orange: 0.333*data Blue: 0.11*data
See Slide 5 again for context.
The Detrended PSD: not just good for imperfect data
The raw geomagne9c power spectra are strongly correlated over 9me, e.g., when a CME strikes the Earth, the geomagne9c noise power (i.e., the BPL) increases across the spectrum. S t rong co r re l a9ons found between power is separate bands during, say, solar wind ac9vity, then, is fairly trivial.
The detrended spectrum, however, shows us informa9on about strong, coherent waves (enhancements above the noise) and evidence of band-‐filtering (significant deple9ons below the noise spectrum). If these were real spectra, we would no9ce that the wave ac9vity, although enhanced by the solar wind along with the BPL, is actually fairly independent of it.
Let’s assume our magnetometer is perfectly calibrated and pretend t1 is a spectrum computed on a geomagne9cally quiet day, and that t2 is a spectrum computed during the passage of a coronal mass ejec9on [CME].
Log(PSD)
Log(Frequency)
t1
Detrended Detrended
Log(PSD)
t2
Log(Frequency)
Two of these spectra have about the same logarithmic slope (aka “spectral index”), but vastly logarithmic offsets. Two of them have the same logarithmic offset, but vastly different spectral indices. However, if one properly detrendeds the spectrum, the detrended spectrum is the same for both. We just said this was a good thing! But it is not always a good thing.
LimitaEons of the Detrended PSD in isolaEon
However, we do not look at just the DPSD. When compu9ng the linear fits, we need not throw out this addi9onal informa9on. Even with poorly calibrated data, the spectral index is leW unharmed. The offset will differ, but only between magnetometers, for example. It is s9ll very useful for a given magnetometer to gauge how the total power is varying over 9me.
DATA QUALITY RECAP THE MOST IMPORTANT TAKE AWAY: Quality issues in the 9me domain do not necessarily map to the frequency domain, and those that do can be controlled and mi9gated. • Absolute field data and variometer-‐quality data essen9ally have the same power spectrum • Spectral morphology (shape and log slope) is “invariant under calibra9on error.” I.e., a
poorly-‐calibrated magnetometer s9ll gives us a lot of relevant informa9on. • If necessary, one can re-‐adjust the poorly-‐calibrated data to approximately absolute values if
one knows the what the noise floor is supposed to be. • However, magnetometers of varying calibra9on quality can always be compared using
detrended power spectra, which when done properly is “invariant under calibra9on error.” • Although there are many ways to detrend spectra, all detrending schemes are not created
equal: one should choose a scheme that will live up to the synonyms of detrended spectra: whitened, decorrelated
• Detrended power spectra are great for telling you about which bands hold coherent wave
energy and which bands have been filtered (somehow). However, they hold no informa9on concerning the background geomagne9c noise (logarithmic slope and offset / absolute values).
Conclusion
You can trust my spectral data and data products derived from the spectral data.
top related