basic time series analyzing variable star data for the amateur astronomer
TRANSCRIPT
Basic Time SeriesBasic Time Series
Analyzing variable star dataAnalyzing variable star data
for the amateur astronomerfor the amateur astronomer
What Is Time Series?What Is Time Series?
Single variable Single variable xx that changes over time that changes over time ttcan be multiple variables, W W W A Tcan be multiple variables, W W W A T
Light curve: Light curve: xx = brightness (magnitude) = brightness (magnitude)Each observation consists of Each observation consists of twotwo numbers numbersTime Time tt is considered perfectly precise is considered perfectly preciseData (observation/measurement/estimate) Data (observation/measurement/estimate)
xx is not perfectly precise is not perfectly precise
Two meanings of “Time Series”Two meanings of “Time Series”
TS is a TS is a processprocess, how the variable , how the variable changes over timechanges over time
TS is an execution of the process, often TS is an execution of the process, often called a called a realizationrealization of the TS process of the TS process
A realization (observed TS) consists of A realization (observed TS) consists of pairspairs of numbers ( of numbers (ttnn,,xxnn), one such pair for ), one such pair for
each observationeach observation
Goals of TS analysisGoals of TS analysis
Use the process to define the behavior of Use the process to define the behavior of its realizationsits realizations
Use a realization, i.e. observed data (Use a realization, i.e. observed data (ttnn,,xxnn), ),
to discover the processto discover the processThis is our main goalThis is our main goal
Special NeedsSpecial Needs
Astronomical data creates special Astronomical data creates special circumstances for time series analysiscircumstances for time series analysis
Mainly because the data are irregularly Mainly because the data are irregularly spaced in time – spaced in time – uneven samplinguneven sampling
Sometimes the time spacing (the Sometimes the time spacing (the ““samplingsampling”) is even pathological – with big ”) is even pathological – with big gaps that have periods all their owngaps that have periods all their own
Analysis Step 1Analysis Step 1
Plot the data and Plot the data and look at the graphlook at the graph! ! (visual inspection)(visual inspection)
Eye+brain combination is the world’s best Eye+brain combination is the world’s best pattern-recognition systempattern-recognition system
BUT – also the most easily fooledBUT – also the most easily fooled ““pictures in the clouds”pictures in the clouds”
Use visual inspection to get ideasUse visual inspection to get ideasConfirm them with numerical analysisConfirm them with numerical analysis
Data = Signal + NoiseData = Signal + Noise
True brightness is a function of time rue brightness is a function of time f(t)f(t) it’s probably smooth (or nearly so)it’s probably smooth (or nearly so)
There’s some measurement error εit’s randomIt’s almost certainly not smooth
Additive model: data xn at time tn is sum of signal f(tn) and noise εn
xn = f(tn) + εn
Noise is RandomNoise is Random
That’s its That’s its definitiondefinition!!Deterministic part = Deterministic part = signalsignalRandom part = Random part = noisenoiseUsually – the true brightness is Usually – the true brightness is
deterministic, therefore it’s the signaldeterministic, therefore it’s the signalUsually – the noise is measurement errorUsually – the noise is measurement error
Achieve the GoalAchieve the Goal
Means we have to figure out how the Means we have to figure out how the signal behaves signal behaves andand how the noise how the noise behavesbehaves
For light curves, we usually just For light curves, we usually just assumeassume how the noise behaveshow the noise behaves
But we still should determine its But we still should determine its parametersparameters
What Determines Random?What Determines Random?
Probability distribution (Probability distribution (pdfpdf or or pmfpmf))pdf: probability that the value falls in a pdf: probability that the value falls in a
small range of width small range of width ddε, centered on ε isε, centered on ε is Probability = Probability = PP((εε) d) dεεpmf: probability that the value is pmf: probability that the value is εε is is PP((εε))pdf/pmf has some mean value pdf/pmf has some mean value μμpdf/pmf has some standard deviation pdf/pmf has some standard deviation σσ
Most Common Noise ModelMost Common Noise Model
i.i.d. = “independent identically distributed”i.i.d. = “independent identically distributed”
Each noise value is independent of othersEach noise value is independent of others PP1212(x(x11,x,x22) = P) = P11(x(x11)P)P22(x(x22))
They’re all identically distributedThey’re all identically distributed PP11(x(x11) = P) = P22(x(x22))
What is the Distribution?What is the Distribution?
Most common is Most common is GaussianGaussian (a.k.a. (a.k.a. NormalNormal))
2)(
22 /)(2
1
e
P
Noise ParametersNoise Parameters
μμ = mean = < = mean = <εε>>Usually assumed zero (i.e., data Usually assumed zero (i.e., data unbiasedunbiased))
σσ22 = variance = <( = variance = <(ε-με-μ))22>>σσ = √( = √(σσ22) = standard deviation) = standard deviation
Typical value is 0.2 mag. for visual dataTypical value is 0.2 mag. for visual dataSmaller for CCD/photoelectric (we hope!)Smaller for CCD/photoelectric (we hope!)Note: don’t diparage visual data, what they Note: don’t diparage visual data, what they
lack in individual precision they make up by lack in individual precision they make up by the power of sheer numbersthe power of sheer numbers
Is the default noise model right?Is the default noise model right?
No! We No! We knowknow it’s wrong it’s wrongBias: Bias: μμ values not zero values not zeroNOT identically distributed – different NOT identically distributed – different
observers have different observers have different μμ, , σσ values valuesSometimes not even independent Sometimes not even independent
(autocorrelated noise)(autocorrelated noise)BUT – i.i.d. Gaussian is still a useful BUT – i.i.d. Gaussian is still a useful
working hypothesis, so W W W A Tworking hypothesis, so W W W A T
Even if …Even if …
Even if we know the Even if we know the formform of the noise … of the noise …We still have to figure out its parametersWe still have to figure out its parameters
Is it unbiased (i.e. centered at zero so Is it unbiased (i.e. centered at zero so μμ = 0)?= 0)?How big does it tend to be (what’s How big does it tend to be (what’s σσ )? )?
And …And …
We still have to separate the signal from We still have to separate the signal from the noisethe noise
And of course figure out the And of course figure out the formform of the of the signal, i.e.,signal, i.e.,
Figure out the process which determines Figure out the process which determines the signalthe signal
Whew!Whew!
Simplest Possible SignalSimplest Possible Signal
None at all!None at all!
f(t)f(t) = constant = = constant = ββoo
This is the This is the null hypothesisnull hypothesis for many tests for many testsBut we can’t be But we can’t be suresure f(t)f(t) is constant … is constant …… … that’s only a that’s only a modelmodel of the signal of the signal
Separate Signal from NoiseSeparate Signal from Noise
We already said data = signal + noiseWe already said data = signal + noiseTherefore data – signal = noiseTherefore data – signal = noiseApproximate signal by modelApproximate signal by modelApproximate noise by Approximate noise by residualsresiduals
data – model = residualsdata – model = residuals
xxnn – y – ynn = R = Rnn
If model is correct, residuals are all noiseIf model is correct, residuals are all noise
Estimate Noise ParametersEstimate Noise Parameters
Use residuals Use residuals RRnn to estimate noise to estimate noise
parametersparametersEstimate mean Estimate mean μμ by average by average
Estimate standard deviation Estimate standard deviation σσ by sample by sample standard deviationstandard deviation
N
jjRN
R1
1
1
)(1
2
N
RR
s
N
jj
AveragesAverages
When we average i.i.d. noise we expect to When we average i.i.d. noise we expect to get the meanget the mean
Standard deviation Standard deviation of the averageof the average (usually called the (usually called the standard errorstandard error) is ) is lessless than standard deviation than standard deviation of the dataof the data
Nes raw
ave)(
)( ."."
Confidence IntervalConfidence Interval
95% confidence interval is the range in 95% confidence interval is the range in which we expect the average to lie, 95% of which we expect the average to lie, 95% of the timethe time
About 2 standard errors above or below About 2 standard errors above or below the expected valuethe expected value
Nx
xIC
raw
ave
/2
2..%95
)(
)(
Does Does average average change?change?
Divide time into Divide time into binsbins Usually of equal time width (often 10 days)Usually of equal time width (often 10 days) Sometimes of equal number of data NSometimes of equal number of data N
Compute average and standard deviation within Compute average and standard deviation within each bineach bin
IF signal is constant AND noise is consistent, IF signal is constant AND noise is consistent, THEN expected value of data average will be THEN expected value of data average will be constantconstant
So: do the “bin averages” show more variation So: do the “bin averages” show more variation than is expected from noise?than is expected from noise?
ANOVA testANOVA test
Compare variance of averages to variance Compare variance of averages to variance of data (ANalysis Of VAriance = ANOVA)of data (ANalysis Of VAriance = ANOVA)
In other words… compare variance In other words… compare variance betweenbetween bins to variance bins to variance withinwithin bins bins
““F-test” gives a “p-value,” probability of F-test” gives a “p-value,” probability of getting that result IF the data are just noisegetting that result IF the data are just noise
Low p-value Low p-value probably NOT just noise probably NOT just noiseEither we haven’t found all the signalEither we haven’t found all the signalOr the noise isn’t the simple kindOr the noise isn’t the simple kind
ANOVA testANOVA test
50-day averages:50-day averages:Fstat df.between df.within pFstat df.between df.within p0.315563 2 147 0.7298710.315563 2 147 0.729871NOT significantNOT significant
10-day averages:10-day averages:Fstat df.between df.within pFstat df.between df.within p0.728138 14 135 0.7431330.728138 14 135 0.743133NOT significantNOT significant
ANOVA testANOVA test
50-day averages:50-day averages:Fstat df.between df.within pFstat df.between df.within p13.25758 2 147 5e-0613.25758 2 147 5e-06 IS significantIS significant
10-day averages:10-day averages:Fstat df.between df.within pFstat df.between df.within p2.546476 14 135 0.0028792.546476 14 135 0.002879 IS significantIS significant
Averages Rule!Averages Rule!
Excellent way to Excellent way to reducereduce the noise the noisebecause because σσ(ave)(ave) = = σσ(raw) (raw) / √/ √NN
Excellent way to Excellent way to measuremeasure the noise the noiseVery little change to signalVery little change to signal
unless signal changes faster than averaging unless signal changes faster than averaging timetime
So in most cases averages smooth the So in most cases averages smooth the data, i.e., reduce noise but not signaldata, i.e., reduce noise but not signal
Decompose the SignalDecompose the Signal
Additive model: Additive model: sumsum of component signals of component signalsNon-periodic partNon-periodic part
sometimes called sometimes called trendtrendsometimes called sometimes called secularsecular variation variation
Repeating (periodic) partRepeating (periodic) partor almost-periodic (pseudoperiodic) partor almost-periodic (pseudoperiodic) partcan be multiple periodic parts (multiperiodic)can be multiple periodic parts (multiperiodic)
f(t) = S(t) + P(t)f(t) = S(t) + P(t)
Periodic SignalPeriodic Signal
Discover that it’s periodic!Discover that it’s periodic!Find the period PFind the period P
Or frequencyOr frequency ννPν = 1 ν = 1 / P P = 1 / νPν = 1 ν = 1 / P P = 1 / ν
Find amplitude Find amplitude AA = size of variation = size of variationOften use Often use AA to denote the to denote the semi-amplitudesemi-amplitude, ,
which is half the full amplitudewhich is half the full amplitudeFind Find waveformwaveform (i.e., cycle shape) (i.e., cycle shape)
PeriodogramPeriodogram
Searches for periodic behaviorSearches for periodic behaviorTest many frequencies (i.e., many periods)Test many frequencies (i.e., many periods)For each frequency, compute a powerFor each frequency, compute a power
Higher power Higher power more likely it’s periodic with more likely it’s periodic with that frequency (that period)that frequency (that period)
Plot of power vs frequency is a Plot of power vs frequency is a periodogramperiodogram, a.k.a. , a.k.a. power spectrumpower spectrum
PeriodogramsPeriodograms
Fourier analysis Fourier analysis Fourier periodogram Fourier periodogramDon’t use DFT or FFT because of uneven Don’t use DFT or FFT because of uneven
time samplingtime samplingUse Lomb-Scargle modified periodogram ORUse Lomb-Scargle modified periodogram ORDCDFT (date-compensated discrete Fourier DCDFT (date-compensated discrete Fourier
transform)transform)Folded light curve Folded light curve AoV periodogram AoV periodogramMany more … these are the most commonMany more … these are the most common
DCDFT periodogramDCDFT periodogram
AoV periodogramAoV periodogram
Lots lots more …Lots lots more …
Non-periodic signalsNon-periodic signalsPeriodic but not Periodic but not perfectlyperfectly periodic periodic
(parameters are changing)(parameters are changing)What if the noise is something “different”?What if the noise is something “different”?
Come to the next workshop!Come to the next workshop!
Enjoy observing variablesEnjoy observing variables
See your own data used in real scientific See your own data used in real scientific study (AJ, ApJ, MNRAS, A&A, PASP, …)study (AJ, ApJ, MNRAS, A&A, PASP, …)
Participate in monitoring and observing Participate in monitoring and observing programsprograms
Assist in space science and astronomyAssist in space science and astronomy
Make your own discoveries!Make your own discoveries!
http://www.aavso.org/http://www.aavso.org/