basic time series analyzing variable star data for the amateur astronomer

Basic Time SeriesBasic Time Series

Analyzing variable star dataAnalyzing variable star data

for the amateur astronomerfor the amateur astronomer

What Is Time Series?What Is Time Series?

Single variable Single variable xx that changes over time that changes over time ttcan be multiple variables, W W W A Tcan be multiple variables, W W W A T

Light curve: Light curve: xx = brightness (magnitude) = brightness (magnitude)Each observation consists of Each observation consists of twotwo numbers numbersTime Time tt is considered perfectly precise is considered perfectly preciseData (observation/measurement/estimate) Data (observation/measurement/estimate)

xx is not perfectly precise is not perfectly precise

Two meanings of “Time Series”Two meanings of “Time Series”

TS is a TS is a processprocess, how the variable , how the variable changes over timechanges over time

TS is an execution of the process, often TS is an execution of the process, often called a called a realizationrealization of the TS process of the TS process

A realization (observed TS) consists of A realization (observed TS) consists of pairspairs of numbers ( of numbers (ttnn,,xxnn), one such pair for ), one such pair for

each observationeach observation

Goals of TS analysisGoals of TS analysis

Use the process to define the behavior of Use the process to define the behavior of its realizationsits realizations

Use a realization, i.e. observed data (Use a realization, i.e. observed data (ttnn,,xxnn), ),

to discover the processto discover the processThis is our main goalThis is our main goal

Special NeedsSpecial Needs

Astronomical data creates special Astronomical data creates special circumstances for time series analysiscircumstances for time series analysis

Mainly because the data are irregularly Mainly because the data are irregularly spaced in time – spaced in time – uneven samplinguneven sampling

Sometimes the time spacing (the Sometimes the time spacing (the ““samplingsampling”) is even pathological – with big ”) is even pathological – with big gaps that have periods all their owngaps that have periods all their own

Analysis Step 1Analysis Step 1

Plot the data and Plot the data and look at the graphlook at the graph! ! (visual inspection)(visual inspection)

Eye+brain combination is the world’s best Eye+brain combination is the world’s best pattern-recognition systempattern-recognition system

BUT – also the most easily fooledBUT – also the most easily fooled ““pictures in the clouds”pictures in the clouds”

Use visual inspection to get ideasUse visual inspection to get ideasConfirm them with numerical analysisConfirm them with numerical analysis

Data = Signal + NoiseData = Signal + Noise

True brightness is a function of time rue brightness is a function of time f(t)f(t) it’s probably smooth (or nearly so)it’s probably smooth (or nearly so)

There’s some measurement error εit’s randomIt’s almost certainly not smooth

Additive model: data xn at time tn is sum of signal f(tn) and noise εn

xn = f(tn) + εn

Noise is RandomNoise is Random

That’s its That’s its definitiondefinition!!Deterministic part = Deterministic part = signalsignalRandom part = Random part = noisenoiseUsually – the true brightness is Usually – the true brightness is

deterministic, therefore it’s the signaldeterministic, therefore it’s the signalUsually – the noise is measurement errorUsually – the noise is measurement error

Achieve the GoalAchieve the Goal

Means we have to figure out how the Means we have to figure out how the signal behaves signal behaves andand how the noise how the noise behavesbehaves

For light curves, we usually just For light curves, we usually just assumeassume how the noise behaveshow the noise behaves

But we still should determine its But we still should determine its parametersparameters

What Determines Random?What Determines Random?

Probability distribution (Probability distribution (pdfpdf or or pmfpmf))pdf: probability that the value falls in a pdf: probability that the value falls in a

small range of width small range of width ddε, centered on ε isε, centered on ε is Probability = Probability = PP((εε) d) dεεpmf: probability that the value is pmf: probability that the value is εε is is PP((εε))pdf/pmf has some mean value pdf/pmf has some mean value μμpdf/pmf has some standard deviation pdf/pmf has some standard deviation σσ

Most Common Noise ModelMost Common Noise Model

i.i.d. = “independent identically distributed”i.i.d. = “independent identically distributed”

Each noise value is independent of othersEach noise value is independent of others PP1212(x(x11,x,x22) = P) = P11(x(x11)P)P22(x(x22))

They’re all identically distributedThey’re all identically distributed PP11(x(x11) = P) = P22(x(x22))

What is the Distribution?What is the Distribution?

Most common is Most common is GaussianGaussian (a.k.a. (a.k.a. NormalNormal))

2)(

22 /)(2

1

e

P

Noise ParametersNoise Parameters

μμ = mean = < = mean = <εε>>Usually assumed zero (i.e., data Usually assumed zero (i.e., data unbiasedunbiased))

σσ22 = variance = <( = variance = <(ε-με-μ))22>>σσ = √( = √(σσ22) = standard deviation) = standard deviation

Typical value is 0.2 mag. for visual dataTypical value is 0.2 mag. for visual dataSmaller for CCD/photoelectric (we hope!)Smaller for CCD/photoelectric (we hope!)Note: don’t diparage visual data, what they Note: don’t diparage visual data, what they

lack in individual precision they make up by lack in individual precision they make up by the power of sheer numbersthe power of sheer numbers

Is the default noise model right?Is the default noise model right?

No! We No! We knowknow it’s wrong it’s wrongBias: Bias: μμ values not zero values not zeroNOT identically distributed – different NOT identically distributed – different

observers have different observers have different μμ, , σσ values valuesSometimes not even independent Sometimes not even independent

(autocorrelated noise)(autocorrelated noise)BUT – i.i.d. Gaussian is still a useful BUT – i.i.d. Gaussian is still a useful

working hypothesis, so W W W A Tworking hypothesis, so W W W A T

Even if …Even if …

Even if we know the Even if we know the formform of the noise … of the noise …We still have to figure out its parametersWe still have to figure out its parameters

Is it unbiased (i.e. centered at zero so Is it unbiased (i.e. centered at zero so μμ = 0)?= 0)?How big does it tend to be (what’s How big does it tend to be (what’s σσ )? )?

And …And …

We still have to separate the signal from We still have to separate the signal from the noisethe noise

And of course figure out the And of course figure out the formform of the of the signal, i.e.,signal, i.e.,

Figure out the process which determines Figure out the process which determines the signalthe signal

Whew!Whew!

Simplest Possible SignalSimplest Possible Signal

None at all!None at all!

f(t)f(t) = constant = = constant = ββoo

This is the This is the null hypothesisnull hypothesis for many tests for many testsBut we can’t be But we can’t be suresure f(t)f(t) is constant … is constant …… … that’s only a that’s only a modelmodel of the signal of the signal

Separate Signal from NoiseSeparate Signal from Noise

We already said data = signal + noiseWe already said data = signal + noiseTherefore data – signal = noiseTherefore data – signal = noiseApproximate signal by modelApproximate signal by modelApproximate noise by Approximate noise by residualsresiduals

data – model = residualsdata – model = residuals

xxnn – y – ynn = R = Rnn

If model is correct, residuals are all noiseIf model is correct, residuals are all noise

Estimate Noise ParametersEstimate Noise Parameters

Use residuals Use residuals RRnn to estimate noise to estimate noise

parametersparametersEstimate mean Estimate mean μμ by average by average

Estimate standard deviation Estimate standard deviation σσ by sample by sample standard deviationstandard deviation

N

jjRN

R1

1

1

)(1

2

N

RR

s

N

jj

AveragesAverages

When we average i.i.d. noise we expect to When we average i.i.d. noise we expect to get the meanget the mean

Standard deviation Standard deviation of the averageof the average (usually called the (usually called the standard errorstandard error) is ) is lessless than standard deviation than standard deviation of the dataof the data

Nes raw

ave)(

)( ."."

Confidence IntervalConfidence Interval

95% confidence interval is the range in 95% confidence interval is the range in which we expect the average to lie, 95% of which we expect the average to lie, 95% of the timethe time

About 2 standard errors above or below About 2 standard errors above or below the expected valuethe expected value

Nx

xIC

raw

ave

/2

2..%95

)(

)(

Does Does average average change?change?

Divide time into Divide time into binsbins Usually of equal time width (often 10 days)Usually of equal time width (often 10 days) Sometimes of equal number of data NSometimes of equal number of data N

Compute average and standard deviation within Compute average and standard deviation within each bineach bin

IF signal is constant AND noise is consistent, IF signal is constant AND noise is consistent, THEN expected value of data average will be THEN expected value of data average will be constantconstant

So: do the “bin averages” show more variation So: do the “bin averages” show more variation than is expected from noise?than is expected from noise?

ANOVA testANOVA test

Compare variance of averages to variance Compare variance of averages to variance of data (ANalysis Of VAriance = ANOVA)of data (ANalysis Of VAriance = ANOVA)

In other words… compare variance In other words… compare variance betweenbetween bins to variance bins to variance withinwithin bins bins

““F-test” gives a “p-value,” probability of F-test” gives a “p-value,” probability of getting that result IF the data are just noisegetting that result IF the data are just noise

Low p-value Low p-value probably NOT just noise probably NOT just noiseEither we haven’t found all the signalEither we haven’t found all the signalOr the noise isn’t the simple kindOr the noise isn’t the simple kind


50-day averages:50-day averages:Fstat df.between df.within pFstat df.between df.within p0.315563 2 147 0.7298710.315563 2 147 0.729871NOT significantNOT significant

10-day averages:10-day averages:Fstat df.between df.within pFstat df.between df.within p0.728138 14 135 0.7431330.728138 14 135 0.743133NOT significantNOT significant


50-day averages:50-day averages:Fstat df.between df.within pFstat df.between df.within p13.25758 2 147 5e-0613.25758 2 147 5e-06 IS significantIS significant

10-day averages:10-day averages:Fstat df.between df.within pFstat df.between df.within p2.546476 14 135 0.0028792.546476 14 135 0.002879 IS significantIS significant

Averages Rule!Averages Rule!

Excellent way to Excellent way to reducereduce the noise the noisebecause because σσ(ave)(ave) = = σσ(raw) (raw) / √/ √NN

Excellent way to Excellent way to measuremeasure the noise the noiseVery little change to signalVery little change to signal

unless signal changes faster than averaging unless signal changes faster than averaging timetime

So in most cases averages smooth the So in most cases averages smooth the data, i.e., reduce noise but not signaldata, i.e., reduce noise but not signal

Decompose the SignalDecompose the Signal

Additive model: Additive model: sumsum of component signals of component signalsNon-periodic partNon-periodic part

sometimes called sometimes called trendtrendsometimes called sometimes called secularsecular variation variation

Repeating (periodic) partRepeating (periodic) partor almost-periodic (pseudoperiodic) partor almost-periodic (pseudoperiodic) partcan be multiple periodic parts (multiperiodic)can be multiple periodic parts (multiperiodic)

f(t) = S(t) + P(t)f(t) = S(t) + P(t)

Periodic SignalPeriodic Signal

Discover that it’s periodic!Discover that it’s periodic!Find the period PFind the period P

Or frequencyOr frequency ννPν = 1 ν = 1 / P P = 1 / νPν = 1 ν = 1 / P P = 1 / ν

Find amplitude Find amplitude AA = size of variation = size of variationOften use Often use AA to denote the to denote the semi-amplitudesemi-amplitude, ,

which is half the full amplitudewhich is half the full amplitudeFind Find waveformwaveform (i.e., cycle shape) (i.e., cycle shape)

PeriodogramPeriodogram

Searches for periodic behaviorSearches for periodic behaviorTest many frequencies (i.e., many periods)Test many frequencies (i.e., many periods)For each frequency, compute a powerFor each frequency, compute a power

Higher power Higher power more likely it’s periodic with more likely it’s periodic with that frequency (that period)that frequency (that period)

Plot of power vs frequency is a Plot of power vs frequency is a periodogramperiodogram, a.k.a. , a.k.a. power spectrumpower spectrum

PeriodogramsPeriodograms

Fourier analysis Fourier analysis Fourier periodogram Fourier periodogramDon’t use DFT or FFT because of uneven Don’t use DFT or FFT because of uneven

time samplingtime samplingUse Lomb-Scargle modified periodogram ORUse Lomb-Scargle modified periodogram ORDCDFT (date-compensated discrete Fourier DCDFT (date-compensated discrete Fourier

transform)transform)Folded light curve Folded light curve AoV periodogram AoV periodogramMany more … these are the most commonMany more … these are the most common

DCDFT periodogramDCDFT periodogram

AoV periodogramAoV periodogram

Lots lots more …Lots lots more …

Non-periodic signalsNon-periodic signalsPeriodic but not Periodic but not perfectlyperfectly periodic periodic

(parameters are changing)(parameters are changing)What if the noise is something “different”?What if the noise is something “different”?

Come to the next workshop!Come to the next workshop!

Enjoy observing variablesEnjoy observing variables

See your own data used in real scientific See your own data used in real scientific study (AJ, ApJ, MNRAS, A&A, PASP, …)study (AJ, ApJ, MNRAS, A&A, PASP, …)

Participate in monitoring and observing Participate in monitoring and observing programsprograms

Assist in space science and astronomyAssist in space science and astronomy

Make your own discoveries!Make your own discoveries!

http://www.aavso.org/http://www.aavso.org/

basic time series analyzing variable star data for the amateur astronomer

Documents