stat481/581: introductiontotime seriesanalysis
TRANSCRIPT
![Page 1: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/1.jpg)
STAT481/581:Introduction to TimeSeries Analysis
Dynamic regression models
OTexts.org/fpp3/
![Page 2: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/2.jpg)
Outline
1 Regression with ARIMA errors
2 Stochastic and deterministic trends
3 Dynamic harmonic regression
4 Lagged predictors
2
![Page 3: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/3.jpg)
Outline
1 Regression with ARIMA errors
2 Stochastic and deterministic trends
3 Dynamic harmonic regression
4 Lagged predictors
3
![Page 4: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/4.jpg)
Regression with ARIMA errors
Regression modelsyt = β0 + β1x1,t + · · ·+ βkxk,t + εt ,
yt modeled as function of k explanatory variablesx1,t , . . . , xk,t .In regression, we assume that εt was WN.Now we want to allow εt to be autocorrelated.
Example: ARIMA(1,1,1) errorsyt = β0 + β1x1,t + · · ·+ βkxk,t + ηt ,
(1− φ1B)(1− B)ηt = (1 + θ1B)εt ,
where εt is white noise. 4
![Page 5: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/5.jpg)
Residuals and errors
Example: ηt = ARIMA(1,1,1)
yt = β0 + β1x1,t + · · ·+ βkxk,t + ηt ,
(1− φ1B)(1− B)ηt = (1 + θ1B)εt ,
Be careful in distinguishing ηt from εt .Only the errors εt are assumed to be white noise.In ordinary regression, ηt is assumed to be whitenoise and so ηt = εt .
5
![Page 6: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/6.jpg)
Estimation
If we minimize ∑η2
t (by using ordinary regression):
1 Estimated coefficients β̂0, . . . , β̂k are no longeroptimal as some information ignored;
2 Statistical tests associated with the model (e.g.,t-tests on the coefficients) are incorrect.
3 p-values for coefficients usually too small(“spurious regression”).
4 AIC of fitted models misleading.
Minimizing ∑ε2
t avoids these problems.Maximizing likelihood similar to minimizing ∑
ε2t .
6
![Page 7: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/7.jpg)
Stationarity
Regression with ARMA errors
yt = β0 + β1x1,t + · · ·+ βkxk,t + ηt ,
where ηt is an ARMA process.
All variables in the model must be stationary.If we estimate the model while any of these arenon-stationary, the estimated coefficients can beincorrect.Difference variables until all stationary.If necessary, apply same differencing to allvariables.
7
![Page 8: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/8.jpg)
Stationarity
Model with ARIMA(1,1,1) errors
yt = β0 + β1x1,t + · · ·+ βkxk,t + ηt ,
(1− φ1B)(1− B)ηt = (1 + θ1B)εt ,
Equivalent to model with ARIMA(1,0,1) errors
y ′t = β1x ′1,t + · · ·+ βkx ′k,t + η′t ,
(1− φ1B)η′t = (1 + θ1B)εt ,
where y ′t = yt − yt−1, x ′t,i = xt,i − xt−1,i andη′t = ηt − ηt−1.
8
![Page 9: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/9.jpg)
Regression with ARIMA errors
Any regression with an ARIMA error can be rewritten as aregression with an ARMA error by differencing allvariables with the same differencing operator as in theARIMA model.Original data
yt = β0 + β1x1,t + · · ·+ βkxk,t + ηt
where φ(B)(1− B)dηt = θ(B)εt
After differencing all variablesy ′
t = β1x ′1,t + · · ·+ βkx ′
k,t + η′t .
where φ(B)ηt = θ(B)εt
and y ′t = (1− B)dyt 9
![Page 10: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/10.jpg)
Model selection
Fit regression model with automatically selectedARIMA errors. (R will take care of differencingbefore estimation.)Check that εt series looks like white noise.
Selecting predictorsAICc can be calculated for final model.Repeat procedure for all subsets of predictors to beconsidered, and select model with lowest AICc value.
10
![Page 11: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/11.jpg)
US personal consumption and income
Consum
ptionIncom
e
1970 1980 1990 2000 2010
−2
−1
0
1
2
−2.5
0.0
2.5
Year
Quarterly changes in US consumption and personal income
11
![Page 12: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/12.jpg)
US personal consumption and income
−2
−1
0
1
2
−2.5 0.0 2.5Income
Con
sum
ptio
n
Quarterly changes in US consumption and personal income
12
![Page 13: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/13.jpg)
US personal consumption and income
No need for transformations or furtherdifferencing.Increase in income does not necessarily translateinto instant increase in consumption (e.g., afterthe loss of a job, it may take a few months forexpenses to be reduced to allow for the newcircumstances). We will ignore this for now.
13
![Page 14: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/14.jpg)
US personal consumption and income
fit <- us_change %>% model(ARIMA(Consumption ~ Income))report(fit)
## Series: Consumption## Model: LM w/ ARIMA(1,0,2) errors#### Coefficients:## ar1 ma1 ma2 Income intercept## 0.6922 -0.5758 0.1984 0.2028 0.5990## s.e. 0.1159 0.1301 0.0756 0.0461 0.0884#### sigma^2 estimated as 0.3219: log likelihood=-156.9## AIC=325.9 AICc=326.4 BIC=345.3
Write down the equations for the fitted model.14
![Page 15: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/15.jpg)
US personal consumption and income
residuals(fit, type='regression') %>%
gg_tsdisplay(.resid) + ggtitle("Regression errors")
−2
−1
0
1
1970 1980 1990 2000 2010Time
.res
id
Regression errors
−0.1
0.0
0.1
0.2
0.3
2 4 6 8 10 12 14 16 18 20 22lag [1Q]
acf
−2
−1
0
1
Jan Apr Jul OctTime
.res
id
1979
1989
1999
2009
15
![Page 16: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/16.jpg)
US personal consumption and income
residuals(fit, type='response') %>%
gg_tsdisplay(.resid) + ggtitle("ARIMA errors")
−2
−1
0
1
1970 1980 1990 2000 2010Time
.res
id
ARIMA errors
−0.1
0.0
0.1
2 4 6 8 10 12 14 16 18 20 22lag [1Q]
acf
−2
−1
0
1
Jan Apr Jul OctTime
.res
id
1979
1989
1999
2009
16
![Page 17: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/17.jpg)
US personal consumption and income
augment(fit) %>%
features(.resid, ljung_box, dof = 5, lag = 12)
## # A tibble: 1 x 3## .model lb_stat lb_pvalue## <chr> <dbl> <dbl>## 1 ARIMA(Consumption ~ Income) 6.35 0.500
17
![Page 18: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/18.jpg)
US personal consumption and income
us_change_future <- new_data(us_change, 8) %>%
mutate(Income = mean(us_change$Income))forecast(fit, new_data = us_change_future) %>%
autoplot(us_change) +
labs(x = "Year", y = "Percentage change",title = "Forecasts from regression with ARIMA(1,0,2) errors")
−2
−1
0
1
2
1980 2000 2020Year
Per
cent
age
chan
ge
.level
80
95
Forecasts from regression with ARIMA(1,0,2) errors
18
![Page 19: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/19.jpg)
Forecasting
To forecast a regression model with ARIMAerrors, we need to forecast the regression part ofthe model and the ARIMA part of the modeland combine the results.Some predictors are known into the future (e.g.,time, dummies).Separate forecasting models may be needed forother predictors.Forecast intervals ignore the uncertainty inforecasting the predictors.
19
![Page 20: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/20.jpg)
Daily electricity demand
Model daily electricity demand as a function of temperatureusing quadratic regression with ARMA errors.vic_elec_daily %>%
ggplot(aes(x=Temperature, y=Demand, colour=Day_Type)) +geom_point() +labs(x = "Maximum temperature", y = "Electricity demand (GW)")
200
250
300
350
10 20 30 40Maximum temperature
Ele
ctric
ity d
eman
d (G
W)
Day_Type
Holiday
Weekday
Weekend
20
![Page 21: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/21.jpg)
Daily electricity demand
vic_elec_daily %>%gather("var", "value", Demand, Temperature) %>%ggplot(aes(x = Date, y = value)) + geom_line() +facet_grid(vars(var), scales = "free_y")
Dem
andTem
perature
Jan 2014 Apr 2014 Jul 2014 Oct 2014 Jan 2015
200
250
300
350
10
20
30
40
Date
valu
e
21
![Page 22: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/22.jpg)
Daily electricity demand
fit <- vic_elec_daily %>%model(ARIMA(Demand ~ Temperature + I(Temperature^2) +
(Day_Type=="Weekday")))report(fit)
## Series: Demand## Model: LM w/ ARIMA(2,1,2)(0,0,2)[7] errors#### Coefficients:## ar1 ar2 ma1 ma2 sma1 sma2## 1.1521 -0.2750 -1.3851 0.4071 0.1589 0.3103## s.e. 0.6265 0.4812 0.6082 0.5804 0.0591 0.0538## Temperature I(Temperature^2)## -7.947 0.1865## s.e. 0.492 0.0097## Day_Type == "Weekday"TRUE## 31.825## s.e. 1.019#### sigma^2 estimated as 48.82: log likelihood=-1220## AIC=2461 AICc=2462 BIC=2500
22
![Page 23: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/23.jpg)
Daily electricity demand
augment(fit) %>%gg_tsdisplay(.resid, plot_type = "histogram")
−20
0
20
Jan 2014 Apr 2014 Jul 2014 Oct 2014 Jan 2015Date
.res
id
−0.1
0.0
0.1
0.2
7 14 21lag [1D]
acf
0
20
40
60
−20 0 20 40.resid
coun
t
23
![Page 24: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/24.jpg)
Daily electricity demand
augment(fit) %>%features(.resid, ljung_box, dof = 8, lag = 14)
## # A tibble: 1 x 3## .model lb_stat lb_pvalue## <chr> <dbl> <dbl>## 1 "ARIMA(Demand ~ Temperature + I(Tempera~ 38.1 1.05e-6
24
![Page 25: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/25.jpg)
Daily electricity demand
# Forecast one day ahead
vic_next_day <- new_data(vic_elec_daily, 1) %>%
mutate(Temperature = 26, Day_Type = "Holiday")forecast(fit, vic_next_day)
## # A fable: 1 x 6 [1D]## # Key: .model [1]## .model Date Demand .distribution Temperature## <chr> <date> <dbl> <dist> <dbl>## 1 "ARIM~ 2015-01-01 161. N(161, 49) 26## # ... with 1 more variable: Day_Type <chr>
25
![Page 26: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/26.jpg)
Daily electricity demand
vic_elec_future <- new_data(vic_elec_daily, 14) %>%
mutate(Temperature = 26,Holiday = c(TRUE, rep(FALSE, 13)),Day_Type = case_when(
Holiday ~ "Holiday",wday(Date) %in% 2:6 ~ "Weekday",TRUE ~ "Weekend"
))
26
![Page 27: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/27.jpg)
Daily electricity demand
forecast(fit, vic_elec_future) %>%
autoplot(vic_elec_daily) + ylab("Electricity demand (GW)")
150
200
250
300
350
Jan 2014 Apr 2014 Jul 2014 Oct 2014 Jan 2015Date
Ele
ctric
ity d
eman
d (G
W)
.level
80
95
27
![Page 28: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/28.jpg)
Outline
1 Regression with ARIMA errors
2 Stochastic and deterministic trends
3 Dynamic harmonic regression
4 Lagged predictors
28
![Page 29: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/29.jpg)
Stochastic & deterministic trends
Deterministic trend
yt = β0 + β1t + ηt
where ηt is ARMA process.
Stochastic trend
yt = β0 + β1t + ηt
where ηt is ARIMA process with d ≥ 1.Difference both sides until ηt is stationary:
y ′t = β1 + η′t
where η′t is ARMA process.29
![Page 30: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/30.jpg)
International visitors
2
4
6
1980 1990 2000 2010Year
mill
ions
of p
eopl
e
Total annual international visitors to Australia
30
![Page 31: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/31.jpg)
International visitorsDeterministic trend
fit_deterministic <- aus_visitors %>%model(Deterministic = ARIMA(value ~ trend() + pdq(d = 0)))
report(fit_deterministic)
## Series: value## Model: LM w/ ARIMA(2,0,0) errors#### Coefficients:## ar1 ar2 trend() intercept## 1.113 -0.3805 0.1710 0.4156## s.e. 0.160 0.1585 0.0088 0.1897#### sigma^2 estimated as 0.02979: log likelihood=13.6## AIC=-17.2 AICc=-15.2 BIC=-9.28
31
![Page 32: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/32.jpg)
International visitors
yt = 0.42 + 0.17t + ηt
ηt = 1.11ηt−1 − 0.38ηt−2 + εt
εt ∼ NID(0, 0.0298).
32
![Page 33: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/33.jpg)
International visitorsStochastic trend
fit_stochastic <- aus_visitors %>%model(Stochastic = ARIMA(value ~ pdq(d=1)))
report(fit_stochastic)
## Series: value## Model: ARIMA(0,1,1) w/ drift#### Coefficients:## ma1 constant## 0.3006 0.1735## s.e. 0.1647 0.0390#### sigma^2 estimated as 0.03376: log likelihood=10.62## AIC=-15.24 AICc=-14.46 BIC=-10.57
yt − yt−1 = 0.17 + εt
yt = y0 + 0.17t + ηt
ηt = ηt−1 + 0.30εt−1 + εt
εt ∼ NID(0, 0.0338).
33
![Page 34: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/34.jpg)
International visitors
Determ
inisticS
tochastic
1980 1990 2000 2010 2020
2.5
5.0
7.5
10.0
2.5
5.0
7.5
10.0
Year
Vis
itors
to A
ustr
alia
(m
illio
ns)
.level
80
95
Forecasts from trend models
34
![Page 35: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/35.jpg)
International visitors
2.5
5.0
7.5
10.0
1980 1990 2000 2010 2020Year
Vis
itors
to A
ustr
alia
(m
illio
ns)
Forecast
Deterministic
Stochastic
.level
80
95
Forecasts from trend models
35
![Page 36: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/36.jpg)
Forecasting with trend
Point forecasts are almost identical, butprediction intervals differ.Stochastic trends have much wider predictionintervals because the errors are non-stationary.Be careful of forecasting with deterministictrends too far ahead.
36
![Page 37: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/37.jpg)
Outline
1 Regression with ARIMA errors
2 Stochastic and deterministic trends
3 Dynamic harmonic regression
4 Lagged predictors
37
![Page 38: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/38.jpg)
Dynamic harmonic regression
Combine Fourier terms with ARIMA errorsAdvantages
it allows any length seasonality;for data with more than one seasonal period, youcan include Fourier terms of different frequencies;the seasonal pattern is smooth for small values of K(but more wiggly seasonality can be handled byincreasing K );the short-term dynamics are easily handled with asimple ARMA error.
Disadvantagesseasonality is assumed to be fixed 38
![Page 39: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/39.jpg)
Eating-out expenditure
aus_cafe <- aus_retail %>% filter(Industry == "Cafes, restaurants and takeaway food services",year(Month) %in% 2004:2018
) %>% summarise(Turnover = sum(Turnover))aus_cafe %>% autoplot(Turnover)
2000
2500
3000
3500
4000
2005 2010 2015Month [1M]
Turn
over
39
![Page 40: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/40.jpg)
Eating-out expenditure
fit <- aus_cafe %>% model(K = 1 = ARIMA(log(Turnover) ~ fourier(K = 1) + PDQ(0,0,0)),K = 2 = ARIMA(log(Turnover) ~ fourier(K = 2) + PDQ(0,0,0)),K = 3 = ARIMA(log(Turnover) ~ fourier(K = 3) + PDQ(0,0,0)),K = 4 = ARIMA(log(Turnover) ~ fourier(K = 4) + PDQ(0,0,0)),K = 5 = ARIMA(log(Turnover) ~ fourier(K = 5) + PDQ(0,0,0)),K = 6 = ARIMA(log(Turnover) ~ fourier(K = 6) + PDQ(0,0,0)))
glance(fit)
.model sigma2 AIC AICc BIC
K = 1 0.0017 -616.5 -615.4 -587.8K = 2 0.0011 -699.7 -697.8 -661.5K = 3 0.0008 -763.2 -761.3 -725.0K = 4 0.0005 -821.6 -818.2 -770.6K = 5 0.0003 -919.5 -916.9 -874.8K = 6 0.0003 -920.1 -917.5 -875.4 40
![Page 41: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/41.jpg)
Eating-out expenditure
AICc = −615.4
2000
3000
4000
5000
2005 2010 2015 2020Month
Turn
over .level
80
95
Log transformed LM w/ ARIMA(2,1,3) errors, fourier(K = 1)
41
![Page 42: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/42.jpg)
Eating-out expenditure
AICc = −697.8
2000
3000
4000
5000
2005 2010 2015 2020Month
Turn
over .level
80
95
Log transformed LM w/ ARIMA(5,1,1) errors, fourier(K = 2)
42
![Page 43: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/43.jpg)
Eating-out expenditure
AICc = −761.3
2000
3000
4000
5000
2005 2010 2015 2020Month
Turn
over .level
80
95
Log transformed LM w/ ARIMA(3,1,1) errors, fourier(K = 3)
43
![Page 44: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/44.jpg)
Eating-out expenditure
AICc = −818.2
2000
3000
4000
5000
2005 2010 2015 2020Month
Turn
over .level
80
95
Log transformed LM w/ ARIMA(1,1,5) errors, fourier(K = 4)
44
![Page 45: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/45.jpg)
Eating-out expenditure
AICc = −916.9
2000
3000
4000
5000
2005 2010 2015 2020Month
Turn
over .level
80
95
Log transformed LM w/ ARIMA(2,1,0) errors, fourier(K = 5)
45
![Page 46: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/46.jpg)
Eating-out expenditure
AICc = −917.5
2000
3000
4000
5000
2005 2010 2015 2020Month
Turn
over .level
80
95
Log transformed LM w/ ARIMA(0,1,1) errors, fourier(K = 6)
46
![Page 47: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/47.jpg)
Example: weekly gasoline productsgasoline <- as_tsibble(fpp2::gasoline)fit <- gasoline %>% model(ARIMA(value ~ fourier(K = 13) + PDQ(0,0,0)))report(fit)
## Series: value## Model: LM w/ ARIMA(0,1,1) errors#### Coefficients:## ma1 fourier(K = 13)C1_52 fourier(K = 13)S1_52## -0.8934 -0.1121 -0.2300## s.e. 0.0132 0.0123 0.0122## fourier(K = 13)C2_52 fourier(K = 13)S2_52## 0.0420 0.0317## s.e. 0.0099 0.0099## fourier(K = 13)C3_52 fourier(K = 13)S3_52## 0.0832 0.0346## s.e. 0.0094 0.0094## fourier(K = 13)C4_52 fourier(K = 13)S4_52## 0.0185 0.0398## s.e. 0.0092 0.0092## fourier(K = 13)C5_52 fourier(K = 13)S5_52## -0.0315 0.0009## s.e. 0.0091 0.0091## fourier(K = 13)C6_52 fourier(K = 13)S6_52## -0.0522 0.000## s.e. 0.0090 0.009## fourier(K = 13)C7_52 fourier(K = 13)S7_52## -0.0173 0.0053## s.e. 0.0090 0.0090## fourier(K = 13)C8_52 fourier(K = 13)S8_52## 0.0074 0.0048## s.e. 0.0090 0.0090## fourier(K = 13)C9_52 fourier(K = 13)S9_52## -0.0024 -0.0035## s.e. 0.0090 0.0090## fourier(K = 13)C10_52 fourier(K = 13)S10_52## 0.0151 -0.0037## s.e. 0.0090 0.0090## fourier(K = 13)C11_52 fourier(K = 13)S11_52## -0.0144 0.0191## s.e. 0.0090 0.0090## fourier(K = 13)C12_52 fourier(K = 13)S12_52## -0.0227 -0.0052## s.e. 0.0090 0.0090## fourier(K = 13)C13_52 fourier(K = 13)S13_52 intercept## -0.0035 0.0038 0.0014## s.e. 0.0090 0.0090 0.0007#### sigma^2 estimated as 0.06168: log likelihood=-21.96## AIC=101.9 AICc=103.2 BIC=253
47
![Page 48: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/48.jpg)
Example: weekly gasoline products
forecast(fit, h = "3 years") %>%autoplot(gasoline)
7
8
9
10
1990 2000 2010 2020index
valu
e
.level
80
95
48
![Page 49: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/49.jpg)
5-minute call centre volume
(calls <- read_tsv("http://robjhyndman.com/data/callcenter.txt") %>%gather("date", "volume", -X1) %>% transmute(time = X1, date = as.Date(date, format = "%d/%m/%Y"),datetime = as_datetime(date) + time, volume) %>%
as_tsibble(index = datetime))
## # A tsibble: 27,716 x 4 [5m] <UTC>## time date datetime volume## <time> <date> <dttm> <dbl>## 1 07:00 2003-03-03 2003-03-03 07:00:00 111## 2 07:05 2003-03-03 2003-03-03 07:05:00 113## 3 07:10 2003-03-03 2003-03-03 07:10:00 76## 4 07:15 2003-03-03 2003-03-03 07:15:00 82## 5 07:20 2003-03-03 2003-03-03 07:20:00 91## 6 07:25 2003-03-03 2003-03-03 07:25:00 87## 7 07:30 2003-03-03 2003-03-03 07:30:00 75## 8 07:35 2003-03-03 2003-03-03 07:35:00 89## 9 07:40 2003-03-03 2003-03-03 07:40:00 99## 10 07:45 2003-03-03 2003-03-03 07:45:00 125## # ... with 27,706 more rows
49
![Page 50: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/50.jpg)
5-minute call centre volume
calls %>% fill_gaps() %>% autoplot(volume)
0
100
200
300
400
Apr Jul Octdatetime [5m]
volu
me
50
![Page 51: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/51.jpg)
5-minute call centre volume
calls %>% fill_gaps() %>%gg_season(volume, period = "day", alpha = 0.1) +guides(colour = FALSE)
0
100
200
300
400
00 06 12 18 00datetime
volu
me
51
![Page 52: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/52.jpg)
5-minute call centre volume
library(sugrrants)calls %>% filter(month(date, label = TRUE) == "Apr") %>%ggplot(aes(x = time, y = volume)) +geom_line() + facet_calendar(date)
Apr 28 Apr 29 Apr 30
Apr 21 Apr 22 Apr 23 Apr 24 Apr 25 Apr 26 Apr 27
Apr 14 Apr 15 Apr 16 Apr 17 Apr 18 Apr 19 Apr 20
Apr 07 Apr 08 Apr 09 Apr 10 Apr 11 Apr 12 Apr 13
Apr 01 Apr 02 Apr 03 Apr 04 Apr 05 Apr 06
10:00:0014:00:0018:00:0010:00:0014:00:0018:00:0010:00:0014:00:0018:00:00
100200300
100200300
100200300
100200300
time
volu
me
52
![Page 53: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/53.jpg)
5-minute call centre volumecalls_mdl <- calls %>%
mutate(idx = row_number()) %>%update_tsibble(index = idx)
fit <- calls_mdl %>%model(ARIMA(volume ~ fourier(169, K = 10) + pdq(d=0) + PDQ(0,0,0)))
report(fit)
## Series: volume## Model: LM w/ ARIMA(1,0,3) errors#### Coefficients:## ar1 ma1 ma2 ma3## 0.9894 -0.7383 -0.0333 -0.0282## s.e. 0.0010 0.0061 0.0075 0.0060## fourier(169, K = 10)C1_169## -79.0702## s.e. 0.7001## fourier(169, K = 10)S1_169## 55.2985## s.e. 0.7006## fourier(169, K = 10)C2_169## -32.3615## s.e. 0.3784## fourier(169, K = 10)S2_169## 13.7417## s.e. 0.3786## fourier(169, K = 10)C3_169## -9.3180## s.e. 0.2725## fourier(169, K = 10)S3_169## -13.6446## s.e. 0.2726## fourier(169, K = 10)C4_169## -2.791## s.e. 0.223## fourier(169, K = 10)S4_169## -9.508## s.e. 0.223## fourier(169, K = 10)C5_169## 2.8975## s.e. 0.1957## fourier(169, K = 10)S5_169## -2.2323## s.e. 0.1957## fourier(169, K = 10)C6_169## 3.308## s.e. 0.179## fourier(169, K = 10)S6_169## 0.174## s.e. 0.179## fourier(169, K = 10)C7_169## 0.2968## s.e. 0.1680## fourier(169, K = 10)S7_169## 0.857## s.e. 0.168## fourier(169, K = 10)C8_169## -1.3878## s.e. 0.1604## fourier(169, K = 10)S8_169## 0.8633## s.e. 0.1604## fourier(169, K = 10)C9_169## -0.3410## s.e. 0.1548## fourier(169, K = 10)S9_169## -0.9754## s.e. 0.1548## fourier(169, K = 10)C10_169## 0.8050## s.e. 0.1507## fourier(169, K = 10)S10_169 intercept## -1.1803 192.079## s.e. 0.1507 1.769#### sigma^2 estimated as 242.5: log likelihood=-115411## AIC=230874 AICc=230874 BIC=231088
53
![Page 54: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/54.jpg)
5-minute call centre volume
augment(fit) %>%
gg_tsdisplay(.resid, plot_type = "histogram", lag_max = 338)
−100
−50
0
50
100
150
0 10000 20000idx
.res
id
−0.025
0.000
0.025
0.050
100 200 300lag [1]
acf
0
250
500
750
1000
−100 −50 0 50 100 150.resid
coun
t
54
![Page 55: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/55.jpg)
5-minute call centre volume
fit %>% forecast(h = 1690) %>%autoplot(calls_mdl)
0
100
200
300
400
0 10000 20000 30000idx
volu
me .level
80
95
55
![Page 56: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/56.jpg)
Outline
1 Regression with ARIMA errors
2 Stochastic and deterministic trends
3 Dynamic harmonic regression
4 Lagged predictors
56
![Page 57: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/57.jpg)
Lagged predictors
Sometimes a change in xt does not affect yt
instantaneouslyyt = sales, xt = advertising.yt = stream flow, xt = rainfall.yt = size of herd, xt = breeding stock.
These are dynamic systems with input (xt) andoutput (yt).xt is often a leading indicator.There can be multiple predictors.
57
![Page 58: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/58.jpg)
Lagged predictors
The model include present and past values ofpredictor: xt , xt−1, xt−2, . . . .
yt = a + ν0xt + ν1xt−1 + · · ·+ νkxt−k + ηt
where ηt is an ARIMA process.
58
![Page 59: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/59.jpg)
Lagged predictors
Rewrite model asyt = a + (ν0 + ν1B + ν2B2 + · · ·+ νkBk)xt + ηt
= a + ν(B)xt + ηt .
ν(B) is called a transfer function since itdescribes how change in xt is transferred to yt .x can influence y , but y is not allowed toinfluence x .
59
![Page 60: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/60.jpg)
Example: Insurance quotes and TV ad-verts
Quotes
TV.advert
2002 2003 2004 2005
8
10
12
14
16
18
6
7
8
9
10
11
Year
Insurance advertising and quotations
60
![Page 61: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/61.jpg)
Example: Insurance quotes and TV ad-verts
fit <- insurance %>%# Restrict data so models use same fitting periodmutate(Quotes = c(NA,NA,NA,Quotes[4:40])) %>%# Estimate modelsmodel(ARIMA(Quotes ~ pdq(d = 0) + TV.advert),ARIMA(Quotes ~ pdq(d = 0) + TV.advert + lag(TV.advert)),ARIMA(Quotes ~ pdq(d = 0) + TV.advert + lag(TV.advert) +
lag(TV.advert, 2)),ARIMA(Quotes ~ pdq(d = 0) + TV.advert + lag(TV.advert) +
lag(TV.advert, 2) + lag(TV.advert, 3)))
61
![Page 62: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/62.jpg)
Example: Insurance quotes and TV ad-verts
glance(fit)
Lag order sigma2 AIC AICc BIC
0 0.2650 66.56 68.33 75.011 0.2094 58.09 59.85 66.532 0.2150 60.03 62.58 70.173 0.2056 60.31 64.96 73.83
62
![Page 63: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/63.jpg)
Example: Insurance quotes and TV ad-verts
fit <- insurance %>%model(ARIMA(Quotes ~ pdq(3, 0, 0) + TV.advert + lag(TV.advert)))
report(fit)
## Series: Quotes## Model: LM w/ ARIMA(3,0,0) errors#### Coefficients:## ar1 ar2 ar3 TV.advert lag(TV.advert)## 1.4117 -0.9317 0.3591 1.2564 0.1625## s.e. 0.1698 0.2545 0.1592 0.0667 0.0591## intercept## 2.0393## s.e. 0.9931#### sigma^2 estimated as 0.2165: log likelihood=-23.89## AIC=61.78 AICc=65.28 BIC=73.6
yt = 2.04 + 1.26xt + 0.16xt−1 + ηt ,
ηt = 1.41ηt−1 − 0.93ηt−2 + 0.36ηt−3 + εt ,
63
![Page 64: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/64.jpg)
Example: Insurance quotes and TV ad-verts
advert_a <- new_data(insurance, 20) %>%mutate(TV.advert = 10)
forecast(fit, advert_a) %>% autoplot(insurance)
8
10
12
14
16
18
2002 2004 2006index
Quo
tes .level
80
95
64
![Page 65: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/65.jpg)
Example: Insurance quotes and TV ad-verts
advert_b <- new_data(insurance, 20) %>%mutate(TV.advert = 8)
forecast(fit, advert_b) %>% autoplot(insurance)
8
10
12
14
16
18
2002 2004 2006index
Quo
tes .level
80
95
65
![Page 66: STAT481/581: IntroductiontoTime SeriesAnalysis](https://reader031.vdocuments.pub/reader031/viewer/2022011803/61d360eb205275107d4c5f76/html5/thumbnails/66.jpg)
Example: Insurance quotes and TV ad-verts
advert_c <- new_data(insurance, 20) %>%mutate(TV.advert = 6)
forecast(fit, advert_c) %>% autoplot(insurance)
8
10
12
14
16
18
2002 2004 2006index
Quo
tes .level
80
95
66