--- title: "STAT 436 / 536 - Lecture 17" output: pdf_document --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) knitr::opts_chunk$set(warning = FALSE) knitr::opts_chunk$set(message = FALSE) knitr::opts_chunk$set(fig.align= 'center') knitr::opts_chunk$set(fig.height= 4) knitr::opts_chunk$set(fig.width = 7) library(readr) library(ggplot2) library(forecast) library(tseries) library(vars) ``` ## Multivariate Time Series - Multivariate time series data consist of recordings of multiple variables at the same time. - For instance, consider the prices of conventional and organic avocados in the western region of the United States. ```{r} avo <- read_csv('http://math.montana.edu/ahoegh/teaching/timeseries/data/avocado_west.csv') ggplot(data = avo, aes(y = AveragePrice, x = Date)) + geom_line(aes(color = type)) ``` \vfill #### Spurious Regression - In general, we use regression to assess relationships between a set of variables. \vfill - The textbook has an example about Australian electricity and chocolate production sharing an increasing trend. \vfill - More generally in time series analyses, spurious regression results when both series share an underlying stochastic trend. \vfill \newpage - Consider and example with two simulated random walks. ```{r} set.seed(10) x <- rnorm(100) y <- rnorm(100) for (i in 2:100){ x[i] <- x[i-1] + rnorm(1) y[i] <- y[i-1] + rnorm(1) } comb <- data.frame(time = rep(1:100,2), val=c(x,y), var = rep(c('x','y'), each = 100)) ggplot(data=comb, aes(y=val, x=time)) + geom_line(aes(color = var)) cor(x,y) ``` \vfill - Each time series is a random walk, but the noise terms are uncorrelated. \vfill - Now consider the correlation between the two differenced time series. ```{r} cor(diff(x), diff(y)) ``` \vfill - The result is much smaller correlation. - __Q__: how do you anticipate these results changing for a different seed or a longer time series? \vfill \newpage - With spurious regression, we often see that $t$ or $z$ scores in regression are quite high \vfill - Looking at the residuals can also highlight potential issues with time series regression models. ```{r} ggtsdisplay(residuals(lm(y~x))) ``` \vfill #### Unit Root Testing - In addition to the techniques that we have seen for testing for stationarity we can directly look for unit roots. \vfill - The Dickey-Fuller test is one hypothesis test for assessing whether a unit root exists. \vfill - Note that the alternative hypothesis is that the series is stationary, ```{r} adf.test(x) adf.test(y) ``` so \vfill #### Cointegration - Two time series can have unit roots *and* be related. \vfill - Two non-stationary time series $\{x_t\}$ and $\{y_t\}$ are cointegrated \vfill - As an example of cointegrated time series, consider two time series that are generated from a latent mean process. ```{r} latent.mean <- y2 <- x2 <- rep(0, 100) for (i in 2:500){ latent.mean[i] <- latent.mean[i-1] + rnorm(1) } x2 <- latent.mean + rnorm(500) y2 <- latent.mean + rnorm(500) coint.df <- data.frame(time = rep(1:500,3), val = c(latent.mean, x2, y2), var = rep(c('latent.mean','x','y'), each = 500)) ggplot(data = coint.df, aes(x = time, y = val)) + geom_line(aes(color = var)) ``` \vfill \newpage - The Phillips-Ouliaris test can be used to assess whether two time series are cointegrated. ```{r} po.test(cbind(x2,y2)) ``` In this case, we'd reject the null that the two time series *are not* cointegrated. \vfill #### Vector Autoregressive Models - One approach to model multivariate time series is to extend our ARIMA model framework. \vfill - Two time series, $\{x_t\}$ and $\{y_t\}$ follow a vector autoregressive process of order 1 (VAR(1)) if \vfill \vfill \vfill \vfill - Similar to the univariate case, characteristic equations can be defined to assess whether the models are stationary. \vfill - VAR models can be fit using the `ar()` or `VAR()` functions. ```{r} VAR(cbind(x2,y2), p = 1) ``` \vfill \newpage - The `VARselect` model can also be used to choose the order of a model based on some criteria. ```{r} VARselect(cbind(x2,y2), lag.max = 5, type="const") ``` \vfill - The `predict(n.ahead =)` and `resid()` functions can be applied to `VAR` objects. \vfill #### State-Space approach - More general multivariate time series models can be fit using a state-space approach. \vfill - The `dlm` package in R has the capability to fit models of this type. \vfill