--- title: "STAT 436 / 536 - Lecture 8" date: September 28, 2018 output: pdf_document --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) knitr::opts_chunk$set(warning = FALSE) knitr::opts_chunk$set(message = FALSE) knitr::opts_chunk$set(fig.align= 'center') knitr::opts_chunk$set(fig.height= 3) knitr::opts_chunk$set(fig.width = 5) ``` ## Autoregressive Models - The random walk model can be written more generally as $$x_t = \alpha x_{t-1} + w_t,$$ where $\alpha = 1$. \vfill - If a time series can be written as $$x_t = \alpha_1 x_{t-1} + \alpha_2 x_{t-2} + \dots + \alpha_p x_{t-p}$$ \vfill - The AR model can also be written in terms of the backward shift operator **B**. $$\theta_p(\boldsymbol{B})x_t = (1 - \alpha_1\boldsymbol{B} - \alpha_2\boldsymbol{B}^2 - \dots - \alpha_p\boldsymbol{B}^p)x_t = w_t$$ \vfill - We have seen that the random walk is a special case of an AR(1) model. \vfill - The name autoregressive comes from the fact \vfill - The prediction (of a point estimate) at time $t$ is given by plugging in point estimates for the $\alpha$ values. $$\hat{x_t} = \alpha_1 x_{t-1} + \alpha_2 x_{t-2} + \dots + \alpha_p x_{t-p}$$ \vfill \newpage - Stationarity of the AR process can be determined using the $\theta_p(\boldsymbol{B})x_t$ representation of the series, where $\boldsymbol{B}$ is treated as a number. This equation is known as the characteristic equation. \vfill - The roots of the characteristic equation determine the stationarity of the series. The absolute value of all of the roots must be greater than one for stationarity. \vfill - Consider the AR(1) model, $x_t = \frac{1}{2}x_{t-1} + w_t$ \vfill \vfill - Consider the AR(2) model, $x_t = x_{t-1} + \frac{1}{4}x_{t-2} + w_t$ \vfill \vfill - Consider the random walk model $x_t = x_{t-1} + w_t$ \vfill \vfill - For an AR(1) process, $x_t = \alpha x_{t-1} + w_t$, the second order properties are: \vfill - The autocorrelation function for an AR(1) process is \vfill \newpage - Write a function to simulate an AR(1) process ```{r, eval = F} simAR <- function(alpha, sigma, time.pts){ # function to simulate and AR process # inputs: alpha - the alpha coefficient # : sigma - standard deviation of noise # : time.pts - number of time points # outputs: the time series vector as a ts object } ``` ```{r, echo = F} simAR <- function(alpha, sigma, time.pts){ # function to simulate and AR process # inputs: alpha - the alpha coefficient # : sigma - standard deviation of noise # : time.pts - number of time points # outputs: the time series vector as a ts object x <- rep(0, time.pts) for (t in 2:time.pts){ x[t] <- alpha * x[t-1] + rnorm(1,0,sigma) } return(ts(x)) } ar <- simAR(alpha=.8, sigma=1, time.pts = 50) library(ggfortify) library(dplyr) ar %>% autoplot ``` \vfill - Now let's examine the correlogram ```{r} set.seed(09192018) ar.series <- simAR(alpha=.8, sigma=1, time.pts = 500) acf.ar <- ar.series %>% acf acf.ar ``` this is fairly close to the empirical correlation term. \vfill - The autocorrelation will be non-zero for all lags, even though the model for time $t$ only depends on the value from time $t-1$. \vfill \vfill - The \vfill \vfill ```{r} set.seed(09192018) ar.series <- simAR(alpha=.8, sigma=1, time.pts = 500) pacf.ar <- ar.series %>% pacf pacf.ar ``` \vfill - The \vfill - The `ar()` function in R can be used to fit AR models and has several useful properties \vfill - *the* \vfill - *the* \vfill - *the* \vfill ```{r} ar.vals <- ar(ar.series, order.max = 2) ar.vals predict(ar.vals, n.ahead = 5) ``` \vfill