--- title: "STAT 436 / 536 - Lecture 1" date: August 27, 2018 output: pdf_document --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) knitr::opts_chunk$set(warning = FALSE) library(datasets) library(ggplot2) library(dplyr) ``` ### Prediction Invervals Recap 436/536: Define a 95 percent prediction inverval. \vfill \vfill 536: In words describe: $p(Y_{t+1}|Y_{1:t})$. \vfill \vfill ## Time Series Forecasting Exercise For the following exercises, you will be asked to forecast future observed values. Rather than a point estimate, you will also be asked for 95 percent prediction intervals as a major goal in my classes is to think in terms of distributions, particularly when considering uncertainty. \vfill This will be a competition, where the goal is to have the lowest cumulative score. The score will be a linear function of the specified width of your prediction intervals. Scoring specifics will be given for each situation below, but if your prediction intervals do not contain the requested values, the penalty will be 500. Scoring for prediction interval width: - Lake Huron: width * 100 - Nile River: width - Airline Passengers: width \vfill In each situation, consider the following questions: 1. What did you use to make predictions? 2. Did the level of uncertainty differ between the predictions, if so why? \vfill \newpage #### Lake Huron Depth Predict the depth of Lake Huron in feet, or more specifically a prediction interval, for: 1. 1966: \vfill 2. 1970: \vfill 3. 1972: \vfill ```{r, echo =F, fig.align='center', fig.width=8, fig.height=8} huron.depth <- data.frame(year = 1875:1972, depth = LakeHuron) huron.depth[huron.depth$year > 1965, 'depth'] <- NA label.Dates <- seq(1875,1972, 10) ggplot(data=huron.depth, aes( x=year, y =depth)) + theme_gray() + geom_line() + geom_point() + ylim(575,585) + labs(title="Depth of Lake Huron", y="Feet") + scale_x_continuous(labels = label.Dates, breaks = label.Dates) ``` \newpage #### Airline Passengers Predict airline passenger counts in thousands, or more specifically a prediction interval, for: 1. January 1960: \vfill 2. July 1960: \vfill 3. December 1960: \vfill ```{r, echo =F, fig.align='center', fig.width=8, fig.height=8} air.passengers <- data.frame(year = rep(c(1949:1960), each = 12), month = rep(1:12, 12), passengers = c(AirPassengers[-c(133:144)], rep(NA,12))) air.passengers$strDates <- paste(air.passengers$month, '/15/',air.passengers$year, sep='') air.passengers$date <- as.Date(air.passengers$strDates, "%m/%d/%Y") label.Dates <- paste('Jan', 1949:1960, sep='') break.dates <- as.Date(paste(1, '/15/',1949:1960, sep=''), "%m/%d/%Y") ggplot(data=air.passengers, aes( x=date, y =passengers)) + geom_line() + geom_point() + ylim(0,700) + labs(title="Monthly Airline Passenger Count", y="Number of Passengers(thousands)") + scale_x_date(labels = label.Dates, breaks = break.dates) + theme(axis.text.x = element_text(angle = 90)) ``` \newpage #### Nile River Flow Predict nile flow in million cubic meters, or more specifically a prediction interval, for: 1. 1911: \vfill 2. 1913: \vfill 3. 1916: \vfill ```{r, echo =F, fig.align='center', fig.width=8, fig.height=8} nile.flow <- data.frame(year = 1871:1970, flow = Nile) nile.flow[nile.flow$year > 1910 & nile.flow$year < 1920,'flow'] <- NA label.Dates <- seq(1871,1970, 10) ggplot(data=nile.flow, aes( x=year, y =flow)) + theme_gray() + geom_line() + geom_point() + ylim(0,1500) + labs(title="Annual flow on Nile River", y="Million cubic meters") + scale_x_continuous(labels = label.Dates, breaks = label.Dates) ``` \newpage