Please use D2L to turn in both the PDF/ Word output and your R Markdown file.

For this exercise, we will continue working with the bakery dataset.

Q1. Bread Basket (100 pts)

This question will focus on making predictions using a dataset with purchases at the Bread Basket bakery. The dataset can be downloaded from http://math.montana.edu/ahoegh/teaching/timeseries/data/BreadBasket.csv. In particular, your goal is to estimate the number of pastries to bake.

a.

The code below creates a data frame that has the number of pastries sold on a day as well as the number of drinks (coffee + tea) sold on the previous day.

Create a data visualization to assess the relationship between the previous day’s drink counts and the current day’s pastries.

bakery_sales <- read_csv('http://math.montana.edu/ahoegh/teaching/timeseries/data/BreadBasket.csv')
## Parsed with column specification:
## cols(
##   Date = col_date(format = ""),
##   Time = col_time(format = ""),
##   Transaction = col_integer(),
##   Item = col_character()
## )
pastry_count <- bakery_sales %>% filter(Item %in% c('Pastry','Scandinavian','Medialuna','Muffin','Scone')) %>% group_by(Date) %>% tally() %>% rename(num.pastry = n) 

drink_count <- bakery_sales %>% filter(Item %in% c('Coffee','Tea')) %>% group_by(Date) %>% tally() %>% rename(num.drink = n) 

combined <- bind_cols(pastry_count[-1,],data.frame(prev_drink = drink_count$num.drink[-158]))
b.

The baker hopes to make just enough pastries for a given day, but making too many or too few pastries both have financial implications. You will build a predictive model to determine home many pastries should be prepared on a given day. Decode the loss function below and describe this to the baker.

pastry_loss <- function(pastries.made, pastries.sold){
  ########################################################################
  # Description: loss function for cost of too many or too few pastries
  # ARGS: pastries.made - (prediction for) number of pastries prepared for a day
  #       pastries.sold - number of pastries sold for a day
  # RETURNS:
  #       lost.income - money lost by making too many or to few pastries
  ########################################################################
  if (pastries.made > pastries.sold){
    message('too many pastries made')
    lost.income = (pastries.made - pastries.sold) * 1.50 # cost of making pastry
  } else if (pastries.made < pastries.sold){
    message('too few pastries made')
    lost.income = (pastries.sold - pastries.made) * 2 # profit per pastry
  } else if (pastries.made == pastries.sold){
    message('CONGRATS! correct number of pastries made')
    lost.income = 0
  }
  return(lost.income)
}
c.

Fit at least one predictive model for the number of pastries to prepare each day. Compare you predictive model with the baker’s plan of preparing 46 pastries each day (the maximum number of pastries sold on a day during the first 100 days). Use the first 100 days to fit a model and make predictions for the remaining days. You can use the previous days information to make predictions and note the data frame already contains the previous days coffee and tea sales.

num.preddays <- nrow(combined) - 100
baker_predictions <- rep(46, num.preddays)
baker_loss <- rep(0,num.preddays)

for (days in 1:num.preddays){
 baker_loss[days] <- pastry_loss(pastries.made = baker_predictions[days], pastries.sold = combined$num.pastry[days])  
}

The baker’s plan lost a total of 2319 dollars of potential revenue, largely by making too many pastries.

d.

Briefly summarize the predictive method you chose and contrast this with the baker’s choice.