---
title: "HW 10 Solutions, 23 points"
header-includes: \usepackage{placeins} \usepackage{verbatim}
date: "April 4, 2018"
output: pdf_document
---

```{r, echo = FALSE}
source("http://www.math.montana.edu/parker/courses/STAT411/diagANOVA.r")
library(Sleuth3)
```

# #1 exercise 14 on page 264 re: the {\it Pace of Life and Heart Disease}

## #1(a)
```{r}
summary(ex0914)
pairs(ex0914)
```

## #1(b)
We will fit the following model:
  $$\hat \mu\{Heart|Bank, Walk, Talk\}=\beta_0 + \beta_1Bank + \beta_2 Walk + \beta_3 Talk.$$
```{r}
m.3param=lm(Heart ~ Bank + Walk + Talk,data=ex0914)
diagANOVA(m.3param)
m.2param=lm(Heart ~ Bank + Walk,data=ex0914)
diagANOVA(m.2param)
```

## #1(c)

These residual plots look great, all assumptions appear satisfied
```{r}
diagANOVA(m.3param)
```

## #1(d)
R's output is:
```{r}
summary(m.3param)
```
Hence the estimated regression equation is
  $$\hat \mu\{Heart|Bank, Walk, Talk\}=3.18 + 0.41Bank + 0.45Walk - 0.18Talk.$$
  The SEs are ...

## #1(e)
The evidence fails to suggest that there is an associtioan between Talk and Heart after accounting for Bank and Walk.   This suggests that we consider the 2 parameter model
  $$\hat \mu\{Heart|Bank, Walk, Talk\}=\beta_0 + \beta_1Bank + \beta_2 Walk.$$

An extra sum of squares test comparing this 2-parameter model to the 3-parameter model ($p$ = 0.425) fails to suggest that the 3-parameter model better explains the data (i.e., has lower sums of squares of error).   Because of this result, AND because the 2-parameter model appears to meet MLR model assumptions, one might conclude that the 2-parameter model is sufficient to describe these data.


# #3 exercise 18 on page 266 re: evolution
```{r}
summary(ex0918)
```

## #3(a)
Comparing Model I to Model II
```{r}
m1 = lm(Ratio ~ Continent + Latitude,data=ex0918)
summary(m1)

m2 = lm(Ratio ~ Continent*Latitude,data=ex0918)
summary(m2)


m2.log = lm(log10(Ratio) ~ Continent*Latitude,data=ex0918)
summary(m2)
```

Based on the extra sum of squares test, the evidence suggests that the model with an interaction (Model II, \texttt{m2}) better explains the data.  Because of this results AND because Model II appears to meet the MLR assumptions (see #3(b)), we might conclude that Model II is "better".

## #3(b) 
Checking the assumptions: the assumption of constant variance could be suspect, but this conclusion is based on a single unusual datum with a large negative residual and a mean (fitted value) of about 0.82.   For a real data set, we would follow up with the researcher and ask them to validate this unusual datum:  was it entered into the data set correctly? were there any deviations from the study implementation or sampling plan that might explain why this datum is unusual? 
```{r}
diagANOVA(m2)
diagANOVA(m2.log)
```

## #3(c)
The fit regression model for Europe is:
  $$\hat\mu\{Ratio|Latitude, Continent = Europe\} = 0.799 + 0.0005Latitude.$$
  
The fit regression model for North America is:
  $$\hat\mu\{Ratio|Latitude, Continent = North America\} = (0.799+0.052) + (0.0005 - 0.001)Latitude.$$  
  
## #3(d)
```{r}
plot(Ratio ~ Latitude,pch=as.numeric(Continent),col=as.numeric(Continent),data=ex0918)
Lat = seq(35,57,.1)
lines(Lat, 0.799 + 0.0005*Lat,col=1,lty=1)
lines(Lat, 0.799+0.052 + (0.0005-0.001)*Lat,col=2,lty=2)
```

## #3(e)
The model we are using to answer this question is parameter is 
  $$\mu\{Ratio|Latitude, Continent\}=\beta_0 + \beta_1D_{NA}(Continent) + \beta_2Latitude + \beta_3D_{NA}(Continent) \times Latitude.$$
  
This question is asking us to test $H_0: \beta_3=0$ vs. $H_a: \beta_3\ne 0$.   The evidence suggests that, yes, there is a difference in the ``speed of evolution" of the female fly in Europe and North America ($t=-2.5$, $p=0.0219$)  

A 95% CI for $\beta_3$ is found by
```{r}
confint(m2)
```
Hence, the data suggest with 95% confidence that the mean ratio of basal length to wing size decreases by between 0.0002 and 1.86 for every 1 degree latitude north that the flies live.  

## #3(f)
```{r}
m3 = lm(Ratio ~ Continent*I(Latitude-45.68),data=ex0918)
summary(m3)
```

Hence the evidence suggests that the female fly's ratio is larger on the average in North America (estimate = 0.006 larger, $p$=0.019) compared to Europe.