title: "HW 10 Solutions, 23 points"
date: "April 4, 2018"

# #1 exercise 14 on page 264 re: the {\it Pace of Life and Heart Disease}
## #1(a)
```{r}
summary(ex0914)
pairs(ex0914)
```
## #1(b)
We will fit the following model:
$$\hat \mu\{Heart|Bank, Walk, Talk\}=\beta_0 + \beta_1Bank + \beta_2 Walk + \beta_3 Talk.$$
```{r}
m.3param=lm(Heart ~ Bank + Walk + Talk,data=ex0914)
diagANOVA(m.3param)
m.2param=lm(Heart ~ Bank + Walk,data=ex0914)
diagANOVA(m.2param)
```
## #1(c)
These residual plots look great, all assumptions appear satisfied
```{r}
diagANOVA(m.3param)
```
## #1(d)
R's output is:
```{r}
summary(m.3param)
```
Hence the estimated regression equation is
$$\hat \mu\{Heart|Bank, Walk, Talk\}=3.18 + 0.41Bank + 0.45Walk - 0.18Talk.$$
The SEs are ...
## #1(e)
The evidence fails to suggest that there is an associtioan between Talk and Heart after accounting for Bank and Walk. This suggests that we consider the 2 parameter model
$$\hat \mu\{Heart|Bank, Walk, Talk\}=\beta_0 + \beta_1Bank + \beta_2 Walk.$$
An extra sum of squares test comparing this 2-parameter model to the 3-parameter model ($p$ = 0.425) fails to suggest that the 3-parameter model better explains the data (i.e., has lower sums of squares of error). Because of this result, AND because the 2-parameter model appears to meet MLR model assumptions, one might conclude that the 2-parameter model is sufficient to describe these data.

# #3 exercise 18 on page 266 re: evolution
```{r}
summary(ex0918)
```
## #3(a)
Comparing Model I to Model II
```{r}
m1 = lm(Ratio ~ Continent + Latitude,data=ex0918)
summary(m1)
m2 = lm(Ratio ~ Continent*Latitude,data=ex0918)
summary(m2)
m2.log = lm(log10(Ratio) ~ Continent*Latitude,data=ex0918)
summary(m2)
```
Based on the extra sum of squares test, the evidence suggests that the model with an interaction (Model II, \texttt{m2}) better explains the data. Because of this results AND because Model II appears to meet the MLR assumptions (see #3(b)), we might conclude that Model II is "better".
## #3(b)
Checking the assumptions: the assumption of constant variance could be suspect, but this conclusion is based on a single unusual datum with a large negative residual and a mean (fitted value) of about 0.82. For a real data set, we would follow up with the researcher and ask them to validate this unusual datum: was it entered into the data set correctly? were there any deviations from the study implementation or sampling plan that might explain why this datum is unusual?
```{r}
diagANOVA(m2)
diagANOVA(m2.log)
```
## #3(c)
The fit regression model for Europe is:
$$\hat\mu\{Ratio|Latitude, Continent = Europe\} = 0.799 + 0.0005Latitude.$$
The fit regression model for North America is:
$$\hat\mu\{Ratio|Latitude, Continent = North America\} = (0.799+0.052) + (0.0005 - 0.001)Latitude.$$
## #3(d)
```{r}
plot(Ratio ~ Latitude,pch=as.numeric(Continent),col=as.numeric(Continent),data=ex0918)
Lat = seq(35,57,.1)
lines(Lat, 0.799 + 0.0005*Lat,col=1,lty=1)
lines(Lat, 0.799+0.052 + (0.0005-0.001)*Lat,col=2,lty=2)
```
## #3(e)
The model we are using to answer this question is parameter is
$$\mu\{Ratio|Latitude, Continent\}=\beta_0 + \beta_1D_{NA}(Continent) + \beta_2Latitude + \beta_3D_{NA}(Continent) \times Latitude.$$
This question is asking us to test $H_0: \beta_3=0$ vs. $H_a: \beta_3\ne 0$. The evidence suggests that, yes, there is a difference in the ``speed of evolution" of the female fly in Europe and North America ($t=-2.5$, $p=0.0219$)
A 95% CI for $\beta_3$ is found by
```{r}
confint(m2)
```
Hence, the data suggest with 95% confidence that the mean ratio of basal length to wing size decreases by between 0.0002 and 1.86 for every 1 degree latitude north that the flies live.
## #3(f)
```{r}
m3 = lm(Ratio ~ Continent*I(Latitude-45.68),data=ex0918)
summary(m3)
```
Hence the evidence suggests that the female fly's ratio is larger on the average in North America (estimate = 0.006 larger, $p$=0.019) compared to Europe.