--- title: "HW 10 Solutions, 23 points" header-includes: \usepackage{placeins} \usepackage{verbatim} date: "April 4, 2018" output: pdf_document --- ```{r, echo = FALSE} source("http://www.math.montana.edu/parker/courses/STAT411/diagANOVA.r") library(Sleuth3) ``` # #1 exercise 14 on page 264 re: the {\it Pace of Life and Heart Disease} ## #1(a) ```{r} summary(ex0914) pairs(ex0914) ``` ## #1(b) We will fit the following model: $$\hat \mu\{Heart|Bank, Walk, Talk\}=\beta_0 + \beta_1Bank + \beta_2 Walk + \beta_3 Talk.$$ ```{r} m.3param=lm(Heart ~ Bank + Walk + Talk,data=ex0914) diagANOVA(m.3param) m.2param=lm(Heart ~ Bank + Walk,data=ex0914) diagANOVA(m.2param) ``` ## #1(c) These residual plots look great, all assumptions appear satisfied ```{r} diagANOVA(m.3param) ``` ## #1(d) R's output is: ```{r} summary(m.3param) ``` Hence the estimated regression equation is $$\hat \mu\{Heart|Bank, Walk, Talk\}=3.18 + 0.41Bank + 0.45Walk - 0.18Talk.$$ The SEs are ... ## #1(e) The evidence fails to suggest that there is an associtioan between Talk and Heart after accounting for Bank and Walk. This suggests that we consider the 2 parameter model $$\hat \mu\{Heart|Bank, Walk, Talk\}=\beta_0 + \beta_1Bank + \beta_2 Walk.$$ An extra sum of squares test comparing this 2-parameter model to the 3-parameter model ($p$ = 0.425) fails to suggest that the 3-parameter model better explains the data (i.e., has lower sums of squares of error). Because of this result, AND because the 2-parameter model appears to meet MLR model assumptions, one might conclude that the 2-parameter model is sufficient to describe these data. # #3 exercise 18 on page 266 re: evolution ```{r} summary(ex0918) ``` ## #3(a) Comparing Model I to Model II ```{r} m1 = lm(Ratio ~ Continent + Latitude,data=ex0918) summary(m1) m2 = lm(Ratio ~ Continent*Latitude,data=ex0918) summary(m2) m2.log = lm(log10(Ratio) ~ Continent*Latitude,data=ex0918) summary(m2) ``` Based on the extra sum of squares test, the evidence suggests that the model with an interaction (Model II, \texttt{m2}) better explains the data. Because of this results AND because Model II appears to meet the MLR assumptions (see #3(b)), we might conclude that Model II is "better". ## #3(b) Checking the assumptions: the assumption of constant variance could be suspect, but this conclusion is based on a single unusual datum with a large negative residual and a mean (fitted value) of about 0.82. For a real data set, we would follow up with the researcher and ask them to validate this unusual datum: was it entered into the data set correctly? were there any deviations from the study implementation or sampling plan that might explain why this datum is unusual? ```{r} diagANOVA(m2) diagANOVA(m2.log) ``` ## #3(c) The fit regression model for Europe is: $$\hat\mu\{Ratio|Latitude, Continent = Europe\} = 0.799 + 0.0005Latitude.$$ The fit regression model for North America is: $$\hat\mu\{Ratio|Latitude, Continent = North America\} = (0.799+0.052) + (0.0005 - 0.001)Latitude.$$ ## #3(d) ```{r} plot(Ratio ~ Latitude,pch=as.numeric(Continent),col=as.numeric(Continent),data=ex0918) Lat = seq(35,57,.1) lines(Lat, 0.799 + 0.0005*Lat,col=1,lty=1) lines(Lat, 0.799+0.052 + (0.0005-0.001)*Lat,col=2,lty=2) ``` ## #3(e) The model we are using to answer this question is parameter is $$\mu\{Ratio|Latitude, Continent\}=\beta_0 + \beta_1D_{NA}(Continent) + \beta_2Latitude + \beta_3D_{NA}(Continent) \times Latitude.$$ This question is asking us to test $H_0: \beta_3=0$ vs. $H_a: \beta_3\ne 0$. The evidence suggests that, yes, there is a difference in the ``speed of evolution" of the female fly in Europe and North America ($t=-2.5$, $p=0.0219$) A 95% CI for $\beta_3$ is found by ```{r} confint(m2) ``` Hence, the data suggest with 95% confidence that the mean ratio of basal length to wing size decreases by between 0.0002 and 1.86 for every 1 degree latitude north that the flies live. ## #3(f) ```{r} m3 = lm(Ratio ~ Continent*I(Latitude-45.68),data=ex0918) summary(m3) ``` Hence the evidence suggests that the female fly's ratio is larger on the average in North America (estimate = 0.006 larger, $p$=0.019) compared to Europe.