- Write up a report of the effects of tension and wool type
(A or B) on the number of breaks. Include the following
points:
If you could choose one combination of tension and wool type to
minimize breaks, which would it be?
Is your favorite wool type always better?
Is your favorite tension tension always best?
Do the residuals give any indication of a problem with the
assumptions of ordinary least squares regression? If so, fix it.
Fit the full interaction model and interpret each line of the
table printed by the anova command in R.
- What size is the full model matrix? Write out the X
matrix for this model showing each unique row.
- What size is XT X? Use R to
create it. Start with:
myX <- with(warpbreaks, cbind(1, wool=="A",wool=="B", tension=="L", ...)
Interaction columns really are elementwise products of main
effects columns. Don't show the whole X, but do print
out XT X. Find its rank.
(XtX <- crossprod(myX))
qr(XtX)$rank
- Look at the model matrix R uses (apply the
model.matrix function to your full fitted model) and its
crossproduct. Verify or refute the claim: by deleting certain
rows and the same columns of the
full XT X we can obtain the one R
gives. Verify or refute: the R crossproduct model matrix is of
full column rank. Use the ginv function in the MASS package to
find a generalized inverse of XTX.
- Create a zero matrix the same size as
full XT X. In the rows and columns
which are not removed to get R's smaller matrix, set in the
values of the generalized inverse above.
RsXtXinv <- MASS::ginv( crossprod(model.matrix(my.fitted.model)))
myXtXinv <- matrix(0,12,12) ## or ## 0 * myXtX
omit.rows <- c(....) ## fill this in from above
myXtXinv[-omit.rows, -omit.rows] <- RsXtXinv
Demonstrate (to machine precision) that the resulting matrix is
a generalized inverse of XT X.
Note: in HW1 we used the zapsmall function to convert things
very close to zero to zero. You'll want to subtract the answer
you should get, and use zapsmall on the differences to
demonstrating equality. Alternatively, you could do it by hand
to get exact answers (factor out any 9's or 1/9ths).
- Compute XT y and demonstrate that
premultiplying by your generalized inverse above gives the same
solutions that R spits out, with some estimates (which ones?) set
to 0.
-
Build a table with a row for each estimated coefficient from
R. In terms of the greek letters in the model, what is that
coefficient estimating?
- This is getting long, but we need to see that answers are
not unique. Set these options and refit your model:
options(contrasts=c("contr.SAS","contr.poly"))
Again, find a ginv for R's crossproduct model matrix, and
substitute its values into a matrix of 0's to get a different
(confirm that) generalized inverse
of XTX (confirm it is a ginverse).
Compare the coefficient estimates you get from multiplication
of (XTX)g XT y
with the R estimates for this new model.
-
Again, build a table with a row for each estimated coefficient from
R. In terms of the greek letters in the model, what is that
coefficient estimating? These should be different from the
ones above.
- For each estimated coefficient vector, demonstrate that we
get the same fitted values, Xb (You found two b's using
the same full X).