Homework #5 STAT 506 Spring 2013

Due date: February 8, 2013
  1. Exercise 11.1 p 248.
    The book asks us to write out the models. No fitting needed.
  2. Exercise 11.4 p 249, but use these data (a subset with missing values and singletons removed.)
    1. Plot square root CD4 percentage over time with regression lines.
      cd4$vdate <- as.Date(cd4$vdate)
      require(lattice)
      xyplot(sqrt(cd4pct) ~ vdate, cd4, group=newpid,type=c("p","r"))
      
    2. Create a fit for each kid using lmList in nlme package. (The intervals command creates 95% CIs for each kid's intercept and slope.)
       require(nlme)
       cd4.kidfit <- lmList(sqrt(cd4pct) ~ I(as.numeric(vdate)-7128)|factor(newpid),cd4[,3:5])
       plot(intervals(cd4.kidfit))
      
    3. Extract coefficient vectors for each kid and obtain the two predictors we need.
       intercepts <- summary(cd4.kidfit)$coefficients[,1,1]
       slopes <- summary(cd4.kidfit)$coefficients[,1,2]
       plot(slopes~intercepts)
       age1 <- with(cd4, tapply(baseage, newpid,min))
       trt <- with(cd4, tapply(treatmnt, newpid,min))
      
      Obtain estimates for the whole population by fitting a model for intercept as a function of initial age and treatment, then fitting a similar model for slopes. Plot these fits and their "data points" (in quotes because they are estimates, not data).
    4. Do the same in SAS using PROC GLM (or PROC REG) and a by newpid; statement to fit a line to each kid. Save the parameter estimates by including the right option in the PROC line. Examine the structure of the dataset output and fit the models above in SAS. Report on how well these match with the R output.


    Author: Jim Robison-Cox
    Last Updated: