Homework #4 STAT 506 Spring 2013

Due date: February 1, 2013
  1. Exercise 10.1 p 231-2.
    To load the data in R, load the arm package and type data(lalonde). Lalonde and subsequent authors are comparing results of an experiment to results we might get if we use controls from general survey data. The underlying question is "How well do propensity scores and other such techniques work to answer causal questions?" In particular, are these methods biased compared to the actual treatment effects we compute in part (a)? See Smith & Todd for more comparison and interpretation.
    1. Use experimental data to get (i) difference in means of y = RE78 for treated and controls and (ii) regression-adjusted estimate of treatment effect. Use these (csv) data from Rajeev H. Dehejia's web site. Evaluate the appropriateness and precision of each estimate.
    2. Download Gelman's data and estimate treatment effects as you did in (a) using the CPS data as controls (keep samples 1 and 2, ignore sample 3.) You will need to build columns u74 and u75, which just tell us if the person had no earnings in 74 or 75 (u for unemployed).
    3. Estimate causal effect based on constructed data using propensity score matching. One choice for propensity is given in the help for the matching function in arm. You may use that, but I want to see two different propensity score models, the output from each, and discussion of which is preferred. Include estimates of the treatment effect as well.
    4. What did we estimate in (b) and (c)?
    5. Redo a, b, and c excluding earnings in 1974. When we drop that variable, more data is available for the experimental subjects, so use these data. What does this say about ignorability?
    6. Can we do propensity score matching in SAS? Search the SAS site to find a macro (like an R function) for the matching.
      1. Provide a URL to the pdf file and a citation. Discuss: how do the authors compare the treated and control groups to show that matching has "worked"?
      2. What proc is used to create the propensity scores?
      3. Use R to write the matched data from your favorite matching in (c) to a file. Read the file into SAS and estimate the treatment effect (after appropriate adjustments). Confirm: SAS should agree with R.
  2. Exercise 10.2 p 232. You may use either R or SAS. The main point here is the reasoning behind regression discontinuity analysis, but it is also good practice to be able to build a model with a "change point". Link to Gelman's bypass folder.

Author: Jim Robison-Cox
Last Updated: