Homework #10 STAT 505 Fall 2011

Due date: November 28, 2011
Include all R code as an appendix.
Include R output as needed to explain your analysis, but not in big chunks. It's your job to distill and explain which parts are important to look at.
  1. Exercise 9.13 p 197.
    I created a big csv data file here, and there is a word doc describing the variables. It doesn't say why some incumbent values are 3's. I'm guessing that a 3rd party candidate was incumbent. We'll have to treat them as NA's.
    1. Take data from a particular year not ending in 2 and pull out districts where the election was contested AND the election 2 years before was contested. Estimate the incumbency effect and the party effect.
    2. Plot fitted model and data. Discuss political interpretation of coefficient estimates.
    3. What have we assumed? What does it mean to treat "incumbency" as a treatment variable?
  2. Exercise 10.1 p 231-2
    Load the arm package and type data(lalonde). Lalonde and subsequent authors are comparing results of an experiment to results we might get if we use controls from general survey data. The underlying question is "How well do propensity scores and other such techniques work to answer causal questions?" In particular, are these methods biased compared to the actual treatment effects we compute in part (a)? See Smith & Todd for more comparison and interpretation.
    1. Use experimental data to get (i) difference in means of y = RE78 for treated and controls and (ii) regression-adjusted estimate of treatment effect. Use these (csv) data from Rajeev H. Dehejia's web site. Evaluate the appropriateness and precision of each estimate.
    2. Download Gelman's data and estimate treatment effects as you did in (a) using the CPS data as controls (keep samples 1 and 2, ignore sample 3.) You will need to build columns u74 and u75, which just tell us if the person had no earnings in 74 or 75 (u for unemployed).
    3. Estimate causal effect based on constructed data using propensity score matching. One choice for propensity is given in the help for the matching function in arm. You may use that, but I want to see two different propensity score models, the output from each, and discussion of which is preferred. Include estimates of the treatment effect as well.
    4. What did we estimate in (b) and (c)?
    5. Redo a, b, and c excluding earnings in 1974. When we drop that variable, more data is available for the experimental subjects, so use these data for (a), and use these data for b and c . What does this say about ignorability?


Author: Jim Robison-Cox
Last Updated: