1 Section 2 Randomized and Natural (or Quasi) Experiments
2 Randomized versus natural (or quasi) experiments Definitions: A (randomized) experiment is designed and implemented consciously by human researchers. It entails conscious use of a treatment and control group with random assignment (e.g. clinical trials of drugs). A natural or quasiexperiment has a source of randomization that is as if randomly assigned, but this variation was not part of a conscious randomized treatment and control design.
3 Randomized Experiments How can randomization solve the evaluation problem? Comparison group selected using randomization device to randomly exclude some fraction of program applicants from program per definition: no selection into treatment (if randomization worked) Main advantage: comparability between program participants and nonparticipants same distribution of observables and unobservables Formally: randomization leads to E(Y1 D=1)=E(Y1 D=0)=E(Y1) and E(Y0 D=0)=E(Y0 D=1)=E(Y0), so ATE = E(Y1)E(Y0) = E(Y1 D=1)E(Y0 D=0) (last two exp observed) Note: ATE=E(Y1 D=1)E(Y0 D=1) = TTE
4 Examples of randomized experiments Largescale experiments, e.g. in the US/Canada: US National JTPA (Job Training Partnership Act) Study, Tennessee class size experiment (STAR) More recently, randomized experiments in developing countries: Small experiments addressing very specific questions, for example microfinance experiments by Dean Karlan, on education (e.g. schooling inputs) by Michael Kremer and Esther Duflo, etc Example of a largescale and very successful conditional cash transfer program: Progresa/Oportunidades in Mexico ( )
5 Conditional cash transfer program: Progresa/Oportunidades By 2000, Progresa covered appr. 2.6 mio families (1/3 of rural families or 10% of all families in Mexico) and operates in almost 50,000 rural villages in 31 states. Progresa s budget is about US$ 800 mio or 0.2% of GDP. Significantly improved education and health of children by giving financial incentives ( conditional cash ) to keep children in school, to undertake preventive health care measures (eg vaccinations) and take nutrition additives. Size of the cash transfer is large: on average appr 1/3 of household income given to mother of the family. Served as role model for similar programs all over the world, e.g. Argentina, Colombia, Nicaragua,
6 Conditional cash transfer program: Progresa/Oportunidades Rural Progresa/Oportunidades uses a controlled randomized design: In 1998, 506 of the 50,000 Progresa villages were randomly assigned to control and treatment groups. Eligible (i.e. poor determined through proxy means test) households in treatment villages received benefits immediately, while benefits for eligible households in control villages were postponed for 2 years. T (treatment villages): eligible and ineligible households C (control villages): eligible and ineligible household Households were informed about eligibility and takeup rate was 97%. Timeline of survey (appr 14,500 HH with over 80,000 individuals): Baseline periods: 1997, march 1998 Treatment and control: oct 1998, march 1999, nov 1999 Treatment also for former controls: 2000 Introduction of Progr/Oport in urban areas in 2000 (not randomized due to political opposition)
7 Problems with Experiments in Practice A) Threats to Internal Validity 1. Failure to randomize (or imperfect randomization) 2. Failure to follow treatment protocol ( partial compliance ) Some controls get treatment, some treated dropout of program Differential attrition (e.g. in job training program, controls who find jobs move out of town) 3. Experimental effects Experimenter bias: treatment is associated with extra effort Subject behavior might be affected by being in an experiment (Hawthorne effect) Just as in regression analysis with observational data, threats to internal validity of regression with experimental data implies that Cov(D,U) is not equal to zero, so OLS (difference estimator) is biased.
8 Problems with Experiments in Practice B) Threats to External Validity Nonrepresentative sample Nonrepresentative treatment (that is program or policy) General equilibrium effects (effect of a program can depend on its scale) and peer effects Treatment vs. eligibility effects (what do you want to measure: effect on those who takeup the program or the effect on those eligible)
9 Solutions to dropout of treatment and contamination bias 1) Can define treatment as intenttotreat or offer of treatment, in which case dropout is not a problem 2) Alternative solution for dropout and contamination bias Initial random assignment: R=0/1 Decision to participate: D=0/1 Drop out of treatment: R=1 and D=0 Contamination bias: R=0 and D=1 Notation: p0=p(d=1 R=0), p1=p(d=1 R=1) Observe R, D, p0, p1, Y0 if D=0 and Y1 if D=1
10 Notation: p0=p(d=1 R=0) and p1=p(d=1 R=1) E(Y R=0)=E(Y1 R=0)*p0 + E(Y0 R=0)*(1p0) E(Y R=1)=E(Y1 R=1)*p1 + E(Y0 R=1)*(1p1) Because of Randomization: E(Y1 R=1)=E(Y1 R=0)=E(Y1) (same for Y0) [Assumption: dropouts are unaffected by being in the treatment group] E(Y R=1) E(Y R=0)= E(Y1)*(p1p0) E(Y0)*(p1p0) ATE= E(Y1) E(Y0)= [E(Y R=1)E(Y R=0)]/(p1p0) [WaldEstimator]
11 What to take into account when conducting a randomized experiment? 1) What to randomize on: 1. Randomize eligibility 2. Randomize after acceptance into the program 3. Randomize incentives for takeup for eligibles Ad 2. R=1 if randomized in (treatment group) R=0 if randomized out (control group) D denotes if someone applies to the program and is subject to randomization [here D=1 for all people who are in the randomization] Random assignment implies: For treatment group: E(Y1 X, D=1,R=1) = E(Y1 X, D=1) For control group: E(Y0 X, D=1, R=0) = E(Y0 X, D=1) Experiment gives TTE = E(Y1Y0 X, D=1)
12 2) Power calculations Def: Power of the design is the probability that, for a given effect size and a given statistical significance level, we will be able to reject the hypothesis of zero effect. Design choices that affect the power of an experiment: Sample size Minimum effect size that the researcher wants to be able to detect Multiple treatment groups Partial compliance Control variables (important to know how much they absorb of the residual variance) Standard software exists for the singlesite case Multisite power analyses get very complicated Need to know the impact variation across sites and the interclass correlations
13 3) Choosing the sites in multisite experiments External validity: choose sites at random Realistic impacts: choose sites that are not the first to implement the treatment Efficacy: choose sites that will do the best job of implementing the treatment Avoid contamination: choose sites with little or no contact of any sort Examples: Teach for America, PROGRESA/OPORTUNIDADES in Mexico
15 Randomized versus natural (or quasi) experiments Idea for Evaluation of QuasiExperiments follows that of real randomized experiments: find exogenous source of variation (i.e. variable that affects participation but not directly the outcome)! BUT this is not the case by calling experiments natural : usually not natural but for example changes in laws etc (apart from natural natural experiments such as looking at identical twins, see overview article) and not automatically exogenous Important to understand the source of variation that helps to identify the treatment parameter of interest (see Meyer)
16 How to evaluate these programs? NOTE: the term natural or quasiexperiment does not imply what approach is used to evaluate it! BUT many different approaches of Program Evaluation have been used to evaluate such a quasiexperiment (depending on context of quasiexp), e.g. DiffinDiff or IV. Two types of quasiexperiments: 1. Treatment (D) is as if randomly assigned (perhaps conditional on some control variables) 2. A variable (Z) that influences treatment (D) is as if randomly assigned (Instrumental variable approach)
17 How to evaluate these programs? Randomized experiment In an ideal randomized controlled experiment the treatment level D is randomly assigned: Y=a+b*D+U If D is randomly assigned (for example by computer), then U and D are independently distributed, so E(U D)=0 OLS yields an unbiased estimator of the causal effect b When the treatment is binary, the causal effect b is just the difference in mean outcome (Y) in the treatment vs. the control group (SHOW using counterfactual notation!!). This difference in means is sometimes called the differences estimator.
18 How to evaluate these programs? Randomized experiment Y=a+b*D+U OLS yields an unbiased estimator of the causal effect b Usually one adds other controls to the model: Y=a+b*D+c*X+U Why? 1. Check if randomization worked: if D is randomly assigned, then the OLS estimator with and without the controls, X, should be similar if they aren t, this suggests that D was not randomly assigned Note: to check directly for randomization, regress the treatment indicator, D, on the controls, X, and do a Ftest. 2. Increases efficiency: more precise estimator of b (smaller standard errors) 3. Adjust for conditional randomization (apply conditional randomization if interested in treatment effects for different groups)
19 TO DO Preconditions: read Wooldridge Chapter 1 and 2 on causation and on conditional expectations and law of iterated expectations [Read about counterfactual framework: Rubin] Read about randomized experiments: Chapter of Handbook of Development Economics of Duflo et al (sect 1, 2, 6, 8) Read about natural (quasi) experiments: Meyer this will also be the paper to read for following section differenceindifference estimation (Rosenzweig/Wolpin on natural natural experiments nice overview)
More information