M.I.T. LIBRARIES - DEWEY
Digitized by the Internet Arhive in 2011 with funding frm Bstn Library Cnsrtium Member Libraries http://www.arhive.rg/details/empirialstrategooangr
tmuu ReV wrking paper department f enmis EMPIRICAL STRATEGIES IN LABOR ECONOMICS Jshua D. Angrist Alan B. Krueger Otber 1998 massahusetts institute f tehnlgy 50 memrial drive Cambridge, mass. 02139
WORKING PAPER DEPARTMENT OF ECONOMICS EMPIRICAL STRATEGIES IN LABOR ECONOMICS Jshua D. Angrist Alan B. Krueger N. 98-07Rev. Otber 1998 MASSACHUSETTS INSTITUTE OF TECHNOLOGY 50 MEMORIAL DRIVE CAMBRIDGE, MASS. 02142
rmssg: ;f^e Tde DEC 1 7 1998 LIBRARIES
C:\... \prjets\handbk\transfer\hapterl098.wpd Otber 27, 1998 Empirial Strategies in Labr Enmis Jshua D. Angrist MIT and NBER and Alan B. Krueger Prinetn University and NBER Otber 1998 *We thank Eri Bettinger, Luia Breierva, Kristen Harknett, Aarn Siskind, Diane Whitmre, Eri Wang, and Steve Wu fr researh assistane. Fr helpful mments and disussins we thank Albert Abadie, Darn Aemglu, Jere Behrman, David Card, Angus Deatn, Jeff Kling, Guid Imbens, Chns Mazing, Steve Pishke, and Ceilia Ruse. Of urse, errrs and missins are slely the wrk f the authrs. This paper was prepared fr the Handbk f Labr Enmis.
EMPIRICAL STRATEGIES IN LABOR ECONOMICS JOSHUA D. ANGR1ST AND ALAN B. KRUEGER Massahusetts Institute f Tehnlgy and Prinetn University Cntents 1. Intrdutin 2. Identifiatin strategies fr ausal relatinships 2.1 The range f ausal questins 2.2 Identifiatin in regressin mdels 2.2.1 Cntrl fr nfunding variables 2.2.2 Fixed-effets and differenes-in-differenes 2.2.3 Instrumental variables 2.2.4 Regressin-disntinuity designs 2.3 Cnsequenes f hetergeneity and nnlinearity 2.4 Refutability 3. Data lletin strategies 2.3.1 Regressin and the nditinal expetatin funtin 2.3.2 Mathing instead f regressin 2.3.3 Mathing using the prpensity sre 2.3.4 Interpreting instrumental variables estimates 3.1 Sendary sures 3.2 Primary data lletin and survey methds 3.3 Administrative data and rerd linkage 3.4 Cmbining samples 4. Measurement issues 4.1 Measurement errr mdels 4.2 The extent f measurement errr in labr data 4.3 Weighting and allated values 5. Summary Appendix Referenes
ABSTRACT Empirial Strategies in Labr Enmis This hapter prvides an verview f the methdlgial and pratial issues that arise when estimating ausal relatinships that are f interest t labr enmists. The subjet matter inludes identifiatin, data lletin, and measurement prblems. Fur identifiatin strategies are disussed, and five empirial examples - the effets f shling, unins, immigratin, military servie and lass size - illustrate the methdlgial pints. In disussing eah example, we adpt an experimentalist perspetive that draws a lear distintin between variables that have ausal effets, ntrl variables, and utme variables. The hapter als disusses sendary data sets, primary data lletin strategies, and administrative data. The setin n measurement issues fuses n reent empirial examples, presents a summary f empirial findings n the reliability f key labr market data, and briefly reviews the rle f survey sampling weights and the allatin f missing values in empirial researh. JEL Numbers: J00, J31, CIO, C81
1. Intrdutin Empirial analysis is mre mmn and relies n mre diverse sures f data in labr enmis than in enmis mre generally. Table 1, whih updates Staffrd's (1986, Table 7.2) survey f researh in labr enmis, bears ut this laim. Indeed, almst 80% f reent artiles published in labr enmis ntain sme empirial wrk, and a striking tw-thirds analyzed mir data. In the 1970s, mir data beame mre mmn in studies f the labr market than time-series data, and by the mid-90s the use f mir data utnumbered time-series data by a fatr f ver ten t ne. The use f mir and time-series data is mre evenly split in ther fields f enmis. In additin t using mir data mre ften, labr enmists have me t rely n a wider range f data sets than ther enmists. The fratin f published papers using data ther than what is in standard publi-use files reahed 38% perent in the perid frm 1994 t 1997. The files in the "all ther mir data sets" ategry in Table 1 inlude primary data sets lleted by individual researhers, ustmized publi use files, administrative rerds, and administrative-survey links. This is ntewrthy beause abut ten years ag, in his Handbk f Enmetris survey f enmi data issues, Grilihes (1986, p. 1466) bserved: "... sine it is the 'badness' f the data that prvides us with ur living, perhaps it is nt at all surprising that we have shwn little interest in imprving it, in getting invlved in the grubby task f designing and lleting riginal data sets f ur wn." The grwing list f papers invlving sme srt f riginal data lletin suggests this situatin may be hanging; examples inlude Freeman and Hall (1986), Ashenfelter and Krueger (1994), Andersn and Meyer (1994), Card and Krueger (1994, 1998), Dminitz and Manski (1997), Imbens, Rubin and Saerdte (1997), and Angrist( 1998). Labr enmis has als me t be distinguished by the use f utting edge enmetri and statistial methds. This laim is supprted by the bservatin that utside f time-series enmetris, many and perhaps mst innvatins in enmetri tehnique and style sine the 1970s were largely mtivated by researh n labr-related tpis. These innvatins inlude sample seletin mdels, nnparametri methds fr ensred data and survival analysis, quantile regressin, and the renewed interest in statistial and
2 identifiatin prblems related t instrumental variables estimatrs and quasi -experimental methds. What d labr enmists d with all the data they analyze? A brad distintin an be made between tw types f empirial researh in labr enmis: desriptive analysis and ausal inferene. Desriptive analysis an establish fats abut the labr market that need t be explained by theretial reasning and yield new insights int enmi trends. The imprtane f stensibly mundane desriptive analysis an be aptured by Sherlk Hlmes's admnitin that: "It is a apital ffense t therize befre all the fats are in." A great deal f imprtant researh falls under the desriptive heading, inluding wrk n trends in pverty rates, labr fre partiipatin, and wage levels. A gd example f desriptive researh f majr imprtane is the wrk dumenting the inrease in wage dispersin in the 1980s (see e.g.. Levy, 1987, Murphy and Welh, 1992; Katz and Murphy, 1992; Juhn, Murphy, and Piere, 1993). This researh has inspired a vigrus searh fr the auses f hanges in the wage distributin. In ntrast with desriptive analysis, ausal inferene researh seeks t determine the effets f partiular interventins r pliies, r t estimate features f the behaviral relatinships suggested by enmi thery. Causal inferene and desriptive analysis are nt mpeting methds; indeed, they are ften mplementary. In the example mentined abve, mpelling evidene that wage dispersin inreased in the 1980s inspired a searh fr auses f these hanges. Causal inferene is ften mre diffiult than desriptive analysis, and nsequently mre ntrversial. Mst labr enmists seem t share a mmn view f the imprtane f desriptive researh, but there are differenes in views regarding the rle enmi thery an r shuld play in ausal mdeling. This divisin is illustrated by the debate ver sial experimentatin (Burtless, 1995; Hekman and Smith, 1995), in ntrasting apprahes t studying the impat f immigratin n the earnings f natives (Card, 1990; Brjas, Freeman and Katz, 1997), and in reent sympsia illustrating alternative researh styles (Angrist, 1995a; Keane and Wlpin, 1 997). Researh in a struturalist style relies heavily n enmi thery t guide empirial wrk r t make preditins. Keane and Wlpin (1 997, p. Ill) desribe strutural wrk as trying t d ne f tw
3 things: (a) rever the primitives f enmi thery (parameters determining preferenes and tehnlgy); (b) estimate deisin rules derived frm enmi mdels. Given suess in either f these endeavrs, it is usually lear hw t make ausal statements and t generalize frm the speifi relatinships and ppulatins studied in any partiular appliatin. An alternative t strutural mdeling, ften alled the quasi-experimental r simply the "experimentalist" apprah, als uses enmi thery t frame ausal questins. But this apprah puts frnt and enter the prblem f identifying the ausal effets frm speifi events r situatins. The prblem f generalizatin f findings is ften left t be takled later, perhaps with the aid f enmi thery r infrmal reasning. Often this press invlves the analysis f additinal quasi-experiments, as in reent wrk n the returns t shling (see, e.g., the papers surveyed by Card in this vlume). In his methdlgial survey, Meyer (1995) desribes quasi-experimental researh as "an utburst f wrk in enmis that adpts the language and neptual framewrk f randmized experiments." Here, the ideal researh design is expliitly taken t be a randmized trial and the bservatinal study is ffered as an attempt t apprximate the fre f evidene generated by an atual experiment. In either a strutural r quasi-experimental framewrk, the researher's task is t estimate features f the ausa! relatinships f interest. This hapter fuses n the empirial strategies mmnly used t estimate features f the ausal relatinships that are f interest t labr enmists. The hapter prvides an verview f the methdlgial and pratial issues that arise in implementing an empirial strategy. We use the term empirial strategy bradly, beginning with the statement f a ausal questin, and extending t identifiatin strategies and enmetri methds, seletin f data sures, measurement issues, and sensitivity tests. The hie f tpis was guided by ur wn experienes as empirial researhers and ur researh interests. As far as enmetri methds g, hwever, ur verview is espeially seletive; fr the
4 mst part we ignre strutural mdeling sine that tpi is well vered elsewhere. 1 Of urse, there is nsiderable verlap between strutural and quasi-experimental apprahes t ausal mdeling, espeially when it mes t data and measurement issues. The differene is primarily ne f emphasis, beause strutural mdeling generally relies n assumptins abut exgenus variability in ertain variables and quasiexperimental analyses require sme theretial assumptins. The attentin we devte t quasi-experimental methds is als mtivated by skeptiism abut the redibility f empirial researh in enmis. Fr example, in a ritique f the pratie f mdem enmetris, Lester Thurw (1983, pp. 106-107) argued: "Enmi thery almst never speifies what sendary variables (ther than the primary nes under investigatin) shuld be held nstant in rder t islate the primary effets.... When we lk at the impat f eduatin n individual earnings, what else shuld be held nstant: IQ, wrk effrt, upatinal hie, family bakgrund? Enmi thery des nt say. Yet the effiients f the primary variables almst always depend n preisely what ther variables are entered in the equatin t "hld everything else nstant." This view f applied researh strikes us as being verly pessimisti, but we agree with the fus n mitted variables. In labr enmis, at least, the urrent ppularity f quasi-experiments stems preisely frm this nern: beause it is typially impssible t ntrl adequately fr all relevant variables, it is ften desirable t seek situatins where ne has a reasnable presumptin that the mitted variables are unrrelated with the variables f interest. Suh situatins may arise if the researher an use randm assignment, r if the fres f nature r human institutins prvide smething lse t randm assignment. The next setin reviews fur identifiatin strategies that are mmnly used t answer ausal questins in ntemprary labr enmis. Five empirial examples -- the effets f shling, unins, immigratin, military servie, and lass size - illustrate the methdlgial pints thrughut the hapter. In keeping with ur experimentalist perspetive, we attempt t draw lear distintins between variables that have 'See, fr example, Hekman and MaCurdy's (1986) Handbk f Enmetris hapter, whih "utlines the enmetri framewrk develped by labr enmists wh have built theretially mtivated mdels t explain the new data." (p. 1918). We als have little t say abut desriptive analysis beause desriptive statistis are mmnly disussed in statistis urses and bks (see, e.g., Tufte, 1992, r Tukey, 1977).
5 ausal effets, ntrl variables, and utme variables in eah example. In Setin 3 we turn t a disussin f sendary data sets and primary data lletin strategies. The fus here is n data fr the United States. 2 Setin 3 als ffers a brief review f issues that arise when nduting an riginal survey and suggestins fr assembling administrative data sets. Beause existing publi-use data sets have already been extensively analyzed, primary data lletin is likely t be a grwth industry fr labr enmists in the future. Fllwing the disussin f data sets, Setin 4 disusses measurement issues, inluding a brief review f lassial mdels fr measurement errr and sme extensins. Sine mst f this theretial material is vered elsewhere, inluding the Grilihes (1986) hapter mentined previusly, ur fus is n reent empirial examples. This setin als presents a summary f empirial findings n the reliability f labr market data, and reviews the rle f survey sampling weights and the allatin f missing values in empirial researh. 2. Identifiatin strategies fr ausal relatinships The bjet f siene is the disvery f relatins... f whih the mplex may be dedued frm the simple. Jhn Pringle Nihl, 1840 (quted in Lrd Kelvin's lass ntes). 2.1 The range f ausal questins The mst hallenging empirial questins in enmis invlve "what if statements abut unterfatual utmes. Classi examples f "what if questins in labr market researh nern the effets f areer deisins like llege attendane, unin membership, and military servie. Interest in these questins is mtivated by immediate pliy nerns, theretial nsideratins, and prblems faing individual deisin makers. Fr example, pliy makers wuld like t knw whether military utbaks will redue the earnings 'Overviews f data sures fr develping untries appear in Deatn's (1995) hapter in The Handbk f Develpment Enmis, Grsh and Glewwe (1996, 1998), and Kremer (1997). We are nt aware f a mprehensive survey f mir data sets fr labr market researh in Eurpe, thugh a few sures and studies are referened in Westergard-Nielsn (1989).
6 f minrity men wh have traditinally seen military servie as a majr areer pprtunity. Additinally, many new high shl graduates wuld like t knw what the nsequenes f serving in the military are likely t be fr them. Finally, the thery f n-the-jb training generates preditins abut the relatinship between time spent serving in the military and ivilian earnings. Regardless f the mtivatin fr studying the effets f areer deisins, the ausal relatinships at the heart f these questins invlve mparisns f unterfatual states f the wrld. Smene - the gvernment, an individual deisin maker, r an aademi enmist - wuld like t knw what utmes wuld have been bserved if a variable were manipulated r hanged in sme way. Lewis's (1986) study f the effets f unin wage effets gives a nise desriptin f this type f inferene prblem (p. 2): "At any given date and set f wrking nditins, there is fr eah wrker a pair f wage figures, ne fr uninized status and the ther fr nnunin status". Differenes in these tw ptential utmes define the ausal effets f interest in Lewis's wrk, whih uses regressin t estimate the average gap between them. 3 At first glane, the idea f unbserved ptential utmes seems straightfrward, but in pratie it is nt always lear exatly hw t define a unterfatual wrld. In the ase f unin status, fr example, the unterfatual is likely t be ambiguus. Is the effet defined relative t a wrld where uninizatin rates are what they are nw, a wrld where everyne is uninized, a wrld where everyne in the wrker's firm r industry is uninized, r a wrld where n ne is uninized? Simple mir-enmi analysis suggests that the answers t these questins differ. This pint is at the heart f Lewis's (1986) distintin between unin wage gaps, whih refers t ausal effets n individuals, and wage gains, whih refers t mparisns f equilibria in a wrld with and withut unins. In pratie, hwever, the prblem f ambiguus unterfatuals is typially reslved by fusing n the nsequenes f hypthetial manipulatins in the wrld as is, i.e., assuming there 3 See als Rubin (1974, 1977) and Hlland (1986) fr frmal disussins f unterfatual utmes in ausal researh.
7 are n general equilibrium effets. 4 Even if ambiguities in the definitin f unterfatual states an be reslved, it is still diffiult t leam abut differenes in unterfatual utmes beause the utme f ne senari is all that is ever bserved fr any ne unit f bservatin (e.g., a persn, State, r firm). Given this basi diffiulty, hw d researhers learn abut unterfatual states f the wrld in pratie? In many fields, and espeially in medial researh, the prevailing view is that the best evidene abut unterfatuals is generated by randmized trials beause randmizatin ensures that utmes in the ntrl grup really d apture the unterfatual fr a treatment grup. Thus, Federal guidelines fr a new drug appliatin require that effiay and safety be assessed by randmly assigning the drug being studied r a plaeb t treatment and ntrl grups (Center fr Drug Evaluatin and Researh, 1988). Learner (1982) suggested that the absene f randmizatin is the main reasn why enmetri researh ften appears less nvining than researh in ther mre experimental sienes. Randmized trials are ertainly rarer in enmis than in medial researh, but labr enmists are inreasingly likely t use randmizatin t study the effets f labr market interventins (Passell, 1992). In fat, a reent survey f enmists by Fuhs, Krueger, and Pterba (1998) finds that mst labr enmists plae mre redene in studies f the effet f gvernment training prgrams n partiipants' inme if the researh design entails randm assignment than if the researh design is based n strutural mdeling. Unfrtunately, enmists rarely have the pprtunity t randmize variables like eduatinal attainment, immigratin, r minimum wages. Empirial researhers must therefre rely n bservatinal studies that typially fail t generate the same fre f evidene as a randmized experiment. But the bjet f an bservatinal study, like an experimental study, an still be t make mparisns that prvide evidene abut ausal effets. Observatinal studies attempt t amplish this by ntrlling fr bservable differenes between mparisn grups using regressin r mathing tehniques, using pre-pst mparisns n the same 'Lewis's (1963) earlier bk disussed ausal effets in terms f industries and setrs, and made a distintin between "diret" and "indiret" effets f unins similar t the distintin between wage gaps and wage gains. Hekman, Lhner, and Taber (1998) disuss general equilibrium effets that arise in the evaluatin f llege tuitin subsidies.
8 units f bservatin t redue bias frm unbserved differenes, and by using instrumental variables as a sure f quasi-experimental variatin. Randmized trials frm a neptual benhmark fr assessing the suess r failure f bservatinal study designs that make use f these ideas, even when it is lear that it may be impssible r at least impratial t study sme questins using randm assignment. In almst every bservatinal study, it makes sense t ask whether the researh design is a gd "natural experiment." 5 A sampling f ausal questins that enmists have studied withut benefit f a randmized experiment appears in Table 2, whih haraterizes a few bservatinal studies gruped arding t the sure f variatin used t make ausal inferenes abut a single "ausing variable." The distintin between ausing variables and ntrl variables in Table 2 is ne differene between the disussin in this hapter and traditinal enmetri texts, whih tend t treat all variables symmetrially. The mbinatin f a learly labeled sure f identifying variatin in a ausal variable and the use f a partiular enmetri tehnique t explit this infrmatin is what we all an identifiatin strategy. Studies were seleted fr Table 2 primarily beause the sure r type f variatin that is being used t make ausal statements is learly labeled. The fur apprahes t identifiatin desribed in the table are: Cntrl fr Cnfunding Variables, Fixedeffets and Differenes-in-differenes, Instrumental Variables, and Regressin Disntinuity methds. This taxnmy prvides an utline fr the next setin. 2.2. Identifiatin in regressin mdels 2.2.1 Cntrlfr nfunding variables Labr enmists have lng been nerned with the questin f whether the bserved psitive assiatin between shling and earnings is a ausal relatinship. This questin riginates partly in the 5 This pint is als made by Freeman (1989). The ntin that experimentatin is an ideal researh design fr Enmis ges bak at least t the Cwles Cmmissin. See, fr example, Girshik and Haavelm (1947), wh wrte (p. 79): "In enmi thery... the ttal demand fr the mmdity may be nsidered a funtin f all pries and f ttal dispsable inme f all nsumers. The ideal methd f verifying this hypthesis and btaining a piture f the demand funtin invlved wuld be t ndut a large-sale experiment, impsing alternative pries and levels f inme n the nsumers and studying their reatins."
9 bservatin that peple with mre shling appear t have ther harateristis, suh as wealthier parents, that are als assiated with higher earnings. Als, the thery f human apital identifies unbserved earnings ptential r "ability" as ne f the prinipal determinants f eduatinal attainment (see, e.g, Willis and Rsen, 1979). The mst mmn identifiatin strategy in researh n shling (and in enmis in general) attempts t redue bias in naive mparisns by using regressin t ntrl fr variables that are nfunded with (i.e., related t) shling. The typial estimating equatin in this ntext is, (1) Y,= 'X I 'Pr +p I S l + «i. where Y ( is persn i's lg wage r earnings, Xjis a kxl vetr f ntrl variables, inluding measures f ability and family bakgrund, Sj is years f eduatinal attainment, and e, is the regressin errr. The vetr f ppulatin parameters is [0/ p r]'. The "r" subsript n the parameters signifies that these are regressin effiients. The questin f ausality nerns the interpretatin f these effiients. Fr example, they an always be viewed as prviding the best (i.e., minimum-mean-squared-errr) linear preditr f Y (. 6 The best linear preditr need nt have ausal r behaviral signifiane; the resulting residual is unrreted with the regressrs simply beause the first-rder nditins fr the preditin prblem are Ej>;Xj]=0 and EfS^O. Regressin estimates frm five early studies f the relatinship between shling, ability, and earnings are summarized in Table 3. The first rw reprts estimates withut ability ntrls while the send rw reprts estimates that inlude sme kind f test sre in the X-vetr as a ntrl fr ability. Infrmatin abut the X-variables is given in the rws labeled "ability variable" and "ther ntrls". The first tw studies, Ashenfelter and Mney (1968) and Hansen, Weisbrd, and Sanln (1970) use data n individuals at the extremes f the ability distributin (graduate students and military rejets), while the thers use mre representative samples. Results frm the last tw studies, Grilihes and Masn (1972) and Chamberlain (1978), are reprted fr mdels with and withut family bakgrund ntrls. The shling effiients in Table 3 are smaller than the effiient estimates we are used t seeing The best linear preditr is the slutin t Min E[(Y, -X/b -S 2 b. ( ) ]. See, e.g., White (1980), r Gldberger (1991).
10 in studies using mre reent data (see, e.g., Card's survey in this vlume). This is partly beause the assiatin between earnings and shling has inreased, partly beause the samples used in the papers summarized in the table inlude nly yung men, and partly beause the mdels used fr estimatin ntrl fr age and nt ptential experiene (age-eduatin-6). The latter parameterizatin leads t larger effiient estimates sine, in a linear mdel, the shling effiient ntrlling fr age is equal t the shling effiient ntrlling fr experiene minus the experiene effiient. The nly speifiatin in Table 2 that ntrls fr ptential experiene is frm Grilihes (1977), whih als generates the highest estimate in the table (.065). The rrespnding estimate ntrlling fr age is.022. The table als shws that ntrlling fr ability and family bakgrund generally redues the magnitude f shling effiients, implying that at least sme f the assiatin between earnings and shling in these studies an be attributed t variables ther than shling. What nditins must be met fr regressin estimates like thse in Table 3 t have a ausal interpretatin? In this ase, ausality an be based n an underlying funtinal relatinship that desribes what a given individual wuld earn if he r she btained different levels f eduatin. This relatinship may be persn-speifi, s we write (2) Y a - US) t dente the ptential (r latent) earnings that persn i wuld reeive after btaining S years f eduatin. Nte that the funtin fs(5) has an "i" subsript n it while S des nt. This highlights the fat that althugh S is a variable, it is nt a randm variable. The funtin f^s) tells us what i wuld earn fr any value f shling, 5, and nt just fr the realized value, Sj. In ther wrds, f { (S) answers "what if questins. In the ntext f theretial mdels f the relatinship between human apital and earnings, the frm f f ; (5) may be determined by aspets f individual behavir and/r market fres. With r withut an expliit enmi mdel fr f((5), hwever, we an think f this funtin as desribing the earnings level f individual i if that persn were assigned shling level S (e.g., in an experiment).
11 One the ausal relatinship f interest, f^s), has been defined, it an be linked t the bserved assiatin between shling and earnings. A nvenient way t d this is with a linear mdel: (3) fi(s)=ptp5 + Hi- Iii additin t being linear, this equatin says that the funtinal relatinship f interest is the same fr all individuals. Again, S is written withut a subsript, beause equatin (3) tells us what persn i wuld earn fr any value f S and nt just the realized value, S ;. The nly individual-speifi and randm part f f ;(S) is a mean-zer errr mpnent, r\ it whih aptures unbserved fatrs that determine earnings. In pratie, regressin estimates have a ausal interpretatin under weaker funtinal-frm assumptins than this but we pstpne a detailed disussin f this pint until Setin 2.3. Nte that the earnings f smene with n shling at all is just p + r^ in this mdel. Substituting the bserved value S, fr S in equatin (3), we have (4) Y^P + ps. + r,, This lks like equatin (1) withut variates, exept that equatin (4) expliitly assiates the regressin effiients with a ausal relatinship. The OLS estimate f p in equatin (4) has prbability limit (5) C(Y S, SJ/VCSj) = p + C(Si, rij/vcsi). The term C(S;, nja^sj) is the effiient frm a regressin f rjj n Sj, and reflets any rrelatin between the realized and unbserved individual earnings ptential, whih in this ase is the same as rrelatin with Ss Tjj. If eduatinal attainment were randmly assigned, as in an experiment, then we wuld have C(S (, r i)=0 in the linear mdel. In pratie, hwever, shling is a nsequene f individual deisins and institutinal fres that are likely t generate rrelatin between r)j and shling. Cnsequently, it is nt autmati that OLS prvides a nsistent estimate f the parameter f interest. 7 Regressin strategies attempt t verme this rrelatin in a very simple way: in additin t the 'Enmetri textbks (e.g., Pindyk and Rubinfeld, 1991) smetimes refer t regressin mdels fr ausal relatinships as "true mdels," but this seems like ptentially misleading terminlgy sine nn-behaviral desriptive regressins uld als be desribed as being "true".
12 funtinal frm assumptin fr ptential utmes embdied in (3), the randm part f individual earnings ptential, r^, is dempsed int a linear funtin f the k bservable harateristis, X and an errr term,, ( ;, (6a) t li = Xi'P + i, where P is a vetr f ppulatin regressin effiients. This means that e, and Xj are unrreted by nstrutin. The key identifying assumptin is that the bservable harateristis, Xj, are the nly reasn why r)i and Sj (equivalently, f,(5) and S;) are rrelated, s (6b) E[S,ei]=0. This is the "seletin n bservables" assumptin disussed by Barnw, Cain, and Gldberger (1981), where the regressr f interest is assumed t be determined independently f ptential utmes after aunting fr a set f bservable harateristis. Cntinuing t maintain the seletin-n-bservables assumptin, a nsequene f (6a) and (6b) is that (7) C(Yi,S i )/V(S i ) = p + 4> Sx'P, where 4>sx is a kxl vetr effiients frm a regressin f eah element f X( n Sj. Equatin (7) is the wellknwn "mitted variables bias" frmula, whih relates a bivariate regressin effiient t the effiient n S, in a regressin that inludes additinal variates. If the mitted variables are psitively related t earnings (P>0) and psitively rrelated with shling (<J>SX>0), then C(Y jf S^/VCS,) is larger than the ausal effet f shling, p. A send nsequene f (6a) and (6b) is that the OLS estimate f p r in equatin (1) is in fat nsistent fr the ausal parameter, p. Nte, hwever, that the way we have develped the prblem f ausal inferene, E[Si i]=0 is an assumptin abut e- t and S whereas ErXjeJsO is a statement abut variates that is true by definitin. This suggests that it is imprtant t distinguish errr terms that represent the randm parts f mdels fr ptential utmes frm mehanial dempsitins where the relatinship between errrs and regressrs has n behaviral ntent. A key questin in any regressin study is whether the seletin-n-bservables assumptin is plausible.
13 The assumptin learly makes sense when there is atual randm assignment nditinal n X. Even ; withut randm assignment, hwever, seletin-n-bservables might make sense if we knw a lt abut the press generating the regressr f interest. We might knw, fr example, that appliants t a partiular llege r university are sreened using ertain harateristis, but nditinal n these harateristis all appliants are aeptable and hsen n a first-me/first-serve basis. This leads t a situatin like the ne desribed by Bamw, Cain, and Gldberger (1980, p. 47), where "Unbiasedness is attainable when the variables that determined the assignment are knwn, quantified, and inluded in the equatin." Similarly, Angrist (1998) argued that beause the military is knwn t sreen appliants n the basis f bserved harateristis, mparisns f veteran and nnveteran appliants that adjust fr these harateristis have a ausal interpretatin. The ase fr seletin-n-bservables in a generi shling equatin is less lear ut, whih is why s muh attentin has fused n the questin f mitted-variables bias in OLS estimates f shling effiients. Regressin pitfalls Shling is nt randmly assigned and, as in many ther prblems, we d nt have detailed institutinal knwledge abut the press that atually determines assignment. The hie f variates is therefre ruial. Obvius andidates inlude any variables that are rrelated with bth shling and earnings. Test sres are gd andidates beause many eduatinal institutins use tests t determine admissins and finanial aid. On the ther hand, it is dubtful that any partiular test sre is a perfet ntrl fr all the differenes in earnings ptential between mre and less eduated individuals. We see this in the fat that adding family bakgrund variables like parental inme further redues the size f shling effiients. A natural questin abut any regressin ntrl strategy is whether the estimates are highly sensitive t the inlusin f additinal ntrl variables. While ne shuld always be wary f drawing ausal inferenes frm a regressin with bservatinal data, sensitivity f the regressin results t hanges in the set f ntrl
14 variables is an extra reasn t wnder whether there might be unbserved variates that wuld hange the estimates even further. The previus disussin suggests that Table 3 an be interpreted as shwing that there is signifiant ability bias in OLS estimates f the ausal effet f shling n earnings. On the ther hand, a number f nerns less bvius than mitted-variables bias suggest this nlusin may be premature. A theme f the Grilihes and Chamberlain papers ited in the table is that the negative impat f ability measures n shling effiients is eliminated and even reversed ne ne aunts fr tw fatrs: measurement errr in the regressr f interest, and the use f endgenus test sre ntrls that are themselves affeted by shling. A standard result in the analysis f measurement errr is that if variables are measured with an additive errr that is unrrelated with rretly-measured values, this imparts an attenuatin bias that shrinks OLS estimates twards zer (see, e.g., Grilihes, 1986, Fuller, 1987, and Setin 4, belw). The prprtinate redutin is ne minus the rati f the variane f rretly-measured values t the variane f measured values. Furthermre, the inlusin f ntrl variables that are rrelated with atual values and unrrelated with the measurement errr tends t aggravate this attenuatin bias. The intuitin fr this result is that the residual variane f true values is redued by the inlusin f additinal ntrl variables while the residual variane f the measurement errr is left unhanged. Althugh studies f measurement errr in eduatin data suggest that nly 10 perent f the variane in measured eduatin is attributable t measurement errr, it turns ut that the dwnward bias in regressin mdels with ability and ther ntrls an still be substantial. 8 A send mpliatin raised in the early literature n regressin estimates f the returns t shling is that variables used t ntrl fr ability may be endgenus (see, e.g., Grilihes and Masn, 1972, r Chamberlain, 1977). If wages and test sres are bth utmes that are affeted by shling, then test sres annt play the rle f an exgenus, pre-determined ntrl variable in a wage equatin. T see this, 8 Fr a detailed elabratin f this pint, see Welh, 1975, r Grilihes, 1977, wh ntes (p. 13): "Clearly, the mre variables we put int the equatin whih are related t the systemati mpnents f shling, and the better we 'prtet' urselves against varius pssible biases, the wrse we make the errrs f measurement prblem." We present sme new evidene n attenuatin and variates in Setin 4, belw.
t and 15 nsider a simple example where the ausal relatinship f interest is (4), and C(S, r,)=0 s that a bivariate C regressin wuld in fat generate a nsistent estimate f the ausal effet. Suppse that shling affets test sres as well as earnings, and that the effet n test sres an be expressed using the mdel (8) A, = Y + Y.Si + T,i- This relatinship an be interpreted as refleting the fat that mre frmal shling tends t imprve test sres (s Yi>0). We als assume that CCS;, r,i)=0, s that OLS estimates f (8) wuld be nsistent fr y,. The questin is what happens if we add the utme variable, A,, t the shling equatin in a mistaken (in this ase) attempt t ntrl fr ability bias. Endgeneity f A, in this ntext means that r\ \] h are rrelated. Sine peple wh d well n standardized tests prbably earn mre fr reasns ther than the fat that they have mre shling, it seems reasnable t assume that C(r);, %)>(). In this ase, the effiient n S in a regressin f Yj n S and Aj ; s leads t an innsistent estimate f the effet f shling. Evaluatin f prbability limits shws that the OLS estimate f the shling effiient in a mdel that inludes A, nverges t (9) C(Y,S. i Ai )A^(S. = Aj ) p-y I (j) 01, where SM is the residual frm a regressin f Sj n A and ( 4>, is the effiient frm a regressin f ri; n r) H (see the Appendix fr details). Sine Yi>0 and 4>i>0. ntrlling fr the endgenus test sre variable tends t make the estimate f the returns t shling smaller, but this is nt beause f any mitted- variables bias in the equatin f interest. Rather it is a nsequene f the bias indued by nditining n an utme variable. 9 The prblems f measurement errr and endgenus regressrs generate identifiatin hallenges that lead researhers t use methds beynd the simple regressin-ntrl framewrk. The mst mmnly emplyed strategies fr dealing with these prblems invlve instrumental variables (TV), tw-stage least 9 A similar prblem may affet estimates f shling effiients in equatins that ntrl fr upatin. Like test sres and ther ability measures, upatin is itself a nsequene f shling that is prbably rrelated with unbserved earnings ptential.
16 squares (2SLS), and latent-variable mdels. We briefly mentin sme 2SLS and latent-variable estimates, but defer a detailed disussin f 2SLS and related IV strategies until Setin 2.2.3. The majr pratial prblem in mdels f this type is t find valid instruments fr shling and ability. Panel B reprts Grilihes (1977) 2SLS estimates f equatin (1) treating bth shling and IQ sres as endgenus. The instruments are family bakgrund measures and a send ability prxy. Chamberlain (1978) develps an alternate apprah that uses panel data t identify the effets f endgenus shling in a latent-variable mdel fr unbserved ability. Bth the Chamberlain (1978) and Grilihes (1977) estimates are nsiderably larger than the rrespnding OLS estimates, a finding whih led these authrs t nlude that the empirial ase fr a negative ability bias in shling effiients is muh weaker than the OLS estimates suggest. 10 2.2.2 Fixed effets and differenes-in-dijferenes The main idea behind fixed-effets identifiatin strategies is t use repeated bservatins n individuals (r families) t ntrl fr unbserved and unhanging harateristis that are related t bth utmes and ausing variables. A lassi field f appliatin fr fixed-effets mdels is the attempt t estimate the effet f unin status. Suppse, fr example, that we wuld like t knw the effet f wrkers' unin status n their wages. That is, fr eah wrker, we imagine that there are tw ptential utmes, Y^ denting what the wrker wuld earn if nt a unin member, and Y,j denting what the wrker wuld earn as a unin member. This is just like Y 5 in the shling, example, exept that here "S " is the dihtmus variable, unin status. The effet f unin status n an individual wrker is Y^-Y^ but this is never bserved diretly sine nly ne ptential utme is ever bserved fr eah individual at any ne time." Mst analyses f the unin prblem begin with a nstant-effiients regressin mdel fr ptential '"Anther strand f the literature n ausal effets f shling uses sibling data t ntrl fr family effets that are shared by siblings (early studies are by Grseline, 1932 and Taubman, 1976; see als Grilihes's (1979) survey). Here the prblem f measurement errr is paramunt (see Setin 2.2.2 and 4.1). "This ntatin fr unterfatual utmes was used by Rubin (1974, 1977). Siegfried and Sweeney (1980) and Chamberlain (1980) use a similar ntatin t disuss the effet f a lassrm interventin n test sres.
17 utmes, where (10) Y 0i = Xi 'P + 6 i, Y.^Ya + 5. As in the shling prblem, Yq, has been dempsed int a linear funtin f bserved variates, X/P, and a residual, e^ that is unnelated with X by ; nstrutin. Using U t indiate unin members, ( this leads t the regressin equatin, (11) Y.-X/p + U.fi + e,, whih desribes the ausal relatinship f interest. Many researhers wrking in this framewrk have argued that unin status is likely t be related t ptential nnunin wages, Y a, even after nditining n variates, X (see, e.g Abwd and Farber, 1982; ( r hapters 4 and 5 in Lewis, 1986). This means that Uj is rrelated with e, s OLS des nt estimate the { ausal effet, 6. An alternative t OLS uses panel data sets suh as mathed CPS rtatin grups, the Panel Study f Inme Dynamis, r the Natinal Lngitudinal Surveys and explits repeated bservatins n individuals t ntrl fr unbserved individual harateristis that are time-invariant. A well-knwn study in this genre is Freeman (1984). The fllwing mdel, similar t many in the literature n unin status, illustrates the fixed-effets apprah. Mdifying the previus ntatin t inrprate t=l,...,t bservatins n individuals, the fixedeffets slutin fr this prblem begins by writing (12) Y l = X it'p I + Xa i + $ ii where a, is an unbserved variable fr persn i, that we uld, in priniple, inlude as a ntrl if it were bserved. Equatin (12) is a regressin dempsitin with variates X and a it it s Jj u is unrreted with X and Oj by nstrutin it (X, an inlude harateristis frm different perids). The ausal/regressin mdel fr panel data is nw (13) Y it =X il'p l + UiA + ^i + 5 i..
18 where we have allwed the ausal effet f interest t be time-varying. The identifying assumptins are that the effiient X des nt vary arss perids and that (14) E[U it y=ofrs=l,...,t In ther wrds, whatever the sure f rrelatin is between Uj, and unbserved earnings ptential, it an be desribed by an additive time-invariant variate a it that has the same effiient eah perid. Sine differening eliminates Xa it OLS estimates f the differened equatin (15) Y it - Y it. k = X it X 'P, - it. k 'P,.k + U it6, - U,,A k + (5-5m) are nsistent fr the parameters f interest. Any transfrmatin f the data that eliminates the unbserved a { an be used t estimate the parameters f interest in this mdel. One f the mst ppular estimatrs in this ase is the deviatins-frmmeans r the Analysis f Cvariane (ANCOVA) estimatr, whih is mst ften used fr mdels where P, and 8, are assumed t be fixed. The analysis f variane estimatr is OLS applied t (16) Y jt - y, = PXXirXi) + 6(U U - Ui) + (5a - 1, where verbars dente persn-averages. Analysis f variane is preferable t differening n effiieny grunds in sme ases; fr mdels with nrmally distributed hmsedasti errrs, ANCOVA is the maximum likelihd estimatr. An alternative enmetri strategy fr the estimatin f mdels with individual effets uses repeated bservatins n hrt averages instead f repeated data n individuals. Fr details and examples see Ashenfelter (1984) r Deatn (1985). Finally, nte that while standard fixed-effets estimatrs an nly be used t estimate the effets f time-varying regressrs, Hausman and Taylr (1981) have develped a hybrid panel/tv predure fr mdels with time-invariant regressrs (like shling). It is als wrth nting that even if the ausing variable f interest is time-invariant, we an use standard fixed-effets estimatrs t estimate hanges in the effet f a time invariant variable. Fr example, the estimating equatin fr a mdel with fixed U, is (17) Y Y - i( it. k = X Xit'P, - it. k 'P,.k + UAA*) + (5* - 5m).
19 s (6,-6,. k) is identified. Angrist (1995b) used this methd t estimates hanges in shling effiients in the West Bank and Gaza Strip even thugh shling is apprximately time-invariant. Fixed-effets pitfalls The use f panel data t eliminate bias frm unbserved individual effets raises a number f enmetri and statistial issues. Sine this material is vered in Chamberlain's (1984) hapter in The Handbk f Enmetris, we limit ur disussin t an verview f prblems that have been f partiular nern t labr enmists. First, analysis f variane and differening estimatrs are nt nsistent when the press determining Uit invlves lagged dependent variables. This issue mes up in the analysis f training prgrams beause partiipants ften experiene a pre-prgram deline in earnings, a fat first nted by Ashenfelter (1978). If past earnings are bserved, the simplest strategy in this ase is simply t ntrl fr past earnings either by inluding lagged earnings as a regressr r in mathed treatment-ntrl mparisns (see, e.g., Dehejia and Wahba, 1995; Hekman, Ihimura, and Tdd, 1997). In fat, the questin f whether trainees and a andidate mparisn grup have similar lagged utmes is smetimes seen as a litmus test fr the legitimay f the mparisn grup in the evaluatin f training prgrams (see, e.g., Hekman and Htz, 1989). A prblem arises in this ntext, hwever, when the press determining U it invlves past utmes and an unbserved variate, a v Ashenfelter and Card (1985) disuss an example invlving the effet f training n the Sial Seurity-taxable earnings f trainees under the Cmprehensive Emplyment and Training At (CETA). They prpse a mdel f training status where individuals wh enter CETA training in year t d s beause they have lw a k and their earnings were unusually lw in year t-1. Suppse initially we ignre the fat that training status invlves past earnings, and estimate an equatin like (15). Ignring ther variates, this amunts t mparing the earnings grwth f trainees and ntrls. But whatever the true prgram effet is, the grwth in the earnings f CETA trainees frm year x-1 t year t+1 will tend t be larger
20 than the earnings grwth in a andidate ntrl grup simply beause f regressin-t-the-mean. This generates a spurius psitive training effet and the nventinal differening methd breaks dwn. 12 A natural strategy fr dealing with this prblem might seem t be t add Y it., t the list f ntrl variables, and then differene away the fixed effet in a mdel with Y jt., as regressr. The prblem is that nw any transfrmatin that eliminates the fixed effet will leave at least ne regressr - the lagged dependent variable - rrelated with the errrs in the transfrmed equatin. Althugh the lagged dependent variable is nt the regressr f interest, the fat that it is rrelated with the errr term in the transfrmed equatin means that the estimate f the effiient n U ii+, is biased as well. A detailed desriptin f this prblem, and the slutins that have been prpsed fr it, raises tehnial issues beynd the spe f this hapter. A useful referene is Nikell, 1981, espeially pages 1423-1424. See als Card and Sullivan's (1988) study f the effet f CETA training n the emplyment rates f trainees, whih reprts bth fixed-effets estimates and mathing estimates that ntrl fr lagged utmes. A send ptential prblem with fixed-effets estimatrs is that bias frm measurement errr is usually aggravated by transfrmatins that eliminate the individual effets (see, e.g., Freeman, 1984; Grilihes and Hausman, 1986). This fat prvides an alternative explanatin fr why fixed-effets estimates ften turn ut t be smaller than estimates in levels. Finally, perhaps the mst imprtant prblem with this apprah is that the assumptin that mitted variables an be aptured by an additive, time-invariant individual effet is arbitrary in the sense that it usually des nt me frm enmi thery r frm infrmatin abut the relevant institutins.' 3 On the ther hand, the fixed-effets apprah has a superfiial plausibility ("whatever makes us speial is timeless") and an identifiatin payff that is hard t beat. Als, fixed-effets mdels lend themselves t a variety f speifiatin tests. See, fr example, Ashenfelter and Card (1985), Chamberlain (1984), Grilihes and Hausman (1986), Angrist and Newey (1991), and Jakubsn (1991). Many f these 12 Deviatins-frm-means estimatrs are als biased in this ase. l3 An exeptin is the literature n life-yle labr supply (e.g., MaCurdy, 1981; Altnji, 1986).
21 studies als fus n the unin example. The Differenes-in-Differenes (DD) mdel Differenes-in-differenes strategies are simple panel-data methds applied t sets f grup means in ases when ertain grups are expsed t the ausing variable f interest and thers are nt. This apprah, whih is transparent and ften at least superfiially plausible, is well-suited t estimating the effet f sharp hanges in the enmi envirnment r hanges in gvernment pliy. The DD methd has been used in hundreds f studies in enmis, espeially in the last tw deades, but the basi idea has a lng histry. An early example in labr enmis is Lester (1946), wh used the differenes-in-differenes tehnique t study emplyment effets f minimum wages. 14 The DD apprah is explained here using Card's (1990) study f the effet f immigratin n the emplyment f natives as an example. Sme bservers have argued that immigratin is undesirable beause lw-skilled immigrants may displae lw-skilled r less-eduated US itizens in the labr market. Anedtal evidene fr this laim inludes newspaper aunts f hstility between immigrants and natives in sme ities, but the empirial evidene is innlusive. See Friedberg and Hunt (1995) fr a survey f researh n this questin. As in ur earlier examples, the bjet f researh n immigratin is t find sme srt f mparisn that prvides a mpelling answer t 'what if questins abut the nsequenes f immigratin. Card's study used a sudden large-sale migratin frm Cuba t Miami knwn as the Mariel Batlift t make mparisns and answer unterfatual questins abut the nsequenes f immigratin. In partiular, Card asks whether the Mariel immigratin, whih inreased the Miami labr fre by abut 7 perent between May and September f 1980, redued the emplyment r wages f nn-immigrant grups. An imprtant mpnent f this identifiatin strategy is the seletin f mparisn ities that an be used "The DD methd ges by different names in different fields. Psyhlgist Campbell (1969) alls it the "nnequivalent ntrl-grup pretest-psttest design."
22 t estimate what wuld have happened in the Miami labr market absent the Mariel immigratin. The mparisn ities Card used in the Mariel Batlift study were Atlanta, Ls Angeles, Hustn, and Tampa-St. Petersburg. These ities were hsen beause, like Miami, they have large Blak and Hispani ppulatins and beause disussins f the impat f immigrants ften fuses n the nsequenes fr minrities. Mst imprtantly, these ities appear t have emplyment trends similar t thse in Miami at least sine 1976. This is dumented in Figure 1, whih is similar t a figure in Card's (1989) wrking paper that did nt appear in the published versin f his study. The figure plts mnthly bservatins n the lg f emplyment in Miami and the fur mparisn ities frm 1970 thrugh 1998. The tw series, whih are frm BLS establishment data, have been nrmalized by subtrating the 1970 value. Table 4 illustrates DD estimatin f the effet f Batlift immigrants n unemplyment rates, separately fr whites and blaks. The first lumn reprts unemplyment rates in 1979, the send lumn reprts unemplyment rates in 1981, and the third lumn reprts the 1981-1979 differene. The rws give numbers fr Miami, the mparisn ities, and the differene between them. Fr example, between 1981 and 1979, the unemplyment rate fr Blaks in Miami rse by abut 1.3 perent, thugh this hange is nt signifiant. Unemplyment rates in the mparisns ities rse even mre, by 2.3 perent. The differene in these tw hanges, -1.0 perent, is a DD estimate f the effet f the Mariel immigrants n the unemplyment rate f Blaks in Miami. In this ase, the estimated effet n the unemplyment rate is atually negative, thugh nt signifiantly different frm zer. The ratinale fr this duble-differening strategy an be explained in terms f restritins n the nditinal mean funtin fr ptential utmes in the absene f immigratin. As in the unin example, let Yqj be i's emplyment status in the absene f immigratin and let Y n be i's emplyment status if the Mariel immigrants me t i's ity. The unemplyment rate in ity in year t is ErYJ, t], with n immigratin wave, and E[YJ, t] if there is an immigratin wave. In pratie, we knw that the Mariel immigratin happened in Miami in 1980, s that the nly values f E[Y,jl, t] we get t see are fr =Miami and r>1980.
23 The Mariel Batlift study uses the mparisn ities t estimate the unterfatual average, EfYJ =Miami, r>1980], i.e., what the unemplyment rate in Miami wuld have been if the Mariel immigrants had nt me. The DD methd identifies ausal effets by restriting the nditinal mean funtin E[YJ, t] in a partiular way. Speifially, suppse that (18) E[Y U I, r] = P, + Y. that is, in the absene f immigratin, unemplyment rates an be written as the sum f a year effet that is mmn t ities and a ity effet that is fixed ver time. The additive mdel pertains t E[YJ, t] instead f Y a diretly beause the latter is a zer/ne variable. Suppse als that the effet f the Mariel immigratin is simply t add a nstant t E[YJ, t], s that (19) E[Y l,r] = E[Y,l,r] + 6 This means the emplyment status f individuals living in Miami and the mparisn ities in 1979 and 1981 an be written as (20) Y;= p. + Y + SM. + ej where E[e s l, t] = and M, is a dummy variable that equals 1 if i was expsed t the Mariel immigratin by living in Miami after 1980. Differening unemplyment rates arss ities and years gives (21) {E[Yil=Miami,/=1981]-E[Y,l=Cmparisn,r=1981]}- {E[YI =Miami, r=1979] - E[Y;I =Cmparisn, t=1979]} = 6. Nte that Ms in equatin (20) is an interatin term equal t the prdut f a dummy indiating bservatins after 1980 and a dummy indiating residene in Miami. The DD estimate an therefre als be mputed in a regressin f staked mir data fr ities and years. The regressrs nsist f dummies fr years, dummies fr ities, and Mj. Similarly, a regressin-adjusted versin f the DD estimatr adds a vetr f individual harateristis, X t equatin (20): { Y = i Xi 'P + P I + Y + 6M + e i i, where p is nw a vetr f effiients that inludes a nstant. Cntrlling fr X, hanges the estimate f
24 8 nly if M, are X, are rrelated, nditinal n ity and year main-effets. DD Pitfalls Like any ther identifiatin strategy, DD is nt guaranteed t identify the ausal effet f interest. Meyer (1995) and Campbell (1969) utline a range f threats t the ausal interpretatin f DD estimates. The key identifying assumptin is learly that interatin terms are zer in the absene f the interventin. In fat, it is easy t imagine that unemplyment rates evlve differently arss ities regardless f shks like the Mariel immigratin. One way t test this is t mpare trends in utmes befre r after the event f interest. As nted abve, the mparisn ities in this ase were hsen partly n the basis f Figure 1, whih shws that the mparisn ities exhibited a pattern f enmi grwth similar t that in Miami. Identifiatin f ausal effets using ity/year mparisns learly turns n the assumptin that the tw sets f ities wuld have had the same emplyment trends had the batlift nt urred. We intrdue sme new evidene n this questin in Setin 2.4. 2.2.3. Instrumental Variables Identifiatin strategies based n instrumental variables an be thught f as a sheme fr using exgenus field variatin t apprximate randmized trials. Again, we illustrate with an example where there is an underlying ausal relatinship f interest, in this ase the effet f Vietnam-era military servie n the earnings f veterans later in life. In the 1 960s and early 1970s, yung men were at risk f being drafted fr military servie. Pliy makers, veterans grups, and enmists have lng been interested in what the nsequenes f this military servie were fr the men invlved. A belief that military servie is a burden helped t mbilize supprt fr a range f veterans' prgrams and fr ending the draft in 1973 (see, e.g., Taussig, 1974). Cnerns abut fairness als led t the institutin f a draft lttery in 1970 that was used t determine pririty fr nsriptin in hrts f 1 9-year-lds. This lttery was used by Hearst, Newman, and Hulley (1986) t estimate the effets f military servie n ivilian mrtality and by Angrist (1990) t nstrut
j 25 IV estimates f the effets f military servie n ivilian earnings. As in the unin prblem, the ausal relatinship f interest is based n the ntin that there are tw ptential utmes, Y ft, denting what smene frm the Vietnam-era hrt wuld earn if they did nt serve in the military and Y,j, denting earnings as a veteran. Again, using a nstant-effets mdel fr ptential utmes, we an write (22) Y 0l = p + r V = Y a + 8, where P^EfY,]. The nstant effet 6 is the parameter f interest. IV estimates an be interpreted under weaker assumptins than this, but we pstpne a disussin f this pint until Setin 2.3. As in the unin and shling prblems, r^ is the randm part f ptential utmes, but at this pint there are n bserved variates in the mdel fr Y^. Using D t indiate veteran status, ausal relatinship f interest an be ( written (23) Y-P + D^ + ri, Als as in the unin and shling prblems, there is a nern that sine D> is nt randmly assigned, a mparisn f all veterans t all nnveterans wuld nt identify the ausal effet f interest. Suppse, fr example, that individuals with lw ivilian earnings ptential are mre likely t serve in the military, either beause they want t r beause they are less adept at btaining deferments. Then the regressin effiient in (23), whih is als the differene in means by veteran status, is biased dwnwards: (24) EtYil Dpl]-E[Y,I D~0] = 6+ {E^l D^-E^l D i= 0}] < 8. IV methds an eliminate this srt f bias if the researher has aess t an instrumental variable Zj, that is rrelated with Dj, but therwise independent f ptential utmes. A natural instrument is drafteligibility status, sine this was determined by a lttery ver birthdays. In partiular, in eah year frm 1970 t 1972, randm sequene numbers (RSNs) were randmly assigned t eah birth date in hrts f 19-yearlds. Men with lttery numbers belw an eligibility eiling were eligible fr the draft, while men with numbers
26 abve the eiling uld nt be drafted. In pratie, many draft-eligible men were still exempted frm servie fr health r ther reasns, while many men wh were draft-exempt nevertheless vlunteered fr servie. S veteran status was nt mpletely determined by randmized draft-eligibility; eligibility and veteran status are merely rrelated. Fr white men wh were at risk f being drafted in the 1970-71 draft ltteries, draft-eligibility is learly assiated with lwer earnings in years after the lttery. This an be seen in Table 5, whih reprts the effet f randmized draft-eligibility status n Sial Seurity earnings in lumn (3). Clumn (1) shws average annual earnings fr purpses f mparisn. These data are the FICA-taxable earnings f men with earnings vered by OASDI; fr details see the appendix t Angrist (1990). Fr men brn in 1950, there are signifiant negative effets f eligibility status n earnings in 1970, when these men were being drafted, and in 1981, ten years later. In ntrast, there is n evidene f an assiatin between eligibility status and earnings in 1969, the year the lttery drawing fr men bm in 1950 was held but befre anyne bm in 1950 was atually drafted. Similarly, fr men brn in 1951, there are large negative eligibility effets in 1971 and 1981, but n evidene f an effet in 1970, befre anyne bm in 1951 was atually drafted. The timing f these effets suggests that the negative assiatin between draft-eligibility status and earnings is aused by the military servie f draft-eligible men. Beause eligibility status was randmly assigned, the laim that the estimates in lumn (3) represent the effet f draft-eligibility n earnings seems unntrversial. Hw d we g frm the effet f drafteligibility t the effet f veteran status? The identifying assumptin in this ase is that Zj is independent f ptential earnings, whih in this ase means that Z, is unrreted with r)j. It fllws immediately that 6 = C(Y;, Z,)/C(D Zj). The intuitin here is that nly part f the variatin in D - s the part that is assiated with Z; - is used t identify the parameter f interest (8). Beause Z, is a binary variable, we als have (25) 6 = {EtYJ Z-ll-ErYJ Z^^EtD! Z=1]-E[DI Z=0]}. The sample analg f (25) is the Wald (1940) estimatr that was riginally applied t measurement errr
27 prblems. 15 Nte that we uld have arrived at (25) diretly, i.e., withut referene t the C(Yj, Zj)/C(Dj, Z;) frmula, beause the independene f Z, and ptential utmes implies E[ r,l ZJ=0. In this ase, the Wald estimatr is simply the differene in mean earnings between draft-eligible and ineligible men, divided by the differene in the prbability f serving in the military between draft-eligible and ineligible men. The nly infrmatin required t g frm draft-eligibility effets t veteran-status effets is the denminatr f the Wald estimatr, whih is the effet f draft-eligibility n the prbability f serving in the military. This infrmatin, whih mes frm the Survey f Inme and Prgram Partiipatin (SIPP), appears in lumn (4) f Table 5. 16 Fr earnings in 1981, lng after mst Vietnam-era serviemen were disharged frm the military, the Wald estimates f the effet f military servie amunt t abut 16 perent f earnings. Effets fr men while in the servie are muh larger, whih is nt surprising sine military pay during the nsriptin era was extremely lw. An imprtant feature f the Wald/TV estimatr is that the identifying assumptins are easy t assess and interpret. The basi laim justifying a ausal interpretatin f the estimatr is that the nly reasn why E[Yjl Z,] varies with Z, is beause E[D,I ZJ varies with Z,. A simple way t hek is t lk fr an assiatin between Zj and persnal harateristis that shuld nt be affeted by D suh as age, rae, sex, r any ther harateristi that was determined befre D; was determined. Anther useful hek is t lk fr an assiatin between the instrument and utmes in samples where there is n reasn fr suh a relatinship. If it really is true that the nly reasn why draft-eligibility affets earnings is veteran status, then in samples where eligibility status is unrelated t veteran status, draft-eligibility effets n earnings shuld be zer. This idea is illustrated in Table 5, whih reprts estimates fr men bm in 1953. Althugh there was a lttery drawing whih assigned RSNs t the 1953 hrt in February f 1972, n ne brn in 1953 was atually drafted (the l3 The relatinship between IV with binary instruments and Wald estimatrs was first nted by Durbin (1954). "In this ase, the denminatr f the Wald estimates des nt me frm the same data set as the numeratr sine the Sial Seurity administratin has n infrmatin n veteran status. As lng as the infrmatin used t estimate the numeratr and denminatr are representative f the same ppulatin, the resulting tw-sample estimate will be nsistent. The enmetris behind this tw-sample apprah t IV are disussed briefly in Setin 3.4, belw.
28 draft ffiially ended in July 1973). This is refleted in the first-stage relatinship between draft-eligibility fr men bm in 1953 (defined using the 1952 RSN utff f 95), whih shws an insignifiant differene in the prbability f serving by eligibility status. In fat, there is n signifiant relatinship between Yj and Zx. Evidene f a relatinship between Z, and Yj wuld ast dubt n the laim that the nly reasn fr drafteligibility effets is the military servie f the men wh were draft-eligible. We disuss ther speifiatin heks f this type in Setin 2.4. S far the disussin f IV has allwed fr nly three variables: the utme, the endgenus regressr, and the instrument. In many ases, the assumptin that EfZ^riJ^ is mre plausible after ntrlling fr a vetr f variates, Xj. Dempsing the randm part f ptential utmes in (22) int a linear funtin f k ntrl variables and an errr term s that rj s = Xj'P + e, as befre, the resulting estimating equatin is (26) Yj = X/p + D,6 + e. Nte that sine e, is defined as the residual frm a regressin f r^ n X it it is unrreted with Xj by nstrutin. In ntrast with 6, whih has a ausal interpretatin., the effiient vetr P is nt meant t apture the ausal effet f the X-variables. As in the disussin f regressin, we make a lear distintin between ntrl variables and ausing variables. Equatins like (26) are typially estimated using 2SLS, i.e., by substituting the fitted values frm a first-stage regressin f Dj n Xj and Z,. In sme appliatins, mre than ne instrument is available t estimate the single ausal effet, 6. 2SLS ammdates this situatin by inluding all the instruments in the first-stage equatin. The mbinatin f multiple instruments t prdue a single estimate makes the mst sense in a nstant-effiients framewrk. The assumptin f instrument validity and nstant effiients an als be tested in this ase (see, e.g., Hansen, 1982; Newey, 1985). In a mre general setting with hetergeneus ptential utmes, different instruments estimate different weighted averages f the differene Y^-Yj (Imbens and Angrist, 1994). We return t this pint in Setin 2.3.
29 IV Pitfalls The mst imprtant IV pitfall is the validity f instruments, i.e., the pssibility that rj, and Z, are rrelated. Suppse, fr example, that Z, is related t the vetr f ntrl variables, Xit and we d nt aunt fr this in the estimatin. The Wald/IV estimatr in that ase has prbability limit 5 + p;{eki^=l]-e[x,izro]}/{e[d l IZr?l]-E[P! l^=qll, This is a versin f the mitted-variables bias frmula fr IV. The frmula aptures the fat that "a little mitted variables bias an g a lng way" in an IV setting, beause the assiatin between X; and Z; gets multiplied by {E[DI Z=1]-E[DI Z=0] }"'. In the draft lttery ase, fr example, any draft-eligibility effets n mitted variables get multiplied by abut 1/.15=6.7. A send imprtant pint abut bias in instrumental variables estimates is that randm assignment alne des nt guarantee a valid instrument. Suppse, fr example, that in additin t being mre likely t serve in the military, men with lw draft-lttery numbers were mre likely t stay in llege s as t extend a draft deferment. This fat will reate a relatinship between ptential earnings and Zj even fr nnveterans, in whih ase IV yields biased estimates f the ausal effet f veteran status. Randm assignment f Zj des nt rule ut this srt f bias sine draft-eligibility an in priniple have nsequenes in additin t influening the prbability f being a veteran. In ther wrds, while the randmizatin f Z, ensures that the redued-frm relatinship between Yj and Z, represents the ausal effet f draft eligibility n earnings, it des nt guarantee that the nly reasn fr this relatinship is Dj. The distintin between the assumed randm assignment f an instrument and the assumptin that a single ausal mehanism explains effets n utmes is disussed in greater detail by Angrist, Imbens, and Rubin (1996). Finally, the use f 2SLS t mbine many different instruments an lead t finite-sample bias. The standard inferene framewrk uses asymptti thery, i.e., inferene is based n apprximatins that are inreasingly aurate as sample sizes grw. Typially, inferenes abut OLS effiient estimates als use
30 asymptti thery sine the relevant finite-sample thery assumes nrmally distributed errrs. A key differene between IV and OLS estimatrs, hwever, is that even withut nrmality OLS prvides an unbiased estimate f ppulatin regressin effiients (prvided the regressin funtin is linear; see, e.g., Gldberger, 1991, Chapter 13). In ntrast, IV estimatrs are nsistent but nt unbiased. This means that under repeated sampling with a fixed sample size, IV estimates may systematially deviate frm the rrespnding ppulatin parameter. 17 Mrever, this bias tends t pull IV estimates twards the rrespnding OLS estimates, giving a misleading impressin f similarity between the tw sets f estimates (see, e.g., Sawa, 1969). Hw bad is the finite-sample bias in an IV estimate likely t be? In pratie, this largely turns n the number f instruments relative t the sample size, and the strength f the first-stage relatinship. Other things equal, mre instruments, smaller samples, and weaker instruments eah mean mre bias (see, e.g., Buse, 1 992). The fat that IV estimates an be ntieably biased even with very large data sets was highlighted by Bund, Jaeger, and Baker (1995), fusing n Angrist and Krueger's (1991) mpulsry shling study. This study uses hundreds f thusands f bservatins frm Census data t implement an instrumental variables strategy fr estimating the returns t shling. The instruments are quarter-f-birth dummies sine hildren bm earlier in the year enter shl at an lder age and are therefre allwed t drp ut f shl (typially n their 1 6th birthday) after having mpleted less shling. Sme f the 2SLS estimates in Angrist and Krueger (1991) use many quarter-f-birth/state-f-birth interatin terms in additin t quarter-f-birth main effets as instruments. Sine the underlying first-stage relatinship in these partiular mdels is nt very strng, there is ptential fr substantial bias twards the OLS estimates in these speifiatins. Bund, Jaeger, and Baker (1995) disuss the questin f hw strng a first-stage relatinship has t be in rder t minimize the ptential fr bias. They suggest using the F-statisti fr the jint signifiane f the exluded instruments in the first-stage equatin as a diagnsti. This is learly sensible, sine, if the l7a similar prblem arises with Generalized Methd f Mments estimatin f mdels fr variane strutures (see Altnji and Segal, 1996).
31 instruments are s weak that the relatinship between instruments and endgenus regressrs annt be deteted with a reasnably high level f nfidene, then the instruments shuld prbably be abandned. On the ther hand, Hall, Rudebush, and Wilx (1996) pint ut that this srt f seletin predure als has the ptential t indue a bias frm pre-testing that an in sme ases aggravate the bias instead f reduing it. A simple alternative (r mplement) t sreening n the first-stage F is t use estimatrs that are apprximately unbiased. One suh estimatr is Limited Infrmatin Likelihd (LIML), whih has n integral mments but is nevertheless median-unbiased. This means that the sampling distributin is entered at the ppulatin parameter. 18 In fat, any just-identified 2SLS estimatr is als median-unbiased sine 2SLS and LIML are idential fr just-identified mdels. The lass f median-unbiased instrumental variables estimatrs therefre inludes the Wald estimatr disussed in the previus setin. Other apprximately unbiased estimatrs are based n predures that estimate the first-stage and send-stage relatinship in separate data sets. This inludes Tw-Sample and Split-Sample IV (Angrist and Krueger, 1992, 1995), and an IV estimatr that uses a set f leave-ne-ut first-stage estimates alled Jakknife Instrumental Variables (Angrist, Imbens, and Krueger, 1998). 19 An earlier literature disussed mbinatin estimatrs that are apprximately unbiased (see, e.g., Sawa, 1973). Reently, Chamberlain and Imbens (1996) intrdued a Bayesian IV estimatr that als avids bias. A final and related pint is that the redued frm OLS regressin f the dependent variable n exgenus variates and instruments is unbiased in a sample f any size, regardless f the pwer f the instrument (assuming the redued frm is linear). This is imprtant beause the redued frm effets f the ''Andersn, Kunitm, and Sawa (1982, p. 1026) reprt this in a Mnte Carl study: 'T surrxnarize, the mst imprtant nlusin frm the study f LIML and 2SLS estimatrs is that the 2SLS estimatr an be badly biased and in that sense its use is risky. The LIML estimatr, n the ther hand, has a little mre variability with a slight hane f extreme values, but its distributin is entered at the parameter value." Similar Mnte Carl results and a variety f analyti justifiatins fr the apprximate unbiasedness f LIML appear in Bekker (1994), Dnald and Newey (1997), Staiger and Stk (1997), and Angrist, Imbens, and Krueger (1998). "A SAS prgram that mputes Split-Sample and Jakknife IV is available at http://www.wws.prinetn.edu/faulty/krueger.html.
32 instrument n the dependent variable are prprtinal t the effiient n the endgenus regressr in the equatin f interest. The existene f a ausal relatinship between the endgenus regressr and dependent variable an therefre be gauged thrugh the redued frm withut fear f finite-sample bias even if the instruments are weak. 2.2.4 Regressin-disntinuity designs The Latin mtt Marshall plaed n the title page f his Priniples fenmis is, "Natura nnfait saltum, " whih means: "Nature des nt make jumps." Marshall argues that mst enmi behavir evlves gradually enugh t be mdeled r explained. The ntin that human behavir is typially rderly r smth is at the heart f a researh strategy alled the regressin-disntinuity (RD) design. RD methds use sme srt f parametri r semi-parametri mdel t ntrl fr smth r gradually evlving trends, inferring ausality when the variable f interest hanges abruptly fr nn-behaviral r arbitrary reasns. There are a number f ways t implement this idea in pratie. We fus here n an apprah that an viewed as a hybrid regressin-ntrl/tv identifiatin strategy. This is distint frm nventinal IV strategies beause the instruments are derived expliitly frm nnlinearities r disntinuities in the relatinship between the regressr f interest and a ntrl variable. Reent appliatins f the RD idea inlude van der Klauuw's (1996) study f finanial aid awards; Angrist and Lavy's (1998) study f lass size; and Hahn, Tdd, and van der Klaauw's (1998) study f anti-disriminatin laws. The RD idea riginated with Campbell (1969), wh disussed the (theretial) prblem f hw t identify the ausal effet f a treatment that is assigned as a deterministi funtin f an bserved variate whih is als related t the utmes f interest. Campbell used the example f estimating the effet f Natinal Merit shlarships n appliants' later aademi ahievement. He argued that if there is a threshld value f past ahievement that determines whether an award is made, then ne an ntrl fr any smth funtin f past ahievement and still estimate the effet f the award at the pint f disntinuity. This is
33 dne by mathing disntinuities r nnlinearities in the relatinship between utmes and past ahievement t disntinuities r nnlinearities in the relatinship between awards and past ahievement. 20 van der Klauuw (1996) pinted ut the link between Campbell's suggestin and IV, and used this idea t estimate the effet f finanial aid awards n llege enrllment. 21 Angrist and Lavy (1998) used RD t estimate the effets f lass size n pupil test sres in Israeli publi shls, where lass size is ffiially apped at 40. They refer t the ap f 40 as "Maimnides' Rule," after the 12th Century Talmudi shlar Maimnides, wh first prpsed it. Arding t Maimnides' Rule, lass size inreases ne-fr-ne with enrllment until 40 pupils are enrlled, but when 41 students are enrlled, there will be a sharp drp in lass size, t an average f 20.5 pupils. Similarly, when 80 pupils are enrlled, the average lass size will again be 40, but when 81 pupils are enrlled the average lass size drps t 27. Thus, Maimnides' Rule generates a disntinuity in the relatinship between grade enrllment and average lass size at integer multiples f 40. The lass size funtin derived frm Maimnides' Rule an be stated frmally as fllws. Let b i dente beginning-f-the-year enrllment in shl s in a given grade, and let z s dente the size assigned t lasses in shl s, as predited by applying Maimnides' Rule t that grade. Assuming hrts are divided int lasses f equal size, the predited lass size fr all lasses in the grade is z s = V(int((M)/40)+l). This funtin is pltted in Figure 2a fr the ppulatin Israeli fifth graders in 1991, alng with atual fifth grade lass sizes. The x-axis shws September enrllment and the y-axis shws either predited lass size r the average atual lass size in all shls with that enrllment. Maimnides' Rule des nt predit atual M GIdberger (1972) disusses a similar idea in the ntext f mpensatry eduatin prgrams. ''Campbell's (1969) disussin f RD fused mstly n what he alled a "sharp design", where the regressr f interest is a disntinuus but deterministi funtin f anther variable. In the sharp design there is n need t instrument -- the regressr f interest is entered diretly. This is in ntrast with what Campbell alled a "fuzzy design", where the funtin is nt deterministi. Campbell did nt prpse an estimatr fr the fuzzy design, thugh his student Trhim (1984) develped an IV-like predure fr that ase. The disussin here vers the fuzzy design nly sine the sharp design an be viewed as a speial ase.
34 lass size perfetly beause ther fatrs affet lass size as well, but average lass sizes learly display a sawtth pattern indued by the Rule. In additin t exhibiting a strng assiatin with average lass size, Maimnides' Rule is als rrelated with average test sres. This is shwn in Figure 2b, whih plts average reading test sres and average values f z s by enrllment size, in enrllment intervals f 10. The figure shws that test sres are generally higher in shls with larger enrllments and, therefre, larger predited lass sizes. Mst imprtantly, hwever, average sres by enrllment size exhibit a sawtth pattern that is, at least in part, the mirrr image f the lass size funtin. This is espeially lear in Figure 2, whih plts average sres by enrllment after running auxiliary regressins t remve a linear trend in enrllment and the effets f pupils' sienmi bakgrund. 22 RD interprets the up and dwn pattern in the nditinal expetatin f test sres given enrllment as refleting the ausal effet f hanges in lass size that are indued by exgenus hanges in enrllment. This interpretatin is plausible beause Maimnides' Rule is knwn t have this pattern, while it seems likely that ther mehanisms linking enrllment and test sres will be smther. Figure 2b makes it lear that Maimnides' Rule is nt a valid instrument fr lass size withut ntrlling fr enrllment beause predited lass size inreases with enrllment and test sres inrease with enrllment. The RD idea is t use the disntinuities (jumps) in predited lass size t estimate the effet f interest while ntrlling fr smth enrllment effets. Angrist and Lavy implement this by using z s as an instrument while ntrlling fr smth effets f enrllment using parametri enrllment trends. Cnsider a ausal mdel that nnets the sre f pupil i in shl s with lass size plus effets f the variable used t nstrut Maimnides' Rule: (27) y is =X s 'P + n 6 + e js is, where n is is the size f i's lass, and X,is a vetr f shl harateristis, inluding funtins f grade enrllment, b. % As befre, we imagine that this funtin tells us what test sres wuld be if lass size were "The figure plts the residuals frm regressins f y and z, n b is and the prprtin f lw-inme pupils in the s shl.
35 manipulated t be ther than the bserved size, n i5. The first-stage equatin fr 2SLS estimatin f (27) is (28) n is = X s '7C'+z s 7i, + v is. A simple example is a mdel that simply inludes b s linearly t ntrl fr enrllment effets nt attributable t hanging lass size, alng with a regressr measuring the prprtin f lw-inme students in the shl. 23 The resulting 2SLS estimate f 6 in standard deviatin units is -.037 (with a standard errr f.009), meaning just ver a ne-third standard deviatin deline in test sres fr a 10 pupil inrease in lass size. Sine RD is an IV estimatr, we d nt have a separate setin fr pitfalls. As befre, the mst imprtant issue is instrument validity and the hie f ntrl variables. The hie f ntrls is even mre imprtant in RD than nventinal IV, hwever, sine the instrument is atually a funtin f ne f the ntrl variables. In the Angrist and Lavy appliatin, fr example, identifiatin f 6 learly turns n the ability t distinguish z s frm X s sine z s des nt vary within shls. This suggests that RD depends mre n funtinal frm assumptins than ther IV predures, thugh Hahn, Tdd, and van der Klauuw (1998) nsider ways t weaken this dependene. 2.3 Cnsequenes f hetergeneity and nnlinearity The disussin s far invlves a highly stylized desriptin f the wrld, wherein ausal effets are the same fr everyne, and, if the ausing variable takes n mre than tw values, the effets are linear. Althugh sme enmi mdels an be used t justify these assumptins, there is n reasn t believe this is true in general. On the ther hand, these strng assumptins prvide a useful starting plae beause they may prvide a gd apprximatin f reality, and beause they fus attentin n ausality issues. If the estimates f a linear, nstant-effiient mdel are biased fr the ausal effet f interest, then the estimates are nly mre diffiult t interpret in a general setting. The st f these simplifying assumptins is that they glss ver the fat that even when a set f "In pratie, Angrist and Lavy estimated (27) and (28) using lass-level averages and nt mir data.
36 estimates has a ausal interpretatin, they are generated by variatin fr a partiular grup f individuals ver a limited range f variatin in the ausing variable. There is a traditin in Psyhlgy f distinguishing between the questin f internal validity, i.e., whether an empirial relatinship has a ausal interpretatin in the setting where it is bserved, and the questin f external validity, i.e., whether a set f internally valid estimates has preditive value fr ther grups r values f the respnse variable than thse bserved in a given study. 24 Cnstant-effiient and linear mdels make it harder t disuss the tw types f validity separately, sine external validity is autmati in a nstant-effiients-linear setting. Taken literally, fr example, the nstant-effets mdel says that the enmi nsequenes f military servie are the same fr high-shl drputs and llege graduates. Similarly, the linear mdel says the enmi value f a year f shling is the same whether the year is send grade r the last year f llege. We therefre disuss the interpretatin f traditinal estimatrs when nstant-effets and linearity assumptins are relaxed. 2.3.1 Regressin and the nditinal expetatin funtin Returning t the shling example f Setin 2.2.1, the ausal relatinship f interest is fj(s), whih desribes the effet f shling n earnings. In the absene f any further assumptins, the average ausal respnse funtin is E[f (5)], with average derivative E[fj'(S)]. Earlier, we assumed ^'(5) is equal t a nstant, p, in whih ase averaging is nt needed. In pratie, hwever, the derivative may be hetergeneus; that is, it may vary with i r with i's harateristis, X (. In enmis, mdels fr hetergenus treatment effets are mmnly alled "randm effiient" mdels (see, e.g., Bjrklund and Mffitt, 1987 and Hekman and Rbb, 1985 fr disussins f suh mdels). The derivative als might be nn-nstant (i.e., vary with S). In either ase, it makes sense t fus n the average respnse funtin r its average derivative. The prinipal statistial tl fr ding this is the Cnditinal Expetatin Funtin (CEF) f Yj given S it i.e., E[Yjl Sj=S] r E[Yjl X;, S,=5], viewed as a funtin f S. 24 See, e.g., Campbell and Stanley (1963) and Meyer (1995).
37 T see the nnetin between the CEF and the average ausal respnse, nsider first the differene in average earnings between peple with S years f shling and peple with S-l years f shling: E[Y,I Si=5]-E[Yil S-S-l] = E[f i {S)-f i (S-l)\S i =S) + {E[f,(S-/)E^-E[f,(S-7)GrS-7]}. The first term in this dempsitin is the average ausal effet f ging frm S-l t S years f shling fr thse wh atually have S years f eduatin. The unterfatual average E[ft (5-7 )IS i=5"] is never bserved, hwever. The send term reflets the fat that the average earnings f thse with 5-7 years f shling d nt neessarily prvide a gd answer t the "what if questin fr thse with S years f shling. This term is the unterpart f regressin-style "mitted variables bias" fr this mre general mdel. In this setting, the seletin-n-bservables assumptin asserts that nditining n a set f bserved harateristis, Xj, serves t eliminate the mitted variables bias in naive mparisns. That is, (29) E[f ; (5-7)l X;, S=S] = E[fi(5-7)l X S=S-1] fr all S, s that nditinal n X, the CEF and average ausal respnse funtin are the same: E[Y iix i,s i =5] = E[f i (5)IX i ]. In this ase, the nditinal-n-x mparisn des estimate the ausal effet f shling: E[Y( I Xi, S-SJ-EtYil Xi, S-5-7] = E[f,(S)-f«(S-/)l XJ. This is analgus t the ntin that adding X; t a regressin eliminates mitted variables bias in OLS estimates f the returns t shling. The preeding disussin prvides suffiient nditins fr the CEF t have a ausal interpretatin. We next nsider the relatinship between regressin parameters and the CEF. One interpretatin f regressin is that the ppulatin OLS slpe vetr prvides the minimum mean squared errr (MMSE) linear apprximatin t the CEF. This feature f regressin is disussed in Gldberger's (1991) enmetris text
38 (see espeially Setin 5.5). 25 A related prperty is the fat that regressin effiients have an "average derivative" interpretatin. In multivariate regressin mdels, hwever, this interpretatin is mpliated by the fat that the OLS slpe vetr is atually matrix-weighted average f the gradient f the CEF. Matrixweighted averages are diffiult t interpret exept in speial ases (see Chamberlain and Learner, 1976). 26 One interesting speial ase where the OLS slpe vetr an be readily interpreted is when there is a single regressr f interest and the CEF f this regressr given all ther regressrs is linear, s that (30) E[S,I Xi]=X i '7T, where n is a nfrmable vetr f effiients. This assumptin is satisfied in the shling regressin, fr example, in a mdel where all X-variables are disrete and the parameterizatin allws a separate effet fr eah pssible value f X;. This is nt unrealisti in appliatins with large data sets; see, fr example, Angrist and Krueger (1991) and Angrist (1998). In this ase, the ppulatin regressin effiient frm a regressin f Yi n Xj and Sj an be written (31) p = E[(S r r E[S IX i i])y]/e[(s i -E[S IX i i])s = E[(S ] j i -E[S,IX i])e[yi X,, SJ j/eksreftlxdjs,], whih is derived by iterating expetatins ver X ( and Sj. Maintaining assumptin (30), i.e., that the relatinship between E[Sjl XJ is linear, first nsider the ase where E[Yjl Xj, Sj] is linear in S but nt Xj. Then we an write ( fr all 5, whih means px HE[Y i IX i,s i =5]-E[Y i IX i,s i =5-7] (32) E[YjIXj, Sj]= E[Y,i Xj, S,=0] + S ipx. In ther wrds, the CEF is linear in shling, but the shling effiient is nt nstant and depends n X ;. "Prf that OLS gives the MMSE linear apprximatin f the CEF: The vetr f ppulatin regressin effiients fr regressr vetr W; slves mm b E(Yj-Wj'b) 2. But (Yj-W/b)^ [(Yj-EfYjIWj]) + (EfYjIW,] - W/b)] 2 and E[(Y,- E[Y,IWJ) (E[YjlWj] - Wj'b)]=0, s mm b E([Y,Wj] - W/b)] 2 has the same slutin. 26 The ppulatin slpe vetr is E['W i 'W i '] lwwiyi } = E[W iw j E(YilWj) = E(Y l iiwj=0)+w,"ve(y i wj, where VE(YJ w( 1E[WiE(Y '] iiw )]. Linearizing the CEF, we have i ) is the gradient f the nditinal expetatin funtin, and ']' w, is a randm variable that lies between W, and zer. S the slpe vetr is E[(W,W,' )VE(Y l I E[W,Wi S %)], whih is a matrix- weighted average f the gradient with weights (WjW,').
39 Substituting (32) int (31), we have 2 (33) p, = E[(S j -E[S i IX i ]) px ]/E[(S i -E[S IX 2 i i ]) ] = E[ 0QPx]/E[KXi)] 2 where Os(X i)=e[(s j -E[S j IX l i ]) XJ is the variane f S given Xj. S in this ase, regressin prvides ( a variane-weighted average f the slpe at eah Xj. Values f X, that get the mst weight are thse where the nditinal variane f shling is largest. What if the CEF varies with bth X; and S,? Let pjx h E[Y,I X h S i =S]-E[Y! i X, S=S-1], where the p5x ntatin reflets variatin with bth S and Xj. Then the effiient n Sj in a regressin f Yj n Xj and S, an be written s (34) p r = E[ I PjxM5x]/E[I Ma] 5=1 5=1 where Msx = (E[Sjl Xj, S,*S]-E[S,I Xj, S,<5])(P[Sj^5l XJ(1*PES,'*S1 XJ)) * 0. and S takes n values in the set {0, 1,..., s}. This result, whih is prved in the appendix, is a generalizatin f the frmula frbivariate regressin effiients given by Yitzhaki (1996). 21 The weighting frmula in (34) has a sum and an expetatin. The sum averages pra fr all shling inrements, given a partiular value f Xj (this averaging matters if the CEF is nnlinear). The expetatin then averages this sum in the distributin f Xj (this averaging matters if the respnse funtin is hetergeneus). The frmula fr the weights, u^, an be used t haraterize the OLS slpe vetr. First, fr any partiular Xj, weight is given t p^ fr eah S in prprtin t the hange in the nditinal mean f Sj, as S s falls abve r belw S. Mre weight is als given t pints in the dmain f fj(s) that are lse t the nditinal median f Sj given Xs sine this is where PIS^SI Xj](l-P[Sj^5I Xj]) is maximized. Send, as in the linear ase disussed abve, weight is als given in prprtin t nditinal variane f Sj given Xj, exept nw this "Yitzhaki gives examples and desribes the OLS weighting funtin fr a mdel with a single ntinuusly distributed regressr in detail. Fr Nrmally distributed regressrs, the weighting funtin is the Nrmal density funtin, s that OLS prvides a density-weighted average f the srt disussed by Pwell, Stk, and Stker (1989). Fr an alternative nn-parametri interpretatin f OLS effiients see Stker (1986).
40 variane is defined separately lr eah S using dummies fr the event that S^S. Nte als that the OLS estimate ntains n infrmatin abut the returns t shling fr values f Xj where PfS^Sl XJ equals r 1. This inludes values f X; where S; des nt vary arss bservatins, beause PfS^SI XJ=1 if P[S i =5'IXi]=l The weighting funtin is illustrated in Figure 3 using data frm the 1990 Census. The tp panel plts an estimate f the earnings-shling CEF, i.e., average lg weekly wages against years f shling fr men with 8-20 years f shling, adjusted fr variates. In ther wrds, the plt shws EfEIYjIXj, SpS] }, pltted against S. Years f shling are nt rerded in the 1990 Census and were therefre imputed frm ategrial shling variables as desribed in the appendix. The X-variables are rae (white, nnwhite), age (40-49), and state f birth. The variates in this ase are similar t thse used in sme f the speifiatins in the Angrist and Krueger (1991) study f the returns t shling, althugh the data underlying this figure are mre reent. The dtted line in the figure plts the hange in EfEfYilXj, Sj=S]} with S. This is the variateadjusted differene in average lg weekly wages at eah shling inrement, p5 e E{E[Y,IX S=S] - EWX, SF S-1]) = x psx P(X~X) Fr example, the first pint n the dtted line is an estimate f p 9 -p 8, whih is the average differene in earnings between thse with 9 years f shling and thse with 8 years f shling, adjusting fr differenes in the distributin f X; between the tw shling grups. 28 The returns measured in this way are remarkably stable until 13 years f shling, but quite variable after that and smetimes even negative. The mre lightly shaded line in the figure is the OLS regressin line btained frm fitting equatin (1) with a saturated mdel fr X ( (in ther wrds, the mdel inludes a full set f dummies, da whih equal ne when X=X fr every value X; the OLS estimate f p in this ase is.094). This parameterizatin satisfies assumptin (30), i.e., E[SJ XJ is linear. The figure illustrates the sense in whih OLS aptures the average return. The OLS weighting funtin fr eah value f S; is pltted in the lwer panel, alng with the 28 The unadjusted differene in average wages is (E[Y,ISj=S J-EfYilSpS-7]}, whih equals {E[E(Y i IX i,s i =5)l S~ 5-1]- EtECYJXi.S^S-;)! S i= 5-7].
41 histgram f shling. 29 Like the distributin f shling itself, the OLS weighting sheme puts the mst weight n value between 12-16. It is interesting t nte, hwever, that while the histgram f shling is bimdal, the weighting funtin is smther and unimdal. Mrever, the ppulatin average f ps, i.e., the weighted average f the variate-adjusted return using the shling histgram, s p sp(si=s), is.144, whih is nsiderably larger than the OLS estimate. This is beause abut half f the sample has 12-13 years f shling, where the returns are.136 and.148. The OLS weighting funtin gives mre weight than the histgram t ther shling values, like 14, 15, and 17, where the returns are small and even negative. 2.3.2 Mathing instead f regressin The previus setin shws hw regressin prdues a weighted average f variate-speifi effets fr eah value f the ausing variable. The empirial nsequenes f the OLS weighting sheme in any partiular appliatin depend n the distributin f regressrs and the amunt f hetergeneity in the ausal effet f interest. Mathing methds prvide an alternative estimatin strategy that affrds mre ntrl ver the weighting sheme used t prdue average ausal effets. Mathing methds als have the advantage f making the mparisns that are used fr statistial identifiatin transparent. Mathing is mst pratial in ases where the ausing variable takes n tw values, as in the unin status and military servie examples disussed previusly. Again, we use the example f estimating the effet f military servie t illustrate this tehnique. Angrist (1998) reprted mathing and regressin t estimate the effets f vluntary military servie n ivilian earnings. As in the Vietnam study, the ptential utmes are Y^, denting what smene wuld earn if they did nt serve in the military, and Y,i denting earnings as a veteran. Sine Yjj-Ya is nt nstant, and we never bserve bth ptential utmes fr any ne persn, it makes sense t fus n average effets. One pssibility is the "average treatment effet," EfY^-Y;], but this is nt usually the first hie in studies f this "Sine the regressin mdel has variates, the weights vary with Xj as well as fr eah shling inrement. average weighting funtin pltted in the figure is x u JX P(X~X). The
D,=0] 42 kind sine peple wh serve in the military tend t have persnal harateristis that differ, n average, frm thse f peple wh didn't serve. The manpwer pliy innvatins that are typially ntemplated affet thse individuals wh either wuld nw serve r wh might be expeted t serve in the future. Fr example, between 1989 and 1992, the size f the military delined sharply beause f inreasing enlistment standards. Pliy makers wuld like t knw whether the peple wh wuld have served under the ld rules but are unable t enlist under the new rules were hurt by the lst pprtunity fr servie. This srt f reasning leads researhers t try t estimate the "effet f treatment n the treated," whih is EIY^-Y^ D,=l ] in ur ntatin. 30 As in the study f Vietnam veterans, simply mparing the earnings f veterans and nnveterans is unlikely t prvide a gd estimate f the effet f military servie n veterans. The mparisn by veteran status is E[Y D,=l] - E[Y a = E[Y - Y D-l] + {E[Y D i= l] - E[YJ D,=0]}. This is the average ausal effet f military servie n veterans, E[Y, - Y D=l], plus a bias term attributable t the fat that the earnings f nnveterans are nt neessarily representative f what veterans wuld have earned had they nt served in the military. Fr example, veterans may have higher earnings simply beause they must have higher test sres and be high shl graduates t meet military sreening rules. The bias term in naive mparisns ges away if D is randmly assigned beause then D jwill be ; independent f Y a and Y n. Sine vluntary military servie is nt randmly assigned (and there is n lnger a draft lttery), Angrist (1998) used mathing and regressin tehniques t ntrl fr bserved differenes between veteran and nnveterans wh applied t get int the all-vlunteer fres between 1979 and 1982. The mtivatin fr a ntrl strategy in this ase is the fat that the military sreens appliants t the armed fres primarily n the basis f age, shling, and test sres, harateristis that are bserved in the Angrist (1998) data. Identifiatin in this ase is based n the laim that after nditining n all f the bserved harateristis that are knwn t affet veteran status, veterans and nnveterans are mparable in the sense "Hekman and Rbb (1985) make this pint abut the effet f subsidized training prgrams.
X.D-011 43 that (35) E[YJ X,, D i=l]=e[y 0i l X it D,=0]. This assumptin seems plausible fr tw reasns. First, the nnveterans wh prvide bservatins n Y^ did in fat apply t get in t the military. Send, seletin fr military servie frm the pl f appliants is based almst entirely n variables that are bserved and inluded in the X-variables. Variatin in veteran status nditinal n X, mes slely frm the fat that sme qualified appliants nevertheless fail t enlist at the last minute. Of urse, the nsideratins that lead a qualified appliant t 'drp ut" f the enlistment press uld be related t earnings ptential, s assumptin (35) is learly nt guaranteed. Given assumptin (35), the effet f treatment n the treated an be nstruted as fllws: (36) ErVYjDpl] =E{E[Y li X i,d i=l]-e[y 0l X i,d,=l] D,=l} = E{E[Y X D,=l]-E[Y a D,=l } =E[8 X D,=l]. where 6 x = E[Y i X D,=l]-E[Y X,D,=0]. i i Here 6 X is a randm variable that represents the set f differenes in mean earnings by veteran status rrespnding t eah value taken n by Xj. This is analgus t p x that was defined fr the shling prblem. Nte, hwever, that sine D is binary, the respnse funtin is autmatially linear in Df. The mathing estimatr in Angrist (1998) uses the fat that X, is disrete t nstrut the sample analg f (36), whih an als be written (37) E[Y - Y Dpi] = lx 6 X P(X F XI D i= l), where P(Xi=Xl D=l) is the prbability mass funtin fr given D~l and the summatin is ver the values Xt f X;. 31 In this ase, X;, takes n values determined by all pssible mbinatins f year f birth, AFQT testsre grup, 32 year f appliatin t the military, and eduatinal attainment at the time f appliatin. 3l This mathing estimatr is disussed by Rubin (1977) and used by Card and Sullivan (1988) t estimate the effet f subsidized training n emplyment. "This is the Armed Fres Qualifiatin Test, used by the military t sreen appliants.
44 Naive mparisns learly verestimate the benefit f military servie. This an be seen in Table 6, whih reprts differenes-in-means, mathing, and regressin estimates f the effet vluntary military servie n the 1988-91 Sial Seurity-taxable earnings f men wh applied t jin the military between 1979 and 1982. The mathing estimates were nstruted frm the sample analg f (37), i.e., frm variate-valuespeifi differenes in earnings, 6 X, weighted t frm a single estimate using the distributin f variates amng veterans. Althugh white veterans earn $1,233 mre than nnveterans, this differene after bemes negative ne the adjustment fr differenes in variates is made. Similarly, while nn-white veterans earn $2,449 mre than nnveterans, ntrlling fr variates redues this t $840. Table 6 als shws regressin estimates f the effet f vluntary servie, ntrlling fr exatly the same variates used in the mathing estimates. These are estimates f 6, in the equatin (38) V, = Lrd«P* + 6A + ei. where px is a regressin-effet fr X=X and 6, is the regressin parameter. This rrespnds t a saturated mdel fr X,. Despite the fat that the mathing and regressin estimates ntrl fr the same variables, the regressin estimates are signifiantly larger than the mathing estimates fr bth whites and nnwhites. 33 The reasn the regressin estimates are larger than the mathing estimates is that the tw estimatin strategies use different weighting shemes. While the mathing estimatr mbines variate-value-speifi estimates, 6*, t prdues an estimate f the effet f treatment n the treated, regressin prdues a varianeweighted average f these effets. T see this, nte that sine is binary and EfDJ XJ is linear, frmula (33) Ds frm the previus setin implies 6, = E[(D r E[D i IX i])2 2 6 x ]/E[(D i -E[D i IX i]) ] = E[l&d?>xW[l&d] But in this ase, (Xj)= P(Dj=ll Xftl-P(Dpll X;)), s 6 r = lx 6* [PPplI X^XXl-PO^ll X^X))]P(Xi=*) I* [P(D i= ll X i=x)(l-p(d i=ll X=X))]?(X,=X) "The frmula fr the variane f regressin and mathing estimates is derived in Angrist (1998, p. 274).
45 In ther wrds, regressin weights eah variate-speifi treatment effet by P(X,=X1 Dj=l)(l-P(X-=.XI D,=l)). In ntrast, the mathing estimatr, (37), an be written HEY,, - Y«DF1] = lx 6 x P(Dr\\X~X)?(X~X) ZxPiD^WX-^PQi-X), beause P(X~X\ =l) = P(D,=1I XpWXpXyP^j). D; The weights underlying E[Y n - YJ D,=l] are prprtinal t the prbability f veteran status at eah value f the variates. S the men mst like t serve get the mst weight in estimates f the effet f treatment n the treated. In ntrast, regressin estimatin weights eah f the underlying treatment effets by the nditinal variane f treatment status, whih in this ase is maximized when P(D ( =1 1 X=X)=Vi. Of urse, the differene in weighting shemes is f n imprtane if the effet f interest des nt vary with Xi. But Figure 4, whih plts X-speifi estimates (6X) f the effet f veteran status n average 1988-91 earnings against P[D =1 1 X~X], shws that the men wh were mst likely t serve in the military benefit least frm their servie. This fat leads mathing estimates f the effet f military servie t be smaller than regressin estimates based n the same vetr f ntrls. 2.3.3 Mathing using the prpensity sre It is easy t nstrut a mathing estimatr based n (37) when, as in Angrist (1998), the nditining variables are disrete and the sample has many bservatins at almst every set f values taken n by the vetr f explanatry variables. What abut situatins where is ntinuus, s that exat mathing is nt pratial? X; Prblems invlving mre finely distributed X-variables are ften slved by aggregating values t make arser grupings r by pairing bservatins that have similar, thugh nt neessarily idential values. See Chran (1965), Rubin (1973), r Rsenbaum (1995, Chapter 3) fr disussins f this apprah. Mre reently, Deatn and Paxsn (1998) used nnparametri methds t ammdate ntinuus-valued ntrl variables in a mathing estimatr.
X,). 46 The prblem f hw t aggregate the X-variables als mtivates a mathing methd first develped in a series f papers by Rsenbaum and Rubin (1983, 1984, 1985). These papers shw that full ntrl fr bserved variates an be btained by ntrlling slely fr a partiular funtin f X, alled the prpensity sre, whih is simply the nditinal prbability f treatment, p(xi)=p(di=l The frmal result underlying this apprah says that if nditining n X, eliminates seletin bias, E[YJX il D i=l] = E[YJX i,d i=0] then it must als be true that nditining n p(x ; ) eliminates seletin bias: E[YJ pcx,), D,=l] = EtY-l p(x,), D,=0]. This leads t the fllwing mdifiatin f (36): EtY.i-YilDpl] = E{E[Y li X,,D 1=l]-E[YjX,,D,=l] DF l} = E{E[Y p(x i),d,=l]-e[yjp(x,),d,=0] D j= l Of urse, t make this expressin int an estimatr, the prpensity sre p(x ) must first be estimated. The pratial value f this result is that in sme ases, it may be easier t estimate p(xj) and then nditin n the estimates f p(x,) than t nditin n X diretly. Fr example, even if Xj is ntinuus, p(x ; () may have sme "flat spts", r we may have sme prir infrmatin abut p(x ; ). The prpensity sre apprah is als neptually appealing beause it fuses attentin n variables that are related t the regressr f interest. Althugh Yj may vary with X, in mpliated ways, this is nly f nern fr values f Xj where p(xj) varies as well. An example using the prpensity sre in labr enmis is Dehejia and Wahba's (1995) reanalysis f the Natinal Supprted Wrk (NSW) training prgram studied by Lalnde (1986). The NSW prvided training t different grups f "hard-t-emply" men and wmen in a randmized demnstratin prjet. Lalnde' s study uses bservatinal ntrl grups frm the Current Ppulatin Survey (CPS) and the Panel Study f Inme Dynamis (PSID) t lk at whether enmetri methds are likely t generate nlusins similar t thse fund in the experimental study. One hurdle faing the nn-experimental investigatr
47 attempting t nstrut a ntrl grup fr trainees is hw t ntrl fr lagged earnings. As we nted earlier, ntrlling fr lagged earnings is imprtant sine partiipants in gvernment training prgrams are ften bserved t experiene a deline in earnings befre entering the prgram (see, e.g., Ashenfelter and Card, 1985, and the Hekman, Lalnde, and Smith hapter n training in this vlume). Lalnde (1986) fund that nn-experimental methds based n regressin mdels, inluding mdels with fixed effets and ntrl fr lagged earnings, fail t repliate the NSW experimental findings. Using the same bservatinal ntrl grups as Lalnde (1986) did, Dehejia and Wahba (1995) ntrl fr lagged earnings and ther variates by first estimating a lgit mdel that relates partiipatin in the prgram t the variates and tw lags f earnings. Fllwing an example by Rsenbaum and Rubin (1984), they then divide the sample int quintiles n the basis f fitted values frm this lgit, i.e., based n estimates f the prpensity sre. The verall estimate f the effet f treatment n the treated is the differene between average trainee and average ntrl earnings in eah quintile, weighted by the number f trainees in the quintile. The estimates prdued using this methd are similar t thse based n the experimental randm assignment (and apparently mre reliable than regressin estimates). It shuld be lear, hwever, that use f prpensity sre methds requires a number f deisins abut hw t mdel and ntrl fr the sre. There is little in the way f frmal statistial thery t guide this press, and the questin f whether prpensity sre methds are better than ther methds remains pen. See Hekman, Ihimura, and Tdd (1997) fr further empirial evidene, and Hahn (1998) fr reent theretial results n effiieny nsideratins in these mdels. 2.3.4. Interpreting instrumental variables estimates The disussin f IV in Setin 2.2.3 used the example f veteran status, with tw ptential utmes and a nstant ausal effet, Y n -Y^ = 6. What is the interpretatin f an IV estimate when nstant-effets assumptin is relaxed? We first disuss this fr a mdel where the ausing variable is binary, as in the veteran status example, turning afterwards t a mre general mdel. As befre, the disussin is initially limited t
48 the Wald estimatr sine this is an imprtant and easily-analyzed IV estimatr. Withut the nstant-effets assumptin, we an write the bserved utme, Yj, in terms f ptential utmes as (39) Y-, = Y i0 + (Y.rYJDi = p + 6A + ^ where P =E[Yj] and 6j sy^y^ is the hetergeneus ausal effet. The expressin after the send equals sign is a "randm-effiients" versin f the ausal mdel in Setin 2.3.3 (see equatin 23). T failitate the disussin f IV, we als intrdue sme ntatin fr the first-stage relatinship between the ausing variable, D and the binary instrument, Zj. T allw fr as muh hetergeneity as pssible, the first stage equatin is written in a manner similar t (39): (40) Dj = D i0 + (D.rDJZi = tt + t.jzj + v where 7i =E[D l0 ] and ^^(Dn-Di) is the ausal effet f the instrument n Dj. In the draft lttery example, D 0i tells us whether i wuld serve in the military if nt draft-eligible and D u tells us whether i wuld serve when draft-eligible. The effet f draft-eligibility n D, is the differene between these tw ptential treatment assignments. The priniple identifying assumptin in this setup is that the vetr f ptential utmes and ptential treatment assignments is jintly independent f the instrument. Frmally, {Y li,y 0i,D li,d 0i }IlZi, where "]}" is ntatin fr statistial independene (see, e.g., Dawid, 1979, r Rsenbaum and Rubin, 1983). 34 In the lttery example, Z, is learly independent f {D^, D,j} sine Zj was randmly assigned. As nted in setin 2.3.3, hwever, independene f { Y^, Y,;} and Z, is nt guaranteed by randmizatin sine Y a and Y,j refer t ptential utmes under alternative assignments f veteran status and nt Z, itself. Even thugh Zj was randmly assigned, s the relatinship between Zj and Y { is learly ausal, in priniple there might be reasns ther than veteran status fr an effet f draft-eligibility n earnings. The independene assumptin, 3< The independene assumptin using randm-effiients ntatin is {b r^, Tt,j, v s } JJ Zj.
49 whih is similar t the assumptin that Z, and rj; are unrrelated in the nstant-effets mdel, rules this pssibility ut. A send assumptin that is useful here, and ne that des nt arise in a nstant-effets setting, is that either it u *0 fr all i r 7t,i <0 fr all i. This mntniity assumptin, intrdued by Imbens and Angrist (1994), means that while the instrument may have n effet n sme peple, it must be the ase that the instrument ats in nly ne diretin, either D^Dq; r D,;<D jfr all i. In what fllws, we assume D^Dq, fr all i. In the draft-lttery example, this means that althugh draft-eligibility may have had n effet n the prbability f military servie fr sme men, there is n ne wh was atually kept ut f the military by being draft-eligible. Withut mntniity, instrumental variables estimatrs are nt guaranteed t estimate a weighted average f the underlying ausal effets, Y^Y^. Given independene and mntniity, the Wald estimatr in this example an be interpreted as the effet f veteran status n thse whse treatment status was hanged by the instrument. This parameter is alled the lal average treatment effet (LATE; Imbens and Angrist, 1994), and an be written as fllws: E[Y,IZpl]-E[Y,IZi=0] EtD.IZ-U-EtDilZ-O] = E[Y li -Y,ID li >D Oi ] = E[6 l7i 1 li >0]. Thus, IV estimates f effets f military servie using the draft lttery estimate the effet f military servie n men wh served beause they were draft-eligible, but wuld nt therwise have served. 35 This bviusly exludes vlunteers and men wh were exempted frm military servie fr medial reasns, but it inludes men fr whm the draft pliy was binding. Muh f the debate ver mpulsry military servie fused n draftees, s LATE is learly a parameter f pliy interest in the Vietnam ntext. The LATE parameter an be linked t the parameters in traditinal enmetri mdels fr ausal effets. One mmnly used speifiatin fr dummy endgenus regressrs like veteran status is a latent- "Prf f the LATE result: E[Yjl Zj =l]=e[y i0 + (Y.j-YJDJ Z-l], whih equals E[Yj + (Y.j-YJDJ by independene. Likewise E[YJ Z,=0]= E[Yj + (Y^Y^D,,; ], s the numeratr f the Wald estimatr is E[(Y li -Y 0j )(D Ii -D 0i )]. Mntniity means D^-Da equals ne r zer, s E[(Y li -Y 0j )(D li -D 0i )]= EfY.i-YJD^DJPID.^DJ. A similar argument shws E[DJ Z-11-ErDjl Z,=0] = E[D ll -D 0i ]=P[D li >D 0i ].
50 index mdel (see, e.g., Hekman, 1978), where Dpi if Y + Yi^i > v and therwise, and v ; is a randm fatr assumed t be independent f Zj. This speifiatin an be mtivated by mparisns f utilities and sts under alternative hies. In the ntatin f equatin (40), the latent-index mdel haraterizes ptential treatment assignments as: Di=l if [Y > vj and D n =l if [y + Yi > vj. Nte that in this mdel, mntniity is autmatially satisfied sine Yi is a nstant. Assuming Yi>0. EtY.-YJ D^DJ = EtY.rYJ Y + Y, > V; >y ], whih is a funtin f the strutural first-stage parameters, Y and Yi- The LATE parameter is representative f a larger grup the larger is the first-stage parameter, y,. LATE an als be mpared with the effet f treatment n the treated fr this prblem, whih depends n the same first-stage parameters and the marginal distributin f Z r Nte that in the latent-index speifiatin, Dpi in ne f tw ways: either Y> v p Jn whih ase the instrument desn't matter, r y + y, > Vj >Y and Z l. Sine these tw pssibilities partitin the grup with Dpi, we an write ErY.rY.lDpl^PtD-l)-' x { EtY.rYJ Y + Y. > v, >Y. Z i= l] PCY+Y.^Y. Z,=l) + EtY.i-YJ Y > vjpcy^) } = P(D,=1)-' x {EIYn-Y,! Y + Y. > v> >y ] P(Y+Y.>v i>y)p(zi=d + E[Y ir YJ y > vjpcy^v^ }. This shws that the effet n the treated is a weighted average f LATE and the effet n men whse treatment status is unaffeted by the instrument. 36 Nte, hwever, that althugh LATE equals the Wald estimatr, the effet n the treated is nt identified in this ase withut additinal assumptins (see, e.g., Angrist and Imbens, 1991). "Nte that P[Y + Yi > \ >YMZi=l]+P[Y > v^e^i Z=l] - EfDJ Zi=0])P(Zi=l)+E[D i l Z~0]=P[D~1], s the weights sum t ne. In the speial ase where P[y >vf ]=0 fr everyne, LATE and the effet f treatment n the treated are the same.
=0, 51 Interpreting IV estimates with ardinal variables S far the disussin f IV has fused n mdels with a binary regressr. What des the Wald estimatr estimate when the regressr takes n mre than tw values, like shling? As in the disussin f regressin in Setin 2.2.1, suppse the ausal relatinship f interest is haraterized by a funtin that desribes exatly what a given individual wuld earn if they btained different levels f eduatin. This relatinship is persn-speifi, s we write ft (S) t dente the earnings r wage that i wuld reeive after btaining S years f eduatin. The bserved earnings level is Y i=fi(s j ). Again, it is useful t have a general ntatin fr the first-stage relatinship between S; and Z,: (41) S, = S, + (S.rSJZ, = 4> + <!>, + v where S, is the shling i wuld get if Z S,, is the shling i wuld get if Z;=l, and (J^EtSJ. In ; randm-effiients ntatin, the ausal effet f Z, n S, is (}) li ss,. j-s 0] T make this nrete, suppse the instrument is a dummy fr being brn in the send, third, r furth quarter f the year, as fr the Wald estimate in Angrist and Krueger (1991, Table 3). Sine mpulsry attendane laws allw peple t drp ut f shl n their birthday (typially the 16th) and mst hildren enter shl in September f the year they turn 6, pupils bm later in the year are kept in shl lnger than thse bm earlier. In this example, S i0 is the shling i wuld get if bm in the first quarter and S;, is the shling i wuld get if bm in a later quarter. Nw the independene assumptin is {f,(s), S u, S^} ]} Z, and the mntniity assumptin is S^sSqj. This means the instrument is independent f what an individual uld earn with shling level S, and independent f the randm elements in the first stage. 37 Using the independene assumptin and equatin (41) t substitute fr S jt the Wald estimatr an be written: E[f,(S,)l Z,=l ] - EtfA)! Zi=0] ElftSuMKSa)] (42) = = EtKfM-ftSiMSH-Si)]}. E(S,IZF1]-E[S I I^=0] EJSu-SJ where, =(S n -S 0i )/E[S li - SJ. This is a weighted average ar slpe f f,(5) n the interval [S^, SJ. We an "Fr example, if fj(s)=p +p 15+T) then we assume i, {p^, r^, <J> 1 v } are independent f Z,. ;
52 simplify further using the fat that f i (S, i )=f i (S 0i )+f i '(S*)(S li - S a ), fr sme S* in the interval [Sq,, SJ. 38 Nw we an write the Wald estimatr as an average derivative: EKCSuMKSa)] EKS.i-SJf.XS*)] (43) = = Et.f.'CSl)] E[S 1S - S^ E[S H -Sa] Given the mntniity assumptin, Uj is psitive fr everyne, s the Wald estimatr is a weighted average f individual-speifi slpes at a pint in the interval [S, a S,j]. The weight eah persn gets is prprtinal t the size f the ausal effet f the instrument n him r her. The range f variatin in f^s) summarized by this average is always between S^ and S H. Angrist, Imbens, and Graddy (1995) nte that the Wald estimatr an be haraterized mre preisely in a number f imprtant speial ases. First, suppse that the effet f the instrument is the same fr everybdy, i.e., <fr n is nstant. Then we btain the average derivative E[f/(S*)], and n weighting is invlved. If fj(5) is linear in S, as in Setin 2.2. 1, but with a randm effiient, pi then the Wald estimatr is a weighted average f the randm effiient: E[(S n - SJpJ/ E[S n - SJ. If $ is nstant and fj(5) is linear, then the Wald estimatr is the ppulatin average slpe, E[pJ. Anther interesting speial ase is when fj(s) is a quadrati funtin f S, as in Lang (1993) and Card's (1995) parameterizatin f a strutural human-apital earnings funtin. The quadrati funtin aptures the ntin that returns t shling deline as shling inreases. Nte that fr a quadrati funtin, the pint f linearizatin is always S* = (Su+S^. The Wald estimatr is therefre E[G> i f i '(S Ii +S 0i )/2)] i.e., a weighted average f individual slpes at the midpint f the interval [S^ S u ] fr eah persn. The fat that the weights are prprtinal t S u - S a smetimes has enmi signifiane. In the Card and Lang mdels, fr example, the first-stage effet, S - u S a, is assumed t be prprtinal t individual disunt rates. Sine peple with higher disunt rates get less shling and the shling-earnings relatinship has been 8 Here we assume that f t (S) is ntinuusly differentiable with dmain equal t a subset f the real line.
53 assumed t be nave, this tends t make the Wald estimate higher than the ppulatin average return. Lang (1993) alled this phenmenn "disunt rate bias." In sme appliatins, it is interesting t haraterize the range f variatin aptured by the Wald estimatr further. Returning t (42), whih desribes the estimatr as a weighted average f slpes in the interval [S^, S n ], it seems natural t ask whih values S are mst likely t be vered by this interval. Fr example, des [S 0i, SJ usually ver 12 years f eduatin, r is it mre likely t ver 16 years? The prbability S 6 [S^, SJ is P^^S^S,]. Beause S ; is disrete, it easier t wrk with PfS^S^Sj], sine this an be expressed as (44) P[S li >5^S 0i ]= P[S n >5]-?[S a >S] = P[Si<5l Z^O]- P[S,<5I Z~\]. This is the differene in the umulative distributin funtin (CDF) f shling with the instrument swithed ff and n. The shling values where the CDF-gap is largest are thse mst likely t be vered by the interval [S^, S u ], and therefre mst ften represented in the Wald/weighted average. Angrist and Imbens (1995) used equatin (44) t interpret the Wald estimates f the returns t shling reprted by Angrist and Krueger (1991). 39 They reprt a Wald estimate based n first quarter/furth quarter differenes in lg weekly wages and years f shling using data n men brn 1930-39 in the 1980 Census. Their Wald estimate is.089, and the rrespnding OLS estimate is.07. The first quarter/furth quarter differene in CDFs is pltted in Figure 5. The differene is largest in the 8-14 years-f-shling range. This is nt surprising sine mpulsry attendane laws mainly affet high shl students, i.e., thse with 8-1 2 years f eduatin. The CDF gap fr men with mre than 1 2 years f shling may be aused by men wh are mpelled t mplete high shl and but then attended llege later. Finally, we nte that the disussin f IV in hetergeneus and nnlinear mdels s far has ignred variates. 2SLS estimates in hetergeneus-utmes mdels with variates an be interpreted in muh the same way as regressin estimates f mdels with variates were interpreted abve. That is, F/ estimates in mdels with variates an be thught f as prduing a weighted average f variate-speifi Wald estimates "See als Kling (1998) fr a similar analysis f instrumental variables estimates using distane t llege as an instrument fr shling.
54 as lng as the mdel fr variates is saturated and E[SJ Xj, ZJ is used as an instrument. In ther ases it seems reasnable t assume that sme srt f apprximate weighted average is being generated, but we are unaware f a preise ausal interpretatin that fits all ases.'' 2.4 Refutability Causality an never be prved by assiatins in nn-experimental data. But smetimes the lak f assiatin between variables fr a partiular grup, r the urrene f an assiatin between the "ausing variable" and utme variable fr a grup thught t be unaffeted by the treatment, an ast dubt n, r even refute, a ausal interpretatin. R.A. Fisher (quted in Chran, 1965) argued that the ase fr ausality is strnger when the ausal mdel has many impliatins that appear t hld. Fr this reasn, he suggested that sientifi theries be made "mpliated," in the sense that they yield many testable impliatins. A researh design is mre likely t be suessful at assessing ausality if pssibilities fr heking llateral impliatins f ausal presses are "built in." At ne level, this invlves estimating less restritive mdels. A gd example is Freeman's (1984) panel data study f unin status, whih lks separately at wrkers wh jin unins and leave unins. If unins truly raise wages f their members, then wrkers wh mve frm nnunin t unin jbs shuld experiene a raise, and wrkers wh mve frm unin t nnunin jbs shuld experiene a pay ut. Althugh a less restritive mdel may yield impreise estimates r be subjet t different biases whih render the results diffiult t interpret (e.g., different unbserved variables may ause wrkers t jin and exit unin jbs), a ausal stry is strengthened if the results f estimating a less restritive mdel are nsistent with the stry. In additin t these nsideratins f rbustness, a ausal mdel will ften yield testable preditins fr sub-ppulatins in whih the "treatment effet" shuld nt be bserved, either beause the sub-ppulatin 40 A reent effrt in this diretin is Abadie (1998), wh presents nditins under whih 2SLS estimates an be interpreted as the best linear preditr fr an underlying ausal relatinship. that always has this prperty fr mdels with a single binary instrument. He als intrdues a new IV estimatr
55 is thught t be immune t the treatment r did nt reeive the treatment. Perhaps the best- knwn example f this type f analysis is Bund's (1989) study f the effet f Disability Insurane (DI) benefits n the labr fre partiipatin rate f lder men. Earlier studies (e.g., Parsns, 1980) established an inverse relatinship between the partiipatin rate and the DI benefit-wage replaement rati. But beause the replaement rati is a dereasing funtin f a wrker's past earnings, Bund argued that this assiatin may reflet patterns f labr fre partiipatin rather than a ausal respnse t DI benefits. 41 T test the ausal interpretatin f earlier wrk, Bund perfrmed tw types f analyses. First, he estimated essentially the same enmetri mdel f the relatinship between emplyment and ptential DI benefits that had been estimated previusly, exept he estimated the mdel fr a sub-sample f lder men wh had never applied fr DI. Beause ne wuld nt expet DI benefits t prvide a strng wrk disinentive fr this sub-sample, there shuld be a muh weaker relatinship, r n relatinship at all, if the ausal interpretatin f DI benefit effiients is rret. Instead, he fund that DI benefits had abut the same effet in this sample as in a sample that inluded men wh atually applied fr and reeived DI benefits, suggesting that a ausal interpretatin f the effet f DI benefits was nt warranted. Send, Bund examined the labr fre behavir f men wh applied fr DI but were turned dwn. He reasned that beause men in this subsample were less severely disabled than men wh reeived DI, the labr fre partiipatin rate f this subsample prvided a "natural 'ntrl' grup" (p. 482) fr prediting the upper bund f the labr fre partiipatin rate f DI reipients had they been denied DI benefits. Beause half f the presumably healthier rejeted DI appliants did nt wrk even withut reeiving benefits, Bund nluded that mst DI reipients did nt wrk beause they were disabled, nt beause DI benefits indued them t leave the labr fre. Ntins f "refutability" als arry ver t IV mdels. In Angrist and Krueger (1991) we were nerned that quarter f birth, whih was the instrument fr shling, might have influened eduatinal attainment thrugh sme mehanism ther than the interatin f shl start age and mpulsry shling 4l WeIh (1977) prvides a lsely related ritiism f wrk n Unemplyment Insurane benefits.
56 laws. T test this threat t a ausal interpretatin f the IV estimates, we examined whether quarter f birth influened shling r earnings fr llege graduates, wh presumably were unaffeted by mpulsry shling laws. Althugh quarter f birth had an effet n these utmes fr llege graduates, the effet was weak and had a different pattern than that fund fr the less-than-llege grup, suggesting that mpulsry shling was respnsible fr the effets f quarter f birth in the less-than-llege sample. Tests f refutability may have flaws. It is pssible, fr example, that a subppulatin that is believed unaffeted by the interventin is indiretly affeted by it. Fr example, Parsns (1991) argues that rejeted DI appliants are a misleading ntrl grup beause they may exit the labr fre t strengthen a pssible appeal f their rejeted appliatin r a future re-appliatin fr DI benefits. 42 Likewise, sme students wh mplete high shl beause f mpulsry shling may be indued t g n t llege as a result, invalidating ur 1991 test f refutability. An understanding f the institutins underlying the prgram being evaluated is neessary t assess tests f refutability, as well as t identify subppulatins that are immune frm.the interventin arding t the ausal stry but still subjet t pssible nfunding effets. Lastly, there has been muh reent interest in evaluating entire researh designs, as in Lalnde's (1986) landmark study mparing experimental and nn-experimental researh methds. Only rarely, hwever, have experiments been nduted that an be used t validate nn-experimental researh strategies. Nnetheless, nn-experimental researh designs an still be assessed by mparing "pre-treatment" trends fr the treatment and mparisn grup (e.g., Ashenfelter and Card, 1985, and Hekman and Htz, 1989) r by lking fr effets where there shuld be nne (e.g., Bund, 1989). We prvide anther illustratin f this pint with sme new evidene n the differenes-in-differenes apprah used in Card's (1990) immigratin study. In the summer f 1 994, tens f thusands f Cubans barded bats destined fr Miami in an attempt t emigrate t the United States in a send Mariel Batlift that prmised t be almst as large as the first ne, 42 Bund (1989) nsidered and rejeted these threats t his ntrl grup. Als see Bund's (1991) respnse t Parsns (1991).
57 whih urred in the summer f 1980. Wishing t avid the plitial fallut that ampanied the earlier batlift, the Clintn Administratin intereded and rdered the Navy t divert the wuld-be immigrants t a base in Guantanam Bay. Only a small fratin f the Cuban emigres ever reahed the shres f Miami. Hene, we all this event, "The Mariel Batlift That Didn't Happen." Had the migrants been allwed t reah the United States, there is little dubt that researhers wuld have used this "natural experiment" t extend Card's (1990) influential study f the earlier influx f Cuban immigrants. Nnetheless, we an use this "nn-event" t explre Card's researh design. In partiular, we an ask whether Miami's and the mparisn ities' experienes were in fat similar absent the large wave f immigrants t Miami. Figure 1, whih we referred t earlier in the disussin f Card's paper, shws that nnagriultural emplyment grwth in Miami traks that f the fur mparisn ities rather well in the year befre and few years after the summer f 1994. (A vertial bar indiates the date f the thwarted batlift.) T prvide a mre detailed analysis by ethni grup, we fllwed Card and alulated unemplyment rates fr Whites, Blaks and Hispanis in Miami and the fur mparisn ities using data frm the CPS Outging Rtatin Grups. These results are reprted in Table 7. The Miami unemplyment data are impreise and variable, but still indiate a large inrease in unemplyment in 1994, the year the immigrants were diverted t Guantanam Bay. On the ther hand, 1994 was the first year the CPS redesign was implemented (see Setin 3.1). We therefre take 1993 as the "pre" perid and 1995 as the "pst" perid fr a differene-in-differene mparisn. Fr Whites and Hispanis, the unemplyment rate fell in Miami and fell even mre in the mparisn ities between the pre and pst perids, thugh the differene between these tw hanges is nt signifiant. This is nsistent with a ausal interpretatin f Card's (1990) results, whih attributes the differene-in-differenes t the effet f immigratin. Fr blaks, hwever, the unemplyment rate rse by 3.6 perentage pints in Miami between 1993 and 1995, while it fell by 2.7 pints in the mparisn ities. The 6.3 pint differene-in-differenes estimate is n the margin f statistial signifiane (t=1.70), and wuld have made it lk like the immigrant
58 flw had a negative impat n Blaks in Miami in a DD study. Sine there was n immigratin shk in 1994, this illustrates that different labr market trends an generate spurius findings in researh f this type. 3. Data Clletin Strategies Table 1 duments that labr enmists use many different types f data sets. The renewed emphasis n quasi-experiments in empirial researh plaes a premium n finding data sets fr a partiular ppulatin and time perid ntaining ertain key variables. Often this type f analysis requires large samples, beause nly part f the variatin in the variables f interest is used in the estimatin. Familiarity with data sets is as neessary fr mdem labr enmis as is familiarity with enmi thery r enmetris. Knwledge f the ppulatins vered by the main surveys, the design f the surveys, the respnse rate, the variables lleted, the size f the samples, the frequeny f the surveys, and any hanges in the surveys ver time is essential fr suessfully implementing an empirial strategy and fr evaluating thers' empirial researh. This setin prvides an verview f the mst mmnly used data sets and data lletin strategies in labr enmis. 3.1 Sendary Data Sets The mst mmnly used sendary data sets in labr enmis are the Natinal Lngitudinal Surveys (NLS), the Current Ppulatin Survey (CPS), the Panel Study f Inme Dynamis (PSID), and the Deennial Censuses. Table 8 summarizes several features f the main sendary data sets used by labr enmists. Belw we prvide a mre detailed disussin f the "big three" mir data sets in labr enmis: the NLS, CPS and PSID, and then disuss ther aspets f sendary data sets. Perhaps beause f its easy-t-use CD-ROM frmat and the breadth f its questinnaire, the Natinal Lngitudinal Surveys are ppular in applied wrk. The NLS atually nsists f six age-by-gender data sets: a hrt f 5,020 "lder men" (age 45-59 in 1 966); a hrt f 5,083 mature wmen (age 30-44 in 1967), a
59 hrt f 5,225 yung men (age 14-24in 1966); a hrt f 5, 159 yung wmen (age 14-24in 1968) in 1968); a hrt f 12,686 "yuth" knwn as the NLSY (age 14-22 in 1979); and a hrt f 7,035 hildren f respndents in the NLSY (age 0-20 in 1986). 42 Sampled individuals are interviewed annually. All but the lder men and yung men surveys ntinue tday. The CPS is an nging survey f mre than 50,000 husehlds that is nduted eah mnth by the Census Bureau fr the Bureau f Labr Statistis (BLS). 43 Sampled husehlds are inluded in the survey fr fur nseutive mnths, ut f the sample fr 8 mnths, and then inluded fr a final fur nseutive mnths. Thus, the survey has a "rtatin grup" design, with new rtatin grups jining r exiting the sample eah mnth. The resulting data are used by the Bureau f Labr Statistis t alulate the unemplyment rate and ther labr fre statistis eah mnth. The CPS has a hierarhial husehld-family-persn rerd struture whih enables husehld-level and family-level analyses, as well as individual-level analyses. The design f the CPS has been pied by statistial agenies in several ther untries and there used t alulate labr fre statistis. In the U.S., regular and ne-time supplements are inluded in the survey t llet infrmatin n wrker displaement, ntingent wrk, shl enrllment, smking, vting, and ther imprtant behavirs. In additin, annual inme data frm several sures are lleted eah mnth. A great strength f the CPS is that the survey began in the 1940s, s a lng time-series f data are available; n the ther hand, there have been several hanges that affet the mparability f the data ver time, and mir data are nly available t researhers fr years sine 1964. In additin, beause f its rtatin grup design, ntinuing husehlds an be linked frm ne mnth t the next, r between years; hwever, individuals wh mve ut f sampled husehlds are nt traked, and it is pssible that individuals wh mve int a sampled husehld may be miss-mathed t ther individuals' earlier rerds. High attritin rates are a partiular prblem in the linked 42 See NLS Users' Guide 1995 fr further infrmatin. 43 See Plivka (1996) fr an analysis f reent hanges in the CPS, and fr a list f supplements.
60 CPS fr yung wrkers. Unless a very large sample size is required, it is ften preferable t use a data set that was designed t trak respndents lngitudinally, instead f a linked CPS. The PSID is a natinal prbability sample that riginally nsisted f 5,000 families in 1968. 44 The riginal families, and new husehlds that have grwn ut f thse in the riginal sample, have been fllwed eah year sine. Cnsequently, the PSID prvides a unique data set fr studying family-related issues. The number f individuals vered by the PSID inreased frm 18,000 in 1968 t a umulative ttal exeeding 40,000 in 1996, and the number f families inreased t nearly 8,000. Brwn, Dunan, and Staffrd (1996) nte that the "entral fus f the data is enmi and demgraphi, with substantial detail n inme sures and amunts, emplyment, family mpsitin hanges and residential latin." The PSID is als ne f the few data sets that ntains infrmatin n nsumptin and wealth. A reent paper by Fitzgerald, Gttshalk and Mffit (1998) finds that, despite attritin f nearly half the sample sine 1968, the PSID has remained rughly representative thrugh 1989. 45 The aessibility f sendary data sets is hanging rapidly. The ICPSR remains a majr lletr and distributr f data sets and debks. In additin, CPS data an be btained diretly frm the Bureau f Labr Statistis. Inreasingly, data lletin agenies are making their data diretly available t researhers via the internet. In 1996, fr example, the Census Bureau made the reent Marh Current Ppulatin Surveys, whih inlude supplemental infrmatin n annual inme and demgraphi harateristis, available ver the internet. Beause the Marh CPS ntains annual inme data, many researhers have mathed these data frm ne year t the next. Beause sendary data sets are typially lleted fr a brad range f purpses r fr a purpse ther than that intended by the researher, they ften lak infrmatin required fr a partiular prjet. Fr example, the PSID wuld be ideal fr a lngitudinal study f the impat f persnal mputers n pay, exept it laks "This paragraph is based n Brwn, Dunan, and Staffrd (1996). <5 See als Beketti, Guld, Lillard and Welh (1988) fr evidene n the representativeness f the PSID.
61 infrmatin n the use f persnal mputers. In ther situatins, the data lletr may mit survey items frm publi-use files t preserve respndent nfidentiality. Nnetheless, several large publi-use surveys enable researhers t add questins, r will prvide ustmized extrats with variables that are nt n the publi-use file. Fr example, Vrman (1991) added supplemental questins t the CPS n the utilizatin f unemplyment insurane benefits. The st f adding the 7 questins was $100,000. 46 Frm time t time, survey rganizatins als sliit researhers' advie n new questins r new mdules t add t n-ging surveys. Sine 1993, fr example, the PSED has held an pen mpetitin amng researhers t add supplemental questins t the PSID. 3.1.1 Histrial Cmparability in the CPS and Census Statistial agenies are ften faed with a tradeff between adjusting questins t make them mre relevant fr the mdern enmy and maintaining histrial mparability. Often it seems that statistial agenies plae insuffiient weight n histrial nsisteny. Fr example, after 50 years f measuring eduatin by the highest grade f shl individuals attended and mpleted, the Census Bureau swithed t measuring eduatinal attainment by the highest degree attained in the 1990 Census. The CPS fllwed suit in 1992. This is a subtle hange in the eduatin data, but it is imprtant fr labr enmists t be aware f, and uld ptentially affet estimates f the enmi return t eduatin (see Park, 1994 and Jaeger, 1993). Beause many statistis are mst infrmative in mparisn t their values in earlier years, it is imprtant that statistial agenies plae weight n histrial mparability even thugh the nepts being measured may have hanged ver time. Frtunately, the Bureau f Labr Statistis and the Census Bureau typially intrdue a majr hange in a questinnaire after studying the likely effets f the hange n the survey results. Beause sme hanges ^Beause f nern that the additinal questins might affet future respnses, the supplement was nly asked f individuals wh were in their final rtatin in the sample. The supplement was added t the survey in the mnths f May, August, Nvember 1989 and February 1990. The sample size was 2,859 eligible unemplyed individuals.
62 have a majr impat n ertain variables (r n ertain ppulatins), it is imprtant that analysts be aware f hanges in n-ging surveys, and f their likely effets. Fr example, a majr redesign f the CPS was intrdued in January 1994, after eight years f study. The re-designed CPS illustrates the imprtane f being aware f questinnaire hanges, as well as the diffiulty f estimating the likely impat f suh hanges. The redesigned CPS is nduted with mputer-assisted interviewing tehnlgy, whih failitates mre mpliated skip patterns, mre narrwly tailred questins, and dependent interviewing (in whih respndents' answers t an earlier mnth's questin are integrated int the urrent mnth's questin). In additin, the redesign hanged the way key labr fre variables were lleted. Mst imprtantly, individuals wh are nt wrking are nw prbed mre thrughly fr ativities that they may have dne t searh fr wrk. In the lder survey, interviewers were instruted t ask a respndent wh "appears t be a hmemaker" whether she was keeping huse mst f last week r ding smething else. The new questin is gender neutral. Anther majr hange nerns the earnings questins. Prir t the redesign, the CPS asked respndents fr their usual weekly wage and usual weekly hurs. 47 The rati f these tw variables gives the implied hurly wage. The redesigned CPS first asks respndents fr the easiest way they uld reprt their ttal earnings n their main jb (e.g., hurly, weekly, annually, r n sme ther basis), and then llets usual earnings n that basis. T gauge the impat f the survey redesign n respnses in 1992 and 1993, the BLS and Census Bureau nduted an verlap survey in whih a separate sample f husehlds was given the redesigned CPS, while the regular sample was still given the ld CPS questinnaire. Then, fr the first five mnths f 1994, this verlap sample was given the ld CPS, while the regular sample was given the new ne. Overlap samples an be extremely infrmative, but they are als diffiult t implement prperly. In this instane, the verlap sample was drawn with different predures than the regular CPS sample, and there appear t be systemati differenes between the tw samples whih mpliate mparisns. Taking aunt f these diffiulties, 47 The ld CPS als lleted hurly earnings fr wrkers wh indiated they were paid hurly.
63 Plivka (1996) and Plivka and Miller (1995) estimate that the redesign had an insignifiant effet n the unemplyment rate, althugh it appears t have raised the emplyment-t-ppulatin rati f wmen by 1.6 perent, raised the prprtin f self-emplyed wmen by 20 perent, inreased the prprtin f all wrkers wh are lassified as part-time by 10 perent, and dereased the fratin f disuraged wrkers (i.e., thse ut f the labr fre wh have given up searhing fr wrk beause they believe n jbs are available fr them) by 50 perent. Plivka (1997) addresses the effet f the redesign n the derived hurly wage rate. She finds that the redesign auses abut a 5 perent inrease in the average earnings f llege graduates relative t thse wh failed t mplete high shl, and abut a 2 perent inrease in the male-female gap. If researhers are nt aware f the ptential hanges in measurement brught abut by the redesigned CPS, they uld spuriusly attribute shifts in emplyment r wages t enmi fres rather than t hanges in the questinnaire and survey tehnlgy. Three ther hanges in the CPS are espeially ntewrthy. First, beginning in 1980 the Annual Demgraphi Supplement f the Marh CPS was expanded t ask a mre prbing set f inme questins. The impat f these hanges an be estimated beause the 1979 Marh CPS administered the ld (pre- 1980) questinnaire t five f the eight rtatin grups in the sample, and administered the new, mre detailed questinnaire t the ther three rtatin grups. 48 Send, as nted abve, the eduatin questin (whih is n the "ntrl ard" rather than the basi mnthly questinnaire) was swithed frm the number f years f shl mpleted t the highest degree attained in 1992 (see Park, 1994 and Jaeger, 1993). Third, the "tp de"f the inme and earnings questins that is, the highest level f inme allwed t be reprted in the publi-use file ~ has hanged ver time, whih bviusly may have impliatins fr studies f inme inequality. 48 See Krueger (1990a) fr an analysis f the hange in the questinnaire n respnses t the questin n wrkers' mpensatin benefits. The new questinnaire seems t have deteted 20 perent mre wrkers' mpensatin reipients. See Cder and Sn-Rgers (1996) fr a mparisn f CPS and SIPP inme measures.
64 3.2 Primary data lletin and survey methds It is beming inreasingly mmn fr labr enmists t be invlved in lleting their wn data. Labr enmists' invlvement in the design and lletin f riginal data sets takes many frms. First, it shuld be nted that labr enmists have lng played a majr rle in the design and lletin f sme f the majr publi-use data files, inluding the PSDD and NLS. Send, researhers have turned t lleting smaller, ustmized data t estimate speifi quantities r desribe ertain enmi phenmenn. Sme f Rihard Freeman's researh illustrates this apprah. Freeman and Hall (1986) nduted a survey t estimate the number f hmeless peple in the U.S., whih ame very lse t the ffiial Census Bureau estimate in 1990. Brjas, Freeman and Lang (1991) nduted a survey f brder rssing behavir f illegal aliens t estimate the number f illegal aliens in the U.S. Freeman (1990) nduted a survey f inner-ity yuths in Bstn, whih in part is a fllw-up n the survey nduted by Freeman and Hlzer (1986). Often, data lleted in these surveys are mbined with sendary data files t derive natinal estimates. Third, sme surveys have been nduted t prbe the sensitivity f results in large-sale sendary data sets, r t prbe the sensitivity f respnses t questin wrding r rder. Fr example, Farber and Krueger (1993) nduted a survey f 102 husehlds in whih nn-unin respndents were asked tw different questins nerning their likelihd f jining a unin, with the rder f the questins randmly interhanged. The tw questins, whih are listed belw, were inluded in earlier surveys nduted by the Canadian Federatin f Labr (CFL) and the Amerian Federatin f Labr-Cngress f Industrial Organizatins (AFL-CIO), and had been analyzed by Riddell (1992). Based n mparing respnses t these questins, Riddell nluded that Amerian wrkers have a higher "frustrated demand" fr unins. CFL Q.: Thinking abut yur wn needs, and yur urrent emplyment situatin and expetatins, wuld yu say that it is very likely, smewhat likely, nt very likely, r nt likely at all that yu wuld nsider jining r assiating yurself with a unin r a prfessinal assiatin in the future?
65 AFL Q.: If an eletin were held tmrrw t deide whether yur wrkplae wuld be uninized r nt, d yu think yu wuld definitely vte fr a unin, prbably vte fr a unin, prbably vte against a unin, r definitely vte against a unin? In their small-sale survey, Farber and Krueger (1993) fund that the respnses t the CFL questin were extremely sensitive t the questins that preeded them. If the AFL questin was asked first, 55% f nnunin members answered the CFL questin affirmatively, but if the CFL questin was asked first, 26% f nnunin members answered affirmatively t the CFL questin. 49 Thus, the Farber and Krueger results suggest a gd deal f autin in interpreting the CFL-style questin, espeially arss untries. Finally, and f mst interest fr ur purpses, researhers have nduted speial-purpse surveys t evaluate ertain natural experiments. Prbably the best knwn example f this type f survey is Card and Krueger's (1994) survey f fast fd restaurants in New Jersey and Pennsylvania. Other examples inlude: Ashenfelter and Krueger's (1994) survey f twins; Behrman, Rsenzweig and Taubman's (1996) survey f twins; Miner and Higuhi's(1988) survey f turnver at Japanese plants in the U.S. and their self-identified mpetitrs; and Freeman and Kleiner's (1990) survey f mpanies underging a unin drive and their mpetitrs. Several exellent vlumes have been written n the design and implementatin f surveys, and a detailed verview f this material is beynd the spe f this paper. 50 But a few pints that may be f speial interest t labr enmists are utlined belw. Custmized surveys seem espeially apprpriate fr rare ppulatins, whih are likely t be underrepresented r nt easily identified in publi-use data sets. Examples inlude idential twins, illegal aliens, hmeless peple, and disabled peple. T ndut a survey, ne must bviusly have a questinnaire. Preparing a questinnaire an be a time- nsuming and diffiult endeavr. Survey researhers ften find that answers t questins -- even fatual <9 The t-rati fr the differene between the prprtins is 3.3. 50 See, fr example. Grves (1989), Sudman and Bradburn (1991), and Singer and Presser (1989).
66 enmi questins -- are sensitive t the wrding and rdering f questins. Frtunately, ne des nt have t begin writing a questinnaire frm srath. Survey questinnaires typially are nt pyright prteted. Beause many enmists are familiar with existing questinnaires used in the majr sendary data sets (e.g., the CPS), and beause a great deal f effrt typially ges int designing and testing these questinnaires, it is ften advisable t py as many questins as pssible verbatim frm existing questinnaires when frmulating a new questinnaire. Aside frm the redibility gained frm repliating questins frm well knwn surveys, anther advantage f dupliating thers' questins is that the results frm the sampled ppulatin an be mpared diretly t the ppulatin as a whle with the sendary survey. Furthermre, if data frm a ustmized survey are pied tgether with data frm a sendary survey, it is essential that the questins be mparable. One prmising reent develpment in questinnaire design invlves "fllw-up brakets" (als knwn as "unflding" brakets). This tehnique ffers braketed ategries t respndents wh initially refuse r are unable t prvide an exat value t an pen ended questin. Juster and Smith (1997) find that fllw-up brakets redued nnrespnse t wealth questins in the Health and Retirement Survey (HRS) and Asset and Health Dynamis amng the Oldest Old Survey (AHEAD). See Hurd, et al. (1998) fr experimental evidene f anhring in respnses based n the sequene f unflding brakets fr nsumptin and savings data in the AHEAD survey. Fllw-up brakets have als been used t measure wealth in the PSID. The use f fllw-up brakets wuld seem partiularly useful fr hard-t-measure quantities, suh as inme, wealth, saving and nsumptin. Lastly, pwer alulatins shuld guide the determinatin f sample size prir t the start f a survey. Fr example, suppse the gal f the survey is t estimate a 95% nfidene interval fr a mean. With randm sampling, the expeted sample size (n) required t btain a nfidene interval f width 2W is: n = 8 2 /W 2, where 2 is the ppulatin variane f the variable in questin. Althugh the variane generally will nt be knwn prir t nduting the survey, an estimate frm ther surveys an be used fr the pwer alulatin.
67 Als ntie that in the ase f a binary variable (i.e., if the gal is t estimate a prprtin, p), the variane is p(l-p), s in the wrse-ase senari the variane is.25 =.5 *.5. It shuld als be nted that in mplex sample designs invlving lustering and stratifiatin, mre bservatins are usually need than in simple randm samples t attain a given level f preisin. 3.3 Administrative data and rerd linkage Administrative data, i.e., data prdued as a by-prdut f sme administrative funtin, ften prvide inexpensive large samples. The prliferatin f mputerized rerd keeping in the last deade shuld inrease the number f administrative data sets available in the future. Examples f widely used administrative data bases inlude sial seurity earnings rerds (Ashenfelter and Card, 1985, Vrman, 1990, Angrist, 1990), unemplyment insurane payrll and benefit rerds (Andersn, 1993, Katz and Meyer, 1990, Jabsn, Lalnde, and Sullivan, 1994, Card and Krueger, 1998), wrkers' mpensatin insurane rerds (Meyer, Visusi and Durbin, 1995, and Krueger, 1990b), mpany persnnel rerds (Medff and Abraham, 1980, Lazear, 1992, Baker, Gibbs and Hlmstrm, 1994), and llege rerds (Bwen and Bk, 1998). An advantage f administrative data is that they ften ntain enrmus samples r even an entire ppulatin. Anther advantage is that administrative data ften ntain the atual infrmatin used t make enmi deisins. Thus, administrative data may be partiularly useful fr identifying ausal effets frm disrete threshlds in administrative deisin making, r fr implementing strategies that ntrl fr seletin n bserved harateristis. A frequent limitatin f administrative data, hwever, is that they may nt prvide a representative sample f the relevant ppulatin. Fr example, mpanies that are willing t make their persnnel rerds available are prbably nt representative f all mpanies. In sme ases administrative data have even been btained as a by-prdut f urt ases r lleted by parties with a vested interest in the utme f the researh, in whih ase there is additinal reasn t be nerned abut the representativeness f the samples.
68 Anther mmn limitatin f administrative data is that they are nt generated with researh purpses in mind, s they may lak key variables used in enmi analyses. Fr example, sial seurity earnings rerds lak data n individuals' eduatin. As a nsequene, it is mmn fr researhers t link survey data t administrative data, r t link arss administrative data sets. Often these links are based n sial seurity numbers r the individuals' names. Examples f linked data sets inlude: the Cntinuus Lngitudinal Manpwer Survey (CLMS) survey, whih is a link between sial seurity rerds and the 1976 CPS; the 1973 Exat Math file whih ntains CPS, IRS, and sial seurity data; and the Lngitudinal Emplyer-Emplyee Data Set (LEEDS). All f these linked data sets are nw dated, but they an still be used fr sme imprtant histrial studies (e.g., Chay, 1996). Mre reently, the Census Bureau has been engaged in a prjet t link Census daa t the Survey f Manufaturers. It is als pssible t petitin gvernment agenies t release administrative data. Althugh the Internal Revenue Servie severely limits dislsure f federal administrative rerds lleted fr tax purpses, State data is ften aessible and even federal data an still be linked and released under sme irumstanes. Fr example, Angrist (1998) linked military persnnel rerds t Sial Seurity Administratin (SSA) data. The HRS has als suessfully linked SSA data t survey-based data. Furthermre, many states prvide fairly free aess t UI payrll tax data t researhers fr the purpse f linking data. 51 There is als a literature n data release shemes fr administrative rerds that preserve nfidentiality and meet legal requirements (see, e.g., Dunan and Pearsn, 1991). 3.4 Cmbining samples Althugh in sme ases individual rerds an be linked between different data sures, an alternative linkage strategy explits the fat that many f the estimatrs used in empirial researh an be nstruted frm separate sets f first and send mments. S, in priniple, individual rerds with a full mplement f 3l An example is Krueger and Kruse (1996), whih links New Jersey unemplyment insurane payrll tax data t a data set the authrs lleted in a survey f disabled individuals.
69 variables are nt always needed t arry ut a multivariate analysis. It is smetimes enugh t have all the mments required, even thugh these mments may be drawn frm mre than ne sample. In pratie, this makes it pssible t undertake empirial prjets even if the required data are nt available in any single sure. Reent versins f the multiple-sample apprah t empirial wrk inlude the tw-sample instrumental variables estimatrs develped by Arellan and Meghir (1992) and Angrist and Krueger (1992, 1995), and used by Lusardi (1996), Japelli, Pishke, and Suleles (1998), and Kling (1998). The use f tw samples t estimate regressin effiients dates bak at least t Durbin (1953), wh disussed the prblem f hw t update OLS estimates with infrmatin frm a new sample. Maddala (1971) disussed a similar prblem using a maximum likelihd framewrk. This idea was reently revived by Imbens and Lanaster (1994), wh address the prblem f hw t use marenmi data in mir-enmetri mdels. Deatn (1985) fuses n estimating panel data mdels with aggregate data n hrts. 4. Measurement Issues In his lassi vlume n the auray f enmi measurement, Oskar Mrgenstem (1950) qutes the famed mathematiian Nrbert Wiener as remarking, "Enmis is a ne r tw digit siene." The fat that the fus f mst empirial researh has mved frm aggregate time-series data t mir-level rsssetinal and lngitudinal survey data in reent years nly magnifies the imprtane f measurement errr, beause (randm) errrs tend t average ut in aggregate data. Cnsequently, a gd deal f attentin has been paid t the extent and impat f "nisy" data in the last deade, and muh has been learned. Measurement errr an arise fr several reasns. In survey data, a mmn sure f measurement errr is that respndents give faulty answers t the questins psed t them. 52 Fr example, sme respndents "Even well-trained enmists an make errrs f this srt. Harvard's Dean f Faulty Henry Rsvsky (1991, p. 40) gives the fllwing aunt f a meeting he had with an enraged enmis prfessr wh mplained abut his salary: "After a quik alulatin, this quantitatively riented enmist nluded that his raise was all f 1 perent: an insult and an utrage. I had the maliius pleasure f rreting his mistaken alulatin. The raise was 6
70 may intentinally exaggerate their inme r eduatinal attainment t impress the interviewer, r they may shield sme f their inme frm the interviewer beause they are nerned the data may smehw fall int the hands f the IRS, r they may unintentinally frget t reprt sme inme, r they may misinterpret the questin, and s n. Even in surveys like SE?P, whih is speifially designed t measure partiipatin in publi prgrams like UI and AFDC, respndents appear t under-reprt prgram partiipatin by 20 t 40 perent (see Marquis, Mre and Bgen, 1996). It shuld als be stressed that in many situatins, even if all respndents rretly answer the interviewers' questins, the bserved data need nt rrespnd t the nept that researhers wuld like t measure. Fr example, in priniple, human apital shuld be measured by individuals' aquired knwledge r skills; in pratie it is measured by years f shling. 53 Fr these reasns, it is prbably best t think f data as being rutinely mismeasured. Althugh few enmists nsider measurement errr the mst exiting researh tpi in enmis, it an be f muh greater pratial signifiane than several ht issues. Tpel (1991), fr example, prvides evidene that failure t rret fr measurement errr greatly affets the estimated return t jb tenure in panel data mdels. Frtunately, the diretin f biases aused by measurement errr an ften be predited. Mrever, in many situatins the extent f measurement errr an be estimated, and the parameters f interest an be rreted fr biases aused by measurement errr. 4.1 Measurement Errr Mdels 4.1.1 The Classial Mdel Suppse we have data n variables dented X and Yj fr a sample f individuals. Fr example, X, ( uld be years f shling and lg earnings. The variables X and Y; may r may nt equal the rretly- Y; ; measured variables the researher wuld like t have data n, whih we dente X * and Yj*. The errr in ( perent: he did nt knw his wn salary and had used the wrng base." "Measurement errr arising frm the mismath between thery and pratie als urs in administrative data. fat, this may be a mre severe prblem in administrative data than in survey data. In
71 measuring the variables is simply the deviatin between the bserved variable and the rretly-measured variable: fr example, = Xj-X,*, where e, is the measurement errr in X,. Cnsideratins f measurement es errr usually start with the assumptin f "lassial" measurement errrs.!4 Under the lassial assumptins, ej is assumed t have the prperties C(ei,X i*)=e(e i)=0. That is, the measurement errr is just mean-zer "white nise". Classial measurement errr is nt a neessary feature f measurement errr; rather, these assumptins are best viewed as a nvenient starting pint. What are the impliatins f lassial measurement errr? First, nsider a situatin in whih the dependent variable is measured with errr. Speifially, suppse that Yj = Y,* + u p where Y; is the bserved dependent variable, Yj* is the rretly-measured, desired, r "true" value f the dependent variable, and u f is lassial measurement errr. If Y, is regressed n ne r mre rretly-measured explanatry variables, the expeted value f the effiient estimates is nt affeted by the presene f the measurement errr. Classial measurement errr in the dependent variable leads t less preise estimates -- beause the errrs will inflate the standard errr f the regressin - but des nt bias the effiient estimates. 55 Nw nsider the mre interesting ase f measurement errr in an explanatry variable. Fr simpliity, we fus n a bivariate regressin, with mean zer variables s we an suppress the interept. Suppse Y* ; is regressed n the bserved variable Xj, instead f n the rretly-measured variable X *. The ppulatin regressin f Y * n X ( s (45) Y i * = X,*6 + e i, * is: while if we make the additinal assumptin that the measurement errr (e ; ) and the equatin errr (ej are unrreted, the ppulatin regressin f * n Xj is: Ys (46) Y,* = Xj XQ, + i "Referenes fr the effet f measurement errr inlude Dunan and Hill (1985), Grilihes (1986), Fuller (1987), and Bund and Krueger (1991). "If the measurement errr in the dependent variable is nt lassial, then the regressin effiients will be biased. The bias will equal the effiients frm a hypthetial regressin f the measurement errr n the explanatry variables.
72 where X = C(X*,X) / V(X). If X; is measured with lassial measurement errr, then C(X*,X) = V(X*) and V(X) = V(X*) + V(e), s the regressin effiient is neessarily attenuated, with the prprtinal "attenuatin bias" equal t (\-X) < l. 56 The quantity X is ften alled the "reliability rati". If data n bth X* and Xj were available, the reliability rati uld be estimated frm a regressin f Xj* n Xj. A higher reliability rati implies that the bserved variability in X ; ntains less nise. Althugh lassial measurement errr mdels prvide a nvenient starting plae, in sme imprtant situatins lassial measurement errr is impssible. If Xj is a binary variable, fr example, then it must be the ase that measurement errrs in Xj are dependent n the values f Xj*. This is beause a dummy variable an nly be mislassified in ne f tw ways (a true 1 an be lassified as a 0, and a true an be lassified as a 1), s nly tw values f the errr are pssible and the errr autmatially depends n the true value f the variable. An analgus situatin arises with variables whse range is limited. Aigner (1972) shws that randm mislassifiatin f a binary variable still biases a bivariate regressin effiient tward even thugh the resulting measurement errr is nt lassial. But, in general, if measurement errr in Xj is nt lassial, the bias fatr uld be greater than r less than ne, depending n the rrelatin between the measurement errr and the true variable. Nte, hwever, that regardless f whether r nt the lassial measurement errr assumptins are met, the prprtinal bias (\-X) is still given by ne minus the regressin effiient frm a regressin f Xj* n Xj. 57 Anther imprtant speial ase f nn-lassial measurement errr urs when a grup average is used as a "prxy-variable" fr an individual-level variable in mir data. Fr example, average wages in an ^Ntie these are desriptins f ppulatin regressins. The estimated regressin effiient is asympttially biased by a fatr ( 1 -X), thugh the bias may differ in a finite sample. If the nditinal expetatin f Y is linear in X, suh as in the ase f nrmal errrs, the expeted value f the bias is (1-X) in a finite sample. "This result requires the previusly mentined assumptin that t and j be unrreted. It may als be the ase that the measurement errr is nt mean zer. Statistial agenies ften refer t suh phenmenn as "nn-sampling errr" (see, e.g., MCarthy, 1979). Suh nn-sampling errrs may arise if the questinnaire used t sliit infrmatin des nt pertain t the enmi nept f interest, r if respndents systematially under r ver reprt their answers even if the questins d aurately reflet the relevant enmi nepts. An imprtant impliatin f nn-sampling errr is that aggregate ttals will be biased.
73 industry r unty might be substituted fr individual wage rates n the right-hand side f an equatin if mir wage data are missing. Althugh this leads t measurement errr, sine the prxy-variable replaes a desired regressr, asympttially there is n measurement-errr bias in a bivariate regressin in this ase. One way t see this is t nte that the effiient frm a regressin f, say, X> n E[X,I industry j] has a prbability limit f 1. S far the disussin has nsidered the ase f a bivariate regressin with just ne explanatry variable. As nted in Setin 2, adding additinal regressrs will typially exaerbate the impat f measurement errr n the effiient f the mismeasured variable beause the inlusin f additinal independent variables absrbs sme f the signal in X;, and thereby redues the residual signal t nise rati. Assuming that the ther explanatry variables are measured withut errr, the reliability rati nditinal n ther explanatry variables bemes X' = (k- R 2 )/(l-r 2 ) where R 2 is the effiient f determinatin frm a regressin f the mismeasured X, n the ther explanatry variables. If the measurement errr is lassial, then k'<,x. And even if the measurement errr is nt lassial, it still remains true that when there are variates in equatin (45), the prprtinal bias is given by the effiient n X> in a regressin f Xj* n X ( and the variates. Nte, hwever, that in mdels with variates it n lnger need be the ase that the use f aggregate prxy variables generates n asymptti bias. An additinal feature f measurement errr imprtant fr applied wrk is that, fr reasns similar t thse raised in the disussin f mdels with variates, attenuatin bias due t lassial measurement errr is generally exaerbated in panel data mdels. In partiular, if the independent variable is expressed in first differenes and if we assume that X * and e f ; are variane statinary, the reliability rati is: (47) X = V(Xj*) / {V(X *) + ( Vfe) [(l-t)/(l-r)] }, where r is the effiient f first-rder serial rrelatin in X* and t is the first-rder serial rrelatin in the measurement errr. If the (psitive) serial rrelatin in X ;* exeeds the (psitive) serial rrelatin in the measurement errr, attenuatin bias is greater in first-differened data than in rss-setinal data (Grilihes
74 and Hausman, 1986). Classial measurement errrs are usually assumed t be serially unrrelated (t=0), in whih ase the attenuatin bias is greater in a first-differened regressin than in a levels regressin. The intuitin fr this is that sme f the signal in anels ut in the first-differene regressin beause f serial Xs rrelatin in X,*, while the effet f independent measurement errrs is amplified beause errrs an ur in the first r send perid. A similar situatin arises if differenes are taken ver dimensins f the data ther than time, suh as between twins r siblings. Finally, nte that if an explanatry variable is a funtin f a mismeasured dependent variable, the measurement errrs in the dependent and independent variables are autmatially rrelated. Brjas (1980) ntes that this situatin ften arises in labr supply equatins where the dependent variable is hurs wrked and the independent variable is average hurly earnings, derived by dividing weekly r annual earnings by hurs wrked. In this situatin, measurement errr in Y; will indue a negative bias when (Yj* + u ( ) is regressed n X; */(Yj*+Uj). In ther situatins, bth the dependent and independent variables may have the same nisy measure in the denminatr, suh as when the variables are saled t be per apita (mmn in the enmi grwth literature). If the true regressin parameter were 0, this wuld bias the estimated effiient tward 1. The extent f bias in these situatins is naturally related t the extent f the measurement errr in the variable that appears n bth the right-hand and left-hand side f the equatin. 4.1.2 Instrumental Variables and Measurement Errr One f the earliest uses f IV was as a tehnique t verme errrs-in-variables prblems. Fr example, in his lassi wrk n the permanent inme hypthesis, Friedman (1957) argued that annual inme is a nisy measure f permanent inme. The gruped estimatr he used t verme measurement errrs in permanent inme an be thught f as IV. It is nw well knwn that IV yields nsistent parameter estimates even if the endgenus regressr is measured with lassial errr, assuming that a valid instrument exists. Indeed, ne explanatin why IV estimates f the return t shling frequently exeed OLS estimates is that
75 measurement errr attenuates the OLS estimates (e.g., Grilihes, 1977). In a reent paper, Kane, Ruse and Staiger (1997) emphasize that IV an yield innsistent parameter estimates if the endgenus regressr is measured with nn-lassial measurement errr. 58 Speifially, they shw that if the mismeasured endgenus regressr, Xj, is a dummy variable, the measurement errr will be rrelated with the instrument, and typially bias the magnitude f IV effiients upward. 59 The prbability limit f the IV estimate in this ase is: B (48) 1 - P(X j=0ix j*=l) - P(X,=llX i*=0) Intuitively, the parameter f interest is inflated by ne minus the sum f the prbabilities f the tw types f errrs that an be made in measuring X ; (bservatins that are l's an be lassified as O's, and bservatins that are O's an be lassified as l's). The reasn IV tends t verestimate the parameter f interest is that if Xj is a binary variable, the value f the measurement errr is autmatially dependent n the true value f Xj*, and therefre must be rrelated with the instrumental variable beause the instrumental variable is rrelated with Xj*. Cmbining this result with the earlier disussin f attenuatin bias, it shuld be lear that if the regressr is a binary variable (in a bivariate regressin), the prbability limit f the OLS and IV estimatrs bund the effiient f interest, assuming the speifiatins are therwise apprpriate. In the mre general ase f nnlassial measurement errr in a ntinuus explanatry variable, IV estimates an be attenuated r inflated, as in the ase f OLS. 4.2 The Extent f Measurement Errr in Labr Data Mellw and Sider (1 983) prvide ne f the first systemati studies f the prperties f measurement 58 A similar pint has been made by James Hekman in an unpublished mment n Ashenfelter and Krueger (1994). 59 The exeptin is if X is s prly measured that it is negatively rrelated with Xj*. (
76 errr in survey data. They examined tw sures f data: (1) emplyee-reprted data frm the January 1977 CPS linked t emplyer-reprted data n the same variables fr sampled emplyees; (2) an exat math between emplyees and emplyers in the 1980 Emplyment Opprtunity Pilt Prjet (EOPP). Mellw and Sider fus n the extent f agreement between emplyer and emplyee reprted data, rather than the reliability f the CPS data per se. Fr example, they find that 92.3% f emplyers and emplyees reprted the same ne-digit industry, while at the three-digit-industry level, the rate f agreement fell t 71.1%. Fr wages, they find that the emplyer-reprted data exeeded the emplyee-reprted data by abut 5%. The mean unin rate was slightly higher in the emplyer-reprted data than in the emplyee-reprted data. They als fund that estimates f mir-level human apital regressins yielded qualitatively similar results whether emplyeereprted r emplyer-reprted data are used. This similarity uld result frm the urrene f rughly equal amunts f nise in the emplyer and emplyee reprted data. Several ther studies have estimated reliability ratis fr key variables f interest t labr enmists. Tw apprahes t estimating reliability ratis have typially been used. First, if the researher is willing t all ne sure f data the truth, then X an be estimated diretly as the rati f the varianes: V(X i*)a'(x l ). Send, if tw measures f the same variable are available (dented X,; and X 2i ), and if the errrs in these variables are unrrelated with eah ther and unrrelated with the true value, then the variane between X,; and X 2i prvides an estimate f V(Xj*). The reliability rati X an then be estimated by using the variane f either measure as the denminatr r by using the gemetri average f the tw varianes as the denminatr. The frmer an be alulated as the slpe effiient frm a regressin f ne measure n the ther, and the latter an be alulated as the rrelatin effiient between the tw measures. If a regressin apprah is used, the variable that rrespnds mst lsely t the data sure that is usually used in analysis shuld be the explanatry variable (beause the tw sures may have different errr varianes). An example f tw mismeasured reprts n a single variable are respndents' reprts f their parents' eduatin in Ashenfelter and Krueger's (1994) twins study. Eah adult twin was asked t reprt the highest
77 grade f eduatin attained by his r her mther and father. Beause eah member f a pair f twins has the same parents, the respnses shuld be the same, and there is n reasn t prefer ne twin's respnse ver the ther's. Differenes between the tw respnses fr the same pair f twins represent measurement errr n the part f at least ne twin. The rrelatin between the twins' reprts f their father's eduatin is.86, and the rrelatin between reprts f their mther's eduatin is.84. These figures prbably verestimate the reliability f the parental eduatin data beause the reprting errrs are likely t be psitively rrelated; if a parent mis-represented his eduatin t ne twin, he is likely t have similarly mis-represented his eduatin t the ther twin as well. Table 9 summarizes seleted estimates f the reliability rati fr self-reprted lg earnings, hurs wrked, and years f shling, three f the mst mmnly studied variables in labr enmis. These estimates prvide an indiatin f the extent f attenuatin bias when these variables appear as explanatry variables. All f the estimates f the reliability f earnings data in the table are derived by mparing emplyees' reprted earnings data with their emplyers' persnnel rerds r tax reprts. The estimates frm the PSID validatin study are based n data frm a single plant, whih prbably redues the variane f rretly-measured variables mpared t their variane in the ppulatin. This in turn redues the estimated reliability rati if reprting errrs have the same distributin in the plant as in the ppulatin. Estimates f X fr rss-setinal earnings range frm.70 t.80 fr men; X is smewhat higher fr wmen. The estimated reliability falls t abut 0.60 when the earnings data are expressed as year-t-year hanges. The deline in the reliability f the earnings data is nt as great if fur-year hanges are used instead f annual hanges, refleting the fat that there is greater variane in the signal in earnings ver lnger time perids. Interestingly, the PSID validatin study als suggests that hurs data are nsiderably less reliable than earnings data. The reliability f self-reprted eduatin has been estimated by mparing the same individual's reprts f his wn eduatin at different pints in time, r by mparing different siblings' reprts f the same
78 individual's eduatin. The estimates f the reliability f eduatin are in the neighbrhd f.90. Beause eduatin is ften an explanatry variable f interest in a rss-setinal wage equatin, measurement errr an be expeted t redue the return t a year f eduatin by abut 10 perent (assuming there are n ther variates). The table als indiates that if differenes in eduatinal attainment between pairs f twins r siblings are used t estimate the return t shling (e.g., Taubman, 1976; Behrman, Hrube, Taubman, and Wales 1980; Ashenfelter and Krueger, 1994; and Ashenfelter and Zimmerman, 1997), then the effet f measurement errr is greatly exaerbated. This is beause shling levels are highly rrelated between twins, while measurement errr is magnified beause reprting errrs appear t be unrrelated between twins. This situatin is analgus t the effet f measurement errr in panel data mdels disussed abve. T further explre the extent f measurement errr in labr data, we re-analyzed the CPS data riginally used by Mellw and Sider (1983). Figure 6 presents a satter diagram f the emplyer-reprted lg hurly wage against the emplyee-reprted lg hurly wage. 60 Althugh mst pints luster arund the 45 degree line, there are learly sme utliers. Sme f the large utliers prbably result frm randm ding errrs, suh as a misplaed deimal pint. Researhers have emplyed a variety f "trimming" tehniques t try t minimize the effets f bservatins that may have been misreprted. An interesting study f histrial data by Stigler (1977) asks whether statistial methds that dwnweight utliers wuld have redued the bias in estimates f physial nstants in 20 early sientifi data sets. These nstants, suh as the speed f light r parallax f the sun, have sine been determined with ertainty. Of the 1 1 estimatrs that he evaluated, Stigler fund that the unadjusted sample mean, r a 10 perent "winsrized mean," prvided estimates that were lsest t the rret parameters. The 10 perent winsrized mean sets the values f bservatins in the bttm r tp deile equal t the value f the bservatin at the 10th r 90th perentile, and simply alulates the mean fr this "Earnings in the data analyzed by Mellw and Sider were alulated in a manner similar t that used in the redesigned CPS. First, husehlds and firms were asked fr the basis n whih the emplyee was paid, and then earnings were lleted n that basis. Usual weekly hurs were als lleted. The husehld data may have been reprted by the wrker r by a prxy respndent.
79 "adjusted" sample. In a similar vein, we used Mellw and Sider's linked emplyer-emplyee CPS data t explre the effet f varius methds fr trimming utliers. The analysis here is less lear ut than in Stigler's paper beause the true values are nt knwn (i.e., we are nt sure the emplyer-reprted data are the "true" data), but we an still mpare the reliability f the emplyee and emplyer reprted data using varius trimming methds. The first lumn f Table 10 reprts the differene in mean earnings between the emplyee and emplyer respnses fr the wage and hurs data. The differenes are small and statistially insignifiant. Clumn 2 reprts the rrelatin between the emplyee reprt and the emplyer reprt, while lumn 3 reprts the slpe effiient frm a bivariate regressin f the emplyer reprt n the emplyee reprt. The regressin effiient in lumn 3 prbably prvides the mst rbust measure f the reliability f the data. Clumns 4 and 5 reprt the varianes f the emplyee and emplyer data. Results in Panel A are based n the full sample withut any trimming. Panel B presents results fr a 1 perent and a 10 perent "winsrized" sample. We als reprt results fr a 1 perent and 10 perent trunated sample, whih drps frm the sample bservatins in either tail f the distributin. Whereas the winsrized sample rlls bak extreme values (defined as the bttm r tp X perent) but retains them in the sample, the trunated sample simply drps the extreme bservatins frm the sample. 61 In Panel B nly the emplyee-reprted data have been trimmed, beause that is all that researhers typially bserve. In Panel C, we trim bth the emplyee and emplyer reprted data. Fr hurs, the unadjusted data have reliability ratis arund.80. Interestingly, the reliability f the hurs data is nsiderably higher in Mellw and Sider's data than in the PSID validatin study. This may result beause the PSID validatin study was nfined t ne plant (whih restrited true hurs variability mpared t the entire wrkfre), r beause there is a differene between the reliability f lg weekly hurs and annual hurs. ''Lsely speaking, winsrizing the data is desirable if the extreme values are exaggerated versins f the true values, but the true values still lie in the tails. Trunating the sample is mre desirable if the extremes are mistakes that bear n resemblane t the true values.
80 The reliability rati is lwer fr the wage data than the hurs data in the CPS sample. Fr hurs and wages, the rrelatin effiients hange little when the samples are adjusted (either by winsrizing r trunating the sample), but the slpe effiients are nsiderably larger in the adjusted data and exeed 1.0 in the 10 perent winsrized samples. When bth the emplyer and emplyee data are trimmed, the reliability f the wage data imprves nsiderably, while the reliability f the hurs data is nt muh affeted. These results suggest that extreme wage values are likely t be mistakes. Overall, this brief explratin suggests that a small amunt f trimming uld be benefiial. In a study f the effet f UI benefits n nsumptin, Gruber (1997) remmends winsrizing the extreme 1 perent f bservatins n the dependent variable (nsumptin), t redue residual variability. A similar pratie seems justifiable fr earnings as well. The estimates in Table 9 r 10 uld be used t "inflate" regressin effiients fr the effet f measurement errr bias, prvided that there are n variates in the equatin. Typially, hwever, regressins inlude variates. Cnsequently, in Table 1 1 we use Mellw and Sider's CPS sample t regress the emplyerreprted data n the emplyee-reprted data and several mmnly used variates (eduatin, marital status, rae, sex, experiene and veteran status). Fr mparisn, the first tw lumns present the rrelatin effiient and the slpe effiient frm a bivariate regressin f the emplyer n the emplyee data. The third lumn reprts the effiient n the emplyee-reprted variable frm a multiple regressin whih speifies the emplyer-reprted variable as the dependent variable, and the rrespnding emplyee-reprted variable as an explanatry variable alng with ther mmnly used explanatry variables; this lumn prvides the apprpriate estimates f attenuatin bias fr a multiple regressin whih inludes the same set f explanatry variables as inluded in the table. Ntie that the reliability f the wage data falls frm.77 t.66 ne standard human apital ntrls are inluded. By ntrast, the reliability f the hurs data is nt very muh affeted by the presene f ntrl variables, beause hurs are nly weakly rrelated with the ntrls. Table 1 1 als reprts estimates f the reliability f reprted unin verage status, industry and upatin. Assuming the emplyer-reprted data are rret, the bivariate regressin suggests that unin
81 status has a reliability rati f.84." Interestingly, this is unhanged when variates are inluded. T nvert the industry and upatin dummy variables int a ne-dimensinal variable, we assigned eah industry and upatin the wage premium assiated with emplyment in that setr based n Krueger and Summers (1987). The upatin data seem espeially nisy, with an estimated reliability rati f.75 nditinal n the variates. Earlier we mentined that lassial measurement errr has a greater effet if variables are expressed as hanges. Althugh we annt examine lngitudinal hanges with Mellw and Sider's data, a dramati illustratin f the effet f measurement errr n industry and upatin hanges is prvided by the 1 994 CPS redesign. The redesigned CPS prmpts respndents wh were interviewed the previus mnth with the name f the emplyer that they reprted wrking fr the previus mnth, and then asks whether they still wrk fr that emplyer. If respndents answer "n," they are asked an independent set f industry and upatin questins. If they answer "yes," they are asked if the usual ativities and duties n their jb hanged sine last mnth. If they reprt that their ativities and duties were unhanged, they are then asked t verify the previus mnth's desriptin f their upatin and ativities. Lastly, if they answer that their ativities and duties hanged, they are asked an independent set f questins n upatin, ativities, and lass f wrker. Based n pre-tests f the redesigned CPS in 1991, Rthgeb and Chany (1992) find that the prprtin f wrkers wh appear t hange three-digit upatins frm ne mnth t the next falls frm 39 perent in the ld versin f the CPS t 7 perent in the redesigned versin. 63 The prprtin wh hange three-digit industry "The mst likely inrret assumptin that the emplyer unin data are rret is made beause unin status is a dummy variable, s measurement errrs will be rrelated with true unin status. If unin status is rretly reprted by emplyers, the regressin effiient nnetheless prvides a nsistent estimate f the attenuatin bias. Additinally, nte that the reliability f data n unin status depends n the true fratin f wrkers wh are vered by a unin ntrat. Sine unin verage as a fratin f the wrkfre has delined ver time, the reliability rati might be even lwer tday. As an extreme example, nte that even if the true unin verage rate falls t zer, the measured rate will exeed zer beause sme (prbably arund 3 perent) nnunin wrkers will be errneusly lassified as vered by a unin. measurement errr in unin status in lngitudinal data. See Freeman (1984), Jakubsn (1986) and Card (1996) fr analyses f the effet f 63 It is als pssible that dependent interviewing redues upatinal hanges beause sme respndents find it easier t mplete the interview by reprting that they did nt hange emplyers even if they did. Althugh this is pssible, Rthgeb and Chany pint ut that asking independent upatin and industry questins f individuals
82 between adjaent mnths falls frm 23 perent t 5 perent. These large hanges in the grss industry and upatin flws bviusly hange ne's impressin f the labr market. 64 4.3 Weighting and Allated Values Many data sets use mpliated sampling designs and me with sampling weights that reflet the design. Researhers are ften nfrnted with the questin f whether t emply sample weights in their statistial analyses t adjust fr nnrandm sampling. Fr example, if the sampling design uses stratified sampling by state, with smaller states sampled at a higher rate than larger states, then bservatins frm small states shuld get less weight if natinal statistis are t be representative. In additin t prviding sample weights fr this purpse, the Census Bureau als "allates" answers fr individuals wh d nt respnd t a questin in ne f their surveys. Missing data are allated by inserting infrmatin fr a randmly hsen persn wh is mathed t the persn with missing data n the basis f majr demgraphi harateristis. Cnsequently, there are n "missing values" n Census Bureau mir data files. But researhers may deide t inlude r exlude bservatins with allated respnses sine infrmatin that has been allated is identified with "allatin flags." Unfrtunately, althugh there is a large literature n weighting and survey nnrespnse, this literature has nt prdued any easy answers that apply t all data sets and researh questins (see, fr example, Rubin, 1983; Dikens, 1985; Lillard, Smith and Welh, 1986; Deatn, 1995, 1997; r Grves, 1998). 65 Tw data sets where bth weighting and allatin issues me up are the CPS and the 1990 Census Publi Use Mir Sample (PUMS), neither f whih is a simple randm sample. The CPS uses a mpliated wh reprt hanging emplyers uld result in spurius industry and upatin hanges. In additin, the large number f mismathes between emplyer and emplyee reprted upatin and industry data in Mellw and Sider's data set if nsistent with a finding f grssly verestimated grss industry and upatin flws. "See als Pterba and Summers (1986), wh estimate the measurement errr in emplyment-status transitins. "But see DuMuhel and Dunan (1983), wh nte that if the bjet f regressin is a MMSE linear apprximatin t the CEF then estimates frm nn-randm samples shuld be weighted.
83 multi-stage prbability sample that ver-samples sme states, and reently versamples Hispanis in the Marh survey (see, e.g., Bureau f the Census 1992). The 1990 PUMS als deviates frm randm sampling beause f ver-sampling f small areas and Native Amerians (Bureau f the Census, 1996).sample. 66 And even randm samples may fail t be representative by hane, r beause sme sampled husehlds are nt atually interviewed. The sampling weights inluding with CPS and PUMS mir data are meant t rret fr features f the sample design, as well as deviatins frm randm sampling due t hane r nnrespnse that affet the age, Sex, Hispani rigin, r rae make-up f the sample. Missing data fr respndents in these data sets are als allated. And in the CPS, if smene fails t answer a mnthly supplement (e.g., the Marh inme supplement), then entire rerd is allated by drawing a randmly mathed "dnr rerd" frm smene wh did respnd. T assess the nsequenes f weighting and allatin fr ne imprtant area f researh, we estimated a standard human apital earnings funtin with data frm the 1990 Marh CPS and 1990 5 perent PUMS fr the fur permutatins f weighting r nt weighting, and inluding r exluding bservatins with allated respnses. The samples nsist f white and blak men age 40 t 49 with at least 8 years f eduatin. 67 Regressin results and mean lg weekly earnings are summarized in Table 12. In bth data sets, the estimated regressin effiients are remarkably similar regardless f whether the equatin is estimated by OLS r weighted least squares t adjust fr sample weights, and regardless f whether the bservatins with allated values are exluded r inluded in the sample. Mrever, exept fr ptential experiene, the regressin effiients are quite similar if they are estimated with either the Census r CPS sample. One ntable differene between the data sets, hwever, is that mean lg earnings are abut 6 pints higher in the Census than the CPS fr this age grup. ^he 1980 PUMS are simple randm samples. equally likely t be sampled) until January 1978. The CPS was stratified but self-weighting (i.e., all bservatins were 67 In additin, t make the samples mparable, the Census sample exludes men wh were n ative duty in the military, and the CPS sample exludes the Hispani versample and the men in the armed fres. The eduatin variable in bth data sets was nverted t linear years f shling based n highest degree attained.
84 The results in Table 12 suggest that estimates f a human apital earnings funtin using CPS and Census data are remarkably rbust t whether r nt the sample is weighted t aunt fr the sample design, and whether r nt bservatins with allated values are inluded in the sample. At least fr this appliatin, nnrandm sampling and the allatin f missing values are nt very imprtant. 68 It shuld be nted, hwever, that Census Bureau surveys analyzed here are relatively lse t randm samples, and that the sample strata invlve variates that are inluded in the regressin mdels. Sme f the data sets disussed earlier, mst ntably the NLSY and the PSID, inlude large nn-randm sub-samples that mre extensively selet r ver-sample ertain grups using a wider range f harateristis, inluding raial minrities, lw-inme respndents, r military persnnel. When wrking with these data is it imprtant t hek whether the use f a nn-representative sample affets empirial results. Mrever, sine researhers ften mpare results arss samples, weighting may be desirable if this helps redue the likelihd that differenes in sample design generate different results. 5. Summary This hapter attempts t prvide an verview f the empirial strategies used in mdern labr enmis. The first step is t speify a ausal questin, whih we think f as mparing atual and unterfatual states. The next step is t devise a strategy that an, in priniple, answer the questin. A ritial issue in this ntext is hw the ausal effet f interest is identified by the statistial analysis. In partiular, why des the explanatry variable f interest vary when ther variables are held nstant? Wh is impliitly being mpared t whm? Des the sure f variatin used t identify the key parameters prvide plausible "unterfatuals"? And an the identifiatin strategy be tested in a situatin in whih the ausal variable is nt expeted t have an effet? Finally, implementatin f the empirial strategy requires apprpriate data, 68 0f urse, the standard errrs f the estimates shuld reflet the sample design and aunt fr hanges in variability due t allatin. But fr samples f this sire, the standard errrs are extrardinarily small, s adjusting them fr these features f the data is prbably f send-rder imprtane.
and areful attentin t the many measurement prblems that are likely t arise alng the way. 85
8<^> Appendix A. Derivatin f equatin (9) in the text The mdel is Yi =p A = i + ps i + T li,e[s i T 1i ]=0 Y + YiS i + Tl, i,e[s ]=0 iti li The effiient n Sj in a regressin f Y; n S ; and A< is C(Y l,s. A j)/v(s. Aj) where Als, S, S^SrTi-Tt.Ai, and Ti^Y.VCSJ/VCAi). V(S. Ai ) = V(S 2 )-Tt i 1 V(A i)=[v(s i)a^(a i)][v(a i C(Y,S. i Ai )A^(S., Ai)=p+C(ii i SrTt-n^fWiS.^) =p-7i,c(r i )- Yl 2 V(S,)]=[V(S 1 )A'(A,)]V(ri li )., A i)a^(s. Ai )=p-7i 1C(n ilo/vcs.j = P - Yi4>i- B. Derivatin f equatin (34) in the text T enmize n ntatin, we use E[YI X, j] as shrthand fr E[YJ X;, S-j]. Repeating equatin (3 1 ) in the text withut "i" subsripts: (A. 1 p = E[Y(S-E[SI X])]/E[S(S-E[SI X])] = E[E(YI S, X)(S-E[SI X])]/E[S(S-E[SI X])] r Nw write S S (A.2) E[YI X, S] = E[YI X, 0] + X(E[YI X, j] - E[YI X, j-1] } = E[YI X, S=0] + I Pjl, j=l j=l where Pj^E[YIX,j]-E[YIX,j-l] We first simplify the numeratr f p r. Substituting A.2 int A.l: S E[E(YI X, S)(S-E[SI X])] = E{( p jx )(S-E[SI X])} j=l Wrking with the inner expetatin: S = E{E[Ip jx (S-E[SIX])IX]} j=l where S s s E[EPj,(S-E[SI X])l X] = I I j=l P = P(S=*IX). s=l j=l Pj^-E[SI X])P
Reversing the rder f summatin, this equals where Nw, simplifying, s s s Xp j Jl( J-EtSIX])PJ = IpjlMjl j=l s=j j=l s Mjl = I(^-E[SIX])P s=j s s = Mjx E sp - n E[SI X]P = (E[SI X, Sij]-E[Sl X])P(S*jl X). Sine E[SI X] = E[SI X, S*j]P(S*jl X) + E[SI X, S<j](l-P(S*jl X)), u jx = (E[SI S*j, X]-E[SI S<j, X])P(S*jl X)(l-P(S*jl X)). S we have shwn s E[Y(S-E[SIX])] = EEp j ji j J. j=l A similar argument fr the denminatr shws E[S(S-E[SIX])] = E[ n jx ]. s j=l Substitute S fr j t get equatin (34) using the ntatin in the text.
86 C. Shling in the 1990 Census Years f shling was ded frm the 1990 Census ategrial shling variables as fllws: Eduatinal attainment 5*. 6 th, 7 th, r 8 th grade 9 th grade 10 th grade 1 1"1 grade r 12 th grade, n diplma High shl graduate, diplma r GED Sme llege, but n degree Cmpleted assiate degree in llege, upatinal prgram Cmpleted assiate degree in llege, aademi prgram Cmpleted bahelr's degree, nt attending shl Cmpleted bahelr's degree, but nw enrlled Cmpleted master's degree Cmpleted prfessinal degree Cmpleted dtrate
8 l - Labr Enmis Artiles All Fields 1965-69 1970-74 1975-79 1980-83 1994-97 1994-97 Thery Only 14 19 23 29 21 44 Mir data 11 27 45 46 66 28 Panel 1 6 21 18 31 12 Experiment 2 2 2 3 Crss-Setiri 10 21 21 26 25 9 Mir data set PSID 6 7 7 2 NLS 3 10 6 11 2 CPS 1 5 6 8 2 SEO 4 4 1 Census 3 5 2 5 1 All ther mir data sets 8 14 18 27 38 21 Time Series 42 27 18 16 6 19 Census Trat 3 2 4 3 State 7 6 3 3 2 2 Other aggregate rss-setin 14 16 8 4 6 6 Sendary Data Analysis 14 3 3 4 2 2 Ttal Number f Artiles 106 191 257 205 197 993 Ntes: Figures fr 1965-83 are frm Staffrd (1986). Figures fr 1994-97 are based n authrs' analysis, and pertain t the first half f 1997. Fllwing Staffrd, artiles are drawn frm 8 leading enmis jurnals.
0\0 u OJ OJ X r _u Xre H j 00 _ m 9J \ n C- r~ «w ^ ^ C3 rs " U w. e re 2 re :=, E T3 u * C X re j d U Q X OO ON ON n 60 C < in ON i * N^-' u. «/_ s O 5-" TO 00 re ON *<> N OJ 4, ON " ON re : E u X "E CO re ll, < U a ^ ^y re lo ^ ON 'n ON 3 O 3 CO > 15 t-t 3 m D >-. OJ a 2 re m00 ON 00 On OO ON 4J C re Cu T3 re re E -^ DC a ^^ «^S r^. n 22 SSu - 5 " re W J5 w *g re C-2 E «C U U S ig'-s CQ ^ < OJ 60 "re re > re E U 60 60 60 60 C 60 X X _C X u IE u re re * re E E E re E u L- C O re CO re >1 >. CO 00 00 g C <U j _> U 60 CL Cl u, O 60 A) OS CU Cl, s u OJ u u 0) u u B re (J u u «J OJ <u re > a i Lh u. Ui L. kri 1) u a) 1) a> L_ C- a LZ LZ U- 141 Li L_ L_ a a T3 1 CO i > «^ _ C 62 ^b u y 1 "f i 60 60 - Vu- CO CO CO CO CO <U u flj 1) <^ u E U ' u.! (U CO u u u T3 flj OJ OJ w Uh u* ll ii ll ll ll S -.u.u.u.l> re.!> CJ OJ [S (*- (*«Li Li Li u. Cl u- L- In Li L- Li L- re Q < 5 5 a Q X Q Q O 60 OJ 00 _ re u IS T3 ii 60. C Xi 'l.2 3 - re re U > 60 00 re 60 C E S3 <u E E > O E re u- 60 3 re GO re u. B j > CO 3 re re <- WD 60 E re 60 ll Cl E E L_ 00 60 i2 X C 'E e E 3 D H z re CO C <L) Cl E O U CO L- (U E S m 3 C CO D " Cl C _ re < 60 X re OJ <NI _jj xre j E - 3 re O > j 60 re < E >-. 60 E C re W w 60 E re W t u L- flj E CO 60 >* L* CO U C 4) 60 re E re Emp nati\.2, 2 i i TD <L) E >> n Cl O C 3 D Q OJ E CO >> OJ 60 Cl E E CJ 1 re W W CO re 60 8,1 H2 Xre 60 3 u u T5 A j re u U re 3 re Cl, U re re u u ^ re 3 CO 3 E w ' 3 re L-. CO u 1 C U re CO L u X re CO L- *4_l U ^^ 60 re Hi C re X re PQ ~v re CO I Si re U CO u U 5 re V 60 re x U CO ~ i i * re Cl Z ll 3.E j«i > HS5
i 0\ On s~. T}- w On y^n n \ 00 \ Q. w n 1 T h -^ 60 2 3 > 60 TJ '5 > S03 2 C 03 1) tj 03 4-«OO 60 8 e Gi CO < On ON On N «S^ >, b 03 <D H S ^ TJ C 3 TJ C C 03 03 en E 3 60 03 C X < 03 CN *: On *"^ On 1-4 / OO * <u NO On tj On On O On V^" 03 >-. > S > O0 03 3 J b- ON "O 3 On C 03 73 TJ 03 C 03 w 03 I u* U 0) 00 g 60 a 1) 60 XI ON On 60 03 C s > < CM CO CO CN E 03 00 O H J CO CN <Z) J CN CO O 60 JZ O0 3 u «O0 (A u ȯ O0 03 Im U 03 u 4> >- > 03 ^_ 0) N.2 " OO 03 00 PU 4) C 'C 3a CO CO 60 ex 60 a. C 3 " 00 O JZ X) 00 U 60 03 u 03 *>. J > E u u 60 " u 8.1 O 03 < O. &: i-) ex 03 O u 8. x T3 U tj 3 «N X) 03 H XI 03 n03 > C E JZ a. E x CO I 60 CO JO <*- <- O U - s 03 03 O 3 <*- 03 >* a x u E 3 C «-* 60 tj «2 "8 5 s 8 E u 3 '3 E 3 03 E N u >. (A OO 3 00 O U CO <u X «a 8 CO <u O z
Weisbr effiients effiie Chamber] \2. - 3 u u 3 a tu re E Xre W rnl Xre H 3 00 U 00 U O O r- r- 6 00 NO I < v r- ss r 8 Sf NO O r- 8 2 a in v O * NO tj- 38 S8 a < 88 S8 2 m Cn> (N 58 2 a m m m 00 fn <s (S O < slllljs g 3 en r3 g X ". "5 E m.2 *> H «-^ b <f e = re > 8 -a % s-i g &* I El = -5 ^n m tu ; (q CU >, C </3 := tu < E (J ^^ ^^ 5 >> tu OO ON m rf f- O m 2 en rt > tu.05 (.00 <.04 (.00 a, /». rr r- r- r* C -B. u'e < tu a. x W OO < u NO Tf H re S8 58 N~' s_^ < w E >, t «-» V 15 t ao C v u >,.fi Abilit variab u 00 < CO tu tu e tu u- O 00 tu re T3 tu t O 4-J u u O Z 15 < < Q. s. X tu 3 O O 00 u OO re en S r- _>. On u. 3 O as tl) re m r~ _>. ON 3 O E tu,s 00 ON E I-1 Ui re tu 111 re = - tu 00 >,2 =.5 tu 00 S s 3 DC 3 < re x> re C "C u re Q. > O Q (A.S tu 'p E S' 8 <^.E s M t 2 t CU U g U re Tl" tu OS > t Z >- t j z 15 2 3 8'! J u > t a p 'is «a s?- 2 - re -. re Q Os m O -a- CN rs s 2 s Q in Os 0O a a >> 4-< X ' re '*" <r Os efl 3 s O - tu OO w I- tu Urn t a tu Z 00 tu t 3 E tu ^ 3 < re tu m E n tu tu 5 >> -^ re tu >s a» re.e 15 fe >< t ««E 3 re j= B O -n.. - i3 j= a - U^ w iz and tu = 1 ith E SeJ u re O vi OO i- E J: = 3 K T3 S 3. «re U 68) rui ifi 23 t (19 nst spe.e S u - re «tu re ^ re S ^. a. (1 and :lter lumn m prt g = t O w "O re tu tu tu a -a F, tu.3 t>-. _ re E Ash rrs the 2 t/i re O i- f- X) y j- tu = -2 S ^ 1 n E 00 l_ u rt 3 > JC tudie anda be le t t - «= re F. - 22 tu 1 at t 13 re?; 3 tu <~ X C Ti -O tu rt e 23 "O wage Ceffi. selete 8). ere lg - «- S C Os ^ 3 t tu > tu t- 3 u re l. tu 3 t "O C t tu ' tu x: re O i_ E re = id S earnings. U <u tu 1. tu < 3 SP a 00.E ^. «re P b _= Os re.e sl (1 as by OLS Masn vided Dntrls allws at e S "S K5 O -B an and these hes d - tu 1 lumn Grili ), averag t- Os. * * w E tu re tu re.e t t re " E g X tn tu B3 - O u S3 < O re ««'Panel (1970) f-edu ntr Pane stima 971. O i tu
n Table 4 Differenes-in-Differenes Estimates f the Effet f Immigratin n Unemplyment Year 1979 1981 1981-79 Grup (1) (2) (3) Whites (1) Miami 5.1 3.9-1.2 (1.1) (-9) (1.4) (2) Cmparisn Cities 4.4 4.3 -.1 (.3) (.3) (.4) (3) Differene.7 -.4-1.1 Miami-Cmparisn (1.1) (.95) (1.5) Blaks (4) Miami 8.3 9.6 1.3 (1.7) (1.8) (2.5) (5) Cmparisn Cities 10.3 12.6 2.3 (.8) (.9) (1.2) (6) Differene -2.0-3.0-1.0 Miami-Cmparisn (1.9) (2.0) (2.8) Ntes: Adapted frm Card (1990), Tables 3 and 6. Standard errrs are shwn in parentheses.
14 Table 5 IV Estimates f the Effets f Military Servie n White Men Earnings year Mean Earnings Veteran Status Wald Estimate f Eligibility Effet Mean Eligibility Veteran Effet Effet (1) (2) (3) (4) (5) A. Men bm 1950 1981 16,461-435.8 (40.5) 1970 2,758-233.8 (39.7).267.159 (.040) -2,741 (1,324) -1,470 (250) 1969 2,299-2.0 (34.5) B. Men brn 1951 1981 16,049-358.3.197.136-2,635 (203.6) (.013) (.043) (1,497) 1971 2,947-298.2-2,193 (41.7) (307) 1970 2,379-44.8 (36.7) C. Men brn 1953 (n ne drafted) 1981 14,762 34.3 (199.0).130.043 (.037) n first stage 1972 3,989-56.5 (54.8) 1971 2,803 2.1 (42.9) Nte: Adapted frm Tables 2 and 3 in Angrist (1990), and unpublished authr tabulatins. Standard errrs are shwn in parentheses. Earnings data are frm Sial Seurity adminstrative rerds. Figures are in nminal dllars. Veteran status data are frm the Survey f Prgram Partiipatin. There are abut 13,500 bservatins with earnings in eah hrt.
\s Table 6 Mathing and regressin estimates f the effets f vluntary military servie Rae Average Differenes in Mathing Regressin Regressin earnings in means by estimate f estimate f minus 1988-1991 veteran status veteran effet veteran effet Mathing 0) (2) (3) (4) (5) Whites 14,537 1,233.4-197.2-88.8 108.4 (60.3) (70.5) (62.5) (28.5) Nnwhites 11,664 2,449.1 839.7 1,074.4 234.7 (47.4) (62.7) (50.7) (32.5) Ntes: Adapted frm Tables II and V in Angrist (1998). Standard errrs are reprted in parentheses. The tables shws estimates f the effet f vluntary military servie n the 1988-91 Sial Seurity-taxabale earnings men wh applied t enter the armed fres between frm 1979-82. The mathing and regressin estimates ntrl fr appliants year f birth, eduatin at the time f appliatin, and AFQT sre. 128,968 whites and 175,262 nnwhites in the sample. There are
I % On On On n 00 ~ m * d en On ^ NO On O NO On r 00 CO ON U CS C E U i_ 3 tt, a CS E s >n OS ON ON On On On On On On P-; Tf 00 m r- rn rn i 00 "<* d 00 d d d CM <t 00 m On On q NO NO in CM On in d d d *~ d On rn O CM IT) VO Tf m in On rn NO 00 1/S d d d CM ^ m r- ~- m NO q On NO f' _ d m d in d rn d d CS I ««> ^X E - f^ "g* rt fj ri * j ya Cl> M < E >N. ts E 3 U K = _." "O rt l) I? ^^ On On r^ CM 00 On NO u-i _l 00,-1 On On * rn d NO On On On CO CS 3 T3 '> K5 u rt OS *-* e E >, _ a. E < D M (0 E- On On On 00 On 00 00 On m On rn NO On m d CM g " <3 15 O On n 00 00 r- d m * 00 m NO 00 00 -st-^ 1 en d On d in d n m >n ^ CS a. C I «n m rn tn rn * d rn ' CQ d On O n n r* d m r- t^ d C/5 'E CS ON i/3 On C rt n < 8u l_ >N 3 X! 1 ««S3 & H 8 S _ t CS CS CS a, Cl, E 00 CO 5 i a E 3 CS 1/3 O t XJ Cu - y U <4Ō S3 /^N C/5 CS t 3 00 Xi **-» aj 12 "3 O CO X) CS Z CU
. m 1 41 OH In as.e e-.2 a.2? i e < j.g C3 O. 60 - >! si 'it id m JZ C 3 E L- g O 7 CL td r E >. I) x> & "3 J < E «9 - CJ t O0 w O s E (0 ^ a e 8 3 C E u >* E O Q. g E td J= u 3 CO > 'a u l>" u a. E IE X a. u.it: id CO O, 0*- u 1> a 3 t: (0 O O0 t X! w3 t 3 td 3 Id J= O U t _ *- 5 u O 3 si j W C) J> O in = td U C O "- "3. x: <D «U 00 g O «- Men S id " J ' Id >. ~ u 00 *«S 00 C O s * ej > O * -a td a 2 rf 3 -= "S "E.- u 15 l~ S 0- ^ U J i- O O C «X) O E E.2 ms 2 C a es 5 <a, a..2t «g. i - 00 1- >> 00 _ D-'E C «UJ u I a.. u E td CJ n 2 - u- td t jz ^ >» E /T aj C u v^ O. «J 3 ~ - l_ M E U a. i O ^ u _> a ^ O (4-i j«: '^0 a C fj u 3 n Q. U u. O «Q i u E «J -a 3 E u > X) < 00 if 1) a E J= si u C «E > n u 3 «S Tr. i _u 3 a- 00 d >, u *w E 'E CO I- n '5 3 <S) UJ D. OO a-c u 00 id? /T u v: J3 " C3 a. i_ Id 3 > a. Id «j V) E id C/5 5 1 id ~ a E r/3 td a Si. < g = O (J 2 ^ S3 td d S id 13 in -3 2 E Sjw 5. U 1 U = id td.2 «. g - O X) Q. td "C a Q. Mtd Jf 13 l3- _ J td!t Q. u 2 J= Id W T3 t 00 Z.E Q. 5.2 n «3.2- ".2! >! j= a. a 1/3 DI _>> E E U 00 J2 X5 C3 H l a E 2 jy Id Q, ^ 3 >, "> S 5 " 2 «- I'l <* 5 JJ e-'5 E u " 3. u >, 00 S3 «E u > «u "*- E >> u > u 3 I) >- 2 f --g.e - u j= S 5 ra Q r- CQ U «^ ^u- U u 3 r > O 4- u ^ td in X) Id in td 3 t.2 u in "3 (/i Id 2 C <-- in D. "id. ^ w: ',""? "«3 id O. 00 i- i C In td 1) E 1- Q. td in u >> 00 O.? > OJ E td in u. X) X) in O 8.1 5I " ««2 8.5 /^ W rt - g «6?! U in 5 " Cu i-s 00 O.2 ' E 00,0 5.E E O n tn E 00 2 S 00 te s u 0J s I i w 00_ O 60 ^ td >J Id J= t 3 a J= 3O is «j in J= 1 3 O a td 'a. in ed 3 T3 u 00 3 O 2 3 X t td 2 t a. a. M g 5 a.2 ya S> SIS ^ 2 13g U a- t U id OS w "2 U 2 E.2 t ii S3 ^Q u 2 " C 1 s. t 03 id / O Data, 3 00 is 19 940, 1970, 19 and Ow O O 'b > 00 00 > 'w C M C (< v > CO C3.«Eft*» t 2 3 3 U La 3 b n 8 &1& e J- Id rn CO t» Cl, CL.
qa (N in <n ~ t-» vd 00 VO On 00 «- m <N r O tn Tj t~ CO (~- v r~ ON ON ON ON 00 _U XC3 > O _ 1) t 00 Q 3 _ > 00 Ou OS UJ 00 I 00 u 3 00 a ;u > a. u C ir> C O 3 O 3 CU 00 Om 00 C S2 3 O = O «S s i VO 3 2i i Cr 2 2«T3 00 H 3 Xi _4) "a, E e3 00 u OS ' : '; H H U 2 05 n Z < u u ONl X a E- u xc3 C3 > CO «s - E a On On [jj O0 O0 C3 to CO 3 111 UJ UJ j "* J J < <= 00 =.5 E :> > >*" OO C --.S E W 15 3 _ 3 «C «2 l< g< < g> < g> M g>j j < CO 00 00.3 E Eai 3 3 Earn Earn Annual Annual 1986 1982 al Annu»g AL CO CO 23$ fe- a Annual x Annual 1986 1982 CO CO.3.3 ea 3 w «3 W *-* eb «U 3 3 w w «u 3 w > 3* OO ea ea u 3 Q in On O C ON CQ UJ w *^u St O On CO VO O ON w V ft "n "O 00 ffi VO ON ON v U*3 2 C «a Sn «a s*^ ( 3 C i_ On CO CO C C3 4> t) enzwei enfelt eger( bman rman, 1 4> ea r 2 On x: w u 3 CO t 1) O 03 by.h Feath Biel and "" >*^ < ^ 03 OS H 03 s «a CN en < VO CO
^ S,ample Mean Emplyee Minus Emplyer r 3 Emplyee Variane Emplyer Variane A. Unadjusted Data In wage 0.017 0.65 0.77 0.355 0.430 In hurs -0.043 0.78 0.87 0.195 0.182 B. Emplyee Data Winsrized r Trunated 1% Winsrized Sample In wage 0.021 In hurs -0.044 0.68 0.77 0.88 0.278 0.430 0.91 0.164 0.182 10% Winsrized Sample In wage 0.034 0.68 1.04 0.188 0.430 In hurs -0.069 0.72 1.28 0.064 0.182 1% Trunated Sample In wage 0.023 0.68 0.91 0.243 0.413 In hurs -0.041 0.75 0.87 0.134 0.155 10% Trunated Sample In wage 0.021 0.60 0.94 0.126 0.307 In hurs -0.030 0.62 0.96 0.033 0.072 C. Bth Emplyee and Emplyer Data Winsrized r Trunated 1% Winsrized Sample In wage 0.025 0.8 In hurs -0.04 0.78 0.86 0.278 0.305 0.85 0.164 0.155 10% Winsrized Sample In wage 0.028 0.88 0.92 0.188 0.199 In hurs -0.024 0.84 0.85 0.064 0.059 1 % Trunated Sample In wage 0.032 0.88 0.92 0.230 0.250 In hurs -0.036 0.76 0.81 0.109 0.125 10% Trunated Sample In wage 0.024 0.91 0.94 0.119 0.125 In hurs -0.012 0.8 0.83 0.027 0.028
(O Ntes t Table 10: r is the rrelatin effiient between the emplyee- and emplyer-reprted values. P is the slpe effiient frm a regressin f the emplyer-reprted value n the emplyee-reprted value. Sample size is 3,856 fr unadjusted wage data and 3,974 fr unadjusted hurs data. In the 1% winsrized sample, the bttm and tp 1% f bservtins were rlled bak t the value rrespnding t the 1st r 99th perentile utff; in the trunated sample these bservatins were deleted frm the sample.
IO\ Table 1 Estimates f Reliability Ratis frm Mellw and Sider's CPS Data Set Bivariate Multivariate Variable r R R In wage unadjusted In wage 1% trunated* In wage 1 % winsrized* 0.65 0.77 0.66 0.68 0.91 0.85 0.68 0.88 0.79 In hurs unadjusted In hurs 1 % trunated* In hurs 1 % winsrized* 0.78 0.87 0.86 0.75 0.87 0.85 0.77 0.91 0.90 unin 0.84 0.84 0.84 2-digit industry premium 0.93 0.93 0.92 1 -digit industry premium 0.91 0.92 0.90 1 -digit upatin premium 0.84 0.84 0.75 Ntes: r is the rrelatin effiient between the emplyee- and emplyer-reprted values. P is the effiient frm a regressin f the emplyer-reprted value n the emplyee-reprted value. Int the multiple regressin, variates inlude: highest grade f shl mpleted, high shl diplma; llege diplma dummy, marrital status, nnwhite, female, ptential wrk experiene, ptential wrk experiene squared, and veteran status. Sample size varies frm 3,806 (fr industry) t 4, 087 (fr upatin). * Only the emplyee data were trunated r winsrized.
/G2 Table 12 Weighting and allatin in the Census and CPS Cvariate 1990 Census Marh 1990 CPS (1) (2) (3) (4) (5) (6) (7) (8) Lg wages mean 6.405 6.415 6.425 6.437 6.340 6.348 6.351 6.357.746.747 (.723).721.732.734.717.723 Eduatin.10932.10828.10920.10813.10839.11139.10950.11314 (.00047) (.00047) (.00049) (.00049) (.00442) (.00438) (.00459) (.00459) White.208.213.199.202.194.219.196.211 (.003) (.003) (.004) (.003) (.030) (.027) (.031) (.029) Married.386.387.381.382.386.387.343.362 (.004) (.003) (.004) (.004) (.031) (.029) (.032) (.031) Widwed.181.165.190.171.110.200.077.075 (.013) (.013) (.014) (.014) (.108) (.105) (.117) (.115) Divred r.193.187.202.196.167.135.141.123 separated (.004) (.004) (.005) (.004) (.037) (.035) (.039) (.037) Hispani -.142 -.151 -.138 -.145 -.125 -.179 -.107 -.155 (.005) (.005) (.005) (.005) (.040) (.048) (.041) (.049) Veteran -.012 -.014 -.018 -.021 -.0001 -.012 -.002 -.015 (.002) (.002) (.002) (.002) (.016) (.016) (.017) (.017) Ptential.040.041.041.042.0005 -.002.013.013 experiene (.002) (.002) (.002) (.002) (.021) (.022) (.022) (.023) Pt. experiene -.055 -.055 -.057 -.057.024.035.003.008 squared* 100 (.004) (.004) (.005) (.005) (.043) (.043) (.045) (.045) Allated yes yes n n yes yes n n Weighted n yes n yes n yes n yes N 603,763 603,731 527,095 527,071 7,134 7,134 6,361 6,361 Ntes: The table reprts OLS estimates f wage equatins with the indiated variates. The samples inlude blak and white men aged 40-49 with at least 8 years f shling. The Census sample exludes ative-duty military persnnel and the CPS sample exludes military persnnel and the hispani ver-sample. The CPS shling variable is highest year mpleted while the ensus variable is imputed as desribed in the appendix.
/03 n C\ NO ON n *, r C u E V \ >N \ _ "" Ct E a a ts u < "" a (3 O a On 00 n 00 t IL> a,^ u «i: ON n t "* C CO _ 4 C3 C3 NO On t> ON >. Cv OO ON E -r 3 ra u t 3 < u O 00 CO E n " 2 CO ON "C CO. E OO r» 03 ON <" N e >; e u S NO 1 C r^ * "> On w >% O -J "S.-S t^ ON e i t u IN ON vm O On C «1.= O0 u 00 C a E ra IS. 03 «> CO* ^r V- Figu Hu 0 6l ~entref uixj jasuikidua 8j ui ssaet
josr A. Average Class Size and Predited Class Size 40 35 30 a 25 55 20 10 t «15 O 10 5 -Atual Class Size Predited Class Size (Maimnides Rule) 20 40 60 80 100 120 140 160 180 200 220 B. Average Reading Sres and Average Predited Class Size C. Regressin-Adjusted Reading Sres and Predited Class Size 15 w «1 tn3 < 4 3 2 1 0- Predited Class Size A /\ 10 1-2 -3 Test Sres -10-4 Figure 2. 25 45 125 145 165 65 85 105 Enrllment in Grade Illustratin f regressin-disntinuity methd fr estimating the effet f lass size n pupils' test sres. Data are frm Angrist and Lavy (1998). -15
/ B. «* A. Cnditinal expetatin funtin and OLS regressin line 1.00 0.80.--* 00 - \ 0.60-0.40-0.20- -" 0.20-8 9 10 J^^i 2 13 14 15 16 * 17/ 18 * 19 20-0.40 Years f shling (S) -Average adjusted CEF Average hange in CEF - regressin line B. Shling histgram and OLS weighting funtin 0.30 0.25 0.20 0.15 -- 0.10-0.05 13 14 15 Years f shling I shling histgram nrmalized weighting funtin Figure 3. Panel A shws the nditinal expetatin funtin (CEF) f lg weekly earnings given shling, adjusted fr variates as desribed in the text. Als pltted is the average hange in the CEF and the OLS regressin line. Panel B shws the shling histgram and OLS weighting funtin. Data are fr men aged 40-49 in the 1990 Census.
(OG 3000 a. a ta CO bo.5 & 5000-4000- 2000-1000- 0- -i H r 0.25 055 0.45 055 0.65 0.05 0.45 i 0.75 0.85 Prbability f servie, nditinal n variates Figure 4. Effets f vluntary military servie n earnings in 1988 91, pltted by rae and prbability f servie, nditinal n variates. The earnings data are frm Sial Seurity administrative rerds.
m 0.030-0.025 0.020 u a 0.015-2 Q Q O 0.010 0.005 0.000 - zrrht^i--.--«-0.005- Years f shling Figure 5. First quarter furth quarter differene in shling CDFs, fr men brn 1930 39 in the 1980 Census. The dtted lines are 95% nfidene intervals.
/? -CD -If) CO S-i CO <u l- J3 +-». t -* CO u CO i r (OlOtONrOrNlO^inffl -i i i i i i i i i i i -CO r -CM -CO _-* -If) i 00 W a eg OO, <u 1-1 I s CO CO CO * "O u U D, 1) t-. >% * _ r "S, s ON u s ^ t- CO <u > T3 Sm CO u >*- js CO "S, O "S, 6 u. u < 4-i *- 4> V-. t «J -*- a VO Q V- 3.2? «5 Hh -= (ssbm. paidai-iaxidins) Sq
(O^ Referenes Abadie, Albert (1998) "Semiparametri Estimatin f Instrumental Variable Mdels fr Causal Effets," MIT Department f Enmis, mime. Abwd, Jhn M. and Henry S. Farber (1982) "Jb Queues and the Unin Status f Wrkers," Industrial and Labr Relatins Review, 35:354-367. Aigner, Dennis J. (1972) "Regressin With a Binary Independent Variable Subjet t Errrs f Observatin," Jurnal f Enmetris, l(l):49-59. Altnji, Jseph G. (1986) "Intertempral Substitutin in Labr Supply: Evidene frm Mir Data," Jurnal f Plitial Enmy, 94(3):S 1 76-S21 5. Altnji, Jseph G. and Lewis M. Segal (1996) "Small-Sample Bias in GMM Estimatin f Cvariane Strutures," Jurnal fbusiness and Enmi Statistis, I4(3):353-66. Andersn, Patriia M. (1993) "Linear Adjustment Csts and Seasnal Labr Demand: Evidene frm Retail Trade Firms," Quarterly Jurnal f Enmis, 108(4): 101 5-42. Andersn, Patriia M. and Brue D. Meyer (1994) "The Extent and Cnsequenes f Jb Turnver," Brkings Papers n Enmi Ativity: Mirenmis, 0(0): 177-236. Andersn, T. W., Nat Kunitm and Takamitsu Sawa (1982) "Evaluatin f the Distributin Funtin f the Limited Infrmatin Maximum Likelihd Estimatr," Enmetria, 50:1009-1027. Angrist, Jshua D. (1990) "Lifetime Earnings and the Vietnam Era Draft Lttery: Evidene frm Sial Seurity Administrative Rerds," Amerian Enmi Review, 80:313-335. Angrist, Jshua D. (1995a) "Intrdutin t the JBES Sympsium n Prgram and Pliy Evaluatin," Jurnal f Business and Enmi Statistis, 1 3(2): 1 33-36. Angrist, Jshua D. (1995b) "The Enmi Returns t Shling in the West Bank and Gaza Strip," Amerian Enmi Review, 85(5): 1065-87. Angrist, Jshua D. (1998) "Estimating the Labr Market Impat f Vluntary Military Servie Using Sial Seurity Data n Military Appliants," Enmetria, 66(2):249-88. Angrist, Jshua D. and William N. Evans (1998) "Children and their Parents' Labr Supply: Evidene frm Exgenus Variatin in Family Size," Amerian Enmi Review, frthming. Angrist, Jshua D. and Guid W. Imbens (1991), "Sures f Identifying Infrmatin In Evaluatin Mdels," NBER Tehnial Wrking Paper 117, Deember. Angrist, Jshua D. and Guid W. Imbens (1995) "Tw-Stage Least Squares Estimates f Average Causal Effets in Mdels with Variable Treatment Intensity," Jurnal f the Amerian Statistial Assiatin, 90(430):431-42. Angrist, Jshua D., Guid W. Imbens and Kathryn Graddy (1995) "Nn-Parametri Demand Analysis with an Appliatin t the Demand fr Fish," NBER Tehnial Wrking Paper N. 178. Angrist, Jshua D., Guid W. Imbens and Alan B. Krueger (1998) "Jakknife Instrumental Variables Estimatin," Jurnal fapplied Enmetris, frthming. Angrist, Jshua D., Guid W. Imbens and Dnald B. Rubin (1996) "Identifiatin f Causal Effets Using Instrumental Variables," Jurnal fthe Amerian Statistial Assiatin, 91(434):444-55. Angrist, Jshua D. and Alan B. Krueger (1991) "Des Cmpulsry Shl Attendane Affet Shling and Earnings?" Quarterly Jurnal f Enmis, 106:979-1014. Angrist, Jshua D. and Alan B. Krueger (1992) "The Effet f Age at Shl Entry n Eduatinal Attainment: An Appliatin f Instrumental Variables with Mments frm Tw Samples," Jurnal f r He Amerian Statistial Assiatin, 87(41 8):328-36. Angrist, Jshua D. and Alan B. Krueger (1995) "Split-Sample Instrumental Variables Estimates f the Returns t Shling," Jurnal fbusiness and Enmi Statistis, 13(2):225-35. Angrist, Jshua D. and Vitr Lavy (1998) "Using Maimnides Rule t Estimate the Effets f Class Size n Shlasti Ahievement," Quarterly Jurnal fenmis, frthming. Angrist, Jshua D. and Whitney K. Newey (1991), "Over-Identifiatin Tests in Earnings Funtins with Fixed Effets," Jurnal f Business and Enmi Statistis, 9(3):3 17-23.
'10 Arellan, Manuel and Cstas Meghir (1992) "Female Labur Supply and On-the-Jb Searh: Empirial Mdel Estimated Using Cmplementary Data Sets," Review f Enmi Studies, 59(3):537-59. Ashenfelter, Orley A. (1978) "Estimating the Effet f Training Prgrams n Earnings," Review f Enmis and Statistis, 60(l):47-57. Ashenfelter, Orley A. (1984) "Marenmi Analyses and Mirenmi Analyses f Labr Supply," Carnegie-Rhester Series n Publi Pliy, 21(0):1 17-55. Ashenfelter, Orley A. and David E. Card (1985) "Using the Lngitudinal Struture f Earnings t Estimate the Effet f Training Prgrams," Review f Enmis and Statistis, 67(4):648-60. Ashenfelter, Orley A. and Alan B. Krueger (1994) "Estimates f the Enmi Return t Shling Frm a New Sample f Twins," Amerian Enmi Review, 84(5): 1 157-73. Ashenfelter, Orley A. and Jseph D. Mney (1968) "Graduate Eduatin, Ability, and Earnings," Review f Enmis and Statistis, 50(l):78-86. Ashenfelter, Orley A. and David J. Zimmerman (1997) "Estimates f the Returns t Shling frm Sibling Data: Fathers, Sns, and Brthers," Review f Enmis and Statistis, 79(l):l-9. Baker, Gerge, Mihael Gibbs and Bengt Hlmstrm (1994) "The Internal Enmis f the Firm: Evidene frm Persnnel Data," Quarterly Jurnal f Enmis, 109(4):881-919. Barnw, Burt S., Glen G. Cain and Arthur Gldberger, (1981) "Seletin n Observables," Evaluatin Studies Review Annual, 5:43-59. Beketti, Sean, William Guld, Lee Lillard, and Finis Welh (1988) "The Panel Study f Inme Dynamis after Furteen Years: An Evaluatin," Jurnal flabr Enmis, 6(4):472-92. Behrman, Jere, Zdenek Hrube, Paul Taubman and Terene Wales (1980) Sienmi Suess: A Study f the Effets f Geneti Endwments, Eamily Envirnment, and Shling. Amsterdam: Nrth-Hlland. Behrman, Jere R., Mark R. Rsenzweig and Paul Taubman (1994) "Endwments and the Allatin f Shling in the Family and in the Marriage Market: The Twins Experiment," Jurnal f Plitial Enmy 102(6): 1131-74. Behrman, Jere R., Mark R. Rsenzweig and Paul Taubman (1996) "Cllege Chie and Wages: Estimates Using Data n Female Twins," Review f Enmis and Statistis, 78(4):672-85. Bekker, Paul A. (1994) "Alternative Apprximatins t the Distributins f Instrumental Variables Estimatrs," Enmetria, 62(3):657-81. Bielby, William; Hauser, Rbert; and Featherman, David (1977) "Respnse Errrs f Nn-Blak Males in Mdels f the Stratifiatin Press," in D.J. Aigner and A.S. Gldberger, eds., Latent Variables in Sienmi Mdels, Amsterdam: Nrth-Hlland, 227-51. Bjrklund, Anders and Rbert Mffitt (1987), "The Estimatin f Wage Gains and Welfare Gains in Self-seletin Mdels," The Review fenmis and Statistis 69(1), pp. 42-49. Brjas, Gerge J. (1980) "The Relatinship between Wages and Weekly Hurs f Wrk: The Rle f Divisin Bias," Jurnal fhuman Resures, 15(3):409-23. Brjas, Gerge J., Rihard B. Freeman and Lawrene F. Katz (1997) An "Hw Muh D Immigratin and Trade Affet Labr Market Outmes?" Brkings Papers n Enmi Ativity, 10(1): 1-67. Brjas, Gerge J., Rihard B. Freeman and Kevin Lang (1991) "Undumented Mexian-Bm Wrkers in the United States: Hw Many, Hw Permanent?" in: Jhn M. Abwd and Rihard B. Freeman, eds., Immigratin, trade, and the labr market. Chiag: University f Chiag Press. A Natinal Bureau f Enmi Researh Prjet Reprt, Bund, Jhn (1989) "The Health and Earnings f Rejeted Disability Insurane Appliants," Amerian Enmi Review, 79(3):482-503. Bund, Jhn (1991) 'The Health and Earnings f Rejeted Disability Insurane Appliants: Reply," Amerian Enmi Review 81(5), pp. 1427-34. Bund, Jhn, David Jaeger and Regina Baker (1995) "Prblems with Instrumental Variables Estimatin
I\) when the Crrelatin Between the Instruments and the Endgenus Explanatry Variable is Weak," Jurnal f the Amerian Statistial Assiatin, 90(430):443-50. Bund, Jhn and Alan B. Krueger (1991) "The Extent f Measurement Errr in Lngitudinal Earnings Data: D Tw Wrngs Make a Right?" Jurnal f Labr Enmis, 9(l):l-24. Bund, Jhn, et al. (1994) "Evidene n the Validity f Crss-Setinal and Lngitudinal Labr Market Data," Jurnal f Labr Enmis 1 2 (3), 345-68. Bwen, William G. and Derek Bk (1998) The Shape f the River: Lng-Term Cnsequenes f Cnsidering Rae in Cllege and University Admissins. Prinetn: Prinetn University Press. Brnars, Stephen G. and Jeff Grgger (1994) "The Enmi Cnsequenes f Unwed Mtherhd: Using Twins as a Natural Experiment," Amerian Enmi Review, 84(5):1 141-1 156. Brwn, Charles, Greg J. Dunan and Frank P. Staffrd (1996), "Data Wath: The Panel Study f Inme Dynamis," Jurnal f Enmi Perspetives, 10(2):155-68. Burtless, Gary (1995) "The Case fr Randmized Field Trials in Enmi and Pliy Researh," Jurnal f Enmi Perspetives, 9(2):63-84. Buse, A. (1992) "The Bias f Instrumental Variable Estimatrs," Enmetria, 60(1): 173-80. Campbell, Dnald T. (1969) "Refrms as Experiments," Amm'ar Psyhlgist XXIV, 409-429. Campbell, Dnald T. and J.C. Stanley (1963) Experimental and Quasi-Experimental Designs fr Researh. Chiag: Rand-MNally. Card, David E. (1989) "The Impat f the Mariel Batlift n the Miami Labr Market," Prinetn University Industrial Relatins Setin Wrking Paper N. 253. Card, David E. (1990) "The Impat f the Mariel Batlift n the Miami Labr Market," Industrial and Labr Relatins Review, 43:245-57. Card, David E. (1995) "Earnings, Shling and Ability Revisited," in Slmn W. Plahek, ed., Researh in Labr Enmis. Greenwih, CT: JAI Press. Card, David E. (1996) "The Effet f Unins n the Struture f Wages: A Lngitudinal Analysis" Enmetria, 64(4):957-79. Card, David E. (1998) "The Causal Effet f Shling n Earnings," in this vlume. Card, David E. and Alan B. Krueger (1994), "Minimum Wages and Emplyment: A Case Study f the Fast-Fd Industry in New Jersey and Pennsylvania," Amerian Enmi Review, 84 (4): 772-84. Card, David E. And Alan B. Krueger (1998), "A Reanalysis f the Effet f the New Jersey Minimum Wage Inrease n the Fast-Fd Industry with Representative Payrll Data," NBER Wrking Paper N. 6386. Card, David E. and Daniel Sullivan (1988) "Measuring the Effet f Subsidized Training n Mvements In and Out f Emplyment," Enmetria, 56(3):497-530. Center fr Drug Evaluatin and Researh (1988), "Guideline fr the Frmat and Cntent f the Clinial and Statistial Setins f a New Drug Appliatin," U.S. Fd and Drug Administratin, Department f Health and Human Servies, Washingtn, DC: USGPO. Chamberlain, Gary (1977) "Eduatin, Inme, and Ability Revisited," Jurnal f Enmetris, 5(2):241-257. Chamberlain, Gary (1978) "Omitted Variables Bias in Panel Data: Estimating the Returns t Shling," Annales De L'INSEE, 30-31:49-82. Chamberlain, Gary (1980) "Disussin," Amerian Enmi Review, 70(2):47-49. Chamberlain, Gary (1984) "Panel Data," in: Zvi Grilihes and Mihael D. Intriligatr, eds., Handbk f Enmetris. Amsterdam: Nrth-Hlland. Chamberlain, Gary and Guid W. Imbens (1996) "Hierarhial Bayes Mdels with Many Instrumental Variables," Harvard University Department f Enmis, Disussin Paper N. 1781, Deember. Chamberlain, Gary and Edward E. Learner (1976) "Matrix Weighted Averages and Psterir Bunds," Jurnal f the Ryal Statistial Siety, Series B, 38:73-84. Chay, Kenneth Y. (1996) An Empirial Analysis f Blak Enmi Prgress ver Time. Prinetn
. 112- University Department f Enmis, Ph.D. Thesis. Chran, William G. (1965) "The Planning f Observatinal Studies f Human Ppulatins (with Disussin)," Jurnal f the Ryal Statistial Siety, Series A, 128:234-266. Cder, Jhn and Lydia Sn-Rgers (1996) "Evaluating the Quality f Inme Data Clleted in the Annual Supplement t the Marh Current Ppulatin Survey and the Survey f Inme and Prgram Partiipatin," Census Wrking Paper N. 215. Dawid, A. P. (1979) "Cnditinal Independene in Statistial Thery," Jurnal f the Ryal Statistial Siety, Series B, 4 1 : 1-3 1 Deatn, Angus (1985) "Panel Data frm a Time Series f Crss-Setins," Jurnal f Enmetris 30:109-26. Deatn, Angus (1995) "Data and Enmetri Tls fr Develpment Analysis," in Hllis Chenery and T.N. Srinivasan, eds., Handbk f Develpment Enmis. Amsterdam: Nrth-Hlland. Deatn, Angus (1997) The Analysis f Husehld Surveys: A Mirenmetri Apprah t Develpment Pliy, Baltimre, MD: Jhns Hpkins University Press. Deatn, Angus and Christina Paxsn (1998) "Enmies f Sale, Husehld Size, and the Deman fr Fd," Jurnal f Plitial Enmy, in press. Dehejia, Rajeev H. and Sadek Wahba (1995) "Causal Effets in Nnexperimental Studies: Re-evaluating the Evaluatin f Training Prgrams," Harvard University Department f Enmis, mime. Dikens, William T. (1985) "Errr Cmpnents in Gruped Data: Why It's Never Wrth Weighting," NBER Tehnial Wrking Paper N. 043. Dminitz, Jeff and Charles F. Manski (1997) "Using Expetatins Data t Study Subjetive Inme Expetatins," Jurnal f the Amerian Statistial Assiatin, 92:855-867. Dnald, Steven and Whitney K. Newey (1997) "Chsing the Number f Instruments," MIT Department f Enmis, mime. DuMuhel, William H., and Greg Dunan (1983), "Using Sample Survey Weights in Multiple Regressin Analyses f Stratified Samples," Jurnal f the Amerian Statistial Assiatin 78, 535-543. Dunan, Greg J. and Daniel H. Hill (1985) "An Investigatin f the Extent and Cnsequenes f Measurement Errr in Labr-Enmi Survey Data," Jurnal flabr Enmis, 3(4):508-32. Dunan, Greg T. and Rbert W. Pearsn (1991) "Enhaning Aess t Mirdata while Prteting Cnfidentiality: Prspets fr the Future," Statistial Siene, 6(3):219-239. Durbin, J. (1953) "A Nte n Regressin When there is Extraneus Infrmatin Abut One f the Ceffiients," Jurnal f the Amerian Statistial Assiatin, 48:799-808. Durbin, J. (1954) "Errrs in Variables," Review f the Internatinal Statistial Institute, 22:23-32. Farber, Henry S. and Alan B. Krueger (1993) "Unin Membership in the United States: The Deline Cntinues," Prinetn University Industrial Relatins Setin Wrking Paper N. 306. Fitzgerald, Jhn, Peter Gttshalk and Rbert Mffit (1998) "An Analysis f Sample Attritin in Panel Data: The Mihigan Panel Study f Inme Dynamis," Jurnal fhuman Resures, frthming. Freeman, Rihard B. (1984) "Lngitudinal Analyses f the Effets f Trade Unins," Jurnal flabr Enmis, 2: 1-26. Freeman, Rihard B. (1989) Labr Markets in Atin. Cambridge: Harvard University Press. Freeman, Rihard B. (1990) "Emplyment and Earnings f Disadvantaged Yung Men in a Labr Shrtage Enmy," NBER Wrking Paper N. 3444. Freeman, Rihard B. and Brian Hall (1986) "Permanent Hmelessness in Ameria?," NBER Wrking Paper N. 2013. Freeman, Rihard B. and Harry J. Hlzer (1986) "The Blak Yuth Emplyment Crisis: Summary f Findings," in: Rihard B. Freeman and Harry J. Hlzer, eds., The Blak Yuth Emplyment Crisis. Natinal Bureau f Enmi Researh Prjet Reprt. Chiag: University f Chiag Press.
//J Freeman, Rihard B. and Mrris M. Kleiner (1990) "The Impat f New Uninizatin n Wages and Wrking Cnditins," Jurnal f Labr Enmis, 8(1):S8-S25. Friedberg, Rahel M. and Jennifer Hunt (1995) "The Impat f Immigrants n Hst Cuntry Wages, Emplyment and Grwth," Jurnal f Enmi Perspetives, 9(2):23-44. Friedman, Miltn (1957) A Thery f the Cnsumptin Funtin. Prinetn: Prinetn University Press. Fuhs, Vitr, Alan B. Krueger and James M. Pterba (1998) "Why D Enmists Disagree Abut Pliy? The Rles f Beliefs Abut Parameters and Values," Jurnal f Enmi Perspetives, frthming. Fuller, Wayne A. (1987) Measurement Errr Mdels. New Yrk: Wiley. Girshik, M. A. and Trygve Haavelm (1947) "Statistial Analysis f the Demand fr Fd: Examples f Simultaneus Estimatin f Strutural Equatins," Enmetria, 15(2):79-1 10. Gldberger, Arthur S. (1972). "Seletin Bias in Evaluating Treatment Effets: Sme Frmal Illustratins," University f Wisnsin Institute fr Researh n Pverty Disussin Paper 123-72. Gldberger, Arthur S. (1991) A Curse in Enmetris. Grseline, Dnald E. (1932) The Effet f Shling Upn Inme. Indiana. Cambridge: Harvard University Press. Blmingtn, IN: University f Grilihes, Zvi (1977) "Estimating the Returns t Shling: Sme Enmetri Prblems," Enmetria, 45(1): 1-22. Grilihes, Zvi (1979) "Sibling Mdels and Data in Enmis: Beginnings f a Survey," Jurnal f Plitial Enmy, 87(5):S37-S64. Grilihes, Zvi (1986) "Enmi Data Issues," in: Zvi Grilihes and Mihael D. Intriligatr, eds., Handbk f Enmetris. Amsterdam: Nrth-Hlland. Grilihes, Zvi and Jerry A. Hausman (1986) "Errrs in Variables in Panel Data," Jurnal f Enmetris, 31(1):93-1 18. Grilihes, Zvi and William M. Masn (1972) "Eduatin, Inme, and Ability," Jurnal f Plitial Enmy, 80(3):S74-S 103. Grsh, Margaret E. and Paul Glewwe (1996) "Husehld Survey Data frm Develping Cuntries: Prgress and Prspets," Amerian Enmi Review, 86(2): 15-1 9. Grsh, Margaret E. and Paul Glewwe (1998) "Data Wath: The Wrld Bank's Living Standards Measurement Study Husehld Surveys," Jurnal f Enmi Perspetives, 12(1): 187-96. Grves, Rbert M. (1989) Survey Errrs and Survey Csts. New Yrk: Wiley. Grves, Rbert M. (1998) Nnrespnse in Husehld Interview Surveys, New Yrk: Wiley. Gruber, Jnathan (1997) "The Cnsumptin Smthing Benefits f Unemplyment Insurane," Amerian Enmi Review, 87(1): 192-205. Hahn, Jinyng (1998) "On the Rle f the Prpensity Sre in the Effiient Estimatin f Average Treatment Effets," Enmetria, 66:315-332. Hahn, Jinyng, Petra Tdd, and Wilbert van der Klaauw (1998) "Estimatin f Treatment Effets with a Quasi-Experimental Regressin-Disntinuity Design: with Appliatin t Evaluating the Effet f Federal Antidisriminatin Laws n Minrity Emplyment in Small U.S. Firms," University f Pennsylvania Department f Enmis, mime. Hall, Alastair R., Glenn D. Rudebush, and David W. Wilx (1996), "Judging Instrument Relevane in Instrumental Variables Estimatin," Internatinal Enmi Review, 37(2):283-98. Hansen, Lars Peter (1982) "Large Sample Prperties f Generalized Methd f Mments Estimatrs," Enmetria, 50(4): 1029-54. Hansen, W. Lee, Burtn A. Weisbrd and William J. Sanln (1970) "Shling and Earnings f Lw Ahievers," Amerian Enmi Review, 60(3):409-18. Hausman, Jerry A. and William E. Taylr (1981) "Panel Data and Unbservable Individual Effets," Enmetria, 49(6): 1377-98. Hearst, Nrman, Thmas Newman and Steven Hulley (1986) "Delayed Effets f the Military Draft n Mrtality: A Randmized Natural Experiment," New England Jurnal fmediine, 314:620-24.
//4- Hekman, James J. (1978) "Dummy Endgenus Variables in a Simultaneus Equatins System," Enmetria, 46(4):931-59. Hekman, James J. and V, Jseph Htz (1989) "Chsing amng Alternative Nnexperimental Methds fr Estimating the Impat f Sial Prgrams: The Case f Manpwer Training," Jurnal f the Amerian Statistial Assiatin, 84(408):862-74. Hekman, James J., Hidehik Ihimura and Petra E. Tdd (1997) "Mathing as an Enmetri Evaluatin Estimatr: Evidene frm Evaluating a Jb Training Prgramme," Review f Enmi Studies, 64(4):605-54. Hekman, James J., Rbert J. Lalnde and Jeffrey A. Smith (1998) "Enmi Analysis f Training Prgrams," in this vlume. Hekman, James J., Lane Lhnev and Christpher Taber (1998) "Tax Pliy and Human-Capital Frmatin," Amerian Enmi Review, 88(2):293-97. Hekman, James J. and Thmas E. MaCurdy (1986) "Labr Enmetris," in: Orley Ashenfelter and Rihard Layard, eds., Handbk f Labr Enmis. Amsterdam: Nrth-Hlland. Hekman, James J. and Brk S. Payner (1989) "Determining the Impat f Antidisriminatin Pliy n the Enmi Status f Blaks: A Study f Suth Carlina," Amerian Enmi Review, 79(1 ): 138-77. Hekman, James J. and Rihard Rbb, Jr. (1985) "Alternative Methds fr Evaluating the Impat f Interventins," in James J. Hekman and Burtn Singer, eds., Lngitudinal Analysis f Labr Market Data. Enmetri Siety Mngraphs Series, N. 10. Cambridge: Cambridge University Press. Hekman, James J. and Jeffrey A. Smith (1995) "Assessing the Case fr Sial Experiments," Jurnal f Enmi Perspetives, 9(2):85-l 10. Hlland, Paul W. (1986) "Statistis and Causal Inferene," Jurnal f the Amerian Statistial Assiatin, 81:945-970. Hurd, Mihael, et al. (1998) "Cnsumptin and Savings Balanes f the Elderly: Experimental Evidene n Survey Respnse Bias," frthming in D. Wise (ed.), Frntiers in the Enmis faging, Chiag: University f Chiag Press, pp. 353-87. Imbens, Guid W. and Jshua D. Angrist (1994), "Identifiatin and Estimatin f Lal Average Treatment Effets," Enmetria, 62(2):467-75. Imbens, Guid W. and Tny Lanaster (1994) "Cmbining Mir and Mar Data in Mirenmetri Mdels," Review f Enmi Studies, 61(4):655-80. Imbens, Guid W., Dnald B. Rubin, and Brue I. Saerdte (1997) "Estimating Inme Effets: Evidene frm a Survey f Lttery Players," UCLA Enmis Department, mime. Imbens, Guid W. and Wilbert van der Klaauw (1995) "The Cst f Cnsriptin in the Netherlands," Jurnal f Business and Enmi Statistis, 13(2):207-15. Jabsn, Luis S., Rbert J. Lalnde and Daniel G. Sullivan (1994) "Earnings Lsses f Displaed Wrkers," Amerian Enmi Review, 83(4):685-709. Jaeger, David (1993) "The New Current Ppulatin Survey Eduatin Variable: A Remmendatin." University f Mihigan Ppulatin Studies Center Researh Reprt N. 93-289. Jakubsn, Gerge (1986) "Measurement Errr in Binary Explanatry Variables in Panel Data Mdels: Why D Crss Setin and Panel Estimates f the Unin Wage Effet Differ?" Prinetn University Industrial Relatins Setin Wrking Paper N. 209. Jakubsn, Gerge (1991) "Estimatin and Testing f the Unin Wage Effet Using Panel Data," R^view f Enmi Studies, 58(5):971-91. Jappelli, Tuilli, Jm-Steffen Pishke and Nihlas Suleles (1998) "Testing fr Liquidity Cnstraints in Euler Equatins with Cmplementary Data Sures," Review f Enmis and Statistis 80, 251-262. Juhn, Chinhui, Kevin M. Murphy, and Brks Piere (1993) "Wage Inequality and the Rise in Returns t Skill," Jurnal f Plitial Enmy, 101(3):410-42. Juster, F. Thmas and James P. Smith (1997) "Imprving the Quality f Enmi Data: Lessns frm
il5 the HRS and AHEAD," Jurnal f the Amerian Statistial Assiatin, 92(440): 1268-78. Kane, Thmas J., Ceilia Elena Ruse, and Duglas Staiger (1997) "Estimating Returns t Shling When Shling is Misreprted" (unpublished). Katz, Lawrene F. and Brue Meyer (1990) "Unemplyment Insurane, Reall Expetatins and Unemplyment Outmes," Quarterly Jurnal f Enmis, 105(4):973-1002. Katz, Lawrene F. and Kevin M. Murphy (1992) "Changes in Relative Wages, 1963-1987: Supply and Demand Fatrs," Quarterly Jurnal f Enmis, 107(l):35-78 Keane, Mihael P. and Kenneth Wlpin (1997) "Intrdutin t the JBES Speial Issue n Strutural Estimatin in Applied Mirenmis," Jurnal fbusiness and Enmi Statistis, 15(2):lll-4. Kling, Jeffrey (1998) "Interpreting Instrumental Variables Estimates f the Returns t Shling," in Identifying Causal Effets f Publi Pliies. MJT Department f Enmis, Ph.D. Thesis. Kremer, Mihael (1997) "Develpment Data Sets," MIT Department f Enmis, mime. Krueger, Alan B. (1990a) "Inentive Effets f Wrkers' Cmpensatin Insurane," Jurnal f Publi Enmis, 41:73-99. Krueger, Alan B. (1990b) "Wrkers' Cmpensatin Insurane and the Duratin f Wrkplae Injuries," NBER Wrking Paper N. 3253. Krueger, Alan B. and Duglas Kruse (1996) "Labr Market Effets f Spinal Crd Injuries in the Dawn f the Cmputer Age," Prinetn University Industrial Relatins Setin Wrking Paper N. 349. Krueger, Alan B. and Jm Steffen Pishke (1992) "The Effet f Sial Seurity n Labr Supply. A Chrt Analysis f the Nth Generatin," Jurnal flabr Enmis, 10(2):412-437. Krueger, Alan B. and Lawrene H. Summers (1987) "Effiieny Wages and the Inter-industry Wage Struture," Enmetria, 56(2):259-93. Lalnde, Rbert J. (1986) "Evaluating the Enmetri Evaluatins f Training Prgrams Using Experimental Data," Amerian Enmi Review, 76(4):602-620. Lang, Kevin (1993) "Ability Bias, Disunt Rate Bias and the Return t Eduatin," Bstn University Department f Enmis, mime. Lazear, Edward P. (1992) "The Jb as a Cnept," in William J. Bruns, Jr. ed., Perfrmane Measurement, Evaluatin, and Inentives. Bstn: Harvard Business Shl Press. Learner, Edward E. (1982) "Let's Take the Cn Out f Enmetris," Amerian Enmi Review, 73(1 ):3 1-43. Lester, Rihard A. (1946) "Shrtmings f Marginal Analysis fr Wage-Emplyment Prblems," Amerian Enmi Review, 36:63-82. Levy, Frank (1987) Dllars and Dreams: The Changing Amerian Inme Distributin. NY: Russell Sage Fundatin. Lewis, H. Gregg (1963) Uninism and Relative Wages in the United States; an Empirial Inquiry. Chiag: University f Chiag Press. Lewis, H. Gregg (1986) Unin Relative Wage Effets. Chiag: University f Chiag Press. Lillard, Lee, James P. Smith and Finis Welh (1986) "What D We Really Knw abut Wages? The Imprtane f Nnreprting and Census Imputatin," Jurnal f Plitial Enmy, 94(3):489-506. Lusardi, Ann Maria (1996) "Permanent Inme, Current Inme and Cnsumptin: Evidene frm Tw Panel Data Sets," Jurnal f Business and Enmi Statistis, 14(l):81-90. MaCurdy, Thmas E. (1981) "An Empirial Mdel f Labr Supply in a Life-Cyle Setting," Jurnal f Plitial Enmy, 89(6): 1059-85. Marquis, K.H., J.C. Mre and K. Bgen (1996) "An Experiment t Redue Measurement Errr in the SIPP: Preliminary Results," Bureau f the Census, mime. Marshall, Alfred (1982) Priniples f Enmis. Philadelphia: Prupine Press. MCarthy, P. J. (1979), "Sme Sures f Errr in Labr Fre Estimates frm the Current Ppulatin Survey," in Natinal Cmmissin n Emplyment and Unemplyment Statistis, Cunting the
}\le Labr Fre, Appendix, Vlume II, Washingtn DC: US Gvernment Printing Offie. Medff, James L. and Katharine G. Abraham (1980) "Experiene, Perfrmane, and Earnings," Quarterly Jurnal f Enmis, 95(4):703-736. Mellw, Wesley, and Hal Sider (1983) "Auray f Respnse in Labr Market Surveys: Evidene and Impliatins" Jurnal f Labr Enmis, l(4):331-44. Meytr, Brue D. (1995) "Natural and Quasi-experiments in Enmis," Jurnal f Business and Enmi Statistis, 1 3(2): 151-61. Meyer, Brue D., W. Kip Visusi and David L. Durbin (1995) "Wrkers' Cmpensatin and Injury Duratin: Evidene frm a Natural Experiment," Amerian Enmi Review, 85:322-40. Miner, Jab and Yshi Higuhi (1988) "Wage Strutures and Labr Turnver in the U.S. and in Japan," Jurnal f the Japanese and Internatinal Enmy, 2(2):97-133 Mrgenstern, Oskar (1950) On the auray fenmi Observatins. Prinetn: Prinetn University Press. Murphy, Kevin M, and F. Welh (1992) "The Struture f Wages," Quarterly Jurnal f Enmis, 197(l):285-326. Newey, Whitney K. (1985) "Generalized Methd f Mments Estimatin and Testing," Jurnal f Enmetris, 29(3):229-56. Nikell, Stephen J. (1981) "Biases in Dynami Mdels with Fixed Effets," Enmetria, 49(6): 141 7-26. NLS Handbk. (1995) Clumbus, Ohi: Center fr Human Resure Researh, The Ohi State University. Park, Jin Huem (1994) "Returns t Shling: A Peuliar Deviatin frm Linearity," Prinetn University Industrial Relatins Setin Wrking Paper N. 339. Parsns, Dnald O. (1980) "The Deline in Male Labr Fre Partiipatin," Jurnal f Plitial Enmy, 88(1): 1 17-34. Parsns, Dnald O. (1991) "The Health and Earnings f Rejeted Disability Insurane Appliants: Cmment," Amerian Enmi Review 81(5), pp. 1419-26. Passell, P. (1992), "Putting the Siene in Sial Siene," New Yrk Times. Pindyk, Rbert S. and Daniel L. Rubinfeld (1991) Enmetri Mdels and Enmi Freasts. Yrk: MGraw-Hill. Plivka, Anne (1996) "Data Wath: The Redesigned Current Ppulatin Survey," Jurnal f Enmi Perspetives, 10(3): 169-81. Plivka, Anne (1997) "Using Earnings Data frm the Current Ppulatin Survey After the Redesign," Bureau f Labr Statistis Wrking Paper N. 306. Plivka, Anne and Stephen Miller (1995) "The CPS After the Redesign: Refusing the Enmi Lens," Bureau f Labr Statistis, mime. Pterba, James M. and Lawrene H. Summers (1986) "Reprting Errrs and Labr Market Dynamis," Enmetria, 54(6): 13 19-38. Pwell, James L., James H. Stk and Thmas M. Stker (1989) "Semiparametri Estimatin f Index Ceffiients," Enmetria, 57(6): 1403-30. Riddell, W. Craig (1992) "Uninizatin in Canada and the United States: A Tale f Tw Cuntries," University f British Clumbia Department f Enmis, mime. Rsenbaum, Paul R. (1995) Observatinal studies. New Yrk: Springer-Verlag. Rsenbaum, Paul R. and Dnald B. Rubin (1983) "The Central Rle Ci" the Prpensity Sre in Observatinal Studies fr Causal Effets," Bimetrika, 70:41-55. Rsenbaum, Paul R. and Dnald B. Rubin (1984) "Reduing Bias in Observatinal studies Using Sublassifiatin n the Prpensity Sre," Jurnal fthe Amerian Statistial Assiatin, 79:516-524. Rsenbaum, Paul R. and Dnald B. Rubin (1985) "Cnstruting a Cntrl Grup using Multi-variate Mathing Methds that inlude the Prpensity Sre," Amerian Statistiian, 39:33-38. New
in MIT LIBRARIES 3 9080 01444 0686 Rsenzweig, Mark R. and Kenneth I. Wlpin (1980) "Testing the Quantity-Quality Mdel f Fertility: The Use f Twins as a Natural Experiment," Enmetria, 48(l):227-240. Rsvsky, Henry (1990) The University: An Owner's Manual, New Yrk: W.W. Nrtn and Cmpany. Rthgeb, Jennifer M. and Sharn R. Chany (1992) "The Revised CPS Questinnaire: Differenes between the urrent and the prpsed questinnaires" presented at the annual meeting f the Amerian Statistial Assiatin. Rubin, Dnald B. (1973) "Mathing t Remve Bias in Observatinal Studies," Bimetris, 29(l):159-83. Rubin, Dnald B. (1974) "Estimating Causal Effets f Treatments in Randmized and Nn-randmized Studies," Jurnal f Eduatinal Psyhlgy, 66:688-701. Rubin, Dnald B. (1977) "Assignment t a Treatment Grup n the Basis f a Cvariate," Jurnal f Eduatinal Statistis, 2:1-26. Rubin, Dnald B. (1983) "Imputing Inme in the CPS: Cmments n 'Measures f Aggregate Labr Cst in the United States,'" in Jak E. Triplett, ed. The Measurement f Labr Cst. University f Chiag Press. Chiag: Sawa, Takamitsu (1969) "The Exat Sampling Distributin f Ordinary Least Squares and Tw-Stage Least Squares Estimatrs," Jurnal f the Amerian Statistial Assiatin 64(327): 923-937. Sawa, Takamitsu (1973) "An Amst Unbiased Estimatr in Simultaneus Equatins Systems," Internatinal Enmi Review 14(1):97-106. Siegel, Paul and Hdge, Rbert (1968) "A Causal Apprah t the Study f Measurement Errr," in Hubert Blalk and Ann Blalk, eds., Methdlgy in Sial Researh, New Yrk: MGraw- Hill, 1968, 28-59. Siegfried, Jhn J. and Gerge H. Sweeney (1980) "Bias in Enmis Eduatin Researh frm Randm and Vluntary Seletin int Experimental and Cntrl Grups," Amerian Enmi Review, 70(2):29-34. Singer,' Eleanr and Stanley Presser (1989) Sun>ey Researh Methds. Press. Chiag: University f Chiag Sln, Gary R. (1985) "Wrk Inentive Effets f Taxing Unemplyment Benefits," Enmetria, 53(2):295-306. Staffrd, Frank (1986) "Frestalling the Demise f Empirial Enmis: The Rle f Mirdata in Labr Enmis Researh," in: Orley Ashenfelter and Rihard Layard, eds., Handbk f Labr Enmis. Amsterdam: Nrth-Hlland. Staiger, Duglas and James H. Stk (1997) "Instrumental Variables Regressin with Weak Instruments," Enmetria, 65(3):557-86. Stigler, Stephen M. (1977) "D Rbust Estimatrs Wrk with Real Data?" Annals f Statistis, 5(6):1055-1098. Stker, Thmas M. (1986) "Aggregatin, Effiieny, and Crss-Setin Regressin," Enmetria 54(l):171-88. Sudman, Seymur and Nrman Bradbum (1991) Asking Questins: A Pratial Guide t Survey Design. San Franis: Jssey-Bass Publishers. Taubman, Paul (1976) "Earnings, Eduatin, Genetis, and Envirnment," Jurnal fhuman Resures 11 (Fall), 447-461. Taussig Mihael K. (1974) Thse Wh Served: Reprt f the Twentieth entury Fund Task Fre n Pliies Twards Veterans. New Yrk: The Twentieth Century Fund. Thurw, Lester C. (1983) Dangerus Currents: The State f Enmis. New Yrk: Randm Huse. Tpel, Rbert H. (1991) "Speifi Capital, Mbility, and Wages: Wages Rise with Jb Senirity," Jurnal f Plitial Enmy, 99(l):145-76. Trhim, William K. (1984) Researh Design fr Prgram Evaluatin: The Regressin-Disntinuity Apprah. Beverly Hills: Sage. Tufte, Edward R. (1992) The Visual Display f Quantitative Infrmatin. Press. Chesire, CT: Graphis
Iff Tukey, Jhn W. (1977) Explratry Data Analysis. Reading, Mass: Addisn-Wesley Publishing Cmpany. U.S. Bureau f the Census (1992) Current Ppulatin Survey, Marh 1992 Tehnial Dumentatin. Washingtn, D.C.: the Bureau f the Census. U.S. Bureau f the Census (1996) Census f Ppulatin and Husing, 1990 United States: Publi Use Mirdata Sample: 5 Perent Sample. Washingtn, D.C.: US Department f Cmmere, Third ICPSR release, van der Klaauw, Wilbert (1996) "A Regressin-Disntinuity Evaluatin f the Effet f Finanial Aid Offers n Cllege Enrllment," New Yrk University Department f Enmis, manusript. Vrman, Wayne (1990) "Blak Men's Relative Earnings: Are the Gains Illusry?" Industrial and Labr Relatins Review, 44(l):83-98. Vrman, Wayne (1991) "The Deline in Unemplyment Insurane Claims Ativity in the 1980s," UI Oasinal Paper N. 91-2, U.S. DOL, Emplyment and Training Administratin. Wald, A. (1940) "The Fitting f Straight Lines if Bth Variables are Subjet t Errr," Annals f Mathematial Statistis, 1 1 :284-300. Welh, Finis (1975) "Human Capital Thery: Eduatin, Disriminatin, and Life-Cyles," Amerian Enmi Review, 65:63-73. Welh, Finis (1977), "What Have We Learned frm Empirial Studies f Unemplyment Insurane?" Industrial and Labr Relatins Review, 30: pp. 451-61. Westergard-Nielsen, Niels (1989) "Empirial Studies f the Eurpean Labur Market Using Mirenmi Data Sets: Intrdutin," Eurpean Enmi Review, 33(2/3):389-94. White, Halbert (1980) "Using Least Squares t Apprximate Unknwn Regressin Funtins," Internatinal Enmi Review, 21(l):149-70. Willis, Rbert J. and Sherwin Rsen (1979) "Eduatin and Self-Seletin," Jurnal f Plitial Enmy, 87(5):S7-S36. Yitzhaki, Shlm (1996) "On Using Linear Regressins in Welfare Enmis," Jurnal fbusiness and Enmi Statistis, 14:478-486. i /
M f j Date Due Fz - % J Lib-26-67