Is There A Tradeoff between EmployerProvided Health Insurance and Wages?


 Maximillian Watts
 3 years ago
 Views:
Transcription
1 Is There A Tradeoff between EmployerProvded Health Insurance and Wages? Lye Zhu, Southern Methodst Unversty October 2005 Abstract Though most of the lterature n health nsurance and the labor market assumes a tradeoff between employerprovded health nsurance and wages, ts emprcal valdty has not been establshed. Employng Current Populaton Survey 2004 data, ths paper assesses the tradeoff hypothess n a dstrbutonal analyss framework usng stochastc domnance tests. In addton, t contrbutes to the prevous lterature by ncorporatng an ndrect effect of health nsurance on wages nto the analyss. Health nsurance not only drectly affects wages, but also ndrectly by mprovng ndvdual productvty. The results confrm the exstence of a tradeoff for full tme workng wves, and explan why the prevous lterature fals to do so. JEL: C12, C14, D31, I11, J3 Keywords: EmployerProvded Health Insurance, Tradeoff, Dstrbutonal analyss, Stochastc Domnance Tests I am grateful to Professor Danel Mllmet for hs profound supervson and contnuous help. Thanks to Professor Nathan Balke, Professor Tom Fomby and Professor Y Deng for helpful comments and advce. All errors are mne. Correspondng address: Lye Zhu, Department of Economcs, Southern Methodst Unversty, Dallas, TX Tel: Emal:
2 I. Introducton In contrast to most other developed countres, health nsurance n the US s both provded and fnanced predomnantly by employers, especally for workng aged ndvduals. Current Populaton Survey 2004 data show that 63% of Amercan adult populaton s covered by employerprovded health nsurance (HI) 1 and ths percentage changes only slghtly over tme. The magntude of employerprovded health nsurance coupled wth the nsttutons and rules for health nsurance provson have made health nsurance an mportant parameter of labor market decsons both for ndvduals and frms. Theory predcts that there s a tradeoff between employerprovded health nsurance and wages. Therefore, most studes of health nsurance effects on both labor force partcpaton and job choce as well as on other labor ssues are based on ths assumpton. However, the lnk has not been emprcally affrmed. The focus of the recent emprcal studes of the hypothess tests has centered on problems wth the data and the endogenety of HI. However, even usng "good" data and applyng methods, such as panel data or nstrumental varable (IV) models, to solve the endogenety problem, researchers stll cannot obtan strong results n support of the tradeoff hypothess. Most fnd a postve relatonshp; a few obtan results that support the tradeoff hypothess, but have certan shortcomngs n ther studes whch weaken the results. In addton to addressng these two dffcultes, ths paper also deals wth two other possble shortcomngs of the exstng lterature that may explan the nconsstent results. Frst, the basc lnear regresson model employed n most studes may be msspecfed. Second, the effect of HI on wages may be heterogeneous. In terms of model specfcaton, ths paper not only consders the drect effect of HI on wages, but also ncorporates the ndrect effect of HI nto the analyss. HI may ncrease ndvdual productvty, and therefore the returns to other determnants of productvty (such as educaton), by enhancng ndvdual health or boostng workng morale. Therefore, HI enters the wage equaton, not only from the drect tradeoff pont of vew, but also from the productvtyenhancng pont of vew. The drect effect of HI can be reflected by the coeffcent on HI n a wage equaton. The ndrect effect can be captured by the coeffcents on the nteractons between HI and 1 Employerprovded health nsurance s denoted as HI n ths paper. Therefore, HI n ths paper does not refer to whether ndvduals have nsurance or not, but refers to whether ndvduals have employerprovded health nsurance or not. Smlarly, NOHI refers to ndvduals wthout employerprovded health nsurance. It wll be specfed f health nsurance status s n use. 1
3 ndvdual or job characterstcs n a wage equaton. These two effects, the drect tradeoff effect and the ndrect productvtyenhancng effect, work n opposte drectons. The total effect of HI on wages depends on the magntudes of these two effects. Snce the coeffcents on the nteractons turn out to be sgnfcantly dfferent from zero, and the nteractons are hghly correlated wth other regressors, the prevous lterature whch omts the nteracton terms generates based results. Another possble reason that the prevous lterature cannot fnd sgnfcant tradeoff s the heterogenety problem. If ndvdual heterogenety exsts, the estmated tradeoff can be statstcally nsgnfcant when usng regresson analyss, whch s based on the condtonal mean, because the tradeoff wll dffer among ndvduals and may offset. To deal wth ndvdual heterogenety, the tradeoff hypothess s tested n ths paper usng a dstrbutonal analyss. The dstrbutonal analyss combned wth tests of stochastc domnance (SD) dffers from prevous emprcal regresson analyses. It utlzes all avalable nformaton to fnd and test for unform rankngs of dstrbutons for dfferent groups of people. Moreover, ths approach allows one to easly assess potental heterogenety n the magntudes of tradeoff across the dstrbuton of wages. Specfcally, the dstrbutonal analyss and stochastc domnance tests are performed n two cases: when HI s exogenous and when t s endogenous. In the frst case, ndvduals are classfed accordng to ther HI status (HI or NOHI). Dstrbutons of the followng outcomes are obtaned for each sample: the total wage, wages explaned by the constant and resduals, and wages explaned by ndvdual and job characterstcs. In addton, the hypothetcal wages of NOHI sample usng the HI regresson coeffcents are obtaned. Then, the correspondng HI and NOHI dstrbutons are compared by stochastc domnance tests. Wage dfferentals explaned by the constant and resduals correspond to the drect effect n the typcal regresson analyss. By ntroducng the hypothetcal wage, wage dfferentals explaned by characterstcs s further decomposed nto two gaps: characterstc gap, whch s solely due to dfferences n attrbutes, and compensaton gap, whch s due to dfferent wage returns for the same characterstcs across the two groups. The compensaton gap corresponds to the ndrect effect of HI, and s nterpreted as mproved ndvdual productvty through better health or hgher morale. Snce most of the prevous lterature predcts that HI s endogenous n the wage equaton, t s worthwhle to deal wth ths endogenety and thus the selecton bas problem. When HI s endogenous, followng the same 2
4 procedure as above, ndvduals are dvded nto two groups accordng to a bnary nstrument of HI. Accordng to Abade (2002), for the two groups dvded by a bnary IV, the dfference between the cumulatve dstrbuton functons (CDF) s proportonal to the dfference between the CDFs of the complers n each group under certan assumptons. Here, complers refer to those ndvduals who comply wth the IV,.e., whose HI status changes accordng to the IV. Thus, we can test for stochastc domnance among the complers by testng for domnance across the two groups dvded by a bnary IV. The nstruments, borrowed from Olson (2002), are tested for ther valdty. The endogenety of HI s also tested usng the vald nstruments. The results n ths paper, based on Current Populaton Survey (CPS) 2004 data on fulltme workng wves, not only suggest the exstence of a tradeoff, but also suggest the exstence of an ndrect effect of HI on wages. Ths explans why prevous studes fal to establsh a tradeoff. The paper s organzed as follows: Part II s the lterature revew. Part III ntroduces the methods used n ths paper. Part IV descrbes the data. Part V provdes the results. Part VI concludes. II. Lterature Revew A. Theoretcal Foundaton of the Tradeoff Hypothess The theory of the tradeoff between wages and health nsurance s that health nsurance s a frnge beneft employers provde to employees as compensaton. In a compettve product market, economc theory suggests that what matters to proft maxmzng frms s the value of the total compensaton package that they must offer to attract labor servces. If the compensaton level s too low, there wll not be enough labor attracted to the frm; f the compensaton level s too hgh, the frm cannot survve n the market. Thus, frms compensaton level to employees wll be smlar to that offered by other frms whch face the same labor pool. As a result, f we assume that the other benefts offered by frms do not change, to reman compettve, frms wll reduce wages by $1 for each dollar ncrease n health nsurance benefts. Indvduals wll then choose among frms offerng dfferent wage/health nsurance combnaton accordng to ther own preferences. Under ths scenaro, there exsts a tradeoff between wages and health nsurance. As a benchmark, the predct tradeoff s W / HI = 1. The actual tradeoff would dffer from ths n certan stuatons. For example, the tradeoff should be less than 1 f the health nsurance reduces job turnover and job turnover s a cost to the frm. 3
5 Another example, snce health nsurance costs are not taxable ncome for frms, but a cost before tax, the actual tradeoff should be 1 (1 t), where t s the tax rate. The fgures n Curre and Madran (1999) can llustrate the above theory. In Fgure 1 (A), frm 1 and frm 2 are two frms facng a compettve product market and same labor pool. Employee 1 and Employee 2 need the same total compensaton level, but have dfferent preferences over wage compensaton and health nsurance compensaton. We can see there s a tradeoff between wages and health nsurance: f the wage level s hgh, health nsurance wll be low (Employee 2); on the other hand, a lower wage level s combned wth a hgher health nsurance level (Employee 1). Thus, f all frms face the same tradeoff between wages and benefts n total compensaton, the wage/health nsurance bundles that are observed n the market wll reflect the sortng of employees across frms on the bass of ther heterogeneous preferences for health nsurance. Ths framework s the motvaton for much of the lterature on the tradeoff between wages and employerprovded health nsurance or other frnge benefts. B. Emprcal Tests of the Tradeoff Hypothess The emprcal mplementaton of the wagehealth nsurance tradeoff pctured n Fgure 1 (A) has typcally been the estmaton of y = α + X β + HI γ + µ (1) where y s labor market outcome of nterest (here, log hourly wages) for ndvdual, ndvdual and/or job characterstcs for ndvdual, coverage for ndvdual, and µ s ndvdual dsturbance. X s a vector of HI s ether the avalablty or value of health nsurance Condtonal on X and n the absence of tax consderatons, the theory predctsγ = 1f y represents hourly wages and HI s approprately measured n dollars. The emprcal valdty of Equaton (1) wth respect to wages, however, has been dffcult to establsh. The typcal estmates of γ are ether postve or nsgnfcant. The lterature has thus focused not on the magntude of the wagehealth nsurance tradeoff, but on the reasons why economsts cannot fnd evdence that there s one. 4
6 One man pont that researchers focus on s the lack of a sutable dataset. In order to estmate Equaton (1) data s requred on both compensaton and frnge beneft expendtures, that s, wage and health nsurance levels. The frmlevel datasets that nclude nformaton on benefts expendtures are usually aggregated at the frm level, but typcally do not nclude the types of human captal varables that mght allow researchers to control for the productvty of the workforce. The problem created by these omtted varables s llustrated n Fgure 1 (B). Employee 1 has hgher abltes and thus earns more wages as well as frnge benefts, such as health nsurance. Employer 2 has relatvely low abltes and thus earns a lower wage and lower frnge benefts. Thus, f total compensaton ncreases wth average worker productvty and both benefts and other consumpton goods are normal, a regresson usng such frmlevel data wll yeld a postve relatonshp between wages and benefts. Some researchers thus merge n average employer expendtures by ndustry from a frmlevel dataset to ndvduallevel datasets. Even so, such methods stll usually lead to a postve relatonshp. For example, Lebowtz (1983) uses the RAND Health Insurance Study to estmate the wage/frnge beneft tradeoff, but stll fnds a postve (although nsgnfcant) effect of employer health nsurance expendtures on wages. Researchers then deduce that the reason that they cannot fnd a tradeoff s that productvty s determned by both observed human captal varables and unobserved (to the econometrcan) ablty. Ths can be shown as y = α + X β + HIγ + δ + ~ µ (2) where δ s some unobservable varable or fxed effect, ~ µ s ndvdual dsturbance, and other varables are the same as n Equaton (1). Thus, f δ s omtted and s correlated wth health nsurance, the estmaton of γ wll be based. Ths mples that even condtonal on observed human captal varables, some frms employ hgher ablty workers and pay a hgher level of total compensaton. But, as shown n Fgure 1 (B), f ths hgher level of compensaton s allocated to both wages and benefts, a postve relatonshp between wages and frnge benefts wll be estmated despte usng observable human captal controls. Varous approaches have been taken to crcumvent ths problem of omtted varable bas. One common method s the dfferencng method,.e., purge the unobserved varable by dfferencng Equaton (2), ether across dfferent years for same person, or across dfferent job classfcatons wthn a frm. Buchmueller and Lettau (1997) use an employerlevel dataset that tracks compensaton and beneft expendtures for varous jobs 5
7 wthn the frm over a 4year perod. They purge the unobserved productvty dfferences by dfferencng Equaton (2) over tme, essentally examnng the mpact of the growth n health nsurance expendtures over tme on changes n wages over tme. Even so, they fnd no evdence of tradeoff between health nsurance and wages. Olson (1992), Mller (1995) and Ryan (1997) use panel datasets of workers to estmate the effect of changes n health nsurance coverage on changes n wages. However, the problem s that the majorty of changes n health nsurance coverage are generated by job changes and the unobserved job characterstcs that also mpact compensaton are unlkely to be constant followng a job change. The study by Olson s less subject to ths crtcsm as hs sample of dsplaced workers s exogenously selected by the closng of a plant or smlar event. Gruber (1994) explots a dfferent source of varaton, the changes of laws n many states n 1970s, whch requred employers who offered health nsurance to treat pregnancy and chldbrth the same as any other health ssue. He fnds that wages for those groups most lkely to beneft from the law fell n drect proporton to the antcpated cost of the beneft. Overall, hs results are consstent wth a full shftng of employer health nsurance costs onto wages. A recent development n the lterature to deal wth the endogenety problem s the IV method. Olson (2002), usng Current Populaton Survey (CPS) data, models the wages of marred women employed fulltme n the labor market. He uses husband s unon status, husband s frm sze, and husband s health nsurance coverage through hs job as nstruments for wfe s own employerprovded health nsurance benefts. The estmates suggest that wves wth own employerprovded health nsurance accept a wage about 20% lower than what they would have receved workng n a job wthout benefts. However, Olson does not fully test the valdty of the nstruments, especally the orthogonalty of the nstruments wth respect to wages. III. Methodology To test the tradeoff hypothess, ths paper not only consders the drect effect of HI on wages, but also ncorporates the ndrect effect of HI nto the wage equaton. In addton, ths paper deals wth ndvdual heterogenety problem and analyzes the tradeoff n a dstrbutonal framework. Fnally, to solve the endogenety due to omtted varables, the dstrbutonal analyss reles on the nstrumental varables framework ntroduced by Abade (2002). 6
8 A. Productvty Effect of HI If we consder health as one knd of human captal, health nsurance can be treated as an nvestment n ths human captal. Employees typcally start wth a large health endowment that must be contnuously replenshed as t deprecates. If employees have employerprovded health nsurance, they wll have guaranteed health care, and ther health rsk can be allevated. Therefore, HI can mprove employees health, whch can then rase ther productvty. Although employees wth NOHI may stll have other types of health nsurance, HI s usually more generous and covers hgher health care spendng. Moreover, employees productvty can also be enhanced by hgher morale due to better health care. Thus, HI enters the wage equaton not only from the drect tradeoff pont of vew, but also from the productvtyenhancng pont of vew. Fgure 1 (C) llustrates ths productvty effect of HI. When employee 1 obtans better health nsurance from her employer, her productvty mproves and therefore the frm s proft ncreases. On the graph, ths hgher proft can be reflected by an soproft curve, whch s hgher than the orgnal socost curve for the frm. In a compettve market, the frm wll pay hgher wages to employee 1 to reflect the ncrease n proft. Therefore, the actual wage of employee 1 s not W2 but W3. Movement from A to B reflects the drect tradeoff between HI and wages; movement from B to C reflects the ndrect effect of HI on wages. These two effects work n opposte drectons. The total effect of HI on wages s reflected by the movement from A to C, whch only has an nsgnfcant change of wages, snce the above two effects offset each other. In the regresson analyss, ths ndrect effect of HI can be captured by addng the nteracton terms between HI and other characterstc varables nto equaton (2) y = α + X β + HI γ + HI X θ + µ (3) where vector θ, the coeffcents on the nteracton terms, measures the dfferent returns to the same characterstcs for ndvduals wth HI and wth NOHI. As before, the coeffcent on HI,γ, captures the HI drect effect. The prevous regresson analyss gnores the ndrect effect and thus may result n based estmaton f θ 0, snce the nteracton terms are correlated wth HI. B. Dstrbutonal Analyss Assumng Exogenety of HI 7
9 To access the degree of heterogenety, ths paper tests the tradeoff hypothess usng dstrbutonal analyss by comparng the wage dstrbutons of wves wth HI and wves wth NOHI. Ths comparson between two dstrbutons can be formalzed usng stochastc domnance (SD). Several tests for SD have been proposed n the lterature; the approach heren s based on a generalzed KolmogorovSmrnov test. Suppose F 1 and F0 are CDFs for two groups of ndvduals, group 1 and group 0, then the followng defntons apply: F 1 Frst Order Domnates (FSD) F 0 f and only f F1 ( y) F0 ( y), y Y, wth strct nequalty for some y ; F 1 Second Order Domnates (SSD) F 0 f and only f y ( t) dt y F0 F 1 ( t) dt, y Y, wth strct nequalty for some y. The tests characterze the relatonshp between the dstrbutons. Therefore, f the NOHI wage CDF domnates the HI wage CDF, there exsts a tradeoff between employerprovded health nsurance and wages. To test for domnance, defne the emprcal CDFs for Y wth HI and wth NOHI respectvely as 1 F ˆ (4) N HI, N ( y) = I( YHI, y) N = 1 1 F ˆ (5) M NOHI, M ( y) = I( YNOHI, j y) M j= 1 where N (M) s the sze of the sample wth HI (NOHI). Now defne the followng functons of the jont dstrbuton d MN = mn sup[ FNOHI, M ( y) FHI, N ( y)] (6) M + N y MN y s = mn sup{ [ FNOHI, M ( t) FHI, N ( t)] dt} (7) M + N y where mn s taken over FNOHI FHI and HI FNOHI F (n effect performng two tests). The tests for FSD and SSD are based on the emprcal counterparts of d and s ( dˆ and ŝ ) usng the emprcal CDFs. The test for FSD requres: 8
10 () computng the values of Fˆ NOHI, M ( yq ) and Fˆ HI, N ( yq ) for y q, q = 1,..., Q, where Q denotes the number of ponts n the support Y utlzed ( Q = 500 n the applcaton), () computng the dfferences d = Fˆ ( y ) Fˆ ( y ), d = Fˆ ( y ) Fˆ ( y ), () fndng d ˆ = mn{max{ d }, max{ d }}. 1q HI, M q NOHI, N q 1q 2q 2q NOHI, M q HI, N q If d ˆ 0 (to a degree of statstcal certanty), then the null hypothess of frst order domnance s not rejected. Furthermore, f ˆ 0 d and max{ d 1 } < 0, then Y HI FSD Y NOHI as the value of the CDF for dstrbuton Y NOHI s at least as great as the correspondng value for dstrbuton Y HI at all y q, q = 1,..., Q. On the other hand, f d ˆ 0 and max{ d 2 } < 0, then Y NOHI FSD Y HI. The analogous test for SSD requres: () computng the values of Fˆ NOHI, M ( yq ) and Fˆ HI, N ( yq ) for y q, q = 1,..., Q, where Q denotes the number of ponts n the support Y utlzed ( Q = 500 n the applcaton), () computng the dfferences d = Fˆ ( y ) Fˆ ( y ), d = Fˆ ( y ) Fˆ ( y ), 1q HI, M q NOHI, N q 2q NOHI, M q HI, N q () calculatng the sums, fndng s q = = d 1 1 1q,, s = 2 q d = 1 2q,, q = 1,..., Q, ˆ 1q 2q (v) fndng s = mn{max{ s }, max{ s }}. q If s ˆ 0 (to a degree of statstcal certanty), then the null hypothess of second order domnance s not rejected. Moreover, f s ˆ 0 and max{ s 1q } <0, then Y HI SSD Y NOHI as the cumulatve value of the CDF for dstrbuton Y NOHI exceeds the correspondng value for dstrbuton Y HI at all y q, q = 1,..., Q ; otherwse, f max{ s 2q } <0, then Y NOHI SSD Y HI. Ths paper approxmates the emprcal dstrbuton of the test statstcs usng bootstrap technques. For each of 1000 bootstrap samples, dˆ and ŝ are computed. Thus, whether the emprcal dstrbutons are characterzed by FSD or by SSD can be reported. The bootstrap reported probabltes represent the crtcal levels assocated wth the nonrejecton regon. q 9
11 To ths pont, Y HI and Y NOHI have represented two uncondtonal varables. However, the magntudes of wages are not only decded by whether the employee has HI or has NOHI, but also decded by other ndvdual and job characterstcs. Therefore, we need to separate the HI effect on wages from the effects of other varables to correctly measure the tradeoff. From Equaton (3), the predcted wages respectvely for HI and NOHI groups are = ˆ α + X ˆ β + ˆ ε, k HI NOHI (8) yˆ k, k k, k k, =, The partal wage denotes the wage part explaned by the estmated ntercept and resduals for HI and NOHI, y d k, ˆ k k, = The CDFs of ths partal wage for HI and NOHI groups, denoted as = α + ˆ ε, k HI, NOHI (9) d d FNOHI and F HI, can be compared by domnance tests. The dfference between the two dstrbutons corresponds to the drect effect n the regresson analyss, and can be nterpreted as the tradeoff between HI and wages f HI s exogenous. The part left n wage equaton (8) stands for wages explaned by characterstc varables, whch s y = X ˆ β, k HI, NOHI (10) ch k, k, k = The correspondng CDFs, denoted as F and F ch NOHI ch HI, can also be compared by domnance tests. However, the dfference between the two dstrbutons reflects dfferences n X and dfferences n returns. We can decompose ths part usng a smlar method as the standard BlnderOaxaca decomposton. For a regresson, the standard BlnderOaxaca decomposton s n the form of the average value. Here, the author extends the standard Oaxaca decomposton to the dstrbutonal analyss. A hypothetcal wage CDF for ndvduals wth NOHI can be calculated usng the estmated coeffcents from the wage equaton of ndvduals wth HI, h.e., ˆ h F X β ) (shorted as F / ). Wth ths, decomposton smlar to the standard OaxacaBlnder ( NOHI HI decomposton can be performed, F ch HI NOHI HI ch ch h h ch F = F F ] + [ F F ] (11) NOHI [ HI NOHI / HI NOHI / HI The frst part of the expresson on the rght hand sde s the dfference n the earnngs dstrbutons explaned by dfferent characterstcs of ndvduals wth HI and NOHI. It can be called the characterstcs gap. The second part s the dfference due to the estmated parameters or wage structure. It s labeled the unexplaned part of the 10 NOHI
12 wage dfference. In ths paper, we call t the compensaton gap, the wage return dfferences between HI and NOHI ndvduals wth the same characterstcs. It reflects the ndrect effect of HI on wages. In general, FSD and SSD are tested for 5 pars of CDFs: the orgnal data, the partal wages, the predcted wage explaned by the characterstc varables, and the hypothetcal wages. Ths paper also presents graphs of the horzontal dfference between any two CDFs n consderaton. Specfcally, at every quantle, the wage dfference between two CDFs can be calculated and can be plotted aganst the accumulated probabltes. By ths, the pattern of the dfference across the entre wage dstrbutons can be analyzed. C. Dstrbutonal Analyss When HI Is Endogenous Snce most of the prevous lterature predcts that HI s endogenous n the wage equaton, and snce the wage equaton may not be correctly specfed, t s worthwhle to test the endogenety of HI. The reason s that f we dvde the data accordng to an endogenous varable, for example, HI, regresson and dstrbutonal analyss wll suffer from selecton bas. However, f we can fnd nstruments for HI, the tradeoff between HI and wages can be put nto the nstrumental varables envronment developed by Abade (2002). Let Y (0) be the potental outcome for ndvdual wthout treatment, or wth NOHI. Let Y (1) be the potental outcome for the same ndvdual wth HI. Let Z be a bnary nstrument for HI. Denote HI (0) the value that HI would have taken f Z = 0 ; HI (1) has the same meanng for Z = 1. In practce, for any partcular ndvdual we can not observe both HI (0) and HI (1). Instead the realzed treatment HI = HI 1) Z + HI (0) (1 Z ) s observed. Smlarly, only Y = Y 1) HI + Y (0) (1 HI ) s ( ( observed. Under the assumptons of: (1) ndependence of the nstrument: ( Y (0), Y (1), HI (0), HI (1) ) s ndependent of Z ; (2) frst stage: 0 < P ( Z = 1) < 1 and P ( HI (1) = 1) > P( HI (0) = 1) ; (3) monotoncty: P ( HI (1) HI (0)) = 1, Abade has the followng lemma for ndvduals whose treatment status s affected by varaton n the nstrument: HI ( 0) = 0 and HI ( 1) = 1(the subpopulaton of complers): 11
13 F C E[ I ( Y y) Z = 1] E[ I( Y y) Z = 0] ( y) F0 ( y) = = K ( F1 0 ), (12) E[ HI Z = 1] E[ HI Z = 0] C 1 F C C where F E[ I( Y (1) y) HI (1) = 1, HI (0) 0] and F E[ I( Y (0) y) HI (1) = 1, HI (0) 0], 1 = = 12 0 = = denotes the dstrbutons of complers; F E[ I( Y (1) y) Z 1] and F E[ I( Y (0) y) Z 0], the 1 = = 0 = = condtonal dstrbutons gven Z = 1 and Z = 0 ; K = ( E[ HI Z = 1] E[ HI Z = 0] ) <. Under assumpton (2), we know that K > 0. 1 Ths lemma states that for a bnary nstrument, the dfference between two dstrbutons of complers s proportonal to the dfference between the two dstrbutons categorzed by the nstrument. So, f we use the nstrument to solve the endogenety problem and dvde the ndvduals nto two groups accordng to the bnary nstrument, comparng F 1 and F 0 provdes the sgn of the dfference between from the comparson of the orgnal dstrbuton of complers, dfferent nstruments wll generate dfferent results. C C F 1 and F 0. Ths s dfferent F HI and F NOHI. Moreover, snce the IV results apply only to The nstruments used n ths paper are borrowed from Olson (2002): husband s frm sze, husband s health nsurance, and husband s Unon status. Olson (2002) qualtatvely analyzes the economc valdty of these three nstruments, but does not statstcally test ther valdty. For a varable to be the rght nstrument, t must be correlated wth the endogenous varable, and orthogonal to the error process. The relevance of the nstrument varable can be tested by examnng the frst stage regresson. The test statstcs relate to the explanatory power of the nstrument n the regresson. A statstc commonly used s the R squared of the frst stage regresson wth the ncluded exogenous varables partalled out. Alternatvely, ths may be expressed as the Ftest of the jont sgnfcance of all the nstruments n the frst stage regresson. The orthogonalty of the nstruments can be tested usng two stage least squares (2SLS) or general method of moment (GMM) technques. If the dsturbance s homoskedastc, the GMM estmator s equvalent to the 2SLS estmator. If t s heteoskedastc, the 2SLS estmator s neffcent but consstent, whereas the standard estmated covarance matrx s nconsstent. Ths paper tests the heteroskedastcty of the dsturbance n wfe s wage equaton usng Whte/Koenker nr2 statstc and PaganHall general statstc. In the context of GMM, the orthogonalty of the nstrument may be tested va the commonly employed Hansen s J statstc. Ths statstc s
14 the value of the GMM objectve functon, evaluated at the effcent GMM estmator. The J statstc s dstrbuted as 2 χ wth degree of freedom equal to the number of overdentfyng restrctons. A rejecton of the null hypothess mples that the nstruments are not satsfyng the orthogonal condtons requred for ther employment. Ths may be ether because they are not truly exogenous, or because they are beng ncorrectly excluded from the regresson. Snce the model s overdentfed, testng a subset of the overdentfyng restrctons s possble. In ths context, the C test allows us to test a subset of the orgnal set of orthogonalty condtons. The statstc s computed as the dfference between two J statstcs: that for the (restrcted, fully effcent) regresson usng the entre set of overdentfyng restrctons, versus that for the (unrestrcted, neffcent but consstent) regresson usng a smaller set of restrctons, n whch a subset of nstruments are removed from the set. The C statstc, dstrbuted 2 χ wth degrees of freedom equal to the loss of overdentfyng restrctons, has the null hypothess that the specfed varables are proper nstruments. Usng the vald nstruments, ths paper then tests whether wfe s HI s exogenous n the wage equaton. Both DurbnWuHausman (DWH) and the WuHausman tests are appled. For the DurbnWuHausman test, a quadratc form n the dfferences between the two coeffcent vectors, the IV estmator whch s fully effcent under the null but nconsstent f the null s not true and the OLS estmator whch s consstent under both the null and the alternatve hypotheses, scaled by the precson matrx, gves rse to a test statstc for the null hypothess that the OLS estmator s consstent and fully effcent. The test statstc s dstrbuted as 2 χ wth the degrees of freedom equal to the number of regressors beng tested for endogenety, whch equals 1 n ths paper. The asymptotcally equvalent WuHausman test s an Ftest of the sgnfcance of the frst sage resduals n the auxlary second stage regresson of 2SLS. One advantage of the WuHausman Fstatstc over the DurbnWu Hausman test s that wth certan normalty assumptons, t s a fnte sample test exactly dstrbuted as F. IV. Data The data used n ths paper come from the 2004 Current Populaton Survey (CPS) March Supplement dataset, and the 2004 CPS January Basc dataset. 13
15 The CPS s a monthly survey of a probablty sample of housng unts each month. The Annual Demographc Survey or March CPS Supplement s the prmary source of detaled nformaton on ncome and work experence n the Unted States. The labor force and work experence data from ths survey are used to profle the U.S. labor market and to make employment projectons. More mportantly, the March CPS Supplement provdes rch data on the ndvdual health nsurance nformaton. CPS has a varable HI, whch ndcates whether or not the ndvdual s covered by a health nsurance plan provded through current/former employer/unon. However, the 2004 March Supplement does not nclude data on tenure:.e., how long the worker has been n the current job; only the 2004 January Basc dataset has ths nformaton. So, ths paper frst merges these two data sets together. Ths step drops a lot of observatons, due to the structure of the CPS. 2 Then, marred ndvduals are chosen to match the husband and wfe. From ths couple dataset, the observatons are further restrcted to the followng: ndvduals aged 2560, employed full tme, n the prvate sector, whose man earnngs are wages and salary. The observatons wth the ncomplete nformaton 3 are dropped, as well as ndvduals wth hourly wages less than two dollars. After these restrctons, the sample sze s 1287 couples. The followng are the bref ntroducton of the varables used n ths paper. The logarthm of current man job hourly wage s the dependent varable. Ths paper chooses the current man job earnngs nstead of total earnng because currently a lot of part tme ndvduals work several jobs, but the nsurance coverage manly comes from the man job. The CPS only provdes the total wages and the earnngs from the other jobs. So, subtractng the latter from the former yelds the wages from the man job. The 2 The CPS s a monthly survey of a probablty sample of housng unts each month. It does not, however, survey a completely new set of housng unts each month. Rather, the sample s dvded nto eght representatve sub samples called rotaton groups, wth housng unts n each rotaton group beng ntervewed for four consecutve months, followed by an 8month break, and then by another four months of ntervews. Thus, CPS sample housng unts are each elgble for 8 dfferent monthly ntervews, and rotaton groups are referred to n CPS parlance by ther month n sample of MIS. In any gven monthly sample, approxmately oneeghth of sample unts wll be ntervewed for the frst tme (MIS=1), oneeghth for the second tme and so on. One eghth of the sample wll leavng the sample permanently (MIS=8), and oneeghth of the sample wll be leavng for the next eght months before beng rentervewed (MIS=4). These latter two rotaton groups, MIS=8 and MIS=4, are referred to as the outgong rotaton groups. So, 75% of the CPS sample s common from month to month (any consecutve two months); whle 50% of the CPS sample s common from one year to the next for the same month. However, because of oneresponse, mortalty, mgraton and recordng errors, there maybe stll some errors after we match the two surveys usng the household number and ndvdual lne number (whch dentfes the ndvdual n the household). Madran and Lefgren (1999) test the dfferent matchng method and gve us better matchng strategy. Frst match two datasets usng household dentfer (H_IDNUM), ndvdual lne number wthn the household (A_LINENO) and household number (H_HHNUM) whch equals 1 n the ntal ntervew and ncreased by 1 f the household s replaced by other n the next ntervew. After matchng usng H_IDNUM, A_LINENO and H_HHNUM, mposng addtonal merge crtera on gender, race and age. If gender and race dffer n two surveys for the same ndvdual, or f the dfference of age n tme t+1 and tme t s less than 1 or greater than 3, we delete these observatons. 3 Manly ncludes those ndvduals wthout health nsurance status nformaton, wthout tenure nformaton, and wthout unon membershp nformaton. 14
16 CPS also provdes the hours worked per week for the man job and weeks usually worked per year. Thus, we can obtan current man job hourly wage, and use the logarthm of t as the dependent varable. HI status s the target varable, HI = 1 f ndvduals have HI and 0 f not. The varable HI means that the ndvdual has employerprovded health nsurance, and NOHI means not. Race 4, Age, Age square, Educaton, Geographcal locaton, Number of kds under 18 and Husband s yearly earnngs. All of these varables are categorcal varables except age and husband s yearly earnngs, whch are contnuous. Race has three categores: whte, black, and other. Educaton s categorzed nto 7 groups: no dploma, hgh school dploma, some college, bachelor, master, professonal schools ncludng MD, and doctoral degree. The CPS classfes the states nto four regons: Northeast, Mdwest, South and West. The number of kds under 18 s classfed nto two categores: no kds under age 18 and some kds under age 18. Major ndustry 5, Major occupaton, Tenure, Tenure square, Current job frm sze, and Unon membershp. The CPS has 14 major ndustral codes and our dataset covers 13. CPS has 11 major occupaton codes, and ours ncludes 10. The excluded one s armed forces. Tenure nformaton comes from the January Basc dataset, and s calculated as the number of years the ndvdual have been employed n the current job. Frm sze equals one f the number of the employees exceeds 100, zero otherwse. Lastly, unon membershp equals one f the ndvdual s a unon member or the current job s covered by a unon, and zero otherwse. Husband s HI, Husband s frm sze), and husband s unon membershp are potental nstruments for wfe s HI status and are defed smlarly as wfe s varables. Table 1 provdes summery statstcs. From the table we can see that wfe s log hourly wage ranges from 0.80 to 5.93, the mean s Furthermore, t shows that the percentage of wves havng HI s relatvely low (0.59) compared to ths percentage of husbands (0.68). Table 2 provdes the correlaton between wfe s HI status and ts nstruments: husband s HI, husband s frm sze and husband s unon membershp. Ths gves us a rough vew of the relatonshps. From the table we have the followng fndngs. Frst, all three nstruments are negatvely correlated wth wfe s HI. Ths confrms that these three varables are potental nstruments. Second, husband s HI s strongly correlated wth wfe s HI; 4 Race turns out to be nsgnfcant n the wage equaton, so I run two sets of the regresson: wth and wthout race varables. Snce the results are smlar, I only provde the results wthout usng ths varable. 5 Industry also turns out to be nsgnfcant n the wage equaton, so I run two sets of the regresson: wth and wthout major ndustry code. Snce the results are smlar, I only provde the results wthout usng ths varable. 15
17 husband s frm sze and unon membershp are less correlated. Ths s reasonable snce the latter two varables are ndrectly correlated wth wfe s HI: they are correlated wth wfe s HI through the correlaton wth husband s HI. V. Results A. Regresson Results To begn, tests for heteroskedastcty are provded, usng the followng regressors: HI, age, age squared, educaton, geographcal locaton, number of kds under 18, major occupaton, tenure, tenure squared and husband s yearly earnngs. 6 The Whte/Koenker nr2 test statstc s equal to ( p = ). The PaganHall general test statstc 7 s equal to ( p = ). Therefore, the tests reject the null hypothess that the dsturbance s homoskedastc. Thus, n the followng, only the GMM results are presented. 8 The nstrumental relevance and orthogonalty test results are llustrated n Table 3. The results are lsted for dfferent combnatons of the three nstruments: husband s frm sze (Fsze), husband s health nsurance status (HHI), and husband s unon membershp (Unon). In general, unlke Olson (2002), ths paper fnds that husband frm sze and husband HI status are vald nstruments, but husband s unon membershp s not. Specfcally, for the relevance test, the second column of Table 3 s the partal Rsquared. The thrd column reports the Fstatstc; the pvalues are n brackets. We can see that the partal Rsquared s between 0.01 and The most relevant nstrument s husband s health nsurance, for whch the partal Rsquared s 0.07, followed by husband s frm sze whose partal Rsquared s 0.02 and husband s unon status whose partal R squared s The Fstatstcs are large, and all the pvalues equal These results are consstent wth our prevous fndngs n Table 2. For the orthogonalty tests, Table 3 reports the Hansen s J statstcs for GMM model. The pvalues are n the brackets. We can see that the Fsze and HHI combnaton has the lowest J, whch s 0.60 ( p = ). Thus, we do not reject the hypothess that Fsze and HHI are orthogonal to the dsturbance and that they are not n the 6 Industry and race are not statstcally sgnfcant, so they are dropped from the regresson. 7 The nstruments used for ths statstc are husband s frm sze and husband s health nsurance status, vald nstruments known from the followng test results. 8 The author also tests nstrument valdty for the 2SLS model, and fnds that the results do not dffer sgnfcantly. 16
18 regresson equaton. In contrast, models wth nstruments combned wth Unon have very hgh J statstcs; the pvalues are below 0.1. Ths suggests that the hypothess that Unon s orthogonal to the dsturbance should be rejected. Table 3 also reports the Cstatstc 9 of subnstruments tests gven the vald nstruments. However, snce the three nstruments are hghly correlated, we only have the Cstatstcs for Fsze, HHI and Unon gven all the three nstruments. We can see that Fsze has the lowest C, followed by HHI and Unon. For example, the C statstc for Fsze s 0.41 ( p = ); for HHI the C statstcs s 1.31 ( p = ); for Unon t s 5.57 ( p = ). Therefore, the Cstatstcs also suggest that Fsze and HHI pass the orthogonalty test, but not Unon. Moreover, Table 3 lsts the HI coeffcents of the IV regressons. The coeffcents are negatve, whch means that there are tradeoffs between employerprovded health nsurance and wages. However, they are not statstcally sgnfcant, except when usng Unon as nstrument. After testng the valdty of the nstruments, Table 4 reports the endogenety test results usng the vald nstruments. The WuHausman test statstcs for Fsze, HHI and both IVs are 2.19, 1.69 and 2.57 respectvely; the pvalues are 0.14, 0.19 and 0.11 respectvely. The DurbnWuHausman tests generate smlar results. Thus, we cannot strongly reject or accept the null hypothess that the HI varable s exogenous, snce the related p values are all between 0.10 and 0.20, the margn of rejecton and acceptance 10. So, n the followng, results assumng exogenety and endogenety are provded. Fnally, OLS and IV regresson results are provded n Table 5, comparng the regressons wth and wthout the nteractons. For the regressons wthout nteractons, column 2 of Table 5 s the HI coeffcent of the OLS regresson, wth the pvalue n brackets. Column 3 reports the HI coeffcent of the GMM model, wth the pvalue n brackets. For the regresson wth nteracton terms, column 4 and 6 reports the OLS and the GMM 11 coeffcents respectvely, wth pvalues n brackets; column 5 and 7 are the Fstatstcs of all the nteracton terms. In the regressons wth nteractons, not only s HI nstrumented, but also all the nteractons, snce they 9 The C statstc for a sngle nstrument s 0, snce there are no overdentfyng restrctons for one nstrument of one endogenous varable. 10 The author also test the Hausman test, plus Durbn s flavor of Hausman test and Wu s flavor of Hausman test. However, for these Hausman tests, the null hypothess s that the dfferences of the coeffcents of IV and OLS regresson are not symmetrc. So, strctly speakng, these are not endogenety tests, but rather tests of the dfference of two dfferent regresson methods: OLS vs. IV. So, although the tests suggest that the dfferences are not systematc (all the pvalue are 1), we stll cannot say that HI s exogenous. 11 The author also estmates 2SLS models wth the frst stage lnear, probt and logt respectvely, but they do not generate fundamentally dfferent results. 17
19 are also endogenous. Rows n Table 5 are for OLS model, the model wth Fsze as nstrument, and the model wth HHI as nstrument respectvely. From the table, we have the followng fndngs. Frst, there s large dfference of the HI coeffcents between whether or not we nclude the ndrect effect of HI. For example, for OLS regressons, the coeffcent on HI excludng the nteractons s 0.09, whle wth the nteractons, t s Though t s not statstcally sgnfcant, the sgn has changed from postve to negatve. For the IV regresson, the HI coeffcent s much bgger n absolute value for the model wth nteractons than the model wthout. For example, HI coeffcents n the IV regresson wth the Fsze as the nstrument are and respectvely for the nonnteracton and nteracton regresson. They are and for HHI as the nstrument. Second, for the regressons ncludng the ndrect effect of HI, the coeffcents on the nteractons are statstcally dfferent from zero for some models. For example, n the IV model wth HHI as the nstrument, the coeffcents on nteracton terms are statstcally dfferent from zero. Therefore, HI affects wages not only from the drect tradeoff pont of vew, but also from changng the returns of the ndvdual characterstcs. Thrd, although we have the expected sgn of the coeffcent on HI, they are not statstcally sgnfcant. Only the HI coeffcent of the OLS regresson wthout nteracton terms s statstcally sgnfcant at 10% level. All the other coeffcents are not statstcally sgnfcant. These results lead us to do the dstrbutonal analyss. B. Dstrbutonal Analyss Results Ths part reports the dstrbutonal analyss results when HI s exogenous and when t s endogenous. Fgures 24 provde the CDF comparsons n graph. Fgure 2 s the CDF comparsons assumng that HI s exogenous n wfe s wage equaton, and therefore the ndvduals are dvded nto HI and NOHI groups by the HI varable. Fgure 3 and 4 are CDF comparsons when wfe s HI s endogenous n her wage equaton. Here, ndvduals are dvded nto HI and NOHI groups by bnary nstruments. Fgure 3 s the results when usng husband s frm sze as the nstrument. Fgure 4 s the results when usng husband health nsurance as the nstrument. For each fgure, graphs n the frst column are CDFs; graphs n the second column are wage dfferentals of CDFs at dfferent quantles; graphs n the thrd column are accumulated CDFs. The graphs n the frst row are for total wages; graphs n the second row are for the partal wages, whch reflect the drect effect of HI; and graphs n the thrd 18
20 row are for the predcted wages explaned by ndvdual characterstcs ( for ndvduals wth NOHI usng the estmated coeffcents for ndvduals wth HI. X βˆ ), as well as the hypothetcal wages Correspondng to these graphs, Table 6 also provdes the quanttatve levels of these CDFs and dfferentals at the mean and at quantles 10, 25, 50, 75 and 90. The three rows are results for the followng dfferent models respectvely: wthout nstrument, husband s frm sze as the nstrument and husband s health nsurance status as the nstrument. The three columns are results of total wage and wage dfferentals, wages explaned by the constant and resduals as well as wage dfferentals, and the wages explaned by observable characterstcs ( X βˆ ) and Oaxaca decomposton results. Table 7 provdes the stochastc domnance test results correspondng to the graphs n Fgures 24. The three rows are results for the followng dfferent models respectvely: wthout nstruments, husband s frm sze as nstrument and husband s health nsurance status as nstrument. For each model, t also provdes n detal the stochastc domnance test results for the total wage, for the partal wage, and the wage explaned by the characterstc varables, whch s subcategorzed as X βˆ gap, X gap and the βˆ gap. The X βˆ gap stands for the total gap explaned by the characterstc varables; the X gap s the wage gap solely due to the characterstc dfferences; βˆ gap s the gap due to the compensaton dfference, the ndrect effect of HI. FSD and SSD test results are based on 1000 bootstrap repettons. are provded for FSD test; d 1 MAX, d 2 MAX, d, Pr( d * 0 1 ), Pr( d * 0), Pr( * ) 2 d 0 statstcs s 1 MAX, s 2 MAX, s, Pr( s * 0 1 ), Pr( s * 0), Pr( * ) 2 s 0 statstcs are provded for SSD test. The subscrpt 1 stands for the HI mnus NOHI; the subscrpt 2 stands for NOHI mnus HI. The superscrpt stands for the results for each bootstrap. Furthermore, the larger Pr( d * 0) s, the more possble that there s FSD; and the larger Pr( s * 0) s, the more possble that there s SSD. The followng are fndngs from the graphs n Fgure 24 and Tables 67. B. 1. Total Wages From the graphs of the frst row n Fgures 24 we can see that the CDFs of total wages do not reflect the tradeoff f we do not use an nstrument. In fact, the CDFs of HI even domnate the CDFS of NOHI. Usng the IV 19
21 approach, the stuaton reverses. The frst column n Table 6 clearly mples ths. The total mean wage dfference s 0.26 when not usng nstrument, but ends up wth and 0.07, respectvely, when usng husband s frm sze and health nsurance as nstruments. Table 7 lsts the stochastc domnance results. Except NOHI second order domnates HI when usng husband s frm sze as nstrument, the other relatonshps usng nstruments are not statstcally sgnfcant. Ths fndng explans why the prevous researchers cannot fnd the tradeoffs when they do not use proper nstruments. B. 2. Partal Wages From the second row graphs of partal wages n Fgures 24, we can see that there exsts a tradeoff between employerprovded health nsurance and wages, whch s the drect effect of HI on wages. When assumng the exogenety of HI, the tradeoff s very small. From Table 6 we can see that ths mean wage dfference explaned by constant and resduals s 0.02 n absolute value. When usng husband s frm sze as nstrument, the dfference s 0.24 n absolute value. When usng husband s health nsurance status as nstrument, the dfference s 0.51 n absolute value. However, these numbers depend on the magntude of K, so they do not reflect the true tradeoff. Nevertheless, ther sgns ndcate that there s a tradeoff between employerprovded health nsurance and wages for complers. The stochastc domnance test confrms the above drect fndngs from the graph. Table 7 suggests that for the partal wages, there s no frst or second order stochastc domnance for HI and NOHI groups when not usng any nstrument. For example, Pr( d * 0) s and Pr( s * 0) s When usng the nstruments, the NOHI not only second order stochastc domnates HI, but also frst order domnates. For example, Pr( d * 0) s and Pr( s * 0) s 0.993, and d 2 s when usng husband s frm sze as nstrument. When usng MAX husband s health nsurance status as nstrument, these three statstcs are 0.999, and respectvely. Table 6 also suggests that the total wages are explaned half by the characterstc varables and half by the constant and resduals. Ths s consstent wth the lterature, snce the Rsquared for wage equatons are at most 0.5. B. 3. Characterstc / Compensaton Gaps 20
22 For the wage explaned by the characterstc varables, the compensaton gap whch reflects the ndrect effect of HI s postve and much larger than the characterstc gap. From the thrd row graphs n Fgures 24, we can see that the hypothetcal CDFs of NOHI group usng HI regresson coeffcents are very close to the CDFs of the HI group, and far away from the CDFs of NOHI group. The last three columns of Table 6 llustrate ths quanttatvely. Except the one wthout nstrument, the numbers n W / postve and almost equal to the numbers n However, the numbers n the W / h noh h noh h noh column (compensaton gap) are Wh noh column (total gap); sometmes they are even larger. column (characterstc gap) are much smaller. Ths fndng tells us that the characterstc dfferences for HI and NOHI ndvduals are not that bg, at least for the characterstcs we can observe. For example, when usng husband s frm sze as nstrument, the mean wage dfference due to the compensaton dfference s 0.20, whle the total gap s 0.09 and gap due to the characterstc dfference s Whle for husband s health nsurance status as nstrument, the mean wage dfference due to compensaton dfference s 0.50 and the total gap s 0.47 and the gap due to the characterstc dfference s The test results n Table 7 confrm ths fndng. We can see from Table 7 that there are FSDs and SSDs for the βˆ gap, the compensaton gap or the ndrect effect of HI, but no obvous FSDs for the X gap,.e. the characterstc gap, when usng nstruments. For example, when usng husband s frm sze as nstrument, Pr( d * 0) for βˆ gap s 0.995, Pr( s * 0) s and d 1 s equal to However, these three statstcs MAX for X gap are respectvely 0.384, and Wth husband s health nsurance status as nstrument, the three statstcs for βˆ gap are 0.999, and respectvely. They are respectvely 0.044, and for X gap. Ths fndng means that the wage returns are hgher for ndvduals wth employerprovded health nsurance than those ndvduals wthout. In other words, the characterstcs are not too dfferent for HI and NOHI groups, at least for the characterstcs we can observe, but the wage return of the same characterstcs s very dfferent for the two groups. The ndvduals wth employerproved health nsurance not only get ther health nsurance from ther companes, they also get hgher pay. One possble explanaton s that HI mproves ndvdual s productvty through nvestment n health; another explanaton s that employees morale s hgher 21
More information