Assessing health efficiency across countries with a two-step and bootstrap analysis *



Similar documents
Can Auto Liability Insurance Purchases Signal Risk Attitude?

An Alternative Way to Measure Private Equity Performance

How To Calculate The Accountng Perod Of Nequalty

Answer: A). There is a flatter IS curve in the high MPC economy. Original LM LM after increase in M. IS curve for low MPC economy

HOUSEHOLDS DEBT BURDEN: AN ANALYSIS BASED ON MICROECONOMIC DATA*

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic

Wage inequality and returns to schooling in Europe: a semi-parametric approach using EU-SILC data

1. Measuring association using correlation and regression

On the Optimal Control of a Cascade of Hydro-Electric Power Stations

Traffic-light a stress test for life insurance provisions

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

Data Mining from the Information Systems: Performance Indicators at Masaryk University in Brno

Calculation of Sampling Weights

SIMPLE LINEAR CORRELATION

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting

PRIVATE SCHOOL CHOICE: THE EFFECTS OF RELIGIOUS AFFILIATION AND PARTICIPATION

DEFINING %COMPLETE IN MICROSOFT PROJECT

THE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek

MEASURING OPERATION EFFICIENCY OF THAI HOTELS INDUSTRY: EVIDENCE FROM META-FRONTIER ANALYSIS. Abstract

Marginal Benefit Incidence Analysis Using a Single Cross-section of Data. Mohamed Ihsan Ajwad and Quentin Wodon 1. World Bank.

Analysis of Premium Liabilities for Australian Lines of Business

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Number of Levels Cumulative Annual operating Income per year construction costs costs ($) ($) ($) 1 600,000 35, , ,200,000 60, ,000

Risk-based Fatigue Estimate of Deep Water Risers -- Course Project for EM388F: Fracture Mechanics, Spring 2008

CHAPTER 14 MORE ABOUT REGRESSION

Risk Model of Long-Term Production Scheduling in Open Pit Gold Mining

1 De nitions and Censoring

! # %& ( ) +,../ # 5##&.6 7% 8 # #...

! ## % & ( ) + & ) ) ),. / 0 ## #1#

Course outline. Financial Time Series Analysis. Overview. Data analysis. Predictive signal. Trading strategy

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

Statistical Methods to Develop Rating Models

What is Candidate Sampling

How To Understand The Results Of The German Meris Cloud And Water Vapour Product

The OC Curve of Attribute Acceptance Plans

Management Quality and Equity Issue Characteristics: A Comparison of SEOs and IPOs

Brigid Mullany, Ph.D University of North Carolina, Charlotte

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12

Staff Paper. Farm Savings Accounts: Examining Income Variability, Eligibility, and Benefits. Brent Gloy, Eddy LaDue, and Charles Cuykendall

Recurrence. 1 Definitions and main statements

An Empirical Study of Search Engine Advertising Effectiveness

8 Algorithm for Binary Searching in Trees

NPAR TESTS. One-Sample Chi-Square Test. Cell Specification. Observed Frequencies 1O i 6. Expected Frequencies 1EXP i 6

PRACTICE 1: MUTUAL FUNDS EVALUATION USING MATLAB.

STAMP DUTY ON SHARES AND ITS EFFECT ON SHARE PRICES

High Correlation between Net Promoter Score and the Development of Consumers' Willingness to Pay (Empirical Evidence from European Mobile Markets)

L10: Linear discriminants analysis

The Use of Analytics for Claim Fraud Detection Roosevelt C. Mosley, Jr., FCAS, MAAA Nick Kucera Pinnacle Actuarial Resources Inc.

A Novel Methodology of Working Capital Management for Large. Public Constructions by Using Fuzzy S-curve Regression

Criminal Justice System on Crime *

A household-based Human Development Index. Kenneth Harttgen and Stephan Klasen Göttingen University, Germany

Gender differences in revealed risk taking: evidence from mutual fund investors

BERNSTEIN POLYNOMIALS

IDENTIFICATION AND CORRECTION OF A COMMON ERROR IN GENERAL ANNUITY CALCULATIONS

Overview of monitoring and evaluation

On the allocation of resources for secondary education schools

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by

WORKING PAPERS. The Impact of Technological Change and Lifestyles on the Energy Demand of Households

NEURO-FUZZY INFERENCE SYSTEM FOR E-COMMERCE WEBSITE EVALUATION

Location Factors for Non-Ferrous Exploration Investments

Power-of-Two Policies for Single- Warehouse Multi-Retailer Inventory Systems with Order Frequency Discounts

Calculating the high frequency transmission line parameters of power cables

The Application of Fractional Brownian Motion in Option Pricing

How Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence

Heterogeneous Paths Through College: Detailed Patterns and Relationships with Graduation and Earnings

Exhaustive Regression. An Exploration of Regression-Based Data Mining Techniques Using Super Computation

An Evaluation of the Extended Logistic, Simple Logistic, and Gompertz Models for Forecasting Short Lifecycle Products and Services

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol

APPLICATION OF PROBE DATA COLLECTED VIA INFRARED BEACONS TO TRAFFIC MANEGEMENT

Multiple-Period Attribution: Residuals and Compounding

THE METHOD OF LEAST SQUARES THE METHOD OF LEAST SQUARES

Available online ISSN: Society for Business and Management Dynamics

Forecasting the Direction and Strength of Stock Market Movement

Project Networks With Mixed-Time Constraints

How To Study The Nfluence Of Health Insurance On Swtchng

To manage leave, meeting institutional requirements and treating individual staff members fairly and consistently.

An Interest-Oriented Network Evolution Mechanism for Online Communities

SPEE Recommended Evaluation Practice #6 Definition of Decline Curve Parameters Background:

Searching and Switching: Empirical estimates of consumer behaviour in regulated markets

Portfolio Loss Distribution

Addendum to: Importing Skill-Biased Technology

Evaluating the Effects of FUNDEF on Wages and Test Scores in Brazil *

14.74 Lecture 5: Health (2)

Management Quality, Financial and Investment Policies, and. Asymmetric Information

Physical activity patterns of European 50+ populations

CREDIT RISK AND EFFICIENCY IN THE EUROPEAN BANKING SYSTEMS: A THREE-STAGE ANALYSIS*

Survival analysis methods in Insurance Applications in car insurance contracts

Structural Estimation of Variety Gains from Trade Integration in a Heterogeneous Firms Framework

Traffic State Estimation in the Traffic Management Center of Berlin

Transition Matrix Models of Consumer Credit Ratings

1.1 The University may award Higher Doctorate degrees as specified from time-to-time in UPR AS11 1.

ARTICLE IN PRESS. Energy Policy

World Economic Vulnerability Monitor (WEVUM) Trade shock analysis

Transcription:

Assessng health effcency across countres wth a two-step and bootstrap analyss * Antóno Afonso # $ and Mguel St. Aubyn # February 2007 Abstract We estmate a sem-parametrc model of health producton process usng a two-stage approach for OECD countres. By regressng data envelopment analyss output effcency scores on non-dscretonary varables, both usng Tobt analyss and a sngle and double bootstrap procedure, we show that neffcency s strongly related to GDP per head, the educaton level, and health behavour such as obesty and smokng habts. The bootstrappng procedure corrects lkely based DEA output scores takng nto account that envronmental varables are correlated to output and nput varables. JEL: C14, C61, H52, I11 Keywords: techncal effcency, health, DEA, bootstrap, sem-parametrc * The opnons expressed heren are those of the authors and do not necessarly reflect those of the author s employers. # UECE Research Unt on Complexty and Economcs; Department of Economcs, ISEG/TULsbon Techncal Unversty of Lsbon, R. Mguel Lup 20, 1249-078 Lsbon, Portugal, emals: aafonso@seg.utl.pt, mstaubyn@seg.utl.pt. UECE s supported by FCT (Fundação para a Cênca e a Tecnologa, Portugal), fnanced by ERDF and Portuguese funds. $ European Central Bank, Drectorate General Economcs, Kaserstraße 29, D-60311 Frankfurt am Man, Germany, emal: antono.afonso@ecb.nt.

Contents 1. INTRODUCTION...3 2. MOTIVATION AND LITERATURE...4 3. ANALYTICAL METHODOLOGY...6 3.1. DEA FRAMEWORK...6 3.2. NON-DISCRETIONARY INPUTS AND THE DEA/TOBIT TWO-STEPS PROCEDURE...7 3.3. NON-DISCRETIONARY INPUTS AND BOOTSTRAP...8 4. EMPIRICAL ANALYSIS...9 4.1. DATA AND INDICATORS...9 4.2. PRINCIPAL COMPONENT ANALYSIS... 11 4.3. DEA EFFICIENCY RESULTS... 12 4.4. EXPLAINING INEFFICIENCY THE ROLE OF NON-DISCRETIONARY INPUTS... 13 5. CONCLUSION... 17 APPENDIX 1 SINGLE AND DOUBLE BOOTSRAP PROCEDURES... 19 APPENDIX 2 POTENTIAL YEARS OF LIFE NOT LOST... 21 REFERENCES... 22 ANNEX DATA AND SOURCES... 24 TABLES AND FIGURES... 26 2

1. Introducton In ths paper we systematcally compare the output from the health system of a set of OECD countres wth resources employed (doctors, nurses, beds and dagnostc technology equpment). Usng data envelopment analyss (DEA), we derve a theoretcal producton fronter for health. In the most favourable case, a country s operatng on the fronter, and s consdered as effcent. However, most countres are found to perform below the fronter and an estmate of the dstance each country s from that border lne s provded the so-called effcency score. Moreover, by estmatng a sem-parametrc model of the health producton process usng a twostage approach, we show that neffcency n the health sector s strongly related to varables that are, at least n the short- to medum run, beyond the control of governments. These are GDP per capta, the educaton level, and unhealthy lfestyles as obesty and smokng habts. In methodologcal terms, a two-stage approach has become ncreasngly popular when DEA s used to assess effcency of decson-makng unts (DMUs). The most usual two-stage approach has been recently crtcsed n statstcal terms. 1 The fact that DEA output scores are lkely to be based, and that the envronmental varables are correlated to output and nput varables, recommend the use of bootstrappng technques, whch are well suted for the type of modellng we apply here. Therefore, we employ both a more usual DEA/Tobt approach and sngle and double bootstrap procedures suggested by Smar and Wlson (2007). Our paper s one of the frst applcatons of ths very recent technque. 2 Our results followng ths procedure are compared to the ones arsng from the more tradtonal one. The paper s organsed as follows. In secton two we provde motvaton and brefly revew some of the lterature and prevous results on health provson effcency. Secton three outlnes the methodologcal approach used n the paper and n secton four we present and dscuss the results of our effcency analyss. Secton fve provdes the conclusons. 1 See Smar and Wlson (2000, 2007). 2 See Afonso and St. Aubyn (2006) for an applcaton to the educaton system. 3

2. Motvaton and lterature Health s one of the most mportant servces provded by governments n almost every country. Accordng to OECD (2005), OECD countres expended an average of 8.7 per cent of GDP n 2003 on health nsttutons, of whch 6.3 per cent of GDP were from publc sources. In a general sense, health provson s effcent f ts producers make the best possble use of avalable nputs, and the sole fact that health nputs weght heavly on the publc purse would call for a careful effcency analyss. A health system not beng effcent would mean ether that results (or outputs ) could be ncreased wthout spendng more, or else that expense could actually be reduced wthout affectng the outputs, provded that more effcency s assured. Research results presented here ndcate that there are cases where consderable mprovements can be made n ths respect. The fact of health spendng beng predomnantly publc s partcularly true n OECD countres. Table 1 summarses some relevant data for thrty OECD countres concernng health spendng. For nstance, publc expendture as a share of total spendng averaged 72.5 per cent n 2003, rangng from 44.4 per cent n the USA to 90.1 per cent n the Czech Republc. For the EU15, average total spendng was 8.8 per cent of GDP n 2003, whch s close to the OECD value, slghtly up from the 8.1 per cent rato observed n 1995. On the other hand, average publc expendture as a share of total expendture n health was, n 2003, lower n the EU15 than n the OECD, the correspondng ratos beng equal to 69.9 and 72.5 percent, respectvely. Furthermore, data reported n Table 1 show that total per capta health spendng s very dverse across OECD countres. Indeed, the country that spends more on health n per capta terms, the USA, expends more than two tmes the OECD average and eleven tmes more than the country that spends the least, Turkey, even though the per capta GDP rato between those two countres s roughly fve and a half. [Insert Table 1 here] 4

Moreover, the relevance of assessng the qualty of publc spendng and redrectng t to more growth enhancng tems s stressed, for nstance, n EC (2004) as beng an mportant goal for governments to pursue. Internatonally, there s a shft n the focus of the analyss from the amount of publc resources used by a government, to servces delvered, and also to acheved outcomes and ther qualty (see OECD, 2003). In our research, we measure and compare health output across countres usng precsely the abovementoned type of qualty measures we resort to cross-natonally comparable evdence on health varables, as reported n OECD (2005). Prevous research on the nternatonal comparatve performance of the publc sector n general and of health outcomes n partcular, ncludng Afonso, Schuknecht and Tanz (2005) for publc expendture n the OECD, and Gupta and Verhoeven (2001) for educaton and health n Afrca, has already suggested that mportant neffcences are at work. These studes use free dsposable hull analyss (FDH) wth nputs measured n monetary terms. Spnks and Hollngsworth (2005) assess health effcency for OECD countres usng DEA based Malmqust ndexes. They report a mean value of 0.961 for an OECD dataset suggestng that overall, member countres have moved slghtly away from the fronter, mplyng a decrease n techncal effcency, between 1995 and 2000. Usng both FDH and DEA analyss, Afonso and St. Aubyn (2005) studed effcency n provdng health and educaton n OECD countres usng physcally measured nputs and concluded that f all countres were effcent, nput usage could be reduced by about 13 per cent wthout affectng output. Usng a more extended sample Evans et al. (2000) evaluate the effcency of health expendture n 191 countres usng a parametrc methodology. In ths paper, we estmate sem-parametrc models of the health producton process usng a two-stage approach. In a frst stage, we determne the output effcency score for each country, usng the mathematcal programmng approach known as DEA, relatng health nputs to outputs. In a second stage, these scores are explaned usng regresson analyss. Here, we show that non-dscretonary factors are ndeed hghly correlated to neffcency,.e. they are sgnfcant envronmental varables, usng 5

DEA jargon. 3 They are, however, of a fundamentally dfferent nature from nput varables, n so far as ther values cannot be changed n a meanngful spell of tme by the DMU, here a country. 3. Analytcal methodology 3.1. DEA framework DEA, whch assumes the exstence of a convex producton fronter, allows the calculaton of techncal effcency measures that can be ether nput or output orented. The purpose of an output-orented study s to evaluate by how much output quanttes can be proportonally ncreased wthout changng the nput quanttes used. Ths s the perspectve taken n ths paper. Note, however, that one could also try to assess by how much nput quanttes can be reduced wthout varyng the output. Both output and nput-orented models wll dentfy the same set of effcent/neffcent producers or DMUs. 4 The descrpton of the lnear programmng problem to be solved, output orented and assumng varable returns to scale hypothess, s sketched below. Suppose there are p nputs and q outputs for n DMUs. For the -th DMU, y s the column vector of the outputs and x s the column vector of the nputs. We can also defne X as the (p n) nput matrx and Y as the (q n) output matrx. The DEA model s then specfed wth the followng mathematcal programmng problem, for a gven -th DMU: Max s. to δ λ, δ δ y n1' λ = 1 λ 0 Yλ x Xλ. (1) 3 Throughout the paper we use nterchangeably the terms non-dscretonary, exogenous and envronmental when qualfyng varables or factors not ntally consdered n the DEA programme. 4 See Farrell (1957) semnal work, popularsed by Charnes, Cooper and Rhodes (1978). Coell, Rao, O Donnell and Battese (2005) and Thanassouls (2001) offer good ntroductons to the DEA methodology. 6

In problem (1), δ s a scalar satsfyngδ 1, more specfcally t s the effcency score that measures techncal effcency of the -th unt as the dstance to the effcency fronter, the latter beng defned as a lnear combnaton of best practce observatons. Wthδ > 1, the decson unt s nsde the fronter (.e. t s neffcent), whle δ = 1 mples that the decson unt s on the fronter (.e. t s effcent). The vector λ s a (n 1) vector of constants that measures the weghts used to compute the locaton of an neffcent DMU f t were to become effcent. 3.2. Non-dscretonary nputs and the DEA/Tobt two-steps procedure The standard DEA models as the one descrbed n (1) ncorporate only dscretonary nputs, those whose quanttes can be changed at the DMU wll, and do not take nto account the presence of envronmental varables or factors, also known as nondscretonary nputs. However, soco-economc dfferences may play a relevant role n determnng heterogenety across DMUs ether schools, hosptals or countres achevements n an nternatonal comparson and nfluence outcomes. In what health s concerned, these exogenous soco-economc factors can nclude, for nstance, household wealth, eatng habts and educaton level. As non-dscretonary and dscretonary nputs jontly contrbute to each DMU outputs, there are n the lterature several proposals on how to deal wth ths ssue, mplyng usually the use of two-stage and even three-stage models. 5 Let z be a (1 r) vector of non-dscretonary outputs. In a typcal two-stage approach, the followng regresson s estmated: ˆ, (2) δ = + zβ ε where δˆ s the effcency score that resulted from stage one,.e. from solvng (1). β s a (r 1) vector of parameters to be estmated n step two assocated wth each 5 See Ruggero (2004) and Smar and Wlson (2007) for an overvew. 7

consdered non-dscretonary nput. The fact that ˆ δ 1 has led many researchers to estmate (2) usng censored regresson technques (Tobt), although others have used OLS. 6 3.3. Non-dscretonary nputs and bootstrap The two-stage DEA/Tobt method s lkely to be based n small samples for two reasons. Frstly, the fact that output scores are jontly estmated by DEA mples that the error term ε n equaton (2) s serally correlated. Secondly, non-dscretonary varables z are correlated to the error term ε I. Ths derves from the fact that nondscretonary nputs are correlated to the outputs, and therefore to estmated effcency scores. To surmount ths, Smar and Wlson (2007) propose two alternatves based on bootstrap methods 7. Smlarly to the DEA/Tobt procedure, the effcency score depends lnearly on the envronmental varables, but the error term s a truncated, and not a censored, normal random varable. The frst bootstrap method ( algorthm 1 ) mples the estmaton of the effcency scores usng DEA, as n the DEA/Tobt analyss. However, the nfluence of nondscretonary nputs on effcency s estmated by means of a truncated lnear regresson. Bootstrappng then assesses coeffcent sgnfcance. We have consdered 2000 bootstrap estmates for that effect. The scores derved from DEA are based towards 1 n small samples. Smar and Wlson (2007) second bootstrap procedure, algorthm 2, ncludes a parametrc bootstrap n the frst stage problem, so that bas-corrected estmates for the effcency scores are produced. These corrected scores replace the DEA orgnal ones, and estmaton of envronment effects proceeds lke n algorthm 1. 6 See Smar and Wlson (2007) for an extensve lst of publshed examples of the two step approach. 7 See Appendx 1, where the method s exposed n more detal. We mplemented these algorthms n Matlab. Programmes and functons are avalable on request. 8

4. Emprcal analyss 4.1. Data and ndcators OECD (2005) s our chosen health database for OECD countres. 8 Typcal nput varables nclude medcal technology ndcators and health employment. Output s to be measured by ndcators such as lfe expectancy and nfant mortalty, n order to assess potental years of added lfe. It s of course dffcult to measure somethng as complex as the health status of a populaton. We have not nnovated here, and took two usual measures of health attanment, nfant mortalty and lfe expectancy. 9 Effcency measurement technques used n ths paper mply that outputs are measured n such a way that more s better. Ths s clearly not the case wth nfant mortalty. Recall that the Infant Mortalty Rate (IMR) s equal to: (Number of chldren who ded before 12 months)/(number of born chldren) 1000. We have calculated an Infant Survval Rate, ISR, IMR ISR = 1000, (3) IMR whch has two nce propertes: t s drectly nterpretable as the rato of chldren that survved the frst year to the number of chldren that ded; and, of course, t ncreases wth a better health status. We have consdered a thrd output measure, whch we call Potental Years of Lfe Not Lost, PYLNL. Ths varable was computed on the bass of the ndcator Potental 8 The data and the sources used n the paper are presented n the Annex. 9 These health measures, or smlar ones, have been used n other studes on health and publc expendture effcency see Afonso, Schuknecht and Tanz (2004), and Gupta and Verhoeven (2001). 9

Years of Lfe Lost, PYLL, reported by OECD (2005). Ths last varable, PYLL, equals the number of lfe years lost due to all causes before the age of 70 and that could be, a pror, prevented. Therefore, and for our subsequent DEA analyss, and smlarly to the Infant Mortalty Rate, a transformaton had to be done, n order to provde an ncreasng monotonc relaton between the varable, number of years not lost, and health status. Our transformed varable s: PYNLL = λ-pyll, (4) where λ=3 618 010 s an estmate of the number of potental years of lfe for a populaton under 70 years. 10 Therefore, our fronter model for health s based upon three output varables: - the nfant survval rate, - lfe expectancy, - and potental years of lfe not lost. We compare physcally measured nputs to outcomes. Quanttatve nputs are the number of practsng physcans, practsng nurses, acute care beds per thousand habtants and hgh-tech dagnostc medcal equpment, specfcally magnetc resonance magers (MRI). 11 Table 2 reports the relevant statstcs for the set of OECD countres. [Insert Table 2 here] From Table 2 one notces that practsng nurses per one thousand persons, n the perod 2000 2003, ranged from 1.6 n Korea to 14.7 n Ireland. For the same perod there was also a hgh range of practsng physcans per one thousand persons, from 1.4 1.5 n Turkey and n Korea to 4.3 4.4 n Italy and n Greece. Addtonally, the 10 See detals n Appendx 2. 11 A commonly used ndcator of medcal technology; see, for nstance, Retzlaff-Roberts et al. (2004). 10

number of MRI per mllon persons ranged from 0.2 n Mexco to 32.2 n Japan, and the hosptal acute care beds per one thousand persons ranged from 1.0 n Mexco to 9.1 n Japan. Table 2 also shows that for the perod 2000 2003 lfe expectancy at brth ranged form 68.4 years n Turkey to 81.5 n Japan, and nfant mortalty ranged form 2.4 n Iceland to 36.3 n Turkey. In addton, the potental years of lfe not lost per 100000 populaton was 73 per cent above the average n Hungary and 29 per cent below average n Japan. 4.2. Prncpal component analyss In order to go around the eventual dffcultes posed to the DEA approach when there are a sgnfcant number of nputs and/or outputs, we used prncpal component analyss (PCA) to aggregate some of the ndcators. The use of PCA reduces the dmensonalty of multvarate data, whch s what we have regardng health status, and the health care resources used. The dea of PCA s to descrbe the varaton of a multvarate data set through lnear combnatons of the orgnal varables (see, for nstance, Evertt and Dunn, 2001). Generally, we are nterested n seeng f the frst few components portray most of the varaton of the orgnal data set, for nstance, 80 per cent or 90 per cent, wthout much loss of nformaton. In a nutshell, the prncpal components are uncorrelated lnear combnatons of the orgnal varables, whch are then ranked by ther varances n descendng order. Ths provdes a more parsmonous representaton of the data set and avods that n the DEA computatons too many DMUs are labelled effcent by default. Usually one apples PCA by mposng that the orgnal varables are normalzed to have zero mean. Ths means that the computed prncpal components scores also have zero mean, and therefore some of the results from PCA are negatve. Snce DEA nputs and outputs need to be strctly postve, PCA results wll be ncreased by the most negatve value n absolute value plus one, n order to ensure strctly postve data (see, for nstance, Adler and Golany, 2001). 11

We appled PCA to the four nput varables, doctors, nurses, beds and MRI unts. The results of such analyss (see Table 3) led us to use the frst three prncpal components as the three nput measures, whch explan around 88 per cent of the varaton of the four varables. Ths also mples that we only take nto account the components whose assocated egenvalues are above 0.7, a rule suggested by Jollfe (1972). Applyng PCA also to the set of our selected output varables, lfe expectancy, nfant survval rate and potental number of years of lfe not lost, we selected the frst prncpal component as the output measure snce t accounts for around 84 per cent of the varaton of the three varables (see Table 3). [Insert Table 3 here] We report n Table 4 the abovementoned prncpal components, to be used n the subsequent secton n DEA computatons. [Insert Table 4 here] 4.3. DEA effcency results In Table 5 we report results for the standard DEA varable-returns-to-scale techncal effcency output scores and peers of each of the consdered countres. The specfcaton used ncludes as nputs the frst three components of the PCA performed to the base varables doctors, nurses, beds and MRI unts. As output we use the frst component of the PCA appled to the base varables nfant survval rate, lfe expectancy, and potental years of lfe not lost, as explaned n the prevous secton. [Insert Table 5 here] It s possble to observe n Table 5 that seven countres would be located on the theoretcal producton possblty fronter wth the standard DEA approach: Canada, 12

Fnland, Japan, Korea, Span, Sweden and the USA 12. Canada, Fnland, Japan, Span and Sweden are located n the effcent fronter because they perform qute well n the output ndcator, gettng above average results. On the other hand, Korea and the USA are generally below average regardng the use of resources n all the frst three components selected. Another set of three countres s located on the opposte end Hungary, the Slovak Republc and Poland. DEA analyss ndcates that ther output could be substantally ncreased f they were to become located on the effcency fronter. On average and as a conservatve estmate, countres could have ncreased ther results by 40 per cent usng the same resources. 4.4. Explanng neffcency the role of non-dscretonary nputs Usng the DEA effcency scores computed n the prevous subsecton, we now evaluate the mportance of non-dscretonary nputs. We present results both from Tobt regressons and bootstrap algorthms. Even f Tobt results are possbly based, t s not clear that bootstrap estmates are necessarly more relable. In fact, the latter are based on a set of assumptons concernng the data generaton process and the perturbaton term dstrbuton that may be dsputed. Takng the pros and cons of both methods nto account, t seems sensble to apply both of them. If outcomes are comparable, ths adds robustness and confdence to the results we are nterested n. In order to explan the effcency scores, we regress them on GDP per capta, Y, educatonal level, E, obesty, O, and tobacco consumpon, T, as follows 13 δ = β + β Y + β E + β O + β T + ε. (5) ˆ 0 1 2 3 4 12 One can brefly compare our results wth the ones reported by Afonso and St. Aubyn (2005) that addressed health effcency for 2000 usng a smlar set of nformaton but wthout prncpal component analyss. Interestngly, they reported that countres labelled as effcent were: Canada, Denmark, France, Japan, Korea, Norway, Portugal, Span, Sweden, the Unted Kngdom and the Unted States, rather along the lnes of our results. 13 Educatonal level s gven by the percentage of populaton that acheved tertary educaton n 2000 2003, GDP per capta refers to PPP USD n 2003, obesty refers to the percentage of obese populaton n 2002, and smokng refers to the percentage of populaton that consumed tobacco n 2003 (see the Annex for detals). 13

We frst report n Table 6 results from the censored normal Tobt regressons for several alternatve specfcatons of equaton (5). [Insert Table 6 here] Ineffcency n the health sector s strongly related to the four varables that are, at least n the short to medum run, beyond the control of governments: the economc background, proxed here by the country GDP per capta, the level of educaton, smokng habts, and obesty. The estmated coeffcents of the frst two nondscretonary nputs are statstcally sgnfcant and negatvely related to the effcency measure. For nstance, an ncrease n educaton achevement reduces neffcency, mplyng that the relevant DMU moves closer to the theoretcal producton possblty fronter. Therefore, the better the level of educaton, the hgher the effcency of health provson n a gven country. The same reasonng apples to GDP, wth hgher GDP per capta resultng n more effcency. On the other hand, effcency s lower the stronger smokng habts are and the hgher the percentage of obese populaton s. We also consdered other varables as non-dscretonary nputs: ncome nequalty va the Gn coeffcent, the rato of publc-to-total expendture n health, spendng on pharmaceutcals as a percentage of health expendture, percentage of populaton over 65 years, per capta alcohol and sugar consumpton, and total calores ntake. However, none of these varables prove to be statstcally sgnfcant and the estmaton results are not reported for the sake of space. Table 7 reports the estmaton results from the bootstrap procedures employng algorthms 1 and 2, as descrbed n sub-secton 3.3. Estmated coeffcents are essentally smlar rrespectve of the algorthm used to estmate them. Moreover, they are also close to the estmates derved from the more usual Tobt procedure, and, very mportantly, they are hghly sgnfcant. [Insert Table 7 here] Sgnfcance across dfferent model formulatons and estmaton methods s mportant and consttutes robust emprcal evdence that effcency n health depends drectly on 14

a country s wealth and on educaton levels, and nversely on tobacco consumpton and obesty. In a nutshell, populaton of poorer countres where educaton levels are low tend to under perform, so that results are further away from the effcency fronter. The same reasonng apples to the other two envronmental factors, wth hgher smokng habts and obesty levels drawng countres away from health related effcent performance. Equaton (5) can be regarded as a decomposton of the output effcency score nto two dstnct parts: the one that s the result of a country s envronment, and gven by β + β Y + β E + β O + β T ; 0 1 2 3 4 the one that ncludes all other factors havng an nfluence on effcency, ncludng therefore neffcences assocated wth the health system tself, and gven byε t. We choose models 2 and 4 from Table 7 for our exercse of correctng for envronmental varables n order to use versons wth and wthout educaton as an exogenous factor 14. The frst column n Table 8 ncludes the bas corrected scores for Model 2, the one wth the best ft usng bootstrap algorthms (as can be seen by the lower estmated standard devaton of ε). Algorthm 2 mples a bas correcton after estmatng output effcency scores, takng nto account the correlaton between these scores and the envronmental varables. We also present score correctons for the three envronmental varables. GDP, obesty, and tobacco consumpton correctons were computed as the changes n scores by artfcally consderng that Y, O, and T vared to the sample average n each country. Fully corrected scores, presented n column fve, are estmates of output scores purged from envronmental effects and result from the summaton of the prevous four columns, truncated to one when necessary. [Insert Table 8 here] 14 Models 2 and 4 dffer from models 1 and 3 because ncome s ntroduced n logs. Ths formulaton seems to provde a better ft, as checked by comparatve values of σˆ. ε 15

Comparng the ranks n the last column of Table 8, resultng from correctons for both bas and envronmental varables, wth the prevously presented rankng from the standard DEA analyss (see Table 5 above), t s apparent that sgnfcant changes occurred. For the purpose of such comparson one should notce that the number of countres consdered dropped from twenty-one n the DEA calculatons to nneteen n the two-step analyss, snce tobacco consumpton data was not avalable for Austra and Portugal. Some countres poorly ranked prevously are now closer to the producton possblty fronter ths s the case of Denmark, the Czech Republc, Hungary, Luxembourg, Poland, the Slovak Republc, and the UK. On the other hand, other countres see a worsenng n ther relatve poston after takng nto account envronmental varables, namely Canada, Sweden, and the US, and to a less extent, Japan. At last, countres lke Korea and Span keep ther good postonng. Addtonally, by lookng at GDP, obesty and tobacco consumpton correctons n Table 8, t s apparent that n some countres, envronmental harshness essentally results from low GDP per head, as n the Czech Republc, Korea, Poland and Span. For nstance, for the US, lower than average tobacco consumpton s offset by above average obesty, whle for Japan, Korea, Luxembourg, and Swtzerland we see an opposte pattern. Fnally, note that n countres lke Germany and Italy, all three envronmental varables push down performance, whle an nverse result can be observed for Hungary. Alternatvely, a smlar analyss can be conducted for Model 4, where we now have four envronmental varables: GDP, educaton, obesty, and tobacco consumpton (see Table 9). [Insert Table 9 here] From the results n Table 9 t s possble to conclude that educaton correcton s not benefcal for countres such as Canada, the US, Japan or Korea. Indeed, and as results from both Tobt and bootstrap analyss ndcate, the percentage of populaton wth 16

tertary educaton s a relevant exogenous varable n explanng health effcency scores. On the other hand, the below average results n ths varable for several other countres, such as the Czech Republc, Italy and Luxembourg, allow for an mprovement n ther effcency rankngs after makng the correctons related to all four non-dscretonary factors used n Model 4. 5. Concluson In ths paper, we have evaluated effcency n health servces across countres by assessng outputs (lfe expectancy, nfant survval rate, potental years of lfe not lost) aganst nputs drectly used n the heath system (doctors, nurses, beds, MRI unts) and envronment varables (wealth and country educaton level, smokng habts and obesty). In methodologcal terms, we have employed a two-stage sem-parametrc procedure. Frstly, output effcency scores were estmated by solvng a standard DEA problem wth countres as DMUs. Secondly, these scores were explaned n a regresson wth the envronmental varables as ndependent varables. Results from the frst-stage mply that neffcences may be qute hgh. On average and as a conservatve estmate, countres could have ncreased ther results by 40 per cent usng the same resources. Countres lke Hungary, the Slovak Republc and Poland dsplay sgnfcant room for mprovement. The fact that a country s seen as far away from the effcency fronter s not necessarly a result of neffcences engendered wthn the health system. Our second stage procedures shows that GDP per head, educatonal attanment, tobacco consumpton, and obesty are hghly and sgnfcantly correlated to output scores a wealther and more cultvated envronment are mportant condtons for a better health performance, whle a more obese populaton and prevalence of smokng habts worsen health performance. Moreover, t becomes possble to correct output scores by consderng the harshness of the envronment where the health system operates. Country rankngs and output scores derved from ths correcton can be substantally dfferent from standard DEA results. 17

Non-dscretonary outputs consdered here cannot be changed n the short run. For example, educatonal attanment s essentally gven n the comng year. However, contemporaneous educatonal and socal polcy wll have an mpact on future educatonal attanment. A smlar reasonng apples to smokng habts, whch are dffcult to change, but where, for nstance, tax measures are usually consdered and mplemented by the governments. Obesty problems also mpnge negatvely on the performance of the health system, and may be related to cultural tradtons. Fnally, note that we have appled both the usual DEA/Tobt procedure and two very recently proposed bootstrap algorthms. Results were strkngly smlar wth these three dfferent estmaton processes, whch brng ncreased confdence to obtaned conclusons. 18

Appendx 1 Sngle and Double Bootsrap Procedures Ths appendx brefly descrbes the sngle and double procedure proposed by Smar and Wlson (2007) and appled n ths paper. By assumpton, the true effcency score depends on the envronmental varables z, so that δ = zβ + ε 1, (A1.1) where β s a vector of parameters. ε s a truncated normal random varable, 2 dstrbuted N (0, ) wth left-truncaton at 1 ψ ( z, β ) σ ε 15. The effcency score that solves problem (1) n the man text (the DEA problem), δˆ, s then consdered as an estmate for δ, and ths s the frst stage n the procedure. The second stage s desgned to assess the nfluence of non-dscretonary nputs on effcency. The frst algorthm nvolves the followng steps: [1] The computaton of δˆ for all n decson unts by solvng (1). [2] The estmaton of equaton (A1.1) by maxmum lkelhood, consderng t s a truncated regresson (and not a censored or Tobt regresson). Denote by βˆ and the maxmum lkelhood estmates of β and σ ε. [3] The computaton of L bootstrap estmates for β and σ ε, n the followng way: For = 1,..., n draw ε from a normal dstrbuton wth varance σˆ ε 2 ˆ σ ε and left truncaton at * 1 zβˆ and compute δ = z ˆ β + ε. Then estmate the truncated 15 In a truncated normal dstrbuton, ε s not observed when t would fall below 1 β z. In a censored model (the Tobt model), ε s always observed, even f there s some nformaton loss (t s exactly equal to 1 β z when t would fall below ths value). 19

* regresson of δ on z by maxmum lkelhood, yeldng a bootstrap estmate * * ( ˆ β, ˆ ). σ ε Wth a large number of bootstrap estmates (e.g. L=2000), t becomes possble to test hypotheses and to construct confdence ntervals for β and σ ε. For example, suppose that we want to determne the p-value for a gven estmate ˆ1 β < 0. Ths wll be gven by the relatve frequency of nonnegatve * 1 ˆβ bootstrap estmates. It can be shown that the estmate δˆ s based towards 1 n small samples. Smar and Wlson (2007) second bootstrap procedure, algorthm 2, ncludes a parametrc bootstrap n the frst stage problem, so that bas-corrected estmates for the effcency scores are produced. The producton of these bas-corrected scores s done as follows: [1] Compute δˆ for all n decson unts by solvng problem (1); [2] Estmate equaton (A1) by maxmum lkelhood, consderng t s a truncated regresson. Let βˆ and σˆ ε be the maxmum lkelhood estmates of β and σ ε. [3] Obtan L 1 bootstrap estmates for each δ, the followng way: For = 1,..., n draw ε from a normal dstrbuton wth varance 2 ˆ σ ε and left truncaton at ˆ * 1 zβˆ and compute δ = z ˆ * δ β + ε. Let y = y *, be a δ modfed output measure. Compute * * * man text, where Y s replaced by [ y y ] ˆ* δ by solvng the DEA problem (1) n the Y = 1... n. (But note that y s not replaced by y * n the left-hand sde of the frst restrcton of the problem.) [4] Compute the bas-corrected output neffcency estmator as ˆ ˆ ˆ * δ = 2. δ δ, where ˆ* δ s the bootstrap average of ˆ* δ. Once these frst stage bas-corrected measures are produced, algorthm 2 contnues by replacng δˆ wth Wlson (2007), we set L 1 =100. δˆ n algorthm 1, from step 2 onwards. Followng Smar and 20

Appendx 2 Potental Years of Lfe Not Lost In ths appendx we explan the dervaton of the output varable Potental Years of Lfe Not Lost. Accordng to OECD (2005), the varable Potental Years of Lfe Lost per 100 000 populaton s gven by: l 1 d at Pa PYLL t ( l a) 100000, (A2.1) p P = a= 0 at n where l, the age lmt, was set to 70 years, d at s the number of deaths at age a at year t and p at s the number of persons aged a at year t. P a and P n are, respectvely, the number of persons aged a and the total number of persons n the reference populaton, the OECD total populaton n 1980. We defne our relevant varable, Potental Years of Lfe Not Lost, PYLNL, as follows: l 1 pat d at Pa PYLNL t ( l a) 100000. (A2.2) p P = a= 0 at n Note that p at - d at equals the number of persons aged a at year t that dd not de. Equaton (A2.2) s equvalent to: l 1 l 1 Pa d PYLNL t = ( l a) 100000 ( l a) P p a= 0 n a= 0 at n at P P a 100000, (A2.3) where the second term of the dfference n the rght-hand sde s smply PYLL. The frst term of the rght-hand sde of (A2.3) was computed by us va the very same populaton structure n 1980 used and reported by OECD (2005) when calculatng the PYLL. It gves (see equaton (4) n the text): PYNLL = 3618010 - PYLL, (A2.4) where 3 618 010 s nterpretable as the number of potental years of lfe for a 100 000 populaton under 70 years. 21

References Adler, N. and Golany, B. (2001). Evaluaton of deregulated arlne networks usng data envelopment analyss combned wth prncpal component analyss wth an applcaton to Western Europe. European Journal of Operatonal Research, 132, 260-273. Afonso, A.; Schuknecht, L. and Tanz, V. (2005). Publc Sector Effcency: An Internatonal Comparson, Publc Choce, 123 (3-4), 321-347. Afonso, A. and St. Aubyn (2005). Non-parametrc Approaches to Educaton and Health Effcency n OECD Countres, Journal of Appled Economcs, 8 (2), 227-246. Afonso, A. and St. Aubyn (2006). Cross-country Effcency of Secondary Educaton Provson: a Sem-parametrc Analyss wth Non-dscretonary Inputs, Economc Modellng, 23 (3), 476-491. Charnes, A.; Cooper, W. and Rhodes, E. (1978). Measurng the effcency of decson makng unts, European Journal of Operatonal Research, 2 (6), 429 444. Coell, T.; Rao, P., O Donnell, C. and Battese, G. (2005). An Introducton to Effcency and Productvty Analyss. Kluwer, Boston. EC (2004). Publc Fnances n EMU - 2004. A report by the Commsson servces, SEC(2004) 761. Brussels. Evans, D.; Tandon, A.; Murray, C. and Lauer, J. (2000). The Comparatve Effcency of Natonal Health Systems n Producng Health: an Analyss of 191 Countres, GPE Dscusson Paper Seres 29, Geneva, World Health Organsaton. Evertt, B. and Dunn, G. (2001). Appled Multvarate Data analyss, 2 nd edton, Arnold, London. Farrell, M. (1957). The Measurement of Productve Effcency, Journal of the Royal Statstcal Socety, Seres A, 120, Part 3, 253-290. Gupta, S. and Verhoeven, M. (2001). The Effcency of Government Expendture Experences from Afrca", Journal of Polcy Modellng, 23, 433-467. Jollfe, I. (1972) Dscardng varables n a prncpal component analyss 1: Artfcal data, Appled Statstcs, 21, 160-173. OECD (2003). Enhancng the Cost Effectveness of Publc Spendng, n Economc Outlook, vol. 2003/02, n. 74, December, OECD. OECD (2005), OECD Health Data 2005, Pars, OECD. Retzlaff-Roberts, D., Chang, C. and Rubn, R. (2004). Techncal effcency n the use of health care resources: a comparson of OECD countres, Health Polcy, 69, 55-72. 22

Ruggero, J. (2004). Performance evaluaton when non-dscretonary factors correlate wth techncal effcency, European Journal of Operatonal Research 159, 250 257. Smar, L. and Wlson, P. (2000). A General Methodology for Bootstrappng n Nonparametrc Fronter Models, Journal of Appled Statstcs 27, 779-802. Smar, L. and Wlson, P. (2007). Estmaton and Inference n Two-Stage, Sem- Parametrc Models of Producton Processes, Journal of Econometrcs 136 (1), 31-64. Spnks, J. and Hollngsworth, B. (2005). Health producton and the socoeconomc determnants of health n OECD countres: the use of effcency models, Monash Unversty, Center for Health Economcs, Workng Paper 151. Thanassouls, E. (2001). Introducton to the Theory and Applcaton of Data Envelopment Analyss, Kluwer Academc Publshers. 23

Annex Data and sources Table A1. Health ndcators Country Lfe expectancy 1/ Infant mortalty 2/ Potental years of lfe lost 3/ Practsng physcans 4/ Practsng nurses 5/ Acute care beds 6/ MRI unts 7/ Australa 79.8 5.0 3502 2.5 10.4 3.7 3.7 Austra 78.4 4.5 3700 3.3 9.3 6.1 12.4 Belgum 77.9 4.4.. 3.9 5.6 4.0 6.6 Canada 79.5 5.3 3554 2.1 9.8 3.2 3.9 Czech Republc 75.2 4.0 4632 3.5 9.2 6.5 2.1 Denmark 77.1 4.6 4014 2.9 10.2 3.4 7.1 Fnland 78.1 3.1 3907 2.6 8.8 2.4 11.6 France 79.2 4.2 4098 3.3 7.1 4.0 2.6 Germany 78.2 4.2 3736 3.3 9.6 6.7 5.7 Greece 78.1 5.0 3601 4.4 3.9.. 2.2 Hungary 72.1 7.5 7056 3.2 5.0 6.0 2.3 Iceland 80.2 2.4 3054 3.5 13.4.. 14.9 Ireland 77.2 5.3 4225 2.4 14.7 3.0.. Italy 79.8 4.5 3287 4.3 5.4 4.0 9.6 Japan 81.5 3.0 2917 2.0 7.7 9.1 32.3 Korea 76.2 6.2 4426 1.5 1.6 5.5 7.3 Luxembourg 78.1 5.3 3939 2.6 10.3 5.8 6.2 Mexco 74.5 21.3.. 1.5 2.2 1.0 0.2 Netherlands 78.3 5.1 3447 3.2 13.0 3.3.. New Zealand 78.7 5.6 4149 2.2 9.4.. 3.4 Norway 79.1 3.6 3515 3.0 10.4 3.1.. Poland 74.3 7.4 5974 2.3 4.9 5.0 0.9 Portugal 77.0 4.7 4934 3.3 3.9 3.2 3.6 Slovak Republc 73.6 7.2 5879 3.1 7.0 6.2 2.0 Span 79.8 4.2 3597 3.1 7.0 3.2 6.0 Sweden 80.0 3.4 2937 3.2 10.0 2.4 7.9 Swtzerland 80.1 4.6 3339 3.6 10.7 4.0 13.5 Turkey 68.4 36.3.. 1.4 1.7 2.2 3.0 Unted Kngdom 78.2 5.3 3721 2.1 8.7 3.7 5.1 Unted States 77.0 6.9 5101 2.3 7.9 2.9 8.4 Mean 77.5 6.5 4083 2.8 8.0 4.2 6.8 Medan 78.2 4.9 3736 3.1 8.8 3.7 5.7 Mnmum 68.4 2.4 2917 1.4 1.6 1.0 0.2 Maxmum 81.5 36.3 7056 4.4 14.7 9.1 32.3 Standard devaton 2.8 6.5 981.2 0.8 3.4 1.8 6.4 Observatons 30 30 27 30 30 27 27 1/ Years of lfe expectancy, total populaton at brth. Average for 2000 and 2003. Source: OECD (2005). 2/ Deaths per 1000 lve brths. Average for 2000-2003. Source: OECD (2005). 3/ All causes - <70 year,/100 000. Average for 2000-2003. Source: OECD (2005). 4/ 5/ 6/ Densty per 1000 populaton. Average for 2000-2003. Source: OECD (2005). 7/ Per mllon populaton. Average for 2000-2003. Source: OECD (2005)... non avalable. 24

Table A2. Non-dscretonary factors Country Per capta GDP 1/ Educaton level 2/ Obesty 3/ Tobacco 4/ Australa 29143 19.5 21.7 # 19.8 $ Austra 29972 7.0 9.1 #.. Belgum 28396 12.7 11.7 $ 27.0 Canada 30463 20.8 14.3 * 17.0 Czech Republc 16448 11.4 14.8 24.1 * Denmark 31630 12.0 9.5 # 28.0 Fnland 27252 15.4 12.8 * 22.2 France 27327 12.5 9.4 27.0 Germany 27609 13.6 12.9 * 24.3 Greece 19973 12.2 21.9 35.0 # Hungary 14572 14.4 18.8 * 33.8 Iceland 30657 18.9 12.4 22.4 Ireland 36775 14.3 13.0 27.0 * Italy 27050 10.0 8.5 24.2 Japan 28162 20.1 3.2 * 30.3 Korea 17908 18.9 3.2 $ 30.4 $ Luxembourg 62844 10.2 18.4 33.0 Mexco 9136 13.4 24.2 26.4* Netherlands 29412 21.2 10.0 32.0 New Zealand 21177 14.6 20.9 * 25.0 Norway 37063 27.5 8.3 26.0 Poland 11623 12.5 11.4 & 27.6 $ Portugal 18444 7.1 12.8.. Slovak Republc 13469 10.4 22.4 24.3* Span 22264 17.1 13.1 * 28.1 Sweden 26656 16.8 9.7 * 17.5 Swtzerland 30186 16.1 7.7 26.8* Turkey 6749 8.9 12.0 * 32.1 Unted Kngdom 27106 18.3 23.0 * 26.0 Unted States 37352 28.7 30.6 17.5 Mean 25894 15.2 14.1 26.2 Medan 27290 14.4 12.8 26.6 Mnmum 6749 7.0 3.2 17.0 Maxmum 62844 28.7 30.6 35.0 Standard devaton 10681 5.2 6.4 4.8 Observatons 30 30 30 28 1/ GDP per capta - (USD) PPP GDP and populaton n 2003. Source: World Development Indcators Database, September 2003. 2/ Percentage of populaton at ISCED 5A = Programmes at the tertary level equvalent to unversty programmes (ISCED-76: level 6), and ISCED 6 = Advanced research programmes at the tertary level, equvalent to PhD programmes. (ISCED-76: level 7). Average for 2000-2003. Source: OECD, Educaton at a Glance 2005, www.oecd.org/edu/eag2005. 3/ 2002 body weght, obese populaton (BMI>30kg/m2). Source: OECD HEALTH DATA 2005, Sept. 05. * - 2003; $ - 2001; # 1999; & - 1996. 4/ Tobacco consumpton (% of pop), 2003. Source: OECD HEALTH DATA 2005, Sept. 05. * - 2002; $ - 2001; # 2000... non avalable. 25

Tables and fgures Table 1 Publc and total expendture on health Total expendture, % of GDP Publc expendture, % of total expendture Total health expendture per capta US$ PPP 1995 2003 1995 2003 1995 2003 Australa 8.3 9.3 66.7 67.5 1745 2699 Austra 8.5 7.5 69.7 67.6 1973 2302 Belgum 8.4 9.6.... 1820 2827 Canada 9.2 9.9 71.4 69.9 2051 3001 Czech Republc 6.9 7.5 92.7 90.1 873 1298 Denmark 8.2 9.0 82.5 83.0 1848 2763 Fnland 7.5 7.4 75.6 76.5 1433 2118 France 9.5 10.1 76.3 76.3 2033 2903 Germany 10.6 11.1 80.5 78.2 2276 2996 Greece 9.6 9.9 52.0 51.3 1253 2011 Hungary 7.5 8.4 84.0 72.4 676 1269 Iceland 8.4 10.5 83.9 83.5 1858 3115 Ireland 6.8 7.4 71.6 78.0 1216 2451 Italy 7.3 8.4 71.9 75.1 1535 2258 Japan 6.8 7.9 83.0 81.5 1538 2139 Korea 4.2 5.6 35.3 49.4 538 1074 Luxembourg 6.4 6.9 92.4 89.9 2059 3705 Mexco 5.6 6.2 42.1 46.4 382 583 Netherlands 8.4 9.8 71.0 62.4 1826 2976 New Zealand 7.2 8.1 77.2 78.7 1247 1886 Norway 7.9 10.3 84.2 83.7 1897 3807 Poland 5.6 6.5 72.9 69.9 417 744 Portugal 8.2 9.6 62.6 69.7 1079 1797 Slovak Republc 5.8 5.9 91.7 88.3 543 777 Span 7.6 7.7 72.2 71.2 1198 1835 Sweden 8.1 9.4 86.6 85.2 1738 2703 Swtzerland 9.7 11.5 53.8 58.5 2579 3781 Turkey 3.4 7.4 70.3 70.9 185 513 Unted Kngdom 7.0 7.7 83.9 83.4 1374 2231 Unted States 13.3 15.0 45.3 44.4 3654 5635 Mean 7.7 8.7 72.5 72.5 1494.8 2340 Medan 7,8 8,4 72,9 75,1 1536,5 2280 Standard devaton 1,9 2,0 14,9 12,7 738,7 1115 Mnmum 3.4 (TUR) 5.6 (KOR) 35.3 (KOR) 44.4 (US) 185.0 (TUR) 513 (TUR) Maxmum 13.3 (US) 15.0 (US) 92.7 (CZ) 90.1 (CZ) 3654.0 (US) 5635 (US) EU 15 average 8.1 8.8 69.9 69.9 1644.1 2525 Sources: OECD Health Data 2005 - Frequently asked data (http://www.oecd.org/document/16/0,2340,en_2825_495642_2085200_1_1_1_1,00.html)... non avalable. 26

Table 2 Summary statstcs of the nput and output data Mean Standard devaton Mnmum Lfe expectancy (n years) 1/ 77.5 2.8 68.4 (TUR) Infant mortalty rate (deaths per 4.5 6.5 2.4 1000 lve brths) 2/ (ICE) Potental years of lfe lost (All 4083 981.2 2917 causes - <70 year,/100 000) 2/ (JAP) Practsng physcans, densty per 2.8 0.8 1.4 1000 populaton 2/ (TUR) Practsng nurses, densty per 1000 8.0 3.4 1.6 populaton 2/ (KOR) Acute care beds, densty per 1000 4.2 1.8 1.0 populaton 2/ (MEX) MRI unts, per mllon populaton 6.8 6.4 0.2 2/ (MEX) Maxmum 81.5 (JAP) 36.3 (TUR) 7056 (HU) 4.4 (GRC) 14.7 (IRE) 9.1 (JAP) 32.3 (JAP) Notes: 1/ Average for 2000 and 2003. 2/ Average for 2000-2003. TUR Turkey; JAP Japan; ICE Iceland; HU Hungary; GCR Greece; KOR Korea; IRE Ireland; MEX Mexco. Table 3 Egenvalues and cumulatve R-squared of PCA on health nput and output ndcators Input ndcators (doctors, nurse, beds, and MRI unts) Component Egenvalue Cumulatve R- Output ndcators (lfe expectancy, nfant survval rate, and potental number of years of lfe not lost) Egenvalue Cumulatve R- Squared Squared 1 1.0799 0.4275 2.5155 0.8385 2 1.1208 0.7077 0.4210 0.9789 3 0.7071 0.8845 0.6342E-01 1.0000 4 0.4621 1.0000 27

Table 4 Prncpal components used n the DEA calculatons Output Input P1 P1 P2 P3 Australa 4.093 3.338 4.886 1.343 Austra 3.890 4.591 4.333 2.641 Belgum 3.452 5.160 3.584 Canada 3.971 3.007 4.546 1.055 Czech Republc 3.125 4.084 5.151 3.412 Denmark 3.496 3.593 4.934 1.385 Fnland 4.222 3.329 4.401 1.000 France 3.972 3.178 5.177 2.962 Germany 3.921 4.340 4.792 3.120 Greece 3.735 Hungary 1.000 3.293 4.455 4.182 Iceland 5.381 Ireland 3.280 Italy 4.302 3.756 5.224 3.739 Japan 5.296 5.778 1.000 2.265 Korea 2.921 2.369 2.303 3.501 Luxembourg 3.602 3.992 4.382 2.055 Mexco 1.000 3.757 2.116 Netherlands 3.856 New Zealand 3.526 Norway 4.380 Poland 1.829 2.645 4.016 3.324 Portugal 3.093 2.601 4.780 3.427 Slovak Republc 1.762 3.587 4.658 3.680 Span 4.299 3.110 4.859 2.395 Sweden 4.871 3.520 5.345 1.280 Swtzerland 4.301 4.447 5.006 1.612 Turkey 1.316 3.135 2.412 Unted Kngdom 3.668 3.026 4.188 1.440 Unted States 2.707 3.006 4.148 1.334 Note: The orgnal prncpal components data were ncreased by the most negatve value plus one, n order to ensure strctly postve data. 28

Table 5 DEA output effcency results for health effcency n OECD countres, 3 nputs (PCA on doctors, nurses, beds and MRI) and 1 output (PCA on lfe expectancy, nfant survval rate, and potental number of years of lfe not lost) Country VRS TE Rank Peers Rank 2 Australa 1.101 10 Canada, Sweden, Korea, Fnland 10 Austra 1.304 15 Sweden, Japan 15 Canada 1.000 1 Canada 6 Czech Republc 1.592 18 Japan, Sweden 18 Denmark 1.368 16 Korea, Japan, Sweden, Fnland 16 Fnland 1.000 1 Fnland 4 France 1.106 11 Sweden, Span 11 Germany 1.282 14 Sweden, Japan 14 Hungary 4.386 21 Sweden, Japan, Korea 21 Italy 1.143 12 Sweden, Japan 12 Japan 1.000 1 Japan 2 Korea 1.000 1 Korea 3 Luxembourg 1.372 17 Korea, Japan, Sweden 17 Poland 1.876 19 Span, Korea 19 Portugal 1.083 9 Korea, Span 9 Slovak Republc 2.667 20 Korea, Sweden, Japan 20 Span 1.000 1 Span 4 Sweden 1.000 1 Sweden 1 Swtzerland 1.166 13 Sweden, Japan 13 Unted Kngdom 1.070 8 Canada, Sweden, Korea, Fnland 8 Unted States 1.000 1 Unted States 7 Average 1.406 Note: VRS TE - varable returns to scale techncal effcency. Rank 2 rankng takng nto account the number of tmes the effcent countres are peers of neffcent countres. 29

Table 6 Censored normal Tobt results (19 countres) Model 1 Model 2 Model 3 Model 4 Constant -3.2574 9.0162 (0.029) -1.1185 (0.092) 9.9146 (0.009) Y -4.38E-05-4.44E-05 Log(Y) -1.2476-1.1546 E -0.1060 (0.010) -0.0891 (0.034) O 0.0895 0.0783 (0.001) 0.0946 0.0841 T 0.1708 0.1453 0.1463 0.122 (0.001) σˆ 0.5677 0.5600 0.4759 0.5088 ε Notes: Y GDP per capta; E Educatonal level; O Obesty; T Tobacco consumpton. Estmated standard devaton of ε. P- values n brackets. σˆ ε 30

Table 7 Bootstrap results (19 countres) Algorthm 1 Model 1 Model 2 Model 3 Model 4 Constant -6.9657 (0.007) 6.6360 (0.005) -1.8317 (0.009) Y -1.0697E-04-0.6383E-04 (0.028) Log(Y) -1.4625 (0.001) E -0.1800 O 0.1555 0.1376 0.1080 (0.011) (0.008) T 0.29480 0.2596 0.2050 (0.011) (0.008) σˆ 0.5085 0.4155 0.4279 ε 10.1002-1.4967-0.0962 (0.007) 0.1229 0.2076 (0.002) 0.3759 Algorthm 2 Constant -7.3757 (0.00) 15.5263 (0.00) -6.4315 (0.043) 20.4362 Y -0.9365E-04 (0.00) -0.8663E-04 Log(Y) -2.42259 (0.00) -2.7953 E -0.1133 (0.135) -0.2223 (0.012) O 0.1545 (0.00) 0.1441 (0.00) 0.2399 0.1872 T 0.3071 (0.00) 0.2795 (0.00) 0.2630 0.2978 σˆ 0.5849 0.5338 0.7734 0.7728 ε (0.00) (0.00) Notes: Y GDP per capta; E Educatonal level; O Obesty; T Tobacco consumpton. σˆ ε Estmated standard devaton of ε. P- values n brackets. 31

Table 8 Corrected output effcency scores (for Model 2) Bas corrected scores (1) GDP correcton (2) Obesty correcton (3) Tobacco correcton (4) Fully corrected scores (5)=(1)+(2)+ (3)+(4) Australa 1.144 0.381-1.114 1.555 1.966 13 Canada 1.048 0.489-0.048 2.338 3.826 18 Czech Republc 1.641-1.004-0.120 0.353 1.000 1 Denmark 1.430 0.580 0.644-0.737 1.917 12 Fnland 1.068 0.219 0.168 0.884 2.339 16 France 1.160 0.225 0.658-0.458 1.586 10 Germany 1.324 0.250 0.154 0.297 2.026 14 Hungary 4.600-1.298-0.696-2.358 1.000 1 Italy 1.180 0.201 0.788 0.325 2.494 17 Japan 1.093 0.298 1.552-1.380 1.564 9 Korea 1.182-0.798 1.552-1.408 1.000 1 Luxembourg 1.443 2.243-0.639-2.135 1.000 1 Poland 2.091-1.846 0.370-0.625 1.000 1 Slovak Republc 2.775-1.489-1.215 0.297 1.000 1 Span 1.058-0.271 0.125-0.765 1.000 1 Sweden 1.063 0.165 0.615 2.198 4.041 19 Swtzerland 1.215 0.466 0.903-0.402 2.183 15 Unted Kngdom 1.131 0.206-1.302-0.178 1.000 1 Unted States 1.105 0.982-2.397 2.198 1.888 11 Average 1.513 0.000 0.000 0.000 1.781 Rank Note: the fully corrected scores do not always add up to the ndcated sum snce for the cases were the result was below one we truncated t to the unty. 32

Table 9 Corrected output effcency scores (for Model 4) Bas corrected scores (1) GDP correcton (2) Educaton correcton (3) Obesty correcton (4) Tobacco correcton (5) Fully corrected scores (6)=(1)+(2)+ (3)+(4)+(5) Australa 1.141 0.440 0.840-1.447 1.657 2.630 15 Canada 1.489 0.564 1.129-0.062 2.491 5.611 19 Czech Republc 1.637-1.159-0.960-0.156 0.376 1.000 1 Denmark 1.416 0.669-0.827 0.836-0.785 1.309 9 Fnland 1.066 0.252-0.071 0.219 0.942 2.407 13 France 1.158 0.260-0.716 0.855-0.487 1.069 8 Germany 1.318 0.289-0.471 0.200 0.317 1.652 12 Hungary 4.564-1.497-0.294-0.904-2.513 1.000 1 Italy 1.175 0.232-1.272 1.023 0.346 1.505 11 Japan 1.063 0.344 0.973 2.015-1.470 2.926 16 Korea 1.129-0.921 0.707 2.015-1.500 1.430 10 Luxembourg 1.427 2.588-1.227-0.829-2.274 1.000 1 Poland 2.049-2.130-0.716 0.481-0.666 1.000 1 Slovak Republc 2.757-1.718-1.183-1.578 0.317 1.000 1 Span 1.057-0.313 0.306 0.163-0.815 1.000 1 Sweden 1.043 0.191 0.240 0.799 2.342 4.614 18 Swtzerland 1.205 0.538 0.084 1.173-0.428 2.572 14 Unted Kngdom 1.188 0.237 0.573-1.690-0.190 1.000 1 Unted States 1.055 1.134 2.885-3.113 2.342 4.302 17 Average 1.523 0.000 0.000 0.000 0.000 2.054 Note: the fully corrected scores do not always add up to the ndcated sum snce for the cases were the result was below one we truncated t to the unty. Rank 33