Quantitative Methods in Regulation

Similar documents
Gautam Appa and H. Paul Williams A formula for the solution of DEA models

Session 9 Case 3: Utilizing Available Software Statistical Analysis

DEA implementation and clustering analysis using the K-Means algorithm

The Cobb-Douglas Production Function

Technical Efficiency Accounting for Environmental Influence in the Japanese Gas Market

A PRIMAL-DUAL APPROACH TO NONPARAMETRIC PRODUCTIVITY ANALYSIS: THE CASE OF U.S. AGRICULTURE. Jean-Paul Chavas and Thomas L. Cox *

Introduction to Support Vector Machines. Colin Campbell, Bristol University

Econometrics Simple Linear Regression

Linear Programming. Solving LP Models Using MS Excel, 18

Efficiency in Software Development Projects

Sensitivity Analysis 3.1 AN EXAMPLE FOR ANALYSIS

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

Stochastic Inventory Control

Statistical Machine Learning

Data Envelopment Analysis

The efficiency of fleets in Serbian distribution centres

PERFORMANCE MANAGEMENT AND COST-EFFECTIVENESS OF PUBLIC SERVICES:

SNF Report No 33/08. Weight Restrictions on Geography Variables in the DEA Benchmarking Model for Norwegian Electricity Distribution Companies

AN ANALYSIS OF EFFICIENCY PATTERNS FOR A SAMPLE OF NORWEGIAN BUS COMPANIES

International Doctoral School Algorithmic Decision Theory: MCDA and MOO

Arrangements And Duality

Linear Programming I

Linear Programming Sensitivity Analysis

Assessing Container Terminal Safety and Security Using Data Envelopment Analysis

Clustering-Based Method for Data Envelopment Analysis. Hassan Najadat, Kendall E. Nygard, Doug Schesvold North Dakota State University Fargo, ND 58105

1 Introduction. Linear Programming. Questions. A general optimization problem is of the form: choose x to. max f(x) subject to x S. where.

Simple linear regression

DEA IN MUTUAL FUND EVALUATION

4.6 Linear Programming duality

Distributed Generation in Electricity Networks

COMPUTATIONS IN DEA. Abstract

Demand, Supply and Elasticity

ANALYTIC HIERARCHY PROCESS AS A RANKING TOOL FOR DECISION MAKING UNITS

Increasing for all. Convex for all. ( ) Increasing for all (remember that the log function is only defined for ). ( ) Concave for all.

Linear Programming. March 14, 2014

Lecture 2. Marginal Functions, Average Functions, Elasticity, the Marginal Principle, and Constrained Optimization

Duality in Linear Programming

Chapter 27: Taxation. 27.1: Introduction. 27.2: The Two Prices with a Tax. 27.2: The Pre-Tax Position

Mathematical finance and linear programming (optimization)

Understanding the Impact of Weights Constraints in Portfolio Theory

A Guide to DEAP Version 2.1: A Data Envelopment Analysis (Computer) Program

Linear Programming Notes V Problem Transformations

Association Between Variables

Univariate Regression

Course Objective This course is designed to give you a basic understanding of how to run regressions in SPSS.

Glencoe. correlated to SOUTH CAROLINA MATH CURRICULUM STANDARDS GRADE 6 3-3, , , 4-9

Determine If An Equation Represents a Function

Session 7 Bivariate Data and Analysis

Common Core Unit Summary Grades 6 to 8

Chapter 1 Introduction. 1.1 Introduction

Operation Research. Module 1. Module 2. Unit 1. Unit 2. Unit 3. Unit 1

Simple Linear Regression Inference

EdExcel Decision Mathematics 1

2. Simple Linear Regression

Several Views of Support Vector Machines

The correlation coefficient

Hybrid Data Envelopment Analysis and Neural Networks for Suppliers Efficiency Prediction and Ranking

Introduction to Linear Programming (LP) Mathematical Programming (MP) Concept

Machine Learning and Pattern Recognition Logistic Regression

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression

A Simple Introduction to Support Vector Machines

Chapter 11 STATISTICAL TESTS BASED ON DEA EFFICIENCY SCORES 1. INTRODUCTION

Linear Programming in Matrix Form

VALIDITY EXAMINATION OF EFQM S RESULTS BY DEA MODELS

MEASURING EFFICIENCY OF AUSTRALIAN SUPERANNUATION FUNDS USING DATA ENVELOPMENT ANALYSIS. Yen Bui

Introduction to General and Generalized Linear Models

Duality in General Programs. Ryan Tibshirani Convex Optimization /36-725

Statistics. Measurement. Scales of Measurement 7/18/2012

Causes of Inflation in the Iranian Economy

Module 3: Correlation and Covariance

EXCEL SOLVER TUTORIAL

Definition and Properties of the Production Function: Lecture

Unit 9 Describing Relationships in Scatter Plots and Line Graphs

DESCRIPTION OF COURSES

Wealth inequality: Britain in international perspective. Frank Cowell: Wealth Seminar June 2012

1 Solving LPs: The Simplex Algorithm of George Dantzig

European Journal of Operational Research

2.3 Convex Constrained Optimization Problems

6.4 Normal Distribution

Chapter 2 Solving Linear Programs

Linear Programming: Theory and Applications

Least-Squares Intersection of Lines

Correlation key concepts:

Support Vector Machines Explained

INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition)

Java Modules for Time Series Analysis

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

Discrete Optimization

1. Briefly explain what an indifference curve is and how it can be graphically derived.

Point Biserial Correlation Tests

Lecture 3. Linear Programming. 3B1B Optimization Michaelmas 2015 A. Zisserman. Extreme solutions. Simplex method. Interior point method

Summary of specified general model for CHP system

II. DISTRIBUTIONS distribution normal distribution. standard scores

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

The consumer purchase journey and the marketing mix model

List of Ph.D. Courses

Linear functions Increasing Linear Functions. Decreasing Linear Functions

Abstract. Keywords: Data Envelopment Analysis (DEA), decision making unit (DMU), efficiency, Korea Securities Dealers Automated Quotation (KOSDAQ)

Chapter 5 Analysis of variance SPSS Analysis of variance

Transcription:

Quantitative Methods in Regulation (DEA) Data envelopment analysis is one of the methods commonly used in assessing efficiency for regulatory purposes, as an alternative to regression. The theoretical development of DEA is usually attributed to an economist, M.J. Farrell (1957) but became operational much later following work by OR specialists Charnes, Cooper and Rhodes (1978)(CCR). Consequently, the DEA technique is more associated with the operations research and management science literature, although applications in the economics literature are becoming fairly common. Two orientations are possible corresponding to the cost and output approaches respectively. Figure 1 shows the standard regression-based approach. Figure 2 is a version of the original Farrell diagram for the cost (strictly, input) orientation. There are five companies (A to E), each producing a unit of single output (y) using two inputs (x1 and x2). Companies C, D and E are technically efficient. For example, C uses more of x1 and less of x2 compared with D, while company B is inefficient compared to D since it uses more of both x1 and x2. The efficient counterpart of B, i.e. D, is called B s reference group. Figure 1: Least-squares regression Cost (C) A ^ C = o + 1X 1 OLS line x A Corrected OLS Line most efficient observation Size (X) The term data envelopment analysis arises because DEA can be thought of as fitting a frontier which envelopes the data. In Figure 2 the frontier is defined by CDE. Points C, D and E of the frontier represent real companies while points on the line segments linking the real companies represent hypothetical ones. The technical efficiency of a company such as A is measured by comparing it with its corresponding hypothetical benchmark which is A. The distance AA is a measure of the efficiency of company A. The general version of DEA allows for many inputs in order to calculate technical efficiency. In the management science literature DEA is typically represented as a generalised ratio: City University 1

weighted sum of outputs efficiency ratio = = weighted sum of inputs i j qy i px j i j (3) where ys and xs are outputs and inputs respectively while q s and p s are firm specific weights to be calculated by the DEA technique. (These are counterparts of the regression coefficients.) Each company is given a single efficiency measure from zero to one. The primal LP problem as defined by CCR is essentially to choose these q s and p s to maximise the efficiency score, subject to the constraint that, with these weights no company gets score higher than 1.00. The closer the score to one the more efficient the firm. DEA allocates specific weights (q s and p s) for each company, on the basis of giving the highest possible score. This is sometimes expressed as putting a company in the best possible light. Here we consider the special case where inputs are aggregated by their prices with the aim of deriving overall cost efficiency for (potentially) multiple outputs. Furthermore, the following description is the dual of the approach described by Charnes and Cooper and much of the management science literature since it corresponds more closely to the economic interpretation of the frontier. As with COLS, DEA assumes that there is at least one efficient observation. As in the Farrell diagram, the cost frontier is a convex hull formed by joining adjacent efficient points together by hyperplanes. (Where there is only one x one y and no zs these can be represented by straight lines as in the diagram above.) A separate analysis is carried out for each observation. In the special case here where the inputs are aggregated into a single cost measure we can express the production correspondence as an explicit function c(y,z). c = f(y 0,z 0) A technically efficient company in an environment z 0 would produce the given output y 0 at minimum cost (c min ) while an inefficient company would under the same conditions, incur cost more than the maximum. (c>c min ). A measure of the company s technical efficiency could be therefore the ratio (c min /c). The closer this ratio to one the higher the company s efficiency. In the dual of the CCR approach, for each observation the algorithm searches for an efficient set E 0 which minimises the efficiency score K. We use the efficient set (or reference group) to construct an artificial observation which is a linear combination of the efficient set. Thus the cost of the artificial observation, c E is formed from c E = Σ λ I c i iεe The efficient set must produce at least as much of every output as observation 0: y E j = Σ λ I y ij y 0j for each output j iεe City University 2

Where there are additional noncontrollable factors z, these must also meet the weak inequality: z E j = Σ λ I z ij z 0j for each noncontrollable j iεe The efficiency score is then K = c e /c 0. (There are no noncontrollables in LAB7DEA Model1) Additional constraints are that the weight on any observation, λ i is non-negative. In the original, constant returns to scale formulation, the λ i s are otherwise unconstrained. In the variable returns to scale approach, due to Banker, Charnes, and Cooper (1984), the sum of the λ i s is constrained to 1, which ensures that the artificial observation is not just a linear, but a convex combination of the efficient set. The shadow prices on each constraint represent the weights (ps and qs) on the normal, primal analysis. Roughly these correspond to the regression coefficients, except that each observation has its own weights depending on which facet of the convex hull the reference group defines. DEA also allows the inclusion not only of inputs and outputs but also of other variables describing a company s operating environment (often called non-controllable or environmental variables), thus enabling like for like comparisons. In DEA one has to decide about the relative importance of competing explanatory factors prior to the analysis. The inputs and the outputs are entered into the DEA optimisation algorithm but there is no built-in test about their appropriateness. With DEA one has to decide also about the sign of these explanatory factors prior to the running of the DEA programme while with RA the signs of the explanatory variables are calculated by the OLS algorithm. Without any means for determining the appropriate specification DEA should not be used as the primary approach in comparative efficiency especially when RA is possible. DEA operates in one of two modes - input shrinkage (or minimisation) and output expansion (or maximisation.) Figure 2 shows how, in the input minimisation mode, DEA can use data on several inputs to produce a performance score that is independent of any imposed weighting system for the inputs. The unit under consideration, D, is producing the same City University 3

amount (or less) of every output as units A, B, and C. Figure 2: Input Efficiency: A Comparison of units Producing the Same Output Input 2 A, B, and C are all technically efficient G F A B E C D Input efficiency for D = OE/OD 0 Input 2 B and C are the `reference group' for unit D The fundamental assumption of DEA is that, if B and C are feasible then a linear combination such as E is also feasible. E represents a radial contraction of D, using proportionately less of every input. The ratio OE/OD is the Farrell measure of technical efficiency. B and C are said to be the reference group for unit D. Figure 3 shows the equivalent measure in the output expansion mode. Units I, J, and K are using the same (or less) of every input as branch G. I, J and K are regarded as technically efficient relative to the other points in the data set. City University 4

Figure 3: Output Efficiency: A Comparison Units Using the Same Inputs Output 2 I, J, and K are technically efficient I H J Efficiency score of G = OG/OH G K 0 Output 1 I and J are the reference group for G Constant and Variable Returns in DEA The difference between the constant and variable returns to scale cases is illustrated in Figure 4 using a single-input, single output example. On this figure we have a single output on the vertical axis and a single input on the horizontal axis. Points A, B, C, D, E and F represent actual companies with varying output-input ratios. Company C has the maximum outputinput ratio. Under CRS (constant returns) the fitted DEA frontier will be the ray OC. However under VRS (variable returns) the fitted DEA frontier will be the envelope line ABCDE. Figure 4: Constant and varying returns to scale in DEA Output D E F C B O A Input The companies below and to the left of company C are subject to increasing returns while the ones located above and the right are subject to decreasing returns. Therefore, if a CRS frontier like OC is fitted the companies on ABCDE which are not on ray OC will be City University 5

classified as inefficient, partly due to scale inefficiency, so that more companies are likely to be labelled as efficient under VRS compared to the CRS case. 3. RA versus DEA RA and DEA are widely regarded as equivalent alternative techniques for estimating or fitting an efficiency frontier. Both are the results of a minimisation process: RA uses the least squares algorithm to fit an average line while DEA uses linear programming to fit a convex hull. However, the two techniques have fundamental differences: 1) RA calculates a fixed number of parameters defined by the number of regressors k. The number of parameters calculated by DEA depends on the data set, and the number of factors used as reference sets. The upper limit is N k, where N is the number of observations and k the number of variables. By way of comparison a third approach, (known as parametric programming) combines the LP approach to minimisation with a fixed number of parameters, k.. 2) RA makes assumptions about the stochastic properties of the observed data. Under RA the observed data points are assumed to be realisations of random variables following certain distributions, usually normal, enabling the carrying out of hypothesis testing. This is an important advantage of RA over DEA as it is normally practised since it enables to check the statistical significance of competing explanatory variables as well as the appropriateness of the estimated functional form. Since DEA has been developed mainly in a non-statistical framework, hypothesis testing is more problematic with DEA. Without hypothesis testing model selection is problematic. Consider for example the case of fitting a cost frontier. Economic theory predicts that the quantity of output and factor prices should be among the exogenous determinants of costs. While this is true, there are also other important factors influencing costs in the real world. RA provides an empirical test or a decision rule for identifying the important ones. The implied production possibility frontier for the two approaches is clearly different. In particular, whilst there is only one efficient observation in the econometric approach, DEA will tend to find many observations on the frontier. In order to carry out either of these methods an essential first step is to find out which factors affect the raw performance as reflected in the indicators. The econometric approach, with its statistical tests, is most useful in this process, and we used the results of our econometric analysis to inform our specification of the DEA model. In the econometric approach the benchmark cost level is derived from a statistical cost function which provides the best fit to the data. Implementing this approach requires an assumption to be made about the shape of the underlying cost function. On the other hand, (DEA) does not require an assumption about the shape of the cost function, and, in some ways provides a more convenient framework where there are many outputs and inputs. Finally, all attempts at calculating relative efficiency may be frustrated by difficulties in obtaining sufficient data of good enough quality. Remember GIGO. City University 6