Data Envelopment Analyss Data Envelopment Analyss (DEA) s an ncreasngly popular management tool. DEA s commonly used to evaluate the effcency of a number of producers. A typcal statstcal approach s characterzed as a central tendency approach and t evaluates producers relatve to an average producer. In contrast, DEA compares each producer wth only the "best" producers. By the way, n the DEA lterature, a producer s usually referred to as a decson makng unt or DMU. DEA s not always the rght tool for a problem but s approprate n certan cases. In DEA, there are a number of producers. The producton process for each producer s to take a set of nputs and produce a set of outputs. Each producer has a varyng level of nputs and gves a varyng level of outputs. For nstance, consder a set of nursng homes. Each nursng home has a certan number of regstered nurses, other health care workers, a certan square footage of space, and a certan number of managers (the nputs). There are a number of measures of the output of nursng homes, for example, dfferent type of patents (the outputs). DEA attempts to determne whch of the facltes are most effcent, and to pont out specfc neffcences of the others. A fundamental assumpton behnd ths method s that f a gven producer, A, s capable of producng Y(A) unts of output wth X(A) nputs, then other producers should also be able to do the same f they were to operate effcently. Smlarly, f producer B s capable of producng Y(B) unts of output wth X(B) nputs, then other producers should also be capable of the same producton schedule. Producers A, B, and others can then be combned to form a composte producer wth composte nputs and composte outputs. Snce ths composte producer does not necessarly exst, t s typcally called a vrtual producer. The heart of the analyss les n fndng the "best" vrtual producer for each real producer. If the vrtual producer s better than the orgnal producer by ether makng more output wth the same nput or makng the same output wth less nput then the orgnal producer s neffcent. The subtletes of DEA are ntroduced n the varous ways that producers A and B can be scaled up or down and combned. Numercal Example To llustrate how DEA works, let's take an example of three banks. Each bank has exactly 1 tellers (the only nput), and we measure a bank based on two outputs: Checks cashed and Loan applcatons. The data for these banks s as follows: Bank A: 1 tellers, 1 checks, 2 loan applcatons Bank B: 1 tellers, 4 checks, 5 loan applcatons Bank C: 1 tellers, 2 checks, 15 loan applcatons Now, the key to DEA s to determne whether we can create a vrtual bank that s better than one or more of the real banks. Any such domnated bank wll be an neffcent bank.
Consder tryng to create a vrtual bank that s better than Bank A. Such a bank would use no more nputs than A (1 tellers), and produce at least as much output (1 checks and 2 loans). Clearly, no combnaton of banks B and C can possbly do that. Bank A s therefore deemed to be effcent. Bank C s n the same stuaton. However, consder bank B. If we take half of Bank A and combne t wth half of Bank C, then we create a bank that processes 6 checks and 85 loan applcatons wth just 1 tellers. Ths domnates B (we would much rather have the vrtual bank we created than bank B). Bank B s therefore neffcent. Another way to see ths s that we can scale down the nputs to B (the tellers) and stll have at least as much output. If we assume (and we do), that nputs are lnearly scalable, then we estmate that we can get by wth 6.3 tellers. We do that by takng.34 tmes bank A plus.29 tmes bank B. The result uses 6.3 tellers and produces at least as much as bank B does. We say that bank B's effcency ratng s.63. Banks A and C have an effcency ratng of 1. Graphcal Example The sngle nput two-output or two nput-one output problems are easy to analyze graphcally. The prevous numercal example s now solved graphcally. (An assumpton of constant returns to scale s made and explaned n detal later.) The analyss of the effcency for bank B looks lke the followng: If t s assumed that convex combnatons of banks are allowed, then the lne segment connectng banks A and C shows the possbltes of vrtual outputs that can be formed from these two banks.
Smlar segments can be drawn between A and B along wth B and C. Snce the segment AC les beyond the segments AB and BC, ths means that a convex combnaton of A and C wll create the most outputs for a gven set of nputs. Ths lne s called the effcency fronter. The effcency fronter defnes the maxmum combnatons of outputs that can be produced for a gven set of nputs. Snce bank B les below the effcency fronter, t s neffcent. Its effcency can be determned by comparng t to a vrtual bank formed from bank A and bank C. The vrtual player, called V, s approxmately 54% of bank A and 46% of bank C. The effcency of bank B s then calculated by fndng the fracton of nputs that bank V would need to produce as many outputs as bank B. Ths s easly calculated by lookng at the lne from the orgn, O, to V. The effcency of player B s OB/OV whch s approxmately 63%. Ths fgure also shows that banks A and C are effcent snce they le on the effcency fronter. In other words, any vrtual bank formed for analyzng banks A and C wll le on banks A and C respectvely. Therefore snce the effcency s calculated as the rato of OA/OV or OC/OV, banks A and C wll have effcency scores equal to 1.. The graphcal method s useful n ths smple two dmensonal example but gets much harder n hgher dmensons. The normal method of evaluatng the effcency of bank B s by usng a lnear programmng formulaton of DEA. Snce ths problem uses a constant nput value of 1 for all of the banks, t avods the complcatons caused by allowng dfferent returns to scale. Returns to scale refers to ncreasng or decreasng effcency based on sze. For example, a manufacturer can acheve certan economes of scale by producng a thousand crcut boards at a tme rather than one at a tme - t mght be only 1 tmes as hard as producng one at a tme. Ths s an example of ncreasng returns to scale (IRS.) On the other hand, the manufacturer mght fnd t more than a trllon tmes as dffcult to produce a trllon crcut boards at a tme though because of storage problems and lmts on the worldwde copper supply. Ths range of producton llustrates decreasng returns to scale (DRS.) Combnng the two extreme ranges would necesstate varable returns to scale (VRS.) Constant Returns to Scale (CRS) means that the producers are able to lnearly scale the nputs and outputs wthout ncreasng or decreasng effcency. Ths s a sgnfcant assumpton. The assumpton of CRS may be vald over lmted ranges but ts use must be justfed. As an asde, CRS tends to lower the effcency scores whle VRS tends to rase effcency scores. Usng Lnear Programmng Lnear programmng (LP) s a mathematcal method for determnng a way to acheve the best outcome (such as maxmzng proft or mnmzng cost) n a gven math model and a set of requrements represented as lnear relatonshps.
Data Envelopment Analyss s a lnear programmng procedure for a fronter analyss of nputs and outputs. DEA assgns a score of 1 to a unt only when comparsons wth other relevant unts do not provde evdence of neffcency n the use of any nput or output. DEA assgns an effcency score less than one to (relatvely) neffcent unts. A score less than one means that a lnear combnaton of other unts from the sample could produce the same vector of outputs usng a smaller vector of nputs. The score reflects the radal dstance from the estmated producton fronter to the DMU under consderaton. There are a number of equvalent formulatons for DEA. The most drect formulaton of the exposton I gave above s as follows: Let X be the vector of nputs nto DMU. Let Y be the correspondng vector of outputs. Let X be the nputs nto a DMU for whch we want to determne ts effcency and Y be the outputs. So the X's and the Y's are the data. The measure of effcency for DMU s gven by the followng lnear program: Mn st.. X X YY where s the weght gven to DMU n ts efforts to domnate DMU and s the effcency DMU appears on the left hand sde of the of DMU. So the 's and are the varables. Snce equatons as well, the optmal cannot possbly be more than 1. When we solve ths lnear program, we get a number of thngs: 1. The effcency of DMU ( ) wth 1 meanng that the unt s effcent. 2. The unt's comparables (those DMU wth nonzero ). 3. The goal nputs (the dfference between X and X ) 4. Alternatvely, we can keep nputs fxed and get goal outputs ( 1 Y ) DEA assumes that the nputs and outputs have been correctly dentfed. Usually, as the number of nputs and outputs ncrease, more DMUs tend to get an effcency ratng of 1 as they become
too specalzed to be evaluated wth respect to other unts. On the other hand, f there are too few nputs and outputs, more DMUs tend to be comparable. In any study, t s mportant to focus on correctly specfyng nputs and outputs. Example: 3 DMU, 2 nputs and 3 outputs Input Output DMU 1 5 14 9 4 16 2 8 15 5 7 1 3 7 12 4 8 13 The lnear programs for evaluatng the 3 DMUs are gven by: LP for evaluatng DMU 1: mn st 5L1+8L2+7L3-5 <= 14L1+15L2+12L3-14 <= 9L1+5L2+4L3 >= 9 4L1+7L2+9L3 >= 4 16L1+1L2+13L3 >= 16 L1, L2, L3 >= LP for evaluatng DMU 2: mn st 5L1+8L2+7L3-8 <= 14L1+15L2+12L3-15 <= 9L1+5L2+4L3 >= 5 4L1+7L2+9L3 >= 7 16L1+1L2+13L3 >= 1 L1, L2, L3 >= LP for evaluatng DMU 3: mn st 5L1+8L2+7L3-7 <= 14L1+15L2+12L3-12 <=
9L1+5L2+4L3 >= 4 4L1+7L2+9L3 >= 9 16L1+1L2+13L3 >= 13 L1, L2, L3 >= SAS program and output The LP Procedure for DMU 1 Varable Summary Varable Reduced Col Name Status Type Prce Actvty Cost 1 x1 BASIC NON-NEG 1 2 x2 NON-NEG 1.444444 3 x3 NON-NEG.9555556 4 theta BASIC NON-NEG 1 1 5 const1 SLACK.2 6 const2 DEGEN SLACK 7 const3 SURPLUS.1111111 8 const4 DEGEN SURPLUS 9 const5 DEGEN SURPLUS 1 const6 BASIC SURPLUS 1 11 const7 DEGEN SURPLUS 12 const8 DEGEN SURPLUS The LP Procedure for DMU 2 Varable Summary Varable Reduced Col Name Status Type Prce Actvty Cost 1 x1 BASIC NON-NEG.249981 2 x2 BASIC NON-NEG.444689 3 x3 BASIC NON-NEG.632125 4 theta BASIC NON-NEG 1.753767 5 const1 SLACK.788313 6 const2 SLACK.246233 7 const3 SURPLUS.51654 8 const4 SURPLUS.718486 9 const5 BASIC SURPLUS 2.667865 1 const6 BASIC SURPLUS.249981 11 const7 BASIC SURPLUS.444689
12 const8 BASIC SURPLUS.632125 The LP Procedure for DUM3 Varable Summary Varable Reduced Col Name Status Type Prce Actvty Cost 1 x1 DEGEN NON-NEG 2 x2 DEGEN NON-NEG 3 x3 BASIC NON-NEG 1 4 theta BASIC NON-NEG 1 1 5 const1 SLACK.699677 6 const2 SLACK.425188 7 const3 DEGEN SURPLUS 8 const4 SURPLUS.4366 9 const5 SURPLUS.489774 1 const6 DEGEN SURPLUS 11 const7 DEGEN SURPLUS 12 const8 BASIC SURPLUS 1 Note that DMUs 1 and 3 are overall effcent and DMU 2 s neffcent wth an effcency ratng of.753767. Hence the effcent levels of nputs and outputs for DMU 2 are gven by: Effcent levels of Inputs: 5 7 5.67.249981*.632125* 14 12 11.8 Effcent levels of Outputs: 9 4 4.78.249981* 4.632125* 8 6.69 16 13 12.22
Note that the outputs are at least as much as the outputs currently produced by DMU 2 and nputs are at most as bg as the.753767 tmes the nputs of DMU 2. Ths can be used n two dfferent ways: The neffcent DMU should target to cut down nputs to equal at most the effcent levels. Alternatvely, an equvalent statement can be made by fndng a set of effcent levels of nputs and outputs by dvdng the levels obtaned by the effcency of DMU 2. Ths focus can then be used to set targets prmarly for outputs rather than reducton of nputs. VRS (constant return to scale) Mn st.. 1 X X YY VRS (ncreasng return to scale) Mn st.. 1 X X YY VRS (non-ncreasng return to scale) Mn st.. 1 X X YY
Daly Rate A Real Example 1. Effcent Prcng Effcency vs. Drect Cost 3 Cost = 175.75-51.53*Effcency 25 2 Introduce DEA 15 1 5.1.2.3.4.5.6.7.8.9 1 effceny 2. Explan the dstrbuton above (no consstent prcng) See example of the prcng by sze
The Factors That Affect Effcency Statstcal Sgn sgnfcance 1 Prvate payers as share of total + * Managed care patents as share of total + * Faclty bed utlzaton rate + Specal care patents as share of total + * Propretary faclty dummy + * Publc faclty dummy - Bed sze dummy 2 - * Dffculty ndex 3 + * Downstate dummy + * Qualty score - 1 An astersk denotes estmate was sgnfcant at the 5 percent level. 2 Dummy equals one f number of beds s greater than 3; zero otherwse. 3 Case mx adjusted ouput dvded by actual output.