The most common model to support workforce management of telephone call centers is



Similar documents
State of Louisiana Office of Information Technology. Change Management Plan

MSc. Econ: MATHEMATICAL STATISTICS, 1995 MAXIMUM-LIKELIHOOD ESTIMATION

JON HOLTAN. if P&C Insurance Ltd., Oslo, Norway ABSTRACT

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 14 10/27/2008 MOMENT GENERATING FUNCTIONS

Data Center Power System Reliability Beyond the 9 s: A Practical Approach

Math , Fall 2012: HW 1 Solutions

Modelling and Resolving Software Dependencies

10.2 Systems of Linear Equations: Matrices

Optimal Energy Commitments with Storage and Intermittent Supply

Optimal Control Policy of a Production and Inventory System for multi-product in Segmented Market

A Comparison of Performance Measures for Online Algorithms

On Adaboost and Optimal Betting Strategies

Digital barrier option contract with exponential random time

Firewall Design: Consistency, Completeness, and Compactness

Web Appendices of Selling to Overcon dent Consumers

A Generalization of Sauer s Lemma to Classes of Large-Margin Functions

An intertemporal model of the real exchange rate, stock market, and international debt dynamics: policy simulations

A New Evaluation Measure for Information Retrieval Systems

Unsteady Flow Visualization by Animating Evenly-Spaced Streamlines

Cross-Over Analysis Using T-Tests

Option Pricing for Inventory Management and Control

The one-year non-life insurance risk

Professional Level Options Module, Paper P4(SGP)

GPRS performance estimation in GSM circuit switched services and GPRS shared resource systems *

Answers to the Practice Problems for Test 2

Achieving quality audio testing for mobile phones

Risk Adjustment for Poker Players

A New Pricing Model for Competitive Telecommunications Services Using Congestion Discounts

Detecting Possibly Fraudulent or Error-Prone Survey Data Using Benford s Law

The Quick Calculus Tutorial

MODELLING OF TWO STRATEGIES IN INVENTORY CONTROL SYSTEM WITH RANDOM LEAD TIME AND DEMAND

Differentiability of Exponential Functions

CALCULATION INSTRUCTIONS

INFLUENCE OF GPS TECHNOLOGY ON COST CONTROL AND MAINTENANCE OF VEHICLES

BOSCH. CAN Specification. Version , Robert Bosch GmbH, Postfach , D Stuttgart

Consumer Referrals. Maria Arbatskaya and Hideo Konishi. October 28, 2014

Risk Management for Derivatives

Lecture L25-3D Rigid Body Kinematics

FAST JOINING AND REPAIRING OF SANDWICH MATERIALS WITH DETACHABLE MECHANICAL CONNECTION TECHNOLOGY

Parameterized Algorithms for d-hitting Set: the Weighted Case Henning Fernau. Univ. Trier, FB 4 Abteilung Informatik Trier, Germany

Safety Stock or Excess Capacity: Trade-offs under Supply Risk

Product Differentiation for Software-as-a-Service Providers

Hull, Chapter 11 + Sections 17.1 and 17.2 Additional reference: John Cox and Mark Rubinstein, Options Markets, Chapter 5

Ch 10. Arithmetic Average Options and Asian Opitons

Forecasting and Staffing Call Centers with Multiple Interdependent Uncertain Arrival Streams

Unbalanced Power Flow Analysis in a Micro Grid

How To Understand The Structure Of A Can (Can)

11 CHAPTER 11: FOOTINGS

DIFFRACTION AND INTERFERENCE

Mathematical Models of Therapeutical Actions Related to Tumour and Immune System Competition

Measures of distance between samples: Euclidean

Towards a Framework for Enterprise Architecture Frameworks Comparison and Selection

Mandate-Based Health Reform and the Labor Market: Evidence from the Massachusetts Reform

Reading: Ryden chs. 3 & 4, Shu chs. 15 & 16. For the enthusiasts, Shu chs. 13 & 14.

Mathematics Review for Economists

Lagrangian and Hamiltonian Mechanics

Chapter 9 AIRPORT SYSTEM PLANNING

Minimum-Energy Broadcast in All-Wireless Networks: NP-Completeness and Distribution Issues

Net Neutrality, Network Capacity, and Innovation at the Edges

The mean-field computation in a supermarket model with server multiple vacations

An Introduction to Event-triggered and Self-triggered Control

f(x) = a x, h(5) = ( 1) 5 1 = 2 2 1

ISSN: ISO 9001:2008 Certified International Journal of Engineering and Innovative Technology (IJEIT) Volume 3, Issue 12, June 2014

RUNESTONE, an International Student Collaboration Project

Search Advertising Based Promotion Strategies for Online Retailers

GeTec Ingenieurgesellschaft für Informations- und Planungstechnologie mbh. Presented by

Optimal Control Of Production Inventory Systems With Deteriorating Items And Dynamic Costs

The higher education factor: The role of higher education in the hiring and promotion practices in the fire service. By Nick Geis.

Introduction to Integration Part 1: Anti-Differentiation

Calculating Viscous Flow: Velocity Profiles in Rivers and Pipes

! # % & ( ) +,,),. / % ( 345 6, & & & &&3 6

CURRENCY OPTION PRICING II

Isothermal quantum dynamics: Investigations for the harmonic oscillator

Using research evidence in mental health: user-rating and focus group study of clinicians preferences for a new clinical question-answering service

Call centers with impatient customers: exact analysis and many-server asymptotics of the M/M/n+G queue. Sergey Zeltyn

How To Find Out How To Calculate Volume Of A Sphere

Manure Spreader Calibration

Dynamic Network Security Deployment Under Partial Information

Inverse Trig Functions

Modeling and Predicting Popularity Dynamics via Reinforced Poisson Processes

ThroughputScheduler: Learning to Schedule on Heterogeneous Hadoop Clusters

There are two different ways you can interpret the information given in a demand curve.

Heat-And-Mass Transfer Relationship to Determine Shear Stress in Tubular Membrane Systems Ratkovich, Nicolas Rios; Nopens, Ingmar

A Blame-Based Approach to Generating Proposals for Handling Inconsistency in Software Requirements

Chapter 11: Feedback and PID Control Theory

Example Optimization Problems selected from Section 4.7

Sustainability Through the Market: Making Markets Work for Everyone q

Enterprise Resource Planning

Transcription:

Designing a Call Center with Impatient Customers O. Garnett A. Manelbaum M. Reiman Davison Faculty of Inustrial Engineering an Management, Technion, Haifa 32000, Israel Davison Faculty of Inustrial Engineering an Management, Technion, Haifa 32000, Israel Bell Laboratories, Murray Hill, ew Jersey 07974 oferpbg@zahav.net.il avim@tx.technion.ac.il marty@research.bell-labs.com The most common moel to support workforce management of telephone call centers is the M/M//B moel, in particular its special cases M/M/ (Erlang C, which moels out busy signals) an M/M// (Erlang B, isallowing waiting). All of these moels lack a central prevalent feature, namely, that impatient customers might ecie to leave (abanon) before their service begins. In this paper, we analyze the simplest abanonment moel, in which customers patience is exponentially istribute an the system s waiting capacity is unlimite (M/M/ M). Such a moel is both rich an analyzable enough to provie information that is practically important for call-center managers. We first outline a metho for exact analysis of the M/M/ M moel, that while numerically tractable is not very insightful. We then procee with an asymptotic analysis of the M/M/ M moel, in a regime that is appropriate for large call centers (many agents, high efficiency, high service level). Guie by the asymptotic behavior, we erive approximations for performance measures an propose rules of thumb for the esign of large call centers. We thus a support to the growing acknowlegment that insights from iffusion approximations are irectly applicable to management practice. (Tele-Queues; Erlang C; Erlang A; Telephone Call an Contact Centers; Multiserver Exponential Queues; Workforce Management or Staffing; Queues with Abanonment; Diffusion Approximation) 1. Introuction an Summary During recent ecaes there has been explosive growth in the number of companies that provie services via the telephone, an also in the variety of telephone services provie. A central challenge in esigning an managing any service operation is to achieve a balance between operational efficiency an service quality, an in telephone services this challenge is often pushe to the extreme: A large call center serves thousans of calls per ay, each of which emans a response within secons. Analytical moels provie guiance regaring the sought-after balance. Moeling a Call Center. A simplifie representation of call-center flows is given in Figure 1. Incoming calls form a single queue to wait for service from one of statistically ientical agents. K telephone trunks are connecte to an Automatic Call Distributor (ACD) that manages the queue, connects customers to available agents, an archives operational ata. Customers arriving when all trunks are occupie encounter a busy signal. Such customers might try again later ( retrial ) or give up ( lost eman ). Customers who succee in getting through at a time when all agents are busy (that is, when there are at least but fewer than K customers within the call center), are place in the queue. If waiting customers run out of patience before their service begins, MAUFACTURIG &SERVICE OPERATIOS MAAGEMET 2002 IFORMS Vol. 4, o. 3, Summer 2002, pp. 208 227 1523-4614/02/0403/0208$05.00 1526-5498 electronic ISS

GARETT, MADELBAUM, AD REIMA Figure 1 Schematic Representation of a Telephone Call Center uration per unit of time.) The principle is as follows: For moerate to large values of R (or equivalently moerate to large, since we are assuming that is a fixe parameter of the call center), the appropriate staffing level is R R, (1) they hang up ( abanon ). After abanoning, customers might try calling again later. In basic moels of call centers it is commonly assume that the only parameters uner the system manager s control are the number of trunks available (K ) an the number of agents (). In most contexts the cost of trunk lines is trivial compare to personnel costs, so in this paper we focus on staffing ecisions (), assuming that K for moeling purposes. Thus busy signals are absent in the moels to be consiere, though, we woul like to emphasize that in no way o we avocate here the practice of unconitional no busy signal. Inee, a trae-off between busy signals an abanonment, in the spirit of Borst et al. (2000), is a worthwhile irection for future research. The classical M/M/ queueing moel, also calle the Erlang-C moel, is obtaine by further assuming Poisson arrivals, exponentially istribute service times, an no abanonment. It is the moel most often use in call-center analysis, but it has one glaring efect: Call abanonment is not a negligible or minor aspect of call-center operations. This issue is broaly aresse in 2. The Square Root Rule for Safety Staffing. In this paper, we consier the M/M/ moel with abanonment ae. To summarize the central finings, it will be useful to first review an important principle regaring capacity choice in the absence of abanonment. Let R / enote the (average) offere loa, where is the average call arrival rate an 1/ is the mean call uration. (R is measure in units of service where is a positive constant that epens on the esire level of service. Of course, in practice the value of erive from this formula must be roune to an integer. The secon term on the right sie of (1) may be escribe as the excess capacity neee, beyon nominal requirements (the first term), to achieve the target service level in the face of stochastic variability. Equation (1) shows that the require excess capacity grows less than proportionately with the loa of calls to be hanle. This phenomenon is aptly escribe as statistical economies of scale. The square root formula (1) was erive an iscusse by War Whitt (1992), but as he explaine, similar esign rules ha been avance by a number of other authors uring the 1970s an 1980s. (We refer the reaer to Borst et al. (2000) for a historical perspective.) Whitt s treatment of this subject was base primarily, but not exclusively, on his pioneering work with Shlomo Halfin (Halfin an Whitt 1981) regaring iffusion approximations for many-server queues. To be more specific, the founation of Whitt s argument is the following: If one consiers a variety of systems with ifferent moerate to large values of R, an if the number of agents is chosen accoring to (1) in each case, then the quality of service will be approximately the same in each system. The measure of service quality unerlying this statement is the steay-state probability that a caller must wait in the queue before service. This probability is commonly referre to as the Erlang-C formula, hereafter abbreviate as P{W 0}. Halfin an Whitt (1981) provie a formula for computing in terms of the target value for P{W 0}. The Square Root Rule with Abanonment. Enriching the M/M/ moel to inclue abanonment, we assume the following: There is associate with each arriving caller an exponentially istribute ran- MAUFACTURIG &SERVICE OPERATIOS MAAGEMET Vol. 4, o. 3, Summer 2002 209

GARETT, MADELBAUM, AD REIMA om variable that quantifies the iniviual s patience; if a caller s waiting time in the queue grows to equal his or her patience, then the call is abanone. (In the interest of tractability, we assume that customers who abanon o not retry.) The patience variables characterizing ifferent callers are inepenent an ientically istribute with mean 1, an they are inepenent of all other moel elements as well. The positive quantity will be referre to as either the abanonment rate or the impatience parameter, epening on context. For this moel, we use the notation M/ M/ M, as introuce by Baccelli an Hebuterne (1981), an we propose to refer to it as Erlang-A (A for Abanonment, an for the fact that it interpolates between Erlang-C an Erlang-B, the latter being M/ M//). It will be shown in this paper that the square root rule (1) remains vali in the moel with abanonment for moerate to large R. Of course, the formula for is ifferent in our context: now epens on both the abanonment rate an the target value for P{W 0}, an in our setting may be negative, even when a small probability of waiting is specifie. That is, to achieve a given target value for P{W 0}, it may be sufficient to take smaller than the offere loa R. Furthermore, the appropriate value for in formula (1) is monotonically ecreasing in for fixe P{W 0}. That is, a higher abanonment rate reuces the amount of capacity one nees to achieve a given service level. This fact may be surprising initially, but the following makes it obvious: Temporarily enoting by Q the steay-state number of callers in the system, we have P{W 0} P{Q }, an Q ecreases stochastically as one increases. Of course, P{W 0} is not the only measure one coul use to quantify the notion of service level or service quality, but the iscussion immeiately below shows that performance accoring to other obvious measures is uniformly excellent when R is not small an the square root rule is use for staffing. Table 1 System Performance in Three Staffing Regimes Regime Staffing Level Performance Characteristics Rationalize R R P{W 0} () an P {Ab} 0 Quality-Driven R R, 0 P {W 0} 0 an P {Ab} 0 Efficiency- Driven R R, 0 P {W 0} 1 an P {Ab} Three Staffing Regimes. Another funamental measure of service quality is the steay-state probability that an arrival will abanon before getting service, hereafter abbreviate as P{Ab}. Table 1 summarizes the behavior of P{W 0} an P{Ab} in three limiting regimes stuie later in the paper, each of which represents a ifferent philosophy with regar to the esign of a call center. (The limit referre to here is R.) The rationalize regime is that where the square root rule (1) is use to etermine system capacity: A formula will be erive for the limiting P{W 0}, enote by () in Table 1 (the formula appears below Table 4 in 5), an it will be shown that P{Ab} 0asR, regarless of how is chosen. The quality-riven regime is where capacity excees nominal requirements by a fixe percentage: It will be shown that both P{W 0} an P{Ab} vanish as R. Finally, in the efficiency-riven regime, capacity f alls short of nominal requirements by a fixe percentage: It will be shown that virtually all arrivals wait in this case, but P{Ab} is simply equal to the capacity shortfall, expresse as a fraction of the offere loa. Another interesting performance measure is the steay-state average waiting time before either service begins or the caller abanons, hereafter abbreviate E[W]. One has the useful ientify E[W] P{Ab}, so results cite for P{Ab} in Table 1 translate immeiately into results for E[W]. Even in the efficiency-riven staffing regime where virtually all callers wait, both P{Ab} an E[W] remain small if the capacity shortfall is small. Actually, the results prove later about the three staffing regimes are both more refine an more extensive than the summary provie in Table 1, but this summary communicates the most important finings for purposes of system esign. Base on this analysis, we conclue that the rationalize regime is appropriate in most settings: By choosing the con- 210 MAUFACTURIG &SERVICE OPERATIOS MAAGEMET Vol. 4, o. 3, Summer 2002

GARETT, MADELBAUM, AD REIMA stant appropriately, a system manager using the square root rule (1) can achieve a rational balance between efficiency an service quality, unless one of those two concerns utterly ominates the other. Determination of, base on economic consierations via (asymptotic) optimization, is an important topic uner current research. (See Borst et al. 2000 for M/ M/ imensioning analysis.) The results reporte in Table 1 support the following important, if unsurprising, conclusion: With impatient customers an a moerate to large call volume, system performance is relatively robust; any staffing level close to the nominal requirement R prouces goo service. Callers who refuse to wait inefinitely impose smaller externalities on later arrivals than o patient callers, an generally speaking, they make the system esigner s job easier. Contributions to System Moeling. From a mathematical stanpoint, the classical M/M/ moel is funamentally change when one incorporates the phenomenon of abanonment. For example, the moel with abanonment is stable for all parameter combinations, whereas the classical moel achieves statistical equilibrium if an only if R. Also, as we show by numerical examples, the moels with an without abanonment ten to give very ifferent performance estimates in the parameter regime of primary interest, even when the abanonment rate is small. Because the M/M/ moel is so commonly use for quick-an-irty performance analysis, these effects of aing abanonment are thoroughly iscusse an illustrate in this paper (see 2). Denoting by Q(t) the number of callers present in the system at time t, either waiting or being serve, we focus on the stochastic process Q {Q(t), t 0}. This process is central in call centers, as will now be explaine. First, it is a visible cue for management an agents an its realtime value is often isplaye for everyone to see. Moreover, in call centers that provie toll free services, Q is proportional to the cost of incoming calls. However, the customer s point of view is represente by the waiting time either potential (i.e. the time he woul wait in queue for his service to commence if his patience was infinite) or actual. Inee, in 3 we introuce a metho for calculating a wie variety of performance measures that involve the potential an actual waiting times. An important outcome of our analysis, as shown in Theorem 3 below, is that potential waiting time an queue length are eterministically relate in heavy traffic. Given the various assumptions lai out earlier, Q is a birth-an-eath process, an so its steay-state istribution can be written out in an explicit formula. However, that formula is complicate enough to cause ifficulties both in numerical evaluation an in qualitative unerstaning. After some brief remarks about numerical evaluation of exact formulas for steay-state performance measures, most of this paper eals with approximations in the heavy traffic regime, where R is large an R/ is near 1. That is, we evelop approximations for call centers having a moerate to large number of agents an high agent utilization. Rather than eveloping approximations only for steay-state quantities, we show that a properly scale version of the stochastic process Q is well approximate by a certain iffusion process in the heavy traffic regime. This analysis parallels the iffusion approximation evelope by Halfin an Whitt (1981) for many-server moels without abanonment. It helps to explain the steay-state approximations that spawn the square root formula (1), an gives a more complete unerstaning of system behavior in the heavy traffic regime. Relate Research. As attribute in the sequel, some of our results are motivate or base on previous work by Palm (1937, 1943, 1953), Rioran (1962), Baccelli an Hebuterne (1981), an especially Halfin an Whitt (1981), an Fleming et al. (1994). Inee, both Halfin an Whitt (iscusse earlier in this section) an Fleming et al. focus on heavy traffic analysis of M/M/ an M/M/ M systems respectively. In Fleming et al. a iffusion approximation of the queue process is erive leaing to approximations for the fraction of customers abanoning an other performance measures. A general overview of moels with abanonment appears in Boxma an e Waal (1994), with a review of relevant literature. There have been attempts to analyze more complex moels, of which we mention a MAUFACTURIG &SERVICE OPERATIOS MAAGEMET Vol. 4, o. 3, Summer 2002 211

GARETT, MADELBAUM, AD REIMA few: Ancker an Gafarian (1963) analyze queues with abanonment, multiple heterogeneous agents, an finite capacity through their steay-state equations an erive the waiting time ensity. Sze (1984) compares ifferent approximations for an M/PH/ PH moel (PH stans for Phase Type istribution) with retrials, priorities, an nonstationary arrivals. The results are verifie by simulation. In Harris et al. (1987) an Hoffman an Harris (1986) the basic moel is M/M/, with the aition of abanonment, retrials, an a variety of service isciplines. Assuming a heavily-loae call center an using some approximations, they arrive at a system of steay-state equations that can be solve numerically. In two recent papers by Brant an Brant (1998, 1999), M/M/ G moels with state-epenent arrivals are analyze. Applications inclue systems with an integrate voice-mail-server an cases in which ile agents initiate outboun calls. Finally, flui an iffusion approximations for time-epenent moels with abanonment an retrials, are escribe in Manelbaum et al. (1999, 2000), both of which are base on Manelbaum et al. (1998). Overall Contribution an Contents. The contributions of the present paper, in our opinion, are both theoretical an practical, but even more so the briging of the two. Specifically: Extening the funamental finings of Halfin an Whitt (1981) to accommoate abanonment (for example, Theorems 4 an 2, Table 4) an waiting times (Theorem 3). Revisiting the classical Erlang (1909, 1917) an Palm (1943) results, an aapting them to the environment of the moern call center ( 3, 5.1, an 5.2). Aing support to the growing acknowlegment that insights from iffusion approximations are irectly applicable to management practice ( 5.2, 5.3, Appenix A). Prepare the necessary groun for an economic analysis of abanonment, following Borst et al. (2000). The rest of the paper is organize as follows: In 2 we provie further motivation as to the relevance of our results an the M/M/ M moel in general, to the management of moern call centers. In 3 we outline a metho for exact calculations of a wie variety of performance measures. Then in 4 we focus on heavy traffic limit theorems, which lea to some implementations iscusse in 5. The paper also has three appenices: Appenix A isplays graphs showing the (excellent) quality of a number of approximations erive in 5; Appenix B inclues computational etails of the metho outline in 3; an Appenix C contains proofs of Theorems 1 4 with an extene version of Theorem 2. 2. The Significance of Abanonment in Practice an Moeling A major rawback of moels that ignore abanonment is that they either istort or fail to provie information that is important to call-center managers. When trying to manage a large call center in heavy traffic, one must consier the effect of abanoning customers on service level. It is not enough to consier waiting times or busy signals, especially since abanonment statistics constitute the only ACD ata that unveils customers perception of service quality. Accoring to the Help Desk an Customer Support Practices Report (1997), more than 40% of call centers set a target for fraction of abanonment, but in most cases this target is not achieve. Moreover, the lack of unerstaning of the abanonment phenomenon an the scarcity of moels that acknowlege it, has le practitioners to ignore it altogether. (For example, this lea Clevelan an Mayben (1997) to conclue that abanonment is not a goo inicator of callcenter performance.) This can cause either uner- or over-staffing: On the one han, if service level is measure only for those customers who reach service, the result is unjustly optimistic the immeiate effect of an abanonment is less elay for those further back in line, as well as for future arrivals. This woul lea to uner-staffing. On the other han, using workforce management tools that ignore abanonment woul result in over-staffing as actually fewer agents are neee in orer to meet most abanonment-ignorant service goals. 212 MAUFACTURIG &SERVICE OPERATIOS MAAGEMET Vol. 4, o. 3, Summer 2002

GARETT, MADELBAUM, AD REIMA Figure 2 Table 2 Fraction Queueing M(48)/M(1)/ (a) vs. M(48)/M(1)/ M(0.5) (b) Comparing Results for Moels With/Without Abanonment Fraction abanoning Average spee of answer Waiting time s 90th percentile Average queue percentile Agents utilization M/M/ 20.8 sec. 58.1 sec. 17 96% M/M/ M 3.1% 3.6 sec. 12.5 sec. 3 93% ote. 50 agents, 48 calls per minute, 1 minute average service time, 2 minute average patience. The values were calculate, an can be verifie, using the iprofiler tool available at www.4callcenters.com. This analysis tool is base on parts of the present work in particular 3 an Appenix B. The significance of abanonment can be seen in simple numerical examples. Figure 2 shows graphs of the fraction of customers queueing accoring to the M/M/ an M/M/ M moels. It is clear that these graphs convey a rather ifferent picture of what is happening in the system they epict, in particular, in the range of 40 to 50 agents. Table 2 isplays some results from an M/M/ moel an a corresponing M/M/ M moel with only 3% abanonment. There is a significant ifference in the istributions of waiting time an queue length in particular, the average wait an queue length are both strikingly shorter when abanonment is taken into account. It is important to realize that such goo performance is not achieve if the arrival rate to the M/M/ system ecreases by 3% (for example, the average spee of answer in such case is approximately nine secons). We note, however, that the performance of systems in such heavy traffic is very sensitive to the staffing level aing three or four agents to the moel without abanonment woul result in performance similar to that isplaye for the moel with abanonment (the horizontal istance between the graphs in Figure 2 shows this margin ). onetheless, since personnel costs are the major expense of call centers (prevalent estimates run at about 60 70% of total cost), even a 6 8% reuction in personnel is significant. Both Figure 2 an Table 2 clearly inicate that it is possible, while operating in heavy traffic, to simultaneously achieve high efficiency (agent utilization near 100%) an goo service (low, but nonnegligible abanonment rate an waiting time). This is a irect outcome of economies of scale, as emonstrate in the next example. The following is a possible scenario in which our staffing rules can be use (this scenario is revisite later in 5.3) A given call center with agents, service rate, an arrival rate (the offere loa R /), has service grae (high values of correspon to high service levels). There is a forecast of a higher arrival rate ˆ (Rˆ ˆ /) uring a forthcoming holiay. The call center s manager wishes to maintain the present service level at the call center uring the holiay, an hence nees to ecie on an increase in the number of agents ˆ ˆ () for the holiay shifts. To this en, the manager must first etermine the operational regime of the call center, representing the esire balance between quality an efficiency. Our rules then provie ˆ (). For example, in the rationalize regime, our recommene staffing level is ˆ () Rˆ Rˆ, where ( R)/R. Moreover, our analysis actually yiels explicit approximations for a wie range of performance measures (see 5.2). Specifically, the fraction of customers elaye in queue is expecte to remain unchange, while the fraction of customers abanoning, as well as the average waiting time, will ecrease by a factor of /ˆ, thus exhibiting economies of scale. MAUFACTURIG &SERVICE OPERATIOS MAAGEMET Vol. 4, o. 3, Summer 2002 213

GARETT, MADELBAUM, AD REIMA Figure 3 as a Service Grae for Large Call Centers Correlation with Abanonment ote. Each ot plotte represents ata of a half-hour interval. The x-coorinate was calculate via ( R)/R, where R an are averages over the half-hours, an the y-coorinate is the fraction abanoning uring that half-hour. One might question the practical value of the above staffing rules since they are base on limits for large systems operating in heavy traffic. In this regar, Figures 4 8 in Appenix A inicate that our results can be safely applie to call centers with as few as 30 or even 20 highly utilize agents. Moreover, analysis of ata from moerate to large call centers shows that inee both high efficiency an service quality are achieve as the centers operate in the rationalize regime with 0.5 1. This is emonstrate by the scatter plots in Figure 3. The plots isplay real ata from two call centers, collecte in half-hour intervals uring a single working ay (iscaring the opening an closing hours in which the traffic is not heavy ). Both traffic volumes an staffing levels varie at these call centers throughout the ay (left plot: from 350 to 900 calls per hour, 35 to 90 agents; right plot: 1,400 3,100 calls per hour, 125 250 agents). 3. Calculating Performance Measures Here we present a useful format for expressing performance measures for an M/M/ M moel in steay state. Details about the calculations of these expressions appear in Appenix B, also covering moels with finite capacity (M/M//B M). Due to the unerlying birth-eath structure, such calculations are almost (but not quite, ue to numerical issues) trivial. evertheless, it is natural an important to present them for practical completeness, as well as a lea to our approximations. Our calculation of performance measures is base on the assumption that the system has reache its steay state. Although the arrival rate to many call centers is time varying (accoring to the time of ay, ay of the week, holiays, seasonal effects, etc.), an other parameters such as the number of agents on the shift may be subject to change, it is assume that throughout short time intervals (e.g., an hour) such changes are small enough to isregar, an are slow relative to the spee at which the system reaches its new (e.g., hourly) steay state. This latter assumption can be safely applie to call centers at which the service rate is significantly higher than the rate of such time variations (e.g., hourly variations vs. average service time of a few minutes). We are intereste in a typical customer, arriving at the system in steay state (for a iscussion on the rigorous meaning of typical, see Appenix B). Let V be the customer s potential waiting time, X be his patience, an W be the actual waiting time (W V X). Many performance measures that are of interest to call-center managers can be expresse as expectations of simple functions of V an X. A representative list appears in Table 3. Here we make use of inicator functions of the form 1 (a,b] (t) that are efine by 1, a t b, 1 (a,b](t) 0, otherwise. 214 MAUFACTURIG &SERVICE OPERATIOS MAAGEMET Vol. 4, o. 3, Summer 2002

GARETT, MADELBAUM, AD REIMA Table 3 Performance Measures of the Form E [f (V, X )] f (v, x ) E [f (V, X )] 1 (x,) (v) 1 (t,) (v x) 1 (t,) (v x)1 (x,) (v) (v x)1 (x,) (v) (v x)1 (t,) (v x)1 (x,) (v) g(v x) P {Ab} P {W t } P {W t ; Ab} E [W; Ab} E [W; W t ; Ab] E [g (W)] Some important performance measures cannot be expresse irectly by the metho propose, but only as quotients of performance measures of the type E[ f (V, X )]. For example, the fraction of customers abanoning out of those having to wait in queue is an important measure, yet some experience managers of call centers ten to iscar customers who were not willing to wait for even a short perio of time t. In such a case one uses P{V X t; V X} P{Ab W t} P{V X t} E[1 (t,)(v X)1 (X,)(V)]. E[1 (V X)] (t,) 4. Operational Regimes an Diffusion Approximations A central outcome of the present paper is approximations of performance measures that yiel insight as to their epenence on the moel s parameters. From our theoretical results, which are also supporte by prevailing practice, it follows that moerate to large telephone call centers are capable of elivering high service level while also operating uner high utilization. This justifies our focus on approximations through heavy traffic limits, as (this is equivalent to R, since we will be assuming /R 1). To this en, a subscript will now be ae to our notation to inicate the processes an parameters of the th system (i.e., associate with an M/M/ M moel). Motivate by the work of Halfin an Whitt (1981) we use two performance measures the fraction of customers abanoning (P {Ab}), an the fraction elaye in queue (P {W 0}) as guielines for choosing appropriate operational regimes. Most telephone call centers try to avoi a high percentage of abanonment, without over-staffing. This usually translates into operating with a nonnegligible fraction of customers having to queue, an a small fraction of abanonment. We now characterize the epenence of the moel s parameters on. Primarily, we are intereste in a sequence in which as an, which correspons to scaling up the staffing level () to accommoate the increasing loa ( ) while maintaining a service rate () that oes not vary with staffing level or loa. Furthermore, we restrict our current iscussion to the case lim,0, although our results, as reporte in Appenix C, inclue the regimes 0 (extreme patience) an (extreme impatience) as well. We first introuce the notion of traffic intensity efine by /. The results of Theorems 1 an 2 below, together with the guielines state above, lea us to focus on the following regime: lim (1 ),. The iscussion later in 5.3, formalize by Theorem 4, strongly supports our contention that this is inee the regime of interest for moerate to large call centers. Following are Theorems 1 an 2, with a iscussion about the erivation of the state regimes from their results. Proofs of these theorems, an other selecte results quote throughout the paper, appear in Appenix C. THEOREM 1. Assume that lim, for some 0. Then the limiting behavior of the fraction of abanoning customers is given by 0, 0 1, lim P {Ab} 1 1, 1. Base on this result, it seems clear that from the point of view of abanonment there is no reason to operate with 1: 1 alreay yiels a vanishing abanonment probability. On the other han, when k 1, the limiting abanonment probability is high- MAUFACTURIG &SERVICE OPERATIOS MAAGEMET Vol. 4, o. 3, Summer 2002 215

GARETT, MADELBAUM, AD REIMA er than usually esire. Moreover, from the point of view of the agents utilization (i.e., the fraction of time they spen answering calls, given by [ (1 P {Ab})]/), the maximum limiting utilization is alreay achieve with 1. Thus, 1 arises as a special balance point between the call center s efficiency an service quality. The restriction to 1 is consistent with the work of Halfin an Whitt (1981) who analyze the M/M/ moel, an fin that interesting limiting behavior occurs when 1 /, 0 ( interesting in the sense that only then is the limiting behavior of the fraction of customers having to wait in queue nonegenerate). Because an M/ M/ M moel with very patient customers is close to an M/M/ moel (supporte by Theorem 2* appearing in Appenix C an extension of Theorem 2 below), we also restrict ourselves to the case of lim (1 ), but here. (As alreay mentione, Halfin an Whitt (1981) covers only 0, because otherwise there is no steay state.) Theorem 2 below states iffusion approximations for the process Q {Q(t), t 0}. We consier the sequence of stochastic processes {q } that is obtaine from {Q } through centering an rescaling, namely Q (t) q (t). Centering aroun gives rise to a process whose absolute value is either the queue length (q 0) or the number of ile servers (q 0). The rescaling factor emerges as the appropriate orer of magnitue, that gives rise to a nontrivial continuous limiting process q. The latter will be use to approximate our original birth-eath processes {Q }, via Q q, thus offering approximations for both transient (Q (t), t 0) an steay-state (Q ()) behavior. The mathematical etails of the theorem are not a prerequisite for following its consequences, which are explaine immeiately after the theorem an its corresponing remarks. THEOREM 2. Assume that lim (1 ), If q (0) q(0), then q q, where q is the unique so- lution of the following stochastic ifferential equation, lim, 0. q(t) f (q)t 2b(t), ( x), x 0, f (x) ( x), x 0. (Here b enotes a stanar Brownian Motion, an X X enotes the weak convergence or conver- gence in istribution of a sequence {X }tox.) REMARKS. (1) This limit was conjecture (with a slightly ifferent centering) by Fleming et al. (1994), an a proof was given for the weak limit of the stationary istributions (i.e., q()). An extene version of this theorem, incluing iffusion limits for the cases 0anappears in Appenix C. (2) The limiting process state in this theorem is qualitatively characterize by a combination of two Ornstein-Uhlenbeck processes (q 0, q 0) with ifferent restraining forces. So far we have ealt with iffusion limits of the queue-length process Q. However, as isplaye in Table 3, many performance measures involve the potential waiting time V. ow, the istribution of V coincies (see Appenix B) with the stationary istribution of the process (t) the virtual waiting time at time t (i.e., the time spent waiting in queue by a hypothetical infinitely patient customer arriving at time t). We shall thus focus on approximating the stationary istribution of (t), enote (). A simple relationship between the iffusion limits of the queue-length process an the virtual waiting-time process can be motivate heuristically as follows: If there are ile agents, the virtual waiting time is zero; otherwise the number of waiting customers is q (in view of Theorem 2). How long oes it take for a customer to pass through this queue? Customers will be leaving at a rate of (through service) o() (abanonment; inee, the abanonment rate of customers in front of our tagge customer is no greater 216 MAUFACTURIG &SERVICE OPERATIOS MAAGEMET Vol. 4, o. 3, Summer 2002

GARETT, MADELBAUM, AD REIMA than q). Diviing the queue length by the rate that customers are leaving it yiels the virtual waiting time, which is therefore [q/()]. Formally, such an approximation is erive through Theorem 3 that follows. Here we use the common notation for the stanar normal ensity an istribution functions ( an respectively) 1 2 x /2 (x) e, (x) (y) y, 2 as well as the hazar rate efine by x h(x) (x)/[1 (x)] (x)/(x). Proof of this theorem is base on a useful result by Puhalskii (1994) that links iffusion approximations for an Q. THEOREM 3. Let v [q/], where q solves the stochastic ifferential equation state in Theorem 2. Then (1) (t) v(t), 0 t, where both () an v() are limits in istribution, as t, of (t) an v(t), respectively. (2) v() has the istribution function F v given by 1 F (x) v Here w(, /), x 0, h(/) w(, /), x 0. (/, x) Because () use is [ ] 1 h(xy) w(x, y) 1, yh(x) (x) (x, y). 1 (x y) v(), the approximation we V () v()/, which translates into F V (x) F v (x). 5. Implementation In 3 an 4 we reporte a variety of results concerning the M/M/ M moel. Because we avocate this moel as a substitute for the M/M/ moel commonly use in call-center analysis, it makes sense to she some light on how to apply an interpret our results. The context of the iscussion will be that of managing a moerate to large call center in heavy traffic ( heavy such that the abanonment phenomenon is not negligible). First, we briefly aress the issue of estimating the values of the moel s parameters. Then we suggest which performance measures shoul be use by call-center managers to efine the service level. Finally, we iscuss the use of our approximation results an erive staffing rules. 5.1. Estimating the Parameters To use the moel an the results introuce, it is necessary to etermine the values of the various parameters. The number of agents on shift is fully controlle by the call center s manager. Arrival an service rates are usually estimate from historical ACD ata. As iscusse in 3 for time varying arrival rates, small time intervals are selecte, in which the arrival rate is approximately constant. The main ifficulty is to estimate the abanonment rate () or equivalently, the average patience (1/). The ifficulty arises from the fact that the irect ata we can collect is censore we can only measure the patience of customers who abanon the system before their service began. For the customers receiving service we only have a lower boun for their patience the amount of time they spent waiting in queue. There are statistical methos to eal with such censore samples. While we shall not iscuss these methos here, intereste reaers are referre to the Appenix in Zohar et al. (2000) for a survey an further statistical references. Another, more basic problem for estimating, is that in most cases the ACD ata only contains averages, as oppose to call-by-call measurements (see Manelbaum et al. (2000) for call-by-call ata analysis). To this en we suggest a metho for estimating the average patience that is base on the following balance equation: E[# waiting in queue] P{Ab}. (2) This equation escribes the steay-state balance between the rate that customers abanon the queue MAUFACTURIG &SERVICE OPERATIOS MAAGEMET Vol. 4, o. 3, Summer 2002 217

GARETT, MADELBAUM, AD REIMA (left-han sie) an the rate that abanoning customers (i.e., customers who will eventually abanon) enter the system. Through Little s theorem ( E[W] E[# waiting in queue]), we obtain an alternative equation E[W] P{Ab}. (3) The average wait in queue an fraction of customers abanoning are fairly stanar ACD ata outputs, thus proviing the means for estimating. Wenote, however, that (2) an (3) are known to hol exactly only uner exponentially istribute patience. (See Zohar et al. (2000) for a iscussion of (3) an generalizations.) 5.2. Approximations As state earlier, the abanonment phenomenon is extremely important to a call center s manager. evertheless, this oes not imply that the only performance measure of interest is the fraction of customers abanoning the queue. There are many aitional important performance measures, an it is necessary to select the few that best reflect the service level at the call center, an can serve as service goals an service graes. Approximations can be use to overcome computational ifficulties arising when attempting exact evaluation of performance measures, but they can also reveal how performance measures epen on the moel s parameters. Such an unerstaning is necessary when trying to erive simple rules of thumb (see 5.3 below). Combining the approximation for the stationary virtual waiting time (Theorem 3) with the general representation of performance measures in 3 enables us to erive approximations for many performance measures. These approximations shoul be most accurate in the case of a large call center operating in heavy traffic, with negligible blocking. (The accuracy of some of these approximations is emonstrate in Appenix A.) EXAMPLE. Assume a performance measure that can be expresse as E[g(W)] for some function g. Recall that W X V, an that X an V are inepenent. Therefore, E[g(W)] g(x v)ex F V(v) x 0 0 E[g(0)](1 w(, /)) g(x v)ex w(, /) 0 0 (v /, v) v x. The resulting approximations for several performance measures are P{W 0} w(, /), h(/) P{Ab W 0} 1, h(/ /()) [ ] [ ] h(/) P{Ab} 1 w,, h(/ /()) h(/) 1 E[W] 1 w,, h(/ /()) E[# busy agents] [ ] h(/) 1 w,, h(/ /()) E[# waiting in queue] [ ] h(/) 1 w,, h(/ /()) P{W t} h(/) w(, /) e t, t 0, (/, t) P{Ab W t} (/, t) 1 e t, t 0. (/ /(), t) REMARKS. (1) Along the same lines, we have evelope further useful approximations, notably for E[W W t] an E[W Serve]. These have been omitte ue to excessive bulk. (2) Some of the approximations above can also be erive via the iffusion limit of the queue-length 218 MAUFACTURIG &SERVICE OPERATIOS MAAGEMET Vol. 4, o. 3, Summer 2002

GARETT, MADELBAUM, AD REIMA process (Theorem 2). ote, however, that the approximating expressions thus arrive at are not ientical, but they coincie as. For example, consiering the fraction of customers abanoning, we have on the one han t x V 0 0 P{Ab} E[1 (V)] e x F (t) (X,) (1 e t/) F v(t), 0 an on the other han (using (2)) P{Ab} E[# waiting in queue] t t q v 0 0 F (t) F (t). 5.3. Staffing Rules It is important for a call center s manager to be able to anticipate the impact of changes on the service level. Examples of such a change are an increase in the call arrival rate ue to a marketing campaign, or a change in the number of agents on shift. Most expressions for performance measures erive using the M/M/ M moel are quite complex. Even the approximations in 5.2 ten to be too complex to enable an unerstaning of how the values of the parameters affect the performance measure. It is esirable, therefore, to erive simple rules of thumb to support ecision making. We have the following result, analogous to the result by Halfin an Whitt (1981) that concerns M/M/ queues. THEOREM 4. Assume that,0. Then lim (1 ),, ifanonlyif lim P {W 0}, ifanonlyif lim P {Ab}, in which case 0 1, 0, Table 4 Staffing Rules for Three Operational Regimes Regime Staffing Level Guielines Rationalize [R R ] P {W 0} () an P {Ab} ()/ Quality-Driven [R R], 0 P {W 0} 0 an P {Ab} o(1/ ) Efficiency- Driven [R R], 0 P {W 0} 1 an P {Ab} w(, /), [/ h(/) ]. (Here w an h are as in Theorem 3 above). REMARK. This result hols at the extremes as well, namely iff 1 iff an iff 0 iff 0. We euce from the above results that also for M/ M/ M queues the interesting (as explaine in 4) limiting behavior is when 1 /, but here (in contrast to Halfin an Whitt 1981) is not restricte to be positive. In light of Theorem 4 an Theorem 1, we introuce in Table 4 three regimes of operation, with three matching staffing rules (this is a slightly extene version of Table 1 from 1). REMARKS. (1) The staffing level for the rationalize regime is erive irectly from 1 /, (see 1 an 2 in Whitt (1992) for a etaile iscussion). The extremes of Theorem 4 only set bouns for the staffing levels. Any staffing level such as a with 0, 1 a 0.5, is aequate ( for quality-riven, for efficiency-riven). However, we suggest taking a 1, with which there is a clear ifferentiation between slightly unerloae call centers (quality-riven), slightly overloae call centers (efficiency-riven), an the critically loae call centers (rationalize). (2) The guielines above follow irectly from Theorem 4, except for the fraction abanoning (P{Ab}) in the efficiency-riven regime that involves Theorem 1. Following the result of Theorem 4, an continuing MAUFACTURIG &SERVICE OPERATIOS MAAGEMET Vol. 4, o. 3, Summer 2002 219

GARETT, MADELBAUM, AD REIMA in the spirit of Whitt (1992), we suggest (or ) asa service grae. The main significance of this grae is for comparing two systems, in particular, in the case of a single system before an after an expecte change. Once a manager has ecie which of the three regimes of operation is suitable for her call center, she can etermine the service grae an use the appropriate staffing rule. Moreover, analysis of empirical ata (Figure 3 is representative) shows that, in practice, the value of lies in the range of 0.5 1. We conclue by revisiting the scenario from 2: Suppose a given call center operates in the rationalize regime with agents, service rate, an arrival rate. The service level is quantifie by a service grae. There is a forecast of a higher arrival rate ˆ uring a holiay. The call center s manager wishes to maintain the service level at the call center, an nees to ecie how many agents to have on shift (ˆ ). Base on the appropriate staffing rule /(1 / ), we get ˆ ˆ / ˆ/. Moreover, the anticipate holiay performance is: (1) Fraction waiting: P{W 0} () (as in the original system). (2) Fraction abanoning: P{Ab} ()/ˆ (ecrease by a factor of ˆ /). Acknowlegments The authors thank the Eitor-in-Chief, the Senior Eitor, an the referees for an exceptionally thorough review process. Their helpful comments turne the very ifferent originally submitte manuscript into the present, much more reaable version. A. Manelbaum thanks Professor I. Meilijson of Tel-Aviv University an Professor O. Kella of the Hebrew University in Jerusalem. These colleagues have contribute greatly to his unerstaning of call centers, an to the challenge an joy of analyzing them. Manelbaum s research was supporte by the fun for the promotion of research at the Technion, by the Technion V.P.R. funs Smoler Research Fun an B. an G. Greenberg Research Fun (Ottawa) an by the Israel Science Founation (grant no. 388/99). Appenix A. Accuracy of the Approximations The approximations erive in 5.3 are base on heavy traffic limit theorems in which the number of agents an call volume are taken to infinity. It is therefore of practical interest to see how accurate these approximations are when applie to call centers that are not extraorinarily large. In Figures 4 8 we plot approximations ( signs) for a number of performance measures vs. exact values (soli line) base on the analysis of an M/M/ M moel. All cases assume a call volume Figure 4 Approximating P{Wait 0} Figure 5 Approximating P{Abanon} of 50 calls per minute, each requiring an average hanling time of one minute. Staffing levels range from 20 to 80 agents (implying traffic intensities from 0.625 up to 2.5!). Three ifferent values of average patience are consiere: Graph a: Graph b: Graph c: 10 minutes (very patient). 1 minute (moerately patient). 6 secons (very impatient). The main conclusion from this isplay is that in most cases these approximations are excellent (for any practical use) even in the case 220 MAUFACTURIG &SERVICE OPERATIOS MAAGEMET Vol. 4, o. 3, Summer 2002

GARETT, MADELBAUM, AD REIMA Figure 6 Approximating P{Wait 10 secs.} Figure 8 Approximating E [Wait Serve] Figure 7 Approximating P{Abanon/Wait 10 secs.} (2) In Figure 7 the exact-value graphs for cases b an c were not calculate for all values up to 80 agents, ue to numerical ifficulties in obtaining these values. Specifically, because we use P{Ab W t} (P{Ab; W t})/(p{w t}), when P{W t} becomes extremely small (see Figure 6), one encounters precision ifficulties. To overcome such problems, the exact-value graph for case c was prouce using simulations. This graph is limite to 50 agents because as the number of agents increases, the event in which a call waits in queue for more than 10 secons becomes extremely rare, thus requiring very long simulations. The ifficulties we encountere here provie a goo example of the benefits of having approximations for such performance measures. Inee, the approximation is not as accurate as the others, but the values it provies are useful an capture exact behavior. (3) ote that the scale in Figure 8 is of log-type (log 10 ), for benefit of the clarity of isplay. The upper range of the graphs (a, b, an c) in this case is 550, 55, an 5 secons, respectively. of a meium-size call center hanling moerate traffic intensities. Some aitional, specific remarks are state below. REMARKS. (1) Figure 5 shows only a single graph (b) because the graphs (both exact an approximate) of the other two cases (a an c) practically coincie with it. This is easy to explain from our theoretical results. By Theorem 1, to leaing orer the abanonment probability with 1 oes not epen on. Although the abanonment probability with 1 is more sensitive to (the leaing orer term is zero), the ifferences o not show up ue to the scale of the graph. Appenix B. Calculating E[ f (V, X )] in an M/M//B M Moel To calculate E[ f (V, X )], we start with the following ecomposition: E[ f (V, X)] E[ f (V, X) 1 (V)] E[ f (V, X 1 (V)] (0,) {0} 1 E[ f (V, X) 1 (V)] E[ f (0, X)]. (4) (0,) B k k0 Here we use to enote the stationary istribution of the queuelength process Q(t), namely lim P{Q(t) n}, t n 0,1,2,...,B. A general expression for these probabilities is given by n MAUFACTURIG &SERVICE OPERATIOS MAAGEMET Vol. 4, o. 3, Summer 2002 221

GARETT, MADELBAUM, AD REIMA where k (/) 0, 0 k, k! k k (/) 0, k B, ( j )! j1 k B k 0 k0 k1 j1 [ ] 1 (/) (/). k! ( j )! REMARK. For a blocke customer (i.e. the queue was full upon his arrival) the convention V 0 is introuce. For all functions f which seem of interest in our case, E[ f (0, X)] evaluates to zero or one. Therefore, we procee to calculate the first expression. We present three ifferent methos for performing this calculation, each with its own virtues an rawbacks. Our calculations require the istribution function of V. Recall that V is the potential waiting time of a typical customer. What is meant by a typical customer? Consier the sequence {w n, n }, where w n is the potential waiting time of the nth customer. Let F w be the stationary istribution of this sequence. Quoting from Baccelli an Hebuterne (1981), F w is also the stationary istribution of the process (t) the virtual waiting time at time t (i.e., the time spent waiting in queue of a hypothetical infinitely patient customer arriving at time t). Therefore, a typical customer s potential waiting time, V, has istribution function F w. Similarly, we are intereste in V n, which is a ranom variable whose istribution is that of V given n customers in queue upon arrival, an all agents busy, n 0,1,...;V n has istribution function F n. The istribution of V is not given beforehan, an is erive through analysis of the moel. On the other han, V n can be expresse as the sum of n 1 inepenent exponential ranom variables with parameters,,..., n, theith of these representing the perio of time the customer spent in the ith place in queue, before avancing to the (i 1)th (ue to en of service or abanonment from the queue in front of him). Metho A. Conitioning on the number of customers in the queue upon arrival, an substituting the explicit expression given by Rioran (1962) (Equation (83) on p. 111) for F n(t) 1 F n (t), we have B1 k B1 (/) (/) nk E[ f (V, X)1 (V)] c (1) k (0,) I(k), (5) k! (n k)! where k0 2 (ck)t x I(k) c f(t, x)e e t x an c /. 0 0 Calculating the values of I(k) is usually a simple task. The main rawback of this metho is the alternating signs in the first sum, which cause it to be numerically unstable. Therefore, we present the next metho, which avois this problem. Metho B. Starting similarly to Metho A, an using the relation nk n n t k t n (e ) (1 e ) k0 k to eliminate one sum, we arrive at where B1 n (/) 2 E[ f (V, X)1 (0,)(V)] c J(n), (6) n! (xct) J(n) f (t, x)e (1 e t) n x t. (7) 0 0 Here, calculating the values of J(n) tens to be more costly because the integrals must usually be solve numerically. These methos lose some of their attractiveness when ealing with infinite buffers (B ). Then, sums appearing in both methos become infinite, an must be truncate at some point for implementation (the alternating signs in Metho A can be problematic in the aspect of truncation). Because this case forces us to consier the issue of precision tolerance, we present the thir metho, which is a straightforwar numerical integration. Metho C. Following Rioran (1962), an solving the more general case of any buffer size B, we arrive at the function f V, where f V /P{V 0} is a ensity function, given by n0 B, (1 e t) f (t) 1 V (B ) exp (1 e t) t, t 0. (8) Here an enote the gamma an incomplete gamma functions, respectively, efine by (x) tx1exp(t) t an 0 y x1 (x, y) t exp(t) t, y 0. 0 ow we are left with the evaluation of the ouble integral x V 0 0 E[ f (V, X) 1 (V)] f (t, x)e f (t) x t. (9) (0,) The integral with respect to x is usually solve analytically an rather easily (epening on f ), leaving us to perform one numerical integration (with respect to t). Some aitional remarks concerning the infinite buffer case follow. REMARKS. (1) When the system s buffer is unlimite, solving the stationary istribution equations involves an infinite sum. A solution is given by Palm (1943), expressing the stationary istribution as a function of the easily calculate blocking probability in an M/M/ / system (enote here by P{Bl}), with the same arrival an service rates: 222 MAUFACTURIG &SERVICE OPERATIOS MAAGEMET Vol. 4, o. 3, Summer 2002

GARETT, MADELBAUM, AD REIMA where n P{Bl}!, n, n 1 A, 1P{Bl} n! P{Bl} 1 A, 1P{Bl} n, n, 1 (n ) A(x, y) ye xy (y, xy). (xy) y (2) For B the ensity function f V given here becomes a special case of the result by Baccelli an Hebuterne (1981) for an M/M/ G moel with patience istribution F, namely t V 0 f (t) exp (1 F(u)) u t, t 0. Appenix C. Outlines of Proofs Outlines for the proofs of Theorems 1 4 follow. PROOF OF THEOREM 1. We first point out an intuitive approach for the overloae case ( 1), base on the fact that in systems with many agents it is possible to achieve very high utilization. Inee, in an M/M/ M moel the utilization is given by the rate at which work reaches the agents ( (1 P {Ab})) ivie by the maximum rate at which it can be processe (). Thus when, assuming that the utilization is 1, we obtain the result. We now procee with the rigorous proof, base on bouning the sequence {P {Ab}} from above an below, with the two bouns converging to the esire limit. We begin with the lower boun, which is more intuitive. The utilization of agents in an M/M/ M queue in steay state must be less than one. Therefore, from which we obtain (1 P {Ab}), 1 lim inf P {Ab} 1. Before turning to the upper boun, we note two monotonicity properties of P {Ab} that are prove in Bhattacharya an Ephremies (1991): (i) With,, an fixe, P {Ab} is increasing in (or ). (ii) With,, an fixe, P {Ab} is increasing in. ow we eal with the upper boun. ote that P {Bl}, the probability of blocking in an M( )/M()// queue, is the limit of P {Ab} as in the M( )/M()/ M( ) queue. Thus, by (ii) above, P {Ab} P {Bl} for any with 0. If, with 1, it was shown by Jagerman (1974, p. 538), that [ ] 1 1 22 1 3 5 2 1 ( 1) ( 1) P {Bl}. (10) We eal with o() as follows. Choose 0 1, an efine ( ), ( ). Hence we have (through Jagerman s result an monotonicity) 1 1 lim inf P {Bl} lim sup P {Bl}, an taking 0 yiels 1 lim sup P {Ab} lim P {Bl} 1 for 1. We complete the proof for 1 using (i) above: Because P {Ab} with 1 must be smaller than with 1 for any 0, we have that for 1 1 lim sup P {Ab} lim 1 0. 0 1 An extene version of Theorem 2 that inclues iffusion limits for the cases 0anfollows. THEOREM 2*. Assume that lim (1 ),, lim, 0. If q (0) q(0), then (1) Weak convergence: q q, where q is the unique solution of a stochastic ifferential equation, accoring to the following regimes: 0: 0 : : q(t) f (q)t 2b(t), ( x), x 0, f (x), x 0. q(t) f (q)t 2b(t), ( x), x 0, f (x) ( x), x 0. q(t) ( q(t))t 2b(t) Y(t), q 0; Y(0) 0, Y nonecreasing, qy 0. 0 (2) Interchangeable limits: lim P{q () x} limt P{qt x}. PROOF OF THEOREM 2*, PART 1. We will eal with each of the three cases (corresponing to the value of ) separately. When 0an 0, Stone s criteria (1963) hol, hence the limiting process MAUFACTURIG &SERVICE OPERATIOS MAAGEMET Vol. 4, o. 3, Summer 2002 223

GARETT, MADELBAUM, AD REIMA is easily foun through the convergence of the infinitesimal expectation an variance. The case is more ifficult because the state space shrinks. We omit a iscussion of uniqueness an refer reaers to Dupuis an Ishii (1993) an Manelbaum an Pats (1995). 0: Here the abanonment rate converges to 0. As grows, the abanonment becomes less significant, an inee the limiting process is ientical to the heavy traffic limit of a sequence of M/ M/ queues (Halfin an Whitt 1981). The proof in this case is almost ientical to that in Halfin an Whitt (see the proof of Theorem 2) by using Stone s criteria: The state space of the rescale process q becomes ense in as ; the infinitesimal expectation ( ) an variance ( ) are given by 2 x, x 0, (x) x, x 0, x, x 0, 2 (x) x, x 0, converging, as ( x), x 0, lim (x), x 0, 2 lim (x) 2. 0 : This case appears in Fleming et al. (1994) as a conjecture without a proof, with slightly ifferent centering. It can be prove either as in the case 0 or using Fleming et al. : Here the proof is more complex. Stone s criteria oes not hol because the state space of the limiting process shrinks to (, 0], exhibiting reflection at the origin. We circumvent this ifficulty as follows: Let X Q, an efine two complementary an isjoint subsets of, corresponing to the times X spent in (,0]orin(0,). Thus via time changes we obtain (from X )two processes, each existing in a ifferent part of. We then show that the process existing in (, 0] converges to the propose limit. This is achieve using the proceure introuce in Manelbaum an Pats (1995). Because the process X makes alternating excursions to (, 0] ( negative excursions) an (0, ) ( positive excursions), by showing that the uration of the negative excursions is of orer (1/) an that of the positive excursions is o(1/), we conclue that the time spent by X in (0, ) becomes negligible as. The proof is then complete using an Inverse Ranom Time Change Theorem (see the Appenix in guyen 1993). PROOF OF THEOREM 2*, PART 2. The proof of the interchangeable limits is one through specific calculation of both cases, namely the stationary istribution of the iffusion limits (right-han sie, Rhs below) an the weak limit of the stationary istributions (left-han sie, Lhs below). Here we also eal separately with the three cases corresponing to the value of. Rhs: First, we fin the stationary istribution of the iffusion limits. This is accomplishe by using the results of Browne an Whitt (1995, 18.3). They provie a simple proceure for calculating the ensity function ( f (x)) of the stationary istribution for iffusion processes that have piecewise continuous parameters; reflecting bounary points, if finite, or inaccessible, if infinite. Following their proceure we obtain 0: (x ) (), x 0, f (x) () () exp(x), x 0, 0 : x 0, f (x) : (x ) / h(/) w(/), () x 0, (x/ /) / h(/) w(, /) (/), (x ), x 0, f (x) () 0, x 0. REMARK. When 0, a stationary istribution exists only for positive values of. Lhs: ow we fin the weak limit (if it exists) of the sequence of stationary istributions {q (), 1, 2,...}. ote that these istributions always exist because 0. Our iscussion is in terms of the sequence of cumulative istribution functions, enote by {F } converging to F. We eal separately with the intervals x 0 (corresponing to Q () in the original system) an x 0. Given x 0 there is no queue an therefore no abanonment. Hence the conitional istribution (enote F ) is ientical to that emerging from a sequence of M/M// queues, namely (x ), x 0, () F (x) 1, x 0. This leaves us with etermining F on x 0. For the 0 case we quote the result by Fleming et al. (1994), with a slight ajustment since their rescaling is Q q. This ifference only amounts to a shift of the istribution 224 MAUFACTURIG &SERVICE OPERATIOS MAAGEMET Vol. 4, o. 3, Summer 2002

] Q () Q () q () [ q (). Therefore, the ensity of q() is obtaine by shifting the ensity of q () by, which yiels x 0, f (x) (x ) / h(/ w(, /), () x 0. (x/ /) / h(/) w(, /) (/), ow we use this result as an upper an lower boun for the 0ancases, respectively. When 0 we must assume 0, otherwise the sequence is not tight. Denoting Fˆ P{q () x q () 0}, we fin the limit of this sequence, by sanwiching it between two converging sequences with a common limit. The lower sequence (bouning from below) is of conitional stationary istributions corresponing to a sequence of M/M/ queues, enote by {F }. Accoring to Halfin an Whitt (1981), this sequence has a limit 0, x 0, F (x) x 1 e, x 0. As state above, the upper sequence correspons to a sequence of M/M/ M queues with 0, an is enote by {F }. Here we have that (x/ /) (/) F(x) 1 (/) h(/)(x/ /) 1. h(x/ /)(/) By taking 0 an relying on the asymptotic behavior of h(t) ast, we get x/ / 2 lim F(x) 1 lim exp x 2x 2 0 0 / 1 e x. GARETT, MADELBAUM, AD REIMA [ ] This has complete the sanwich. These results, put together, yiel the ensity function for this case: (x ) (), x 0, f (x) () () exp(x), x 0. Finally, by taking in the 0 case we get that for all the mass of the istribution is concentrate in x 0, an F F. Therefore, for this case we have (x ), x 0, f (x) () 0, x 0. PROOF OF THEOREM 3. We start out by showing that [q/ ], a result that relies on a corollary by Puhalskii (1994) ealing with first passage times. Most of the notation we use here follows the example in Puhalskii (pp. 951 954), replacing the superscript n with subscript for the parameters an processes corresponing to a moel with agents. Hence we have Q {Q (t), t 0}, A {A (t), t 0}, D {D (t), t 0}, as the queue, arrival, an eparture processes, respectively. Let w (t) be the virtual waiting time at t: w (t) inf{s 0:D (s t) Q (0) A (t) ( 1)}. We efine rescale processes 1 1 1 X (t) D (t), Y (t) A (t), K (t) Q (t), an an aitional process Z t), or equivalently characterize via w (t) (Z (t) 3 3 Z 3 (t) inf{s 0:X (s) Y (t) K (0) (1 1/)}. ow introuce X(t) t (X(t) ), Y(t) t, K(0) 1, an a first passage time noting that Z 3 (t) t. Finally, let Z 3 (t) inf{s 0:X(s) Y(t)}, U 3(t) q(0) t b(t) q(t), V 3(t) t b(t) q(0). From here, applying Puhalskii (1994) an the result of Theorem 2* for the 0 case, we get q(t) 3 (Z t), which yiels, through continuous mapping ] q(t) 3 w (t) (Z (t) t), [ completing the proof. Part 1: Follows immeiately from Theorem 2* (0 case) where the parameters of the iffusion process q are provie, an the ensity of q() is given (in the proof above). Part 2: ote that q has a stationary istribution an let q (0) have this istribution. Hence, for all 0 t, q (t) has this istribution. Therefore, (t) also has the same istribution, for all 0 t. Using the opening result completes the proof. PROOF OF THEOREM 4. The irections going from the center ( ) outwar are by-proucts of Lemmas 1 an 2 below, as are the explicit expressions for an. The remaining irections are MAUFACTURIG &SERVICE OPERATIOS MAAGEMET Vol. 4, o. 3, Summer 2002 225

GARETT, MADELBAUM, AD REIMA ealt with by taking up to an own to, using the known asymptotic behavior of h(t) :h(t) t, t. : An increase in represents a ecrease in congestion, an therefore (an ) also ecreases. is foun by upper bouning the fraction abanoning with the fraction blocke in an M/M// queue. Hence, as, /h() lim w(, /) lim 0, h(/) /h() lim lim P {Ab} lim lim P {Bl} lim h() 0. : Here we use the reverse argument, bouning from below: /h() lim w, lim 1, h(/) /h() lim [/ h(/) ]. LEMMA 1. (), 0, lim P {W 0} w(, /), 0, 0,. PROOF. Calculating irectly, we have lim P {W 0} lim P{Q () } P{q() 0}. Hence, this result is arrive at through simple integration of the ensities foun in part 2 of the proof of Theorem 2*. LEMMA 2. Assume 0. Then lim P {Ab} [/ h(/) ] w(, /). PROOF. First, we express P {Ab} as a function of P {W 0} an P {Bl}, whose asymptotic behavior is known (see Jagerman 1974 an Lemma 1). In 5.1 we give the balance equation P {Ab} E[W], which can be rewritten as P {Ab} E[W W 0]P {W 0}. Inserting Rioran s (1962) expression for the conitional expectation, we get /1 / 1 () e P {Ab} 1 P {W 0}. (/, /) Through Palm s (1943) representation we obtain after a few simple manipulations 1 P {Bl}/ P {Ab} 1 P {W 0}. P {Bl}/ 1 P {Bl} Finally, using the connection between an P {Bl}, we get 1 P {Bl}/ P {Ab} 1 (1 P {Bl})/(1 P {W 0}) 1 P {Bl} P {W 0}. ow multiplying by an taking h()(1 w(, /)) lim P {Ab} w,, w(, /) which completes the proof because h(x)(1 w(x, y)) (1/ y)h(xy)w(x, y). References Ancker, C.J. Jr., A.V. Gafarian. 1963. Queueing with reneging an multiple heterogeneous servers. aval Res. Logist. Quart. 10 125 149. Baccelli, F., G. Hebuterne. 1981. On queues with impatient customers. F.J. Kylstra, e. Performance 81. orth-hollan, Amsteram, The etherlans, 159 179. Bhattacharya, P.P., A. Ephremies. 1991. Stochastic monotonicity properties of multiserver queues with impatient customers. J. Appl. Probab. 28 673 682. Borst, S., A. Manelbaum, M. Reiman. 2000. Dimensioning of large call centers. Submitte for publication ie.technion.ac.il/ serveng. Boxma, O.J., P.R. e Waal. 1994. Multiserver queues with impatient customers. The Funamental Role of Teletraffic in the Evolution of Telecommunications etworks. J. Labetoulle an J.W. Roberts, es. Elsevier, Amsteram, The etherlans, 743 756. Brant, A., M. Brant. 1998. On a two-queue priority system with impatience an its application to a call center. Preprint.,. 1999. On the M(n)/M(m)/s queue with impatient calls. Performance Evaluation 35 1 18. Browne, S., W. Whitt. 1995. Piecewise-linear iffusion processes. J.H. Dshalalow, e. Probability an Stochastic Series: Avances in Queueing. Theory, Methos, an Open Problems. CRC Press, Boca Raton, FL, 463 480. Clevelan, B., J. Mayben. 1997. Call Center Management on Fast Forwar. Call Center Press, Annapolis, MD. Dupuis, P., H. Ishii. 1993. SDE s with oblique reflection on nonsmooth omains. Ann. Probab. 21 554 580. Erlang, A.K. 1909. The theory of probabilities an telephone conversations. yt Tisskrift Mat. B2033 39.. 1917. Solutions of some problems in the theory of probabilities of significance in automatic telephone exchanges. Electroteknikeren (Danish) 13 5 13. English translation. 1917 1918. P.O. Electr. Engrg. J. 10 189 197. Fleming, P.J., A. Stolyar, B. Simon. 1994. Heavy traffic limit for a mobile phone system loss moel. Proc. 2n Internat. Conf. Telecommunication Systems, Moeling, an Anal., ashville, T, 158 176. Halfin, S., W. Whitt. 1981. Heavy-traffic limits for queues with many exponential servers. Oper. Res. 29 567 587. Harris, C.M., K.L. Hoffman, P.B. Sauners. 1987. Moeling the IRS telephone taxpayer information system. Oper. Res. 35 504 523. 226 MAUFACTURIG &SERVICE OPERATIOS MAAGEMET Vol. 4, o. 3, Summer 2002

GARETT, MADELBAUM, AD REIMA Help Desk an Customer Support Practices Report. 1997. Survey results, The Help Desk Institute, SOFTBAK Forums (May). Hoffman, K.L., C.M. Harris. 1986. Estimation of a caller retrial rate for a telephone information system. Eur. J. Oper. Res. 27 207 214. Jagerman, D.L. 1974. Some properties of the Erlang loss function. Bell System Tech. J. 53 (3) 525 551. Manelbaum, A., G. Pats. 1995. State-epenent queues: Approximations an applications. F.P. Kelly an R.J. Williams, es. Stochastic etworks. Springer-Verlag, ew York, 239 282., W.A. Massey, M. Reiman. 1998. Strong approximations for Markovian service networks. Queueing Systems: Theory an Applications (QUESTA) 30 149 201., A. Sakov, S. Zeltyn. 2000. Empirical analysis of a call center. Technical report ie.technion.ac.il/serveng/course/096324., W.A. Massey, M. Reiman, B. Rier. 1999. Time varying multiserver queues with abanonment an retrials. Teletraffic Engineering in a Competitive Worl. P. Key an D. Smith, es. Elsevier, Amsteram, The etherlans.,,,, A. Stolyar. 2000. Queue lengths an waiting times for multiserver queues with abanonment an retrials. Submitte to Selecte Proc. 5th IFORMS Telecomm. Conf. guyen, V. 1993. Processing networks with parallel an sequential tasks: Heavy traffic analysis an Brownian limits. Ann. Appl. Probab. 3 28 55. Palm, C. 1937. Etue es elais attente. Ericsson Technics 5 37 56.. 1943. Intensitatsschwankungen im fernsprechverkehr. Ericsson Technics 44(1) 1 189.. 1953. Methos of juging the annoyance cause by congestion. Tele 4 189 208. Puhalskii, A. 1994. On the invariance principle for the first passage time. Math. Oper. Res. 19 (4) 946 954. Rioran, J. 1962. Stochastic Service Systems. Wiley, ew York. Stone, C. 1963. Limit theorems for ranom walks, birth an eath processes, an iffusion processes. Illinois J. Math. 7 638 660. Sze, D.Y. 1984. A queueing moel for telephone operator staffing. Oper. Res. 32 229 249. Whitt, W. 1992. Unerstaning the efficiency of multi-server service systems. Management Sci. 38 (5) 708 723. Zohar, E., A. Manelbaum,. Shimkin. 2000. Aaptive behavior of impatient customers in tele-queues: Theory an empirical support. Technical report ie.technion.ac.il/serveng/course/ 096324. The consulting Senior Eitor for this manuscript was Michael Harrison. This manuscript was receive on October 27, 1999, an was with the authors 684 ays for 3 revisions. The average review cycle time was 77 ays. MAUFACTURIG &SERVICE OPERATIOS MAAGEMET Vol. 4, o. 3, Summer 2002 227