8 Int. J. Operational Researc, Vol. 1, Nos. 1/, 005 Staffing and routing in a two-tier call centre Sameer Hasija*, Edieal J. Pinker and Robert A. Sumsky Simon Scool, University of Rocester, Rocester 1467, NY, USA Fax: 585 73 1140 E-mail: asijas1@simon.rocester.edu E-mail: pinker@simon.rocester.edu E-mail: sumsky@simon.rocester.edu *Corresponding autor Abstract: Tis paper studies service systems wit gatekeepers wo diagnose a customer problem and ten eiter refer te customer to an expert or attempt treatment. We determine te staffing levels and referral rates tat minimise te sum of staffing, customer waiting, and mistreatment costs. We also compare te optimal gatekeeper system (a two-tier system) wit a system staffed wit only experts (a direct-access system). Wen waiting costs are ig, a direct-access system is preferred unless te gatekeepers ave a ig skill level. We also sow tat an easily computed referral rate from a deterministic system closely approximates te optimal referral rate. Keywords: staffing; routing in queueing systems; call centres; gatekeeper systems. Reference to tis paper sould be made as follows: Hasija, S., Pinker, E.J. and Sumsky, R.A. (005) Staffing and routing in a two-tier call centre, Int. J. Operational Researc, Vol. 1, Nos. 1/, pp.8 9. Biograpical notes: Sameer Hasija is currently a PD candidate in Operations Management at te Simon Business Scool, University of Rocester. He as a BTec (Major: Naval Arcitecture and Ocean Engineering, Minor: Industrial Engineering) from te Indian Institute of Tecnology at Madras and a MS (Management Science Metods) from te University of Rocester. His researc interests include efficiency and flexibility issues in service systems, wit a focus on Call-Center and Healt Care Management. Edieal J. Pinker is an Associate Professor of Computers and Information Systems at te Simon Scool of Business, University of Rocester. He conducts researc on te use of contingent workforces, cross-training, and experience-based learning in service sector environments as it applies to work and workflow design. He also studies te use of online auctions in electronic commerce and te issues faced by legacy firms trying to transition into electronic commerce. Pinker as consulted for te United States Postal Service, te financial services industry, and te auto industry. His work as been publised in Management Science, Manufacturing and Service Operations Management, te European Journal of Operational Researc, IIE Transactions and te Communications of te Association of Computing Macinery. He serves on te editorial boards of M&SOM, POMS, Decisions Sciences and IJOR. Professor Pinker earned is MS and PD in Operations Researc from te Massacusetts Institute of Tecnology. Copyrigt 005 Inderscience Enterprises Ltd.
Staffing and routing in a two-tier call centre 9 Robert A. Sumsky is an Associate Professor of Operations Management at te Simon Scool of Business, University of Rocester. Professor Sumsky as researc and teacing interests in te modelling and control of service systems. Current researc focuses on te dynamic use of flexible capacity, te use of incentives for operational control of service systems, and te application of revenue management under competition. His researc as been publised in Management Science, Operations Researc, Manufacturing & Service Operations Management (M&SOM), and Air Traffic Control Quarterly. He is an Associate Editor for Management Science and Operations Researc, serves on te editorial review boards of M&SOM and te Journal of Revenue and Pricing Management, and is a Senior Editor for Production and Operations Management. He as also conducted researc on te USA air traffic management system and studied transportation operations for te Massacusetts Port Autority and te Federal Aviation Administration. Professor Sumsky earned is MS and PD in Operations Researc from te Massacusetts Institute of Tecnology. 1 Introduction In tis paper we consider te problem of capacity planning and call routing in a two-tier service system in wic te first tier consists of gatekeepers wo diagnose te customer s problem, may solve te problem, or may refer te customer to an expert in te second tier. A typical example: call centres for ealt-care services are often staffed by certified nurses wo diagnose te problem and provide advice. If justified by te nature and severity of te call, a nurse may refer a call to a specialist (Bernett 003). Sumsky and Pinker (003) provide additional examples of service systems wit tis two-tier arcitecture, but wit te ealt-care motivation in mind, we refer to te resolution of a customer s problem as a successful treatment. Call centres must balance te relatively low cost of less-skilled gatekeepers wit te benefits of te expert s ability to andle difficult calls. A manager of suc a call centre must determine te staffing levels of gatekeepers and experts as well as te optimal referral rate to minimise total costs. Tese costs may include te cost of staffing, te cost of customer waiting time, and mistreatment costs wen a customer must see an expert for successful treatment despite spending time receiving (unsuccessful) treatment from te gatekeeper, as well. Tis paper focusses on a staffing and referral strategy tat is based only on call difficulty and steady state queue lengts and not on real-time queue lengts (te policies are static, not dynamic). We use a square-root staffing rule to approximate te optimal staffing for bot tiers, given any particular referral rate. Tis rule is asymptotically optimal as system size increases. We ten use te staffing approximation to determine te optimal referral rate and to compare te gatekeeper (or two-tier ) system wit a direct-access (or one-tier) system in wic customers do not encounter a gatekeeper but instead immediately see an expert. Our main findings are:
10 S. Hasija, E.J. Pinker and R.A. Sumsky Te coice of system (gatekeeper or direct access) as a complex relationsip wit te customer s waiting time cost. In particular, as te waiting cost increases, it may or may not be optimal to coose a direct-access system, depending on te oter parameters suc as te gatekeeper s skill level. If it is optimal to coose te direct-access system wen te unit cost of a customer s waiting time is low (or in a deterministic system in wic waiting costs are not assessed), ten it is optimal to coose te direct-access system wen te cost of te customers waiting time is iger. However, tis does not imply tat a ig waiting cost always leads to a one-tier system. As in point 1, it is possible to prefer a gatekeeper system, no matter ow ig te waiting cost per unit time. Wen two-tier systems are preferred, a simple, deterministic model can be used to coose te referral rate. Te optimal referral rate converges to tis deterministic referral rate as te size of te system grows. If we construct an optimal staffing plan, given te deterministic referral rate, ten we will only see a small increase in cost over te cost of te globally optimal system tat uses te optimal referral rate. In Section, we review te related literature. In Section 3 we introduce our model and in Section 4 we describe an approximation tat generates asymptotically optimal staffing levels and referral rates as te system size increases. In Section 5 we use te approximation to caracterise te beaviour of te referral rate as a variety of parameters cange. In tat section we also use numerical experiments to test te accuracy of te approximation and to confirm te four observations described above. Section 6 contains a discussion of our results and describes additional directions for researc. Literature review Tis paper is related to te literature on capacity planning and routing in queuing networks. Halfin and Witt (1981) establis eavy-traffic stocastic limits for multi-server queues in wic te number of servers is allowed to increase along wit te traffic intensity but te steady-state probability tat all servers are busy is eld fixed. Witt (199) sows tat by using te square-root staffing principle, discussed below in Section 4, one can generate te same limiting regime as in Halfin and Witt (1981). Borst et al. (004) use a similar framework to approximate te waiting time distribution of an M/M/N queue and demonstrate te asymptotic optimality of te square root staffing principle, given a cost function involving bot waiting and staffing costs. We apply teir approximation to a two-tier queuing network. We use te square-root staffing rule to find te number of servers for eac level as a function of te routing strategy. We ten determine te total cost of operating suc a system and minimise tat cost wit te static routing strategy as te decision variable.
Staffing and routing in a two-tier call centre 11 Tere exists a substantial literature on optimal routing strategies in call centres wit cross-trained servers ( skill-based routing ). For example, Örmeci (004) and Cevalier et al. (004) study loss models wit specialised and fully flexible servers. Wallace and Witt (004) examine systems wit an arbitrary cross-training pattern (e.g., eac server may be cross-trained in any subset of six skills). Tey use euristics and simulation to find te minimum number of cross-trained servers needed to satisfy performance goals for eac customer type. However, tese models of skill-based routing differentiate calls by type, and not by difficulty level; a server is eiter sufficiently skilled to andle a call or is not. In our model, gatekeepers ave some probability of success wit a particular call. As in our model, de Vèricourt and Zou (004) assume tat eac server as a different call resolution probability (p), and tey also assume tat eac server may ave a different service rate (µ). Tey identify te routing policy of calls to servers (a pµ rule ) tat minimises te total time a call spends in te system, including re-calls. Wile tey assume tat te staffing level is given one server of eac type our model considers bot te staffing and routing problem for large systems. Te structure of our service system is also quite different. We assume tat tere are two pools of servers: te expert pool as a resolution probability equal to one and te gatekeeper pool attempts to treat calls or passes tem along te expert pool. Our model is closest to Sumsky and Pinker (003). Tey determine te optimal routing strategy for a deterministic system and ten formulate a principal/agent model to determine te impact of performance-based incentives on te gatekeeper s beaviour. Teir model does not incorporate queuing effects, for tey assume tat te firm maintains a level of staffing sufficient to satisfy exogenous waiting time goals. Here we model queuing effects but we do not consider te incentive problem. We sow ow te cost-minimising referral rate varies wit canges in parameters related to queuing, i.e., arrival rates and service rates. We also sow tat as te arrival rate increases, te optimal referral rate converges to te optimal referral rate for te deterministic case. 3 Te queueing network model In tis section we describe an open queueing network model of a service centre wit gatekeepers. Te network is essentially two queues in series: n g gatekeepers and n e experts, wit staffing costs c g and c e per unit time, respectively (see Figure 1). Customers (or calls ) arrive to te gatekeepers according to a Poisson process wit rate λ. To te gatekeepers, te calls vary in difficulty and complexity, and we represent te difficulty of eac call wit a random draw from a uniform distribution, U[0, 1]. Tis random variable represents te call s percentile in a ranking of calls by treatment complexity. Given tat a call as complexity x, te probability tat te customer can be treated successfully by te gatekeeper is f(x). As in Sumsky and Pinker (003), we will refer to f(x) as te treatment function. Because complexity increases wit x, we assume tat f (x) 0.
1 S. Hasija, E.J. Pinker and R.A. Sumsky Figure 1 Customer flows Wit eac new call, a gatekeeper spends time diagnosing te problem and determining te complexity (te value of x). Te gatekeeper may ten eiter send te call directly to te expert pool or attempt to solve te problem. If te gatekeeper successfully solves, or treats, te problem, te call leaves te system. If te gatekeeper attempts to treat and te treatment fails, we assess a cost m due to te inconvenience to te customer, and te call is sent to te expert pool. Once a call as reaced an expert, it is served and leaves te system. Bot server pools ave unlimited waiting space, and tere is a cost w for eac unit of time spent waiting. Te time required for an expert to treat a call averages 1/µ. Te time for a gatekeeper to diagnose a call averages 1/µ d, wile te average time to diagnose and treat is 1/µ t > 1/µ d. If te gatekeeper follows a static policy and treats a proportion k of calls, ten te gatekeeper s service rate is, 1 k k µ = + µ d µ t 1. We assume tat service times are distributed as independent, exponential random variables, even wen te gatekeeper only diagnoses some calls, and combines diagnosis wit treatment in oter calls. Given tese assumptions, te gatekeeper and expert pools can eac be modelled as M/M/N queueing systems (see Gross and Harris, 1985, Section 4.1), were te arrival rate to te expert pool is te sum of te rate of calls untreated by te gatekeeper and te rate of calls mistreated by te gatekeeper. We will discuss additional implications of te exponential service-time assumption in Section 5.3. Our objective is to minimise te sum of staffing, waiting, and mistreatment costs. Given te complexity of a call, we must decide weter te gatekeeper sould treat te call or refer it immediately to an expert. Suppose tat te staffing is fixed at (n g, n e ) and te gatekeeper treats all calls in S, were S is a (possibly non-contiguous) subset of [0, 1].
Staffing and routing in a two-tier call centre 13 Let k be te proportion of te range [0, 1] covered by S. If te gatekeeper replaces te set S wit te set [0, k], we know tat Te gatekeeper s service rate does not cange because te proportion k does not cange. Te rate of untreated calls does not cange. Te rate of mistreated calls and te waiting time stays te same or decrease because f (x) 0. Terefore, given any staffing configuration and treatment set S wit proportion k, te waiting, staffing, and mistreatment costs will not increase if te gatekeeper instead treats calls in [0, k]. Tis argument indicates tat te optimal treatment set S takes te form [0, k], and we will refer to k as te treatment tresold. Given treatment tresold k, te gatekeeper refers a proportion 1 k of calls witout attempting treatment. Te expected k fraction of calls treated successfully by a gatekeeper is Fk ( ) = f( x)dx, te fraction mistreated is k F(k), and te fraction of calls seen by te expert pool is 1 F(k). We now develop te objective function for our problem. Te decision variables are k, te proportion of calls treated by te gatekeeper, and te staffing levels n g and n e. Let q(n, λ, µ) be te expected wait for an M/M/N queueing system wit n servers, arrival rate λ, and service rate µ. Te total cost per unit time is: C (n g, n e, k) = wλ [q(n g, λ, µ ) + (1 F(k))q(n e, (1 F(k))λµ)]+c g n g + c e n e + mλ(k F(k)). () Te subscript indicates tat tis is a cost function for a two-tier service system (as opposed to te one-tier direct-access system described below). Te first term is te expected cost of waiting in front of te gatekeeper and expert pools, te second and tird terms are te staffing costs, and te last term is te mistreatment cost. Terefore, we consider te following problem: min C ( n, n, k ) (3) ng, ne, k subject to g e n > λ / µ (4) g ne > λ (1 F)/ µ. (5) Te constraints ensure tat te gatekeeper and expert pools are stable. In te following sections we will be comparing tis two-tier system wit an all-expert system in wic customers do not see gatekeepers, but instead go directly for treatment in te expert server pool. Tis one-tier system is simply an M/M/N system wit n e servers, arrival rate λ and service rate µ. Terefore, te total cost is, C 1 (n e ) = wλq(n e, λ, µ) + c e n e. (6) Wile we can numerically find a routing policy and staffing levels tat minimise C 1 and C, in te next section we will use a square-root staffing euristic tat will 0
14 S. Hasija, E.J. Pinker and R.A. Sumsky 1 allow us to solve tese problems quickly enable us to caracterise te effects of certain parameters on te optimal solution 3 allow for direct comparison between te one-tier and two-tier systems. 4 An approximation using a square-root staffing rule In bot te one and two-tier systems, eac server pool is an M/M/N queue wit linear staffing and waiting-time costs. Borst et al. (004) demonstrate tat wen te number of servers is adjusted to minimise total staffing and waiting costs, and wen we allow λ, te ratio of staffing and waiting costs is bounded. Suc a system is described as being in te rationalised regime. Building on te work of Witt, Borst et al. (004) also sow tat for systems in te rationalised regime, a simple square-root staffing euristic is asymptotically optimal as λ. In tis section we describe te euristic and apply it to our system. In Section 4.1 we sow ow te euristic can be used to generate bot near-optimal staffing levels and an approximation of te total cost function for a single-server pool. Section 4.1 is primarily a summary of te work of Borst et al., and tese results can be applied directly to te one-tier system. In Section 4. we apply tese staffing results to te two-tier system, so tat te routing and staffing problem reduces to a single-variable optimisation in te treatment tresold, k. 4.1 Approximation for an M/M/N queue Consider an M/M/N queue wit load ρ = λ/µ, staffing cost c per unit time and waiting cost w per unit time. Borst et al. sow tat staffing n servers according to te following square-root rule is asymptotically optimal (te superscript refers to eiter euristic or Halfin- Witt ): n = ρ + y*( c, w) ρ. (7) At least ρ agents are needed to guarantee stability and y*( c, w) ρ is te safety staffing for protection against stocastic variability. Te quantity y*(c, w) can be tougt of as te optimal service level and is found by balancing te staffing and waiting costs. Specifically, y*(c, w) minimises te function were wπ ( y) α ( ycw,, ) = cy+, (8) y yφ( y) π ( y) = 1+ φ( y) 1 and φ(y) and Φ(y) are te unit normal pdf and cdf, respectively. Tat is, y *( cw, ) = arg min α( ycw,, ) (10) y>0 (9)
Staffing and routing in a two-tier call centre 15 Because te function α(y, c, w) as a finite, unique, and positive minimum value, y*(c, w) can be found quickly by numerical metods. Te function π(y) as an important interpretation tat will be useful for constructing te approximate cost function. It is known as te Halfin Witt delay function, and it is an asymptotically exact approximation of te probability of delay, Pr{wait > 0}, for te M/M/N queue. Let D(ρ, c, w) be an approximation for te total cost per unit time of staffing and waiting under te rationalised regime, given load ρ, unit staffing cost c, and unit waiting cost w. Given tat π(y) is te approximation for te probability of delay under te rationalised regime, wλπ ( y*( c, w)) D( ρ, c, w) = cn + n µ λ (11) = cρ + α( y*( cw, ), cw, ) ρ. Te second expression follows by substitution for n and te definition of α. Because te one-tier system is simply an M/M/N queue, our approximation of te optimal total cost for tis direct-access system is C 1 = D( λ / µ, c, w) (1) e were λ is te arrival rate to te system, µ is te service rate of te experts, c e is te cost of experts per unit time, and w is te waiting cost per unit time. In numerical experiments, Borst et al. sow tat tis staffing euristic is remarkably robust, even for offered loads as low as 10. We present similar results in our numerical experiments (See Section 5.3), and we also find tat using te approximate total cost function D(ρ, c, w) allows us to find near-optimal solutions to te staffing and routing problem in te two-tier system wit gatekeepers. 4. Approximation for a two-tier system Given te size of te load to eac server pool in te two-tier system, we use te square-root staffing euristic to determine te optimal number of servers for tat pool. In te two-tier system, te coice of te treatment tresold k determines te arrival and service rates of te gatekeeper and expert pools, and terefore determines te load for eac pool. Specifically, te load for te gatekeeper pool, ρ g (k) = λ/ µ (k) and te load for te expert pool, ρ e (k) = (1 F(k))λ/µ. Terefore, for a given k, te number of servers in eac pool is, n = ρ + y*( c, w) ρ, i = g, e, (13) i i i i and our approximation for te total cost of te two-tier system is C = D( ρ, c, w) + D( ρ, c, w) + mλ( k F). (14) g g e e Because te square-root staffing rule specifies te number of servers in eac pool, k is te remaining decision variable, and our problem is to find te cost-minimising value of k 0 k 1 k = arg min C (15)
16 S. Hasija, E.J. Pinker and R.A. Sumsky and to compare te optimal two-tier cost C ( k ) wit te one-tier cost, C 1. Wile k, C ( k ), and C 1 are approximations, in numerical experiments we will see tat tese approximations follow closely te optimal values derived from a more realistic model (Tis alternate model relaxes bot te asymptotic assumptions of te rationalised regime and te markovian assumptions of te original network model presented in Section 3). For an arbitrary treatment function f(k), C ( k ) can take an arbitrary form, e.g., it need not be unimodal. In te next section we assume tat te treatment function is linear. Working wit tese approximations, and wit a linear treatment function, allows us to analytically caracterise te beaviour of te (approximately) optimal treatment tresold and to quickly identify relative advantages of te one-tier and two-tier systems as te system parameters cange. 5 Analysis and numerical experiments wit a linear treatment function Now assume tat f(k) belongs to a class of linear functions, f(k) = b(1 k), were b [0,1]. Wit tis treatment function, gatekeepers ave a positive cance to successfully treat all calls, altoug te probability approaces 0 for te most difficult calls. Parameter b is a measure of te gatekeeper s skill: as b rises, te gatekeeper as a greater cance to andle all calls. For analytical tractability, we ave cosen a functional form for f so tat te vertical intercept and slope are bot equal to b. A byproduct of tis coice is tat as b increases, te implied variance in call difficulty to te gatekeeper increases as well. For brevity, trougout tis section we use te following notation: ρ = λ / µ ρ = λ / µ t ρ = λ / µ d t d y = y*( c, w) for i = g, e i i α = α( y, c, w) for i = g, e i i i Te following Proposition states tat te total cost functions C ( k ) and C 1 are minimised wit a single, optimal system design and treatment tresold. All proofs are in te appendix. Proposition 1: Wen minimising C ( k ) to find k, and wen comparing C 1 wit C ( k ), tere are two possible outcomes: 1 a two-tier system wit a unique treatment tresold k is optimal a one-tier system is optimal.
Staffing and routing in a two-tier call centre 17 A comparison of C ( k ) and C 1 also demonstrates tat a two-tier system is favored wen parameters c g, m, and µ are low, and wen c e, µ d and µ t are ig. Before considering ow k canges as te parameters cange, it is convenient to introduce a simple, deterministic model and te deterministic treatment tresold, k d. 5.1 Te deterministic model Consider a deterministic model of te two-tier system wit no stocastic variability in te arrival or service rates, so tat te capacity of te gatekeeper and expert pools are set equal to te load. Given te linear treatment function f(k), te total cost of tis system is C k c c m k bk bk d 1 k k λ(1 bk + bk /) ( ) = gλ + + e + λ( + /) µ d µ t µ and te optimal treatment tresold is, (16) k d 1 m+ cg(1/ µ t 1/ µ d) = 1 b m+ ce (1/ µ ) +. (17) Note tat k d is equivalent to te optimal treatment tresold for te model in Sumsky and Pinker (003), wic also focuses on a deterministic gatekeeper system. A one-tier deterministic model as total cost, C d 1 c λ e µ =. (18) Te quantity k d will be useful in te following analysis and will also be useful for generating a simple rule of tumb for te system design in te numerical experiments. 5. Analysis of te optimal treatment tresold Here we examine ow te optimal treatment tresold is affected by te system s parameters. In tis section, we limit our attention to cases were bot 0 < k < 1 and 0 < k d < 1. Te proof of Proposition 1 demonstrated tat wen k is an interior solution, C / k > 0. By using tis fact, and applying te implicit function teorem to C ( k ), we find: / c g < 0, / c e > 0, / m < 0, / µ t > 0, / µ d < 0, / µ < 0 and / b > 0. Terefore, for large values of c g, m, µ d,, µ and small values of c e, µ t, b, it is optimal for gatekeepers to treat only te less difficult calls. Te impact of te arrival rate λ and te waiting cost w is more complex. First, we consider λ. We find tat as λ increases, k can eiter fall or rise, and tat it monotonically converges to k d.
18 S. Hasija, E.J. Pinker and R.A. Sumsky Proposition : 1 If k k d ten / λ 0 If k < k d ten / λ > 0 3 k k d as λ. Figures and 3 sow convergence from above and below, respectively. Convergence to k d as an intuitive explanation: for very large λ, waiting costs are relatively small, compared to te sum of staffing and mistreatment costs. Terefore, for very large λ it is optimal to use te treatment tresold from te deterministic model, wic only considers staffing and mistreatment costs. Figure Treatment tresold k vs. λ. Oter parameters are µ t = 0.75, µ d = 5, µ = 1, c g = 1, c e = 4, m = 1,b = 1, w = 5 Figure 3 Treatment tresold k vs. λ. Oter parameters are te same as for Figure except b = 0.8 and w = 0.5
Staffing and routing in a two-tier call centre 19 To understand te impact of w on te optimal treatment tresold, it is useful to examine te expression for te partial derivative of k wit respect to w: ρb(1 k ) π( y ) ( ) ( ) e ρt ρ π y d g C ( k ) = w 1 bk b( k ) / ye ρ ( ) yg ( k ) d k ρt ρ + + d 1. (19) Given tat k is an interior solution, te denominator is positive. Terefore, te sign of te derivative depends upon te sign of te numerator. If te numerator is multiplied by w, ten te first term is te marginal decline in te cost of waiting at te expert queue as k increases. Te second term is te marginal cost of waiting at te gatekeeper queue as k increases. Terefore, if te marginal cost at te gatekeeper queue is lower, ten k rises wit w, sifting some of te workload to te gatekeepers and reducing te expert queue. Expression 19 allows us to see ow te parameters affect te relationsip between w and k. For example, if b is ig and te gatekeeper is skilled, te first term in te numerator dominates, and k rises as w rises. On te oter and, if ρ t ρ d is large, implying tat treatment by te gatekeepers is slow, ten te second term dominates and k falls as w rises. Figure 4 sows ow k canges wit w for tree different gatekeeper skill levels (oter parameters are te same as for Figure ). Figure 4 Treatment tresold k vs. w for b = 0.7, 0.8, and 1 Intuitively, as w rises, queueing economies of scale become more important. Tese economies of scale imply tat it is more efficient to ave one large and one small server pool, rater tan two pools tat are closer in size. If te parameters give an advantage to te gatekeepers, ten a rising w implies a rise in k, expanding te ranks of te gatekeepers and reducing te expert pool. If te parameters give an advantage to te experts, ten rising w implies tat pooling sould occur on te expert level, dropping k and eventually producing a one-tier system. Te following proposition sows tat tis effect is monotonic if k is above k d.
0 S. Hasija, E.J. Pinker and R.A. Sumsky Proposition 3: If k > k d, ten / w > 0. In te next section we will see numerical examples of tese effects, and we will compare te one and two-tier systems under a variety of system parameters. 5.3 Numerical experiments In tis section we demonstrate te accuracy of te approximation described above, and investigate ow te optimal design of te service centre is influenced by te parameter values. In particular, we see numerically ow te optimal treatment tresold canges, and we compare one-tier and two-tier systems under a variety of scenarios. We also sow tat te treatment tresold derived from te deterministic model, k d, is an excellent approximation for te optimal treatment tresold in stocastic systems, as long as it is optimal to use a two-tier, rater tan a one-tier, system. Recall tat in te model introduced in Section 3, we assume tat te gatekeeper s service time is exponentially distributed wit mean µ. However, te gatekeeper s actual service time is a mixture of time spent only diagnosing a customer and time spent bot diagnosing and treating. Because tese two types of services may ave significantly different average times, a more accurate model would use a mixture of two exponential service times: a proportion k wit mean 1/µ t and 1 k wit mean 1/µ d. Given tat te gatekeeper s service times follow suc a distribution, we model te gatekeeper pool as an M/H /N queue and te expert pool as a G/M/N queue. Te arrival process to te expert pool is difficult to caracterise, and we use te approximation suggested by Adan (004). In tis section, we compare our euristic solution, using te square-root staffing rule, wit te optimal solution determined by numerically solving a model based on te more general queueing systems described above. We use a software package tat implements te G/G/N approximations by Witt (1993) to find te optimal combination of k*, n g *, n e * tat minimises te total staffing, waiting, and mistreatment cost. Hencefort we will call tis solution te optimal staffing and routing strategy and we will call te values k, n, n, determined by equations (7), (10), and (15) te euristic staffing and routing g e strategy. We first verify te accuracy of te square-root staffing rule in te two-tier setting. Te accuracy of tis approximation will be driven by te sensitivity of te results to te assumption tat te gatekeeper s service times are exponential and tat arrivals to te experts are Poisson. Terefore, te larger te difference between µ d and µ t, te worse te performance of te euristic. However, we find tat even wit extremely large differences (µ d /µ t as large as 100), te cost of a system operated according to te euristic is witin 1% of te optimal cost, as long as λ > 0. We also observe tat te referral rates and staffing levels generated from te euristic converge quickly to te optimal levels as λ increases. In tis section, we will focus on more reasonable examples tan µ d /µ t = 100; we set µ t = 0.75 and µ d = 5, wile varying oter parameters, suc as te skill level b and te waiting cost w. For example, Figure sows ow k*converges to k as te arrival rate increases, and ow k converges to k d from above, as implied by Proposition. Most of te variation of k*around k is due to te integrality of n g * and n e *. Figure 3 sows a similar pattern, altoug wit a lower value of b and w, and ere k converges to k d from below. In Figure 5, we see tat using te euristic does not significantly increase system
Staffing and routing in a two-tier call centre 1 costs, given large λ, and in Figure 6 we see tat te staffing levels n g *, n e *and n g, n e are nearly identical. (Figures 5 and 6 use te set of parameters tat led to Figure.) Figure 5 Percentage cost penalty for using euristic solution rater tan te optimal solution Figure 6 Staffing levels vs. λ
S. Hasija, E.J. Pinker and R.A. Sumsky In all remaining figures, we do not sow te optimal solution, but in every case te euristic and optimal solutions are nearly identical, and te difference in total cost wen using eac is negligible. We will also be using as a baseline te parameter values for te system sown in Figures, 5 and 6. Wile we will only be presenting a subset of our experiments, we observed tat te euristic solution was nearly optimal over a wide range of parameter values for labor and waiting costs, service times, and gatekeeper skills. Figure 7 plots te staffing levels of eac server pool as a function of b, a measure of te gatekeepers skills. Below a certain skill level a direct-access system is optimal and above tat skill level a two-tier system is optimal. As te skill level continues to increase, te gatekeeper pool grows and te expert pool srinks. Figure 7 Staffing levels vs. b Figure 8 sows k and k d as a function of te skill level. Te steep, initial increase in eac tresold represents te transition from one to two-tier systems; note tat k d rises at a lower value of b tan k. We consistently observed tis penomenon in all numerical experiments we conducted. To understand wy k d sould rise before k, it is useful to interpret k d as te optimal treatment tresold wen w is extremely low (a deterministic system essentially ignores waiting costs). To justify aving gatekeepers in systems wit ig waiting costs, tey must ave iger skills tan needed to justify gatekeepers in systems wit lower waiting costs. In oter words, if it is optimal to coose te direct-access system wen w is very low (as in te deterministic system), ten it is optimal to coose te direct-access system wen te customer s cost of waiting is iger. Tis effect can be explained by te fact tat a one-tier system offers benefits from pooling and tat tese benefits are more powerful wen waiting costs are ig. However, tis does not imply tat a ig waiting cost always leads to a one-tier system. As in Proposition 3, it is possible to prefer a gatekeeper system, no matter ow ig te value of w (we saw an example of tat in Figure 4).
Staffing and routing in a two-tier call centre 3 Figure 8 Treatment tresold k vs. b Figure 9 sows te contribution of mistreatment, waiting, and staffing costs to te total cost as a function of gatekeeper skill. Tis plot sows te actual costs, calculated from te G/G/N models, given te euristic solution. Te staffing cost decreases as b increases because we substitute gatekeepers for experts, as seen in Figure 7. An increase in b coupled wit an increase in k also implies tat a iger fraction of calls leave te system after successful treatment by te gatekeeper, tus reducing te queues and te waiting time in te system. It is somewat counterintuitive tat total mistreatment cost increases wit b. On te one and, increasing b reduces te probability tat gatekeepers mistreat eac call tat tey address. On te oter and, increasing k increases te number of calls treated by te gatekeeper, and terefore increases te mistreatment rate. In all of our experiments, we observed tat te second effect dominates te first, so tat rising b always increases total mistreatment costs. Figure 9 Cost of staffing, waiting and mistreatment for te euristic solution vs. b
4 S. Hasija, E.J. Pinker and R.A. Sumsky In Section 5, we saw tat te response of te optimal treatment tresold to canges in w is complex. In Figure 4 we plot k and k d for different values of b as a function of waiting cost. We see tat for ig values of b, k increases wit waiting cost wile for low values of b te optimal treatment tresold decreases until a direct-access system is preferable. As we discussed in Section 5, te optimal location to pool resources as w increases depends upon te skill level of te gatekeepers and te cost parameters. In all of our experiments, we noticed tat wen a two-tier system is preferred to a direct-access system, k and k d are remarkably close (see, for example, Figures 4, 8). Terefore using k d as an estimate of te optimal treatment tresold does not increase total costs significantly (see Figure 10). However, te coice of a one or two-tier system sould be based on a cost comparison tat takes optimal staffing and waiting costs into account, as was seen in Figures 4 and 8. Terefore, we propose te following rule of tumb for coosing te optimal system: Calculate k d using equation (17). Using k d as te treatment tresold, use te square-root staffing rule to determine te number of gatekeepers and experts in a two-tier system. Given tese staffing levels, calculate te total cost d C ( k ). Also using te square-root staffing rule, determine te number of experts in te direct-access system and calculate te cost C 1. If d C ( k ) < C 1, coose a two-tier system using k d as te treatment tresold. Oterwise, coose a direct-access system. Tis rule of tumb does not require managers to find k* or k, bot of wic require significant computational effort compared to finding k d. Figure 10 Percentage cost penalty for using k d and te square-root staffing rule rater tan te optimal solution
Staffing and routing in a two-tier call centre 5 6 Conclusions In practice most call centres ave multiple tiers, were te tiers are distinguised by teir abilities to serve te customers. Differing abilities typically imply differing compensation rates and service rates as well. Managers must determine staffing at eac tier in conjunction wit routing rules to balance customer queueing delay costs, mistreatment costs and staffing costs. In tis paper we ave developed an approac tat greatly simplifies tis complex managerial problem. By drawing upon recent results sowing te asymptotic optimality of square-root staffing rules for stand-alone queues, we ave sown tat te optimal design of a two-tier system can be reduced to determining an optimal routing rule. Furter, we ave sown tat te easily computed routing rule from a deterministic system can be used wenever a two-tier system is preferred to a one-tier system. It is well known in te queueing literature tat pooling resources can create economic benefits by reducing variability. In a two-tier system in wic te second tier is staffed wit iger skilled and more expensive servers, it is not clear ow to take advantage of pooling. We find tat wen waiting costs are iger, gatekeepers need a iger skill level to be wortwile. Tat is, pooling economies are acieved using te experts only. However, we also see tat if te gatekeepers skills are ig enoug, it is optimal to acieve pooling economies at te first-tier for even very ig values of te waiting costs, w. So we see tat depending on te combination of (b, w) we may seek pooling economies at different locations in te system. Muc of our analysis was restricted to te case of a linear treatment function. Furter researc is necessary to test te validity of our results for more general treatment functions. Oter possible areas for future researc include extending te model and analysis to tree or more tiers of servers, considering dynamic routing policies, and incorporating incentive systems for controlling gatekeeper referral beaviour into te model, as is done in Sumsky and Pinker (003). Acknowledgement We would like to tank Harry Groenevelt for supplying us wit is software package, QMacros, and for patiently answering our questions about te software. QMacros includes an implementation of te G/G/N approximation proposed by Witt (1993), and we used te software for te numerical experiments in Section 5. References Adan, I. (004) Teacing Note on Multi-Macine Systems, available at ttp://www.win.tue.nl/~iadan/sdp/11.pdf. Bernett, H.(003) Healtcare call centers: a tecnology migration, orizons, Perspectives in Healtcare Management and Information Tecnology, September, pp.17 0. Borst, S., Mandelbaum, A. and Reiman, M.I. (004) Dimensioning large call centers, Operations Researc, Vol. 5, No. 1, pp.17 34. Cevalier, P., Sumsky, R.A. and Tabordon, N. (004) Routing and Staffing in Large Call Centers wit Specialized and Fully Flexible Servers, working paper, Simon Scool, University of Rocester, Rocester, NY.
6 S. Hasija, E.J. Pinker and R.A. Sumsky de Véricourt, F. and Zou, Y.-P. (004) A Routing Problem for Call Centers wit Customer Callbacks after Service Failure, Working Paper, Fuqua Scool of Business, Duke University, Duram, Nort Carolina. Gross, D. and C.M. Harris (1985) Fundamentals of Queueing Teory, Second Edition, Wiley, New York. Halfin, S. and Witt, W. (1981) Heavy-traffic limits for queues wit many exponential servers, Operations Researc, Vol. 9, No. 3, pp.567 587. Örmeci, E.L. (004) Dynamic admission control in a call center wit one sared and two dedicated service facilities, IEEE Transactions on Automatic Control, Vol. 49, No. 7, pp.1157 1161. Sumsky, R.A., Pinker, E.J. (003) Gatekeepers and referrals in service, Management Science, Vol. 49, No. 7, pp.839 856. Witt, W. (199) Understanding te efficiency of multi-server service systems, Management Science, Vol. 38, No. 5, pp.708 73. Witt, W. (1993) Approximations for te GI/G/m queue, Production and Operations Management, Vol., No., pp.114 161. Wallace, R.B. and Witt, W. (004) Resource Pooling and Staffing in Call Centers wit Skill-Based Routing, Working Paper, Columbia University, ttp://www.columbia.edu/ ~ww040/pooling.pdf. Appendix: Proofs Proof of Proposition 1: For te given treatment function, ( ) ρt ρd αg = cg( ρt ρd) + ρceb(1 k) k ρ + k( ρ ρ ) d t d ρb(1 k) αe + mλ(1 b+ bk). 1 bk + bk / and, C ( ρt ρd) αg ρb( b) αe ρcb 3/ e 3/ k 4[ ρd + k( ρt ρd)] 4[1 bk+ bk / ] = + + + mλb. (0) (1) Te total cost functions ave te following properties: P1: k = 1 >0, C P: is an increasing function in k, P3: Te cost of staffing no gatekeepers is less tan te cost of staffing gatekeepers wo only do a diagnosis of te incoming calls. Te cost of aving no gatekeepers is given by, C 1 = ρc e + ρα e. ()
Staffing and routing in a two-tier call centre 7 Explore all possible cases. For eac case, we see tat eiter te direct-access system is optimal, or te two-tier system as a unique cost-minimising solution, k. I II C From P k = 0 >0. C >0 for all values of k, C is convex in te domain k [0,1]. From P1 observe tat two subcases are possible. k = 0 >0. Here, ( k ) = 0 as no root on te interval [0, 1]. In tis case it is optimal for te centre to staff only experts. k = 0 <0. Here, ( k ) = 0 as one root in te interval [0, 1]. k is te unique point wic minimises C ( k ). If C ( k )< C 1, ten staff generalists wo treat incoming calls of difficulty level <k, else only staff specialists. C k = 0 <0 and C k = 1 >0. P k (0,1) suc tat C ( k ) is concave for k < k and is convex for k > k. Tere will be four subcases ere: No root for ( k ) = 0 in te interval [0, 1]. It is optimal to staff only experts in tis case. One root for ( k ) = 0 in te interval [0, 1]. It is optimal to staff only experts in tis case. k = 0 <0. Tis case will also ave one root in [0, 1] for ( k ) = 0. Compare te total cost at tat root (k ) wit C 1 to determine wic system is optimal. ( k ) = 0 as two roots in [0,1]. Tis case is sown in Figures 4 and 9. Compare te total cost at te larger root (k ) wit C 1.
8 S. Hasija, E.J. Pinker and R.A. Sumsky III C k [0,1]. k = 0 <0 and C k = 1 <0 Terefore, C ( k ) is concave in te range Here ( k ) = 0 experts. as no roots in [0, 1] and it is optimal to staff te centre wit only Proof of Proposition : Using implicit differentiation wit te first order condition / k = 0 produces / λ = A/ B (3) were ((1/ µ ) (1/ µ )) α µ + k ( µ µ ) 1 1 t d g 1 1 A = cg + ceb(1 k ) µ µ 1/ (1/ ) (1/ ) λ µ t d d t d b(1 k ) αe 1 1 + m(1 b+ bk ) 1 bk + b( k ) / µ λ and ( ρt ρd) αg ρb( b) αe ρ 3/ e 3/ d + k t d bk+ b k B = + c b+ + mλb. 4[ ρ ( ρ ρ )] 4[1 ( ) / ] Substituting terms in A, (4) λ = [ cg( ρt ρd) ρceb(1 k ) + mλ(1 b+ bk )]/ B. λ (5) Te expression B is always positive. Terefore te sign of / λ will be te opposite of te sign of te numerator of te r..s of 5. Te numerator is an increasing function of k and is positive at k = 1. If k d = 0, ten te numerator is non-negative for k 0. d Terefore, k k for any k, and / λ 0. If k d > 0, ten te numerator is nonnegative for k k. Terefore, if k k ten / λ 0. If k < k d, te numerator is d d negative, and / λ > 0. For te tird statement in te proposition, note tat as λ, te solution to te first-order condition C k = 0 approaces k d. Statements 1 and of tis proposition imply monotonic convergence.
Staffing and routing in a two-tier call centre 9 Proof of Proposition 3: From equation (19), π( y ) (1 ) ( ) e ρb k π yg ( ρt ρd) Sign = Sign. w y e 1 bk + b( k ) / yg ρd + k ( ρt ρd) (6) d From Proposition k > k implies, α ρb(1 k ) ( ρ ρ ) t d e αg 1 bk + b( k ) / ρd + k ( ρt ρd) >0. (7) Terefore proving te following inequality completes te proof: π( y ) y e g αe >. y π( y ) α e g g (8) It can be verified tat te above inequality is equivalent to proving: cgyg cy e e >. π( y ) π( y ) g e (9) Te first-order condition for α is, wπ'( yi) wπ( y ) ci + = 0 for i= g, e. (30) y y and terefore, i i i cy i i yiπ '( y) i = w 1 for i = g, e. π( yi) π( yi) (31) Te r..s. of equation (31) is increasing in y i. Furter, it can be sown tat y g > y e. Tis implies tat inequalities given by 8 and 9 old.