Production Release Control: Paced, WIP-Based or Demand-Driven? Revisiting the Push/Pull and Make-to-Order/Make-to-Stock Distinctions

Production Release Control: Paced, WIP-Based or Demand-Driven? Revisiting the Push/Pull and Make-to-Order/Make-to-Stock Distinctions George Liberopoulos Abstract We consider three elementary mechanisms for controlling the release of parts for production in manufacturing systems: 1) setting an external production pace by controlling the raw part arrival process, 2) authorizing releases based on the work-in-process (WIP) level in all or parts of the system, and 3) releasing new parts in response to actual demands. We present and compare different variants of these mechanisms on a simple serial flow line, using a queuing network representation, and we use these variants as a basis to revisit the push/pull and make-to-order/maketo-stock distinctions. We extend our descriptions and definitions to include advance demand information and forecasts. 1 Introduction The last two decades have seen a surge in the literature related to pull control, kanban-type control, WIP control, and more generally token-based production control systems. Not only have many generalizations, extensions, and variants of the original kanban system been introduced, analyzed, and compared (e.g., generalized kanban control system (GKCS), CONstant WIP (CONWIP), production authorization card (PAC), paired-cell overlapping loops of cards with authorization (POLCA), extended kanban control system (EKCS), customized token-based system (CTBS), heijunka kanban, among others), but several reviews (e.g., [30], [16], [31], [13], [17]) and new approaches for representing and analyzing these systems (e.g., [7], [9], [12], [2], [1]) have appeared in the literature in the last five years only. New developments have also taken place within the last five years in extending the analysis and performance of pull systems to include features such as advance demand information (e.g., [47], [33], [38], [8], [22]), lot sizing (e.g., [45]), multiple products George Liberopoulos Department of Mechanical Engineering, University of Thessaly, Volos, Greece e-mail: glib@mie.uth.gr 1

2 George Liberopoulos (e.g., [44], [19], [29]), parameter and system optimization (e.g., [18], [41]), control point optimization (e.g., [3], [48]), and new simulation-based studies have been published (e.g., [15], [28], [43], [40], [26]). Despite this intensive activity in the literature, or perhaps because of it, the definition of certain important concepts remains unclear after all these years. Different authors still use the same name to describe different production release control concepts or different names to describe the same concept. This would not be a problem if the description of the concept were absolutely clear. Often, however, this is not the case, because many descriptions involve imprecise statements, such as production release is based on system status (what is system status? WIP, pending orders?) or production release is done in advance of demand (when is the demand timed? at the arrival time of a customer order, at the due-date of a customer order?). An important concept that is still a source of confusion is the distinction between push and pull and its relationship to the distinction between make-to-order (MTO) and make-to-stock (MTS). After following the related literature for over the last two decades, it appears that there is still no generally agreed upon definition of the main distinction between push and pull. Not only have there been several definitions of the push/pull distinction, but some researchers and practitioners seem to have shifted their perception of this distinction over time. There have also been several review and overview papers that discuss the push/pull distinction; most of these papers end up adopting one or the other definition. The same holds for many textbooks. For example, in Chapter 7 of his book Production and Operations Analysis, Nahmias [42] adopts the more traditional view that a pull system is one in which items are moved from one level to the next only when requested, while a push system is one in which production planning is done in advance. Under this view, he states that MRP is the basic push system and kanban is the earliest of the pull systems. To further clarify the push/pull distinction, he cites the definition in [24], according to which a pull system initiates production as a reaction to present demand, while a push system initiates production in anticipation of future demand. Thus, he writes, MRP incorporates forecasts of future demands, while JIT (the philosophy that grow out of the kanban system) does not. In Chapter 13 of their book Manufacturing Planning and Control for Supply Chain Management, Vollman et al. [46] state that the key distinction between push and pull pertains to whether the individual work centers are allowed to utilize capacity (keep busy ) without being driven by a specific end-item schedule (push) or are authorized to produce only when it has been signaled that there is a need for more parts in a downstream department (pull). A somewhat related view is taken by Zipkin [49] in Chapter 8 of his book Foundations of Inventory Management, where he writes that in a pull system, customer demands trigger all other events in the system, directly or indirectly, as the demand information propagates backwards from the end to the beginning of the system. Under this definition, he states that both kanban and base-stock are pull systems. In a paper that presents a unified framework for modeling and comparing pull systems, Liberopoulos and Dallery [34] side with the view of Zipkin that in a pull system, production is triggered by actual demands for finished products, which im-

Production Release Control 3 plies that in a push system, production is initiated independently of demands. In the follow-up paper, Liberopoulos and Dallery [36] chronicle the debate on the push/pull distinction and extend their framework to include lot sizing. This framework is further extended in [39] to include advance demand information. In Chapter 10 of their book, Factory Physics, Hopp and Spearman [20] take a seemingly different view and state that a pull system authorizes the release of work based on system status, while a push system schedules the release of work based on demand. They further specify that a pull system only allows the release of work when a signal that is generated by a change of system status (typically, the completion of work at some point in the system) calls for it. Another useful way to think about this distinction, they argue, is that push systems are inherently MTO, while pull systems are MTS and that, viewed this way, the base-stock is a pull system, whereas MRP is a push system. They maintain that the key benefits of a pull system arise when it establishes a WIP cap, i.e., a limit on the maximum amount of inventory in the system. Hence, a fundamental distinction between push and pull, they conclude, is that pull systems control WIP and observe throughput, while push systems control throughput and observe WIP. They also describe pull systems as being inherently rate-driven, in that we fix the level of WIP in them and let them run. These statements suggest that pull systems are not necessarily driven by demands, although at some later point, the authors write that CONWIP and kanban are both pull systems in the sense that releases into the line are triggered by external demands. In [21], Hopp and Spearman argue that practitioners initially equated pull with kanban and MTS, and push with MRP and MTO, at a strategic level, but after the 1990 s, these associations were completely reversed and pull became associated with MTO, whereas push with MTS, at a tactical level, causing confusion among practitioners. They then proceed to define a pull system as one that explicitly limits the amount of WIP that can be in the system, while a push system as one that has no explicit limit on the amount of WIP that can be in the system. They also revise their earlier view that push systems are inherently MTO and pull systems are MTS by stating that the MTO/MTS distinction is orthogonal to the push/pull distinction. They argue that, under this view, kanban, CONWIP, (K,S), POLCA, PAC, and MRP with a WIP constraint are pull systems, whereas MRP, base-stock, and installationstock (Q,R) are push systems. In a recent book, Engineering Production Control Strategies, Karrer [25] adopts the definition of Hopp and Spearman. Other authors (e.g., [17]) also adopt this definition. Based on the discussion above, we can group the definitions of the push/pull distinction into the following three general definitions: Definition 1. A pull system initiates production as a reaction to present demand, while a push system initiates production in anticipation of future demand. Definition 2. In a pull system, production is triggered by actual demands for finished products, while in a push system, production is initiated independently of demands.

4 George Liberopoulos Definition 3. A pull system is one that explicitly limits the amount of WIP that can be in the system, while a push system has no explicit limit on the amount of WIP that can be in the system. In this chapter, we adopt Definition 2, but we also discuss the other two definitions. If the push/pull distinction is still under question, the MTO/MTS distinction should be easier to agree on. The MTO/MTS distinction has to do with whether finished goods are produced to be stocked or to fill specific customer orders (demands). Liberopoulos and Dallery [36] define a MTS system as one in which parts are produced up to a certain target inventory level before the actual demands for them have arrived. When a demand arrives to the system, it is satisfied from the stock of finished goods, if such stock is available, and it triggers a production release order for a new part to replenish the finished goods inventory. In a MTO system, no inventory is produced ahead of time; instead production is initiated to satisfy a particular order whenever such an order arrives to the system. Therefore, production follows demand. Somewhere in between MTO and MTS, but perhaps closer to the latter, lies the notion of make-to-forecast (MTF). In a MTF system, parts are produced ahead of time to meet forecasted demands, before the actual demands for finished parts have arrived. We summarize these descriptions into the following definition: Definition 4. In a MTO system, production releases are initiated to meet actual customer orders (demand), while in a MTF system, production releases are initiated to meet forecasts of customer orders. In a MTS, production releases are initiated to replenish the finished goods inventory and bring it up to a specified target level. Therefore, in MTO systems, production follows demand, while in MTF and MTS systems, production precedes demand, where the demand is timed at the due date and not the arrival time of a customer order. One of the issues that we will address in this paper is the relationship between the push/pull distinction and the MTO/MTS distinction. As we wrote earlier, Hopp and Spearman [21] maintain that Definition 3 of the push/pull distinction, which they propose, is orthogonal to the definition of MTO/MTS. Therefore, they argue that both push and pull systems can be either MTS or MTO and illustrate this with some examples. We agree that the push/pull distinction should be separate from the MTO/MTS distinction. However, as we adopt Definition 2 for the push/pull distinction, we argue that the MTO/MTS distinction, which has to do with the timing of production releases relatively to the timing of demands, only makes sense in pull systems, because in push systems, production is initiated independently of demands. Moreover, in Section 4.2, we argue that MTF systems can be either pull or push, depending on whether forecasts are based on actual demands or are generated independently of the demands. Finally, Definition 1 of the push/pull distinction seems to equate push with MTO and pull with MTF and MTS. Our goal in this chapter is to try to sort out the above concepts. To this end, we will present different production control systems and describe exactly how each

Production Release Control 5 system works using a queuing network representation. In the end, we will put labels on these systems (e.g., push, pull, etc), but we need to point out that our intention is not to convince that these labels are written in stone. Our intention is to clarify and classify different ways of production release control and see how these ways relate to each other and can be combined with each other. Ultimately, we want to be able to make statements such as, Here is a particular production release control system and this is exactly how it works. We happen to call it X (e.g., a push system with a WIP cap ). Others, call it Y. This is fine, as long as we understand that we are talking about the same thing. The remainder of this chapter is organized as follows. In Section 2, we present several production control systems in the absence of demands, First, we consider a basic system with controlled raw part arrivals but without any WIP control, and then we turn our attention to systems with WIP control. In Section 3, we revisit some of these systems in the presence of demands and use them as a basis to discuss the push/pull and MTO/MTS distinctions. In Section 4, we extend our descriptions and definitions to include advance demand information and forecasts. Finally, we conclude in Section 5. 2 Production Control in the Absence of Demands In this section, we present several production control systems without accounting for the demands for finished goods. Initially, we look at a system without WIP control but with controlled raw part arrivals. Then we turn our attention to systems in which WIP is controlled in various ways. In practice, no production system operates in the absence of demands. Even when demand is excessive, as is often the case in the initial phase of a popular gadget s lifecycle, production is still driven by orders, which are unavoidably backordered. Therefore, the systems that we present in this section are not encountered in real life. We consider them, because they are key components of more complex systems that take into account the demand for finished goods. They can be also viewed as demand-saturated systems, i.e., systems with infinite demands, that can be used for design purposes to estimate the maximum throughput of the physical production system that they are applied to. The maximum throughput is important to know because it sets the maximum demand rate that the system can meet in the long run. 2.1 System without WIP Control Figure 1 shows a basic production system which is a simple flow line consisting of four workstations in series, separated by buffers of infinite capacity. The workstations are represented by ovals and are denoted by WS i, i = 1,,4. The buffers are represented by open square boxes and are denoted by P i, i = 1,,3. Each worksta-

6 George Liberopoulos Fig. 1 Production system with infinite-capacity buffers tion consists of a machine, represented by a circle, with an input buffer of infinite capacity in front of it, represented by a small open square box. Upstream of WS 1 there is a raw parts buffer, denoted by P 0, which receives raw parts that arrive according to a controlled raw part arrival process, denoted by RP. RP can be thought of as the machine of a pseudo-workstation that is supplied by an infinite source of parts. Downstream of WS 4 there is a finished goods buffer, denoted by P 4. We assume that there is no control of the machine processing rates, so when a machine works on a part, it processes it at full speed. Thus, when the machine of workstation WS i, i = 1,,4, finishes processing a part, it pushes it to the input buffer of the next downstream workstation, WS i+1 (or to the exit of the system, if i = 4). The machine then pulls a new part from its input buffer and starts processing it as fast as it can. If no part is available in its input buffer, then the machine is starved. The part that is pushed downstream passes through buffer P i but does actually spend any time in P i ; hence P i is always empty. The same holds for buffer P 0 which is fed by the pseudo-machine RP. As buffers P 0 to P 4 are always empty, they are drawn with dotted lines to indicate that they could have been omitted from the picture without changing anything in the system behavior. We should mention that in some real-life manufacturing systems, it has been observed that the workload affects the performance of the system, e.g., the processing rates of the machines. One of the reasons for this is that the workers who operate the machines tend to work more efficiently when the workload is at some ideal value. If the workload is below this value, they are not pressured enough, and if it is above it, they are over pressured. This is an interesting issue, but it is outside the scope to this chapter. We refer the interested reader to [4]. Unlike the production rates of the machines in workstations WS i, i = 1,,4, the production rate of pseudo-workstation RP (raw part arrival rate) is controlled and essentially sets the production pace for the rest of the system. In fact, it is the only control parameter in the system. Normally, the production rate of RP (raw part arrival rate) should be set to a value which is lower that the production rates of the actual workstations. In this case, RP acts as the bottleneck (slowest) workstation, and all the buffers downstream of RP have a finite number of parts; hence, the WIP is finite. Moreover, the throughput (output rate) of the system is equal to the production rate of RP. Such a system would be characterized as push, based on Definition 3, because throughput is controlled and WIP is observed. If the production rate of RP is set higher than the production rate of WS 1, then eventually the input buffer in front of WS 1 will be flooded with infinite raw parts and therefore WS 1 will always busy. In fact, if all the workstations (including RP) upstream of any given workstation WS i are faster than WS i, then the input buffer

Production Release Control 7 Fig. 2 Flow line with finite-capacity buffers of WS i will eventually be flooded with infinite parts, even if it is not the bottleneck workstation. The only way that this will not happen is if the bottleneck workstation is RP. Many readers will recognize in Figure 1 an open network of tandem queues where RP is the arrival process of jobs to the rest of the system. 2.2 Systems with WIP Control In this section we present several known WIP control mechanisms applied to the basic production system considered in Section 2.1. To motivate the discussion, first we consider a classical flow line model with finite buffers and then move on to describe the WIP control mechanisms. 2.2.1 Flow Line with Finite-Capacity Buffers Figure 2 shows a basic production system consisting of four workstations separated by finite-capacity buffers. The system is identical to the basic production system shown in Figure 1, except that there are no input buffers in front of the machines, and buffers P i, i = 1,,4 have finite capacities. To indicate this, they are shown as closed square boxes with partitions. The total capacity of WS i plus P i is denoted by K i. Assuming that the machine in WS i can hold only one part, this means that the capacity of buffer P i is K i 1. In addition to the control of raw part arrivals through process RP, the system also has WIP control. Namely, the number of parts in WS i plus P i is not allowed to exceed the WIP limit K i, i = 1,,4. More specifically, when workstation WS i, i = 1,,4, finishes processing a part, it pushes it downstream, but it does not immediately pull a new part from its upstream buffer P i 1, unless there is available space in P i to store that part when it is finished; If there is no available space in P i, then the machine in WS i is blocked from pulling a new part. Of course, if no part is available in buffer P i 1, then the machine is starved. Note that a machine may be blocked and starved at the same time. Machine in WS 4 is never blocked, because when a part finishes its processing at WS 4, it immediately leaves the system. Before leaving, the finished part instantaneously passes though the finished goods buffer P 4 but does not spend any time in it; therefore, buffer P 4 is always empty. As previ-

8 George Liberopoulos Fig. 3 Flow line with finite-capacity buffers and BBS-PNO, represented with the use of PA cards ously, it is drawn with dotted lines to indicate that it could have been omitted from the picture without changing anything in the system behavior. If the production rate of RP (raw part arrival rate) is set higher than the production rate of WS 1, then the raw parts buffer P 0 will eventually be flooded with infinite raw parts and consequently WS 1 will never be starved. In this case, the system will have given up the raw part arrival control, but the WIP control will still be there to ensure that no buffer (except P 0 ) grows to infinity. Of course in real life, no buffer can accommodate infinite parts. In practice, however, it is not unusual that the raw parts supply department of a firm would try to ensure that P 0 almost never runs out of raw parts. The system described above is a specific variant of a manufacturing flow line with finite buffers. The study of such lines has been a particularly active area of research for over 30 years. Much of the literature on flow lines has focused on throughput analysis. For a recent review, see [32]. The blocking mechanism that we described above is only one of several possible blocking mechanisms and is referred to as blocking before service with position non-occupied (BBS-PNO). More blocking mechanisms and flow line variations are described in [10]. The representation of the system shown in Figure 2 is not sufficient for describing the blocking mechanism. An alternative, precise representation of the system described above is shown in Figure 3, where the WIP control on each workstation and its downstream buffer is implemented with the use of production authorization (PA) cards. In this representation, buffers PA i, i = 1,,4, have infinite capacity, as do buffers P i in the system in Figure 1. In order for the machine in WS i to pull a new part from buffer PA i 1 (buffer P 0, in the case of WS 1 ) and start working on it, a free PA card must be available in buffer A i. If such a card is available, then it is attached onto a part in buffer PA i 1 and together they are released into WS i ; therefore, buffers A i an PA i 1 are linked together in a synchronization station. When the part finishes its processing in WS i, it is pushed into buffer PA i, with the card still attached to it. The card is freed from the part when the part leaves PA i to enter the next downstream workstation WS i+1. The free card is returned to buffer A i. The intermediary buffers are denoted by PA i instead of by P i to indicate that they contain parts ( P ) with production authorization cards ( A ) attached to them. Essentially, the cards in buffer A i represent the number of free positions in finite buffer P i in Figure 2. Note that as PA 4 is always empty, buffer A 4 will always have either K 4 or K 4 1 free

Production Release Control 9 Fig. 4 Production system with kanban control at the workstation level cards, given that WS 4 can only hold zero or one part; therefore the behavior of the system for any value of K 4 > 1 is identical to its behavior when K 4 = 1. 2.2.2 System with Kanban Control at the Workstation Level The system in Figure 4 is identical to the system in Figure 3, except that each machine has an input buffer of infinite capacity in front of it as was the case in the system in Figure 1. Alternatively, the system in Figure 4 is identical to the system in Figure 1, except that WIP is controlled at the individual workstation level. More specifically, the number of parts in each workstation WS i plus its downstream buffer PA i is not allowed to exceed the WIP limit K i ; therefore, K i is a WIP cap, i = 1,,4. In the case of the last workstation, PA 4 is always empty (hence it is drawn in dotted lines). This implies that if each finished part released its PA card immediately after exiting WS 4, instead of after exiting P 4, nothing would change in the behavior of the system. If the production rate of RP (raw part arrival rate) is set higher than the production rate of WS 1, then the raw parts buffer P 0 will eventually be flooded with infinite raw parts. In this case, as soon as free PA of WS 1 is returned from buffer PA 1 to buffer A 1, it will immediately be attached onto a raw part in P 0, authorizing its release into WS 1. This means that buffer A 1 will always be empty, and the number of parts in WS 1 plus PA 1 will always be equal to K 1 ; K 1 will therefore be a WIP constant rather than a WIP cap. When P 0 has infinite raw parts, the production system attains its maximum throughput. Such a system would be characterized as pull, based on Definition 3, because WIP is controlled and throughput is observed. If the production rate of RP is lower than the maximum throughput of the system, then the system will be able to absorb all the raw parts generated by RP, and the throughput (output rate) of the system will be equal to the production rate of RP (arrival rate of raw parts). Such a system might be characterized as hybrid push/pull, based on 3, because throughput is controlled and WIP is limited. Many readers will recognize in Figure 4 the classical single-card kanban system, where the kanbans (PA cards) are defined at the individual workstation level. For this reason, we used the name kanban control at the workstation-level to describe the system. We caution, however, that the system in Figure 4 is not a complete kanban system, because it is not driven by demands. Its behavior, however, is identical to the

10 George Liberopoulos Fig. 5 Production system with CONWIP control Fig. 6 Production system with CONWIP control (equivalent representation) behavior of a saturated kanban system, i.e., a kanban system with infinite demands for finished goods. 2.2.3 System with CONWIP Control Controlling WIP at individual workstations (Figure 4) is at one extreme of the different ways of controlling WIP. Figure 5 shows the other extreme case where WIP is controlled at the level of the entire system. More specifically, in the system in Figure 5, the number of parts in the entire system is not allowed to exceed the WIP limit K 1 4 ; hence, K 1 4 is the WIP cap of the entire system. As in the system in Figure 4, buffer PA 4 is always empty (hence it is drawn in dotted lines). This implies that if each finished part released its PA card immediately after exiting WS 4, instead of after exiting the finished goods buffer, as is shown in Figure 6, nothing would change in the behavior of the system. In addition, all the other buffers PA 1 PA 3 are also empty. If the production rate of RP (raw part arrival rate) is set higher than the production rate of WS 1, then the raw parts buffer P 0 will eventually be flooded with infinite raw parts. In this case, as soon as free PA card is returned from buffer PA 4 to buffer A 1, it will immediately be attached onto a raw part in P 0, authorizing its release into WS 1. This means that buffer A 1 will always be empty, and the number of parts in the entire system will always be equal to K 1 4 ; K 1 4 will therefore be a WIP constant rather than a WIP cap and the resulting system will be a CONWIP (CONstant WIP) system [20]. Based on Definition 3, this is a pure pull system, because WIP is controlled and throughput is observed. Note that the PA mechanism in Figure 5 is identical to the kanban mechanism in Figure 4, except that the PA cards (kanbans) are not defined at the individual workstation level but at the level of the entire system. For this reason, some authors (e.g., [34]) view the CONWIP system in Figure 5 as a single-stage kanban system,

Production Release Control 11 Fig. 7 Production system with multi-stage sequential kanban control Fig. 8 Production system with multi-stage sequential kanban control (equivalent representation) because all the workstations have been grouped into a single stage and the WIP of that stage is controlled with a kanban-type mechanism. 2.2.4 System with Multi-Stage Sequential Kanban Control As Hopp and Spearman [20] note, the systems in Figures 4 and 5 are at the extremes in a continuum of CONWIP-based configurations. Figure 7 shows an intermediate case where the system is divided into two stages and a WIP control loop is imposed on each stage. Hopp and Spearman [20] call such a system multi-loop CONWIP. Other authors (e.g., [34]) use the name multi-stage (sequential) kanban to describe it, because the WIP control is implemented by a kanban-type mechanism. Irrespectively of the name, the operation of the system is the same. More specifically, the number of parts in workstations WS 1 and WS 2 plus their downstream buffers PA 1 and PA 2 is not allowed to exceed the WIP cap K 1 2. A similar WIP cap, K 3 4, is set for workstations WS 3 and WS 4 plus their downstream buffers PA 3 and PA 4. As in the other two systems in Figures 4 and 5, the finished goods buffer PA 4 is always empty. This implies again that if each finished part released its PA card immediately after exiting WS 4, instead of after exiting the finished goods buffer, as is shown in Figure 8, nothing would change in the behavior of the system. Also, as in the previous two WIP controlled systems, if the production rate of RP (raw part arrival rate) is set higher than the production rate of WS 1, then the raw parts buffer P 0 will eventually be flooded with infinite raw parts. In this case, buffer A 1 will always be empty, and the number of parts in WS 1, PA 1, WS 2, and PA 2 will always be equal to K 1 2 ; K 1 2 will therefore be a WIP constant rather than a WIP cap. Based on Definition 3, such a system will be a pure pull system.

12 George Liberopoulos Fig. 9 Production system with echelon kanban control 2.2.5 System with Echelon Kanban Control The most comprehensive way of controlling WIP is to control the WIP between the entrance of any two workstations. In our 4-workstation example, this would be equivalent to setting up WIP control loops in workstations 1, 1 2, 1 3, 1 4, 2, 2 3, 2 4, 3, 3 4, and 4. This would imply that some of the WIP control loops are overlapping; as a result, a part would be carrying several PA cards from different WIP control loops as it moved downstream the production process. Such configurations have been studied in [14], [16], and [17], under the name controlled token-based systems (CTBS). Although it is possible that some combinations of overlapping WIP control loops might perform better than others, we think that using too many overlapping WIP control loops would be too confusing and difficult to handle for practical purposes. We find, that among all the possible combinations of overlapping WIP control loops, the case of nested kanban loops, such as the one shown in Figure 8, may be of interest for the following reason. In general, CONWIP (or single-stage kanban) control allows more flexibility in the production system than any other WIP control mechanism, because it controls the release of parts at the entrance of the system only and nowhere else. CONWIP is also very simple to implement. A potential shortcoming of CONWIP is that when a part is released into the system, it is pushed through without any further control. This may be fine in many situations, but it may be problematic in other situations. What would happen, for example, if the parts in the last two workstations, WS 3 and WS 4, required special storage conditions, making the inventory holding cost much higher in them than in the first two workstations, WS 1 and WS 2? What if additionally, WS 3 and WS 4 were relatively slow? In this case, the CONWIP system would indiscriminately push parts to workstations WS 3 and WS 4. These parts would then accumulate in front of the slow workstations WS 3 and WS 4, incurring high inventory cost. The first step towards dealing with such a situation is to recognize that the last two workstations should be treated differently than the first two workstations. On way to signal this is to set up two sequential WIP control loops (sequential multistage kanban or multi-loop CONWIP), as in Figure 7, and set K 3 4 to a small value to limit the WIP in the last two workstations. The sequential multi-stage kanban system in Figure 7, however, is a local control scheme, because the decision of authorizing the release of parts in the first two workstations is based on the WIP in

Production Release Control 13 these workstations and does not take into account the WIP in the last two workstations. The nested kanban mechanism in Figure 9, on the other hand, is a global control scheme, because the decision of authorizing the release of parts at any control point in the system (including the first two workstations) is based on the WIP in the entire system downstream of the control point. Due to its global nature, Buzacott and Shanthikumar [6] call the demand-driven version of the system in Figure 9, integral control system. In a way, the movement of PA cards in the nested system of Figure 9 is similar to the movement of demands in an echelon stock (Q,R) policy, whereas the movement of PA cards in the sequential system of Figure 7 is similar to the movement of demands in an installation stock (Q, R) policy; for this reason, the demand-driven version of the system in Figure 9 is referred to as echelon kanban control system in [36] and [27]. We adopt the same name here. Gonzalez-R et al. [17], on the other hand, call it token-based base-stock system, most likely because the movement of PA cards in the nested system of Figure 9 resembles the movement of demands in a base-stock system, which we will examine in Section 3.1. 3 Production Control in the Presence of Demands In this section, we revisit some of the production control systems that we presented in Section 2, only this time in the presence of demands for finished goods. We distinguish between two cases. In the first case, the demands for finished goods do not generate any further demands upstream of the finished goods buffer. In the second case, the demands for finished goods generate further demands for semi-finished goods and raw parts that are transferred upstream the system. We will use these two distinct cases to characterize the systems as push or pull, based on Definition 2. We will further characterize the pull systems as either MTO or MTS. 3.1 System without WIP Control in the Presence of Demands We revisit the system without WIP control that we presented in Section 2.1, only this time in the presence of demands for finished goods. 3.1.1 System with Demands for Finished Goods Only Figure 10 shows a system which is identical to the basic production system depicted in Figure 1, except that the finished goods coming out of WS 4 do not immediately leave the system by instantaneously passing through the finished goods buffer P 4 ; instead, they are stored in P 4, waiting to be matched to customer demands for fin-

14 George Liberopoulos Fig. 10 Production system with infinite-capacity buffers and demands for finished goods ished goods that arrive to buffer D 5. This means that P 4 is not always empty, as was the case in Figure 10; for this reason it is not drawn in dotted lines. Buffers P 4 and D 5 are linked in a synchronization station. If a part is available in P 4, but no customer demand is available in buffer D 5, then the part waits in P 4 until such a demand arrives to D 5. Similarly, if a demand is available in D 5 but no parts are in P 4, then the demand waits in D 5 until a part enters P 4 from WS 4. If a part is available in P 4 and a demand for such a part is available in buffer D 5, then the part is immediately delivered to the customer that placed that demand and the demand is satisfied; hence, it is dropped from D 5. This means that at least one of the two buffers, P 4 and D 5, is empty at all times. Note that the incoming customer demands are for finished goods only and do not generate any further demands for semi-finished goods or raw parts. Therefore, the entire system upstream of P 4 is not informed of the demands and behaves exactly like the system in Figure 1, i.e., it produces parts with no control other than that stemming from the controlled raw part arrival process RP. This type of control, however, is completely exogenous or open loop, because it does not take into account either the state of the system (WIP) or the external disturbance that is supposed to drive the system (demand). In Section 3.1.3, we claim that the latter is a characteristic of a push system. Normally, the production rate of RP (arrival rate of raw parts) should be set equal to the average demand rate, so that eventually all the finished parts will be matched to demands and vice versa. In this case, the number of parts in buffer P 4 and the number of demands in D 5 will be finite. If the demand arrival process were fairly invariable and the production workstations well balanced (i.e., with more or less equal production rates), then the paced production system in Figure 10 might perform reasonably well, as it would result in a relatively smooth material flow. 3.1.2 System with Demands for Finished Goods that also Generate Demands for Semi-Finished Goods and Raw Parts Figure 11 shows a system which is identical to the system in Figure 10, except that each incoming customer demand for finished goods also generates a demand for a

Production Release Control 15 Fig. 11 Production system with infinite-capacity buffers and demands for finished goods, semifinished goods, and raw parts (base-stock system) semi-finished part in buffer P 2 and a demand for a raw part stored in P 0 ; these two demands enter buffers D 3 and D 1, respectively. In this case, in order for a raw part to enter WS 1, not only must such a part exist in P 0, but a demand for it must also exist in buffer D 1. Similarly, in order for a semi-finished part in P 2 to be released in WS 3, a demand for it must be available in buffer D 3. Although the release of raw parts into the system is still controlled by the exogenous raw part arrival process RP, the release of parts at various other control points of the system (including the entrance of the system), is also driven by demands. These control points are the entrance of WS 1, the entrance of WS 3 and the exit of P 4. Of course, there could be other control points (e.g., the entrance of WS 2 and the entrance of WS 4 ), but we omit them for space considerations. The maximum throughput of the demand-responsive production system in Figure 11 is equal to the throughput of the demand-ignoring system in Figure 1. If the maximum throughput is smaller than the demand rate, then the demand-responsive system in Figure 11 will not be able to meet the demands, and the demand buffers D 1, D 3, and D 5, will eventually grow to infinity. In this case, the demand-responsive system in Figure 11 will behave exactly as the demand-ignoring system in Figure 1. If the maximum throughput of the system in Figure 11 is smaller than the demand rate, however, then the system will be able to meet all the demands, and its throughput will be exactly equal to the demand rate. Moreover, the WIP in the system will be finite. This latter case is more realistic and is of interest. If the production rate of RP (raw part arrival rate) is set higher than the production rate of WS 1, then the raw parts buffer P 0 will eventually be flooded with infinite raw parts. In this case, the system will have given up the raw part arrival control, but the demand response will still be there. A potential disadvantage of having infinite raw parts (or practically, a very large number of raw parts) is that if a very large number of customer demands arrive in a short period of time, then an equal (very large) number of raw parts will enter WS 1, unnecessarily burdening the WIP. Normally, however, the processing rate of RP (raw part arrival rate) should be set equal to the demand rate, which implies that buffer P 0 will not grow to infinity. In this case, the raw part arrival process plays an important control role as it sets a limit on the release pace. Thus, if a very large number of customer demands arrive in a

16 George Liberopoulos short period of time, the controlled raw part arrival process RP prohibits the release of an equal (very large) number of raw parts into WS 1, which may unnecessarily burden the WIP. The initial state of the controlled buffers P 2 and P 4 is defined as the number of parts in these buffers before any customer demands have arrived to the system. Contrary to all the systems presented earlier, the initial states of buffers P 2 and P 4 in the system in Figure 11, denoted by S 2 and S 4, respectively, play an important role, because they set upper limits for these buffers. These limits will be reached again and again if no customer demands arrive to the system for a long enough time so that the rest of the system will have been cleared out of parts. Many readers will recognize the system in Figure 11 as a base-stock system [34]. S 2 is the base-stock level of the part of the system that includes workstations WS 1 and WS 2. Similarly, S 4 is the base-stock level of the part of the system that includes workstations WS 3 and WS 4. 3.1.3 On the Push/Pull and MTO/MTS Distinction Having presented the two systems in Figures 10 and 11, we are now in a position to comment on the push/pull and MTO/MTS distinctions. As we wrote in Section 1, we adopt Definition 2 for the push/pull distinction, but we also discuss the other two definitions. Based on Definitions 2 and 3, the system in Figure 10 is push, but for different reasons; in the case of Definition 2, because production is initiated independently of demands, and in the case of Definition 3, because the WIP in the system is not limited. Definition 1 does not cover this system. As we mentioned in Section 1, Hopp and Spearman [20], who propose Definition 3, argue that the MTO/MTS distinction is orthogonal to the push/pull distinction; however, we are sure how they would characterize the system in Figure 10 in terms of the MTO/MTS distinction. In our view, the system in Figure 10 is neither MTO nor MTS, because parts in it are neither produced to meet actual customer orders (demands) nor are they produced to replenish finished goods inventory when it is depleted by demands. This view is in line with our statement in Section 1 that, according to Definition 2, the MTO/MTS distinction does not make sense in push systems. The base-stock system in Figure 11, is pull, based on Definitions 1 and 2, because production is driven by actual demands. Based on Definition 3, however, it is push, because the WIP in it is not limited. Hopp and Spearman [20] propose Definition 3 for the push/pull distinction as a refinement of the more general definition that in pull systems the release of work is authorized based on system status. Their refinement lies in that they restrict the system status to mean WIP. Under this definition of system status, the base-stock system in Figure 11 is certainly not pull. It is important to note, however, that the production release decisions at each control point of the base-stock system are based on system status, if by system status we mean the echelon inventory position of finished goods. The echelon inventory position is defined as the sum of the pending orders from the control point to the end of

Production Release Control 17 the system (hence, the term echelon ) plus the on-hand inventory of finished parts in P 4 minus the backordered demands in D 5. More specifically, the rule that drives production release decisions in the base-stock system is that the echelon inventory position must always be constant and equal to the so-called echelon base-stock level. In the system in Figure 11 there are two control points: one at the entrance of the first stage and the other at the entrance of the second stage. For the first control point, the echelon base-stock level is S 2 + S 4, and the pending orders are defined as the sum of the unprocessed orders in D 1 plus the in-process orders in the entire manufacturing system from WS 1 thought to WS 4. For the second control point, the echelon base-stock level is S 4, and the pending orders are defined as the sum of the unprocessed orders in D 3 plus the in-process orders in the second stage of the manufacturing system, namely from WS 3 thought to WS 4. Based on the discussion above, we can conclude that all three definitions of the push/pull distinction agree that in pull systems, production release decisions are based on system status. The difference is that in Definitions 1 and 2, the system status is defined as the inventory position, whereas in Definition 3, it is defined as the WIP. Concerning the MTO/MTS distinction, the system in Figure 11 can be characterized as MTO or MTS, depending on whether the echelon base-stock level is zero or strictly greater than zero. More specifically, if the echelon base-stock level of the first control point is zero, i.e., if S 2 + S 4 = 0, which means that S 2 = S 4 = 0, then any arriving customer demand will trigger the release of a raw part into the system. When this part is completed and exits WS 4, it will be matched to the demand that triggered its release, hence it will have been made to order (MTO). On the other hand, if S 4 > 0, then any arriving customer demand will be satisfied by a finished part from buffer P 4 that has been produced before the arrival of that demand, i.e., that has been made to stock (MTS). The arriving demand will also trigger the release of a raw part into the system to replenish the inventory in buffer P 4.When this part is finished and exits WS 4, it will not be matched to the demand that triggered its release, but to a subsequent demand. Finally, if S 4 = 0 and S 2 > 0, the second stage will be MTO (because its echelon base-stock level S 4 will be zero) but the first will be MTS (because its echelon base-stock level S 2 + S 4 will be strictly positive). Of course, it is possible for a system to be partly push and partly pull, as well as partly MTO and partly MTS. An example is the system in Figure 12. In that system, the demands that are generated by each customer demand go as far upstream as the intermediary semi-finished goods buffer, P 2, instead of to the raw parts buffer P 0 ; hence the part of the system upstream of P 2 behaves like the push system in Figure 10, whereas the part downstream of P 2 behaves like the pull system in Figure 11. Moreover, if S 3 > 0 but S 4 = 0, then the part of the pull system upstream of buffer P 3 is MTS, whereas the part of the system downstream of P 3 is MTO.

18 George Liberopoulos Fig. 12 Production system with infinite-capacity buffers and demands for finished goods and semifinished goods 3.2 Systems with WIP Control in the Presence of Demands In this section we revisit some of the systems with WIP control that we presented in Section 2.2, only this time in the presence of demands for finished goods. As in the case of the systems with no WIP control that we discussed in Section 3.1, we consider both cases where the demands for finished goods do or do not generate further demands for semi-finished goods and raw parts. 3.2.1 System with Demands for Finished Goods that also Generate Demands for Semi-Finished Goods and Raw Parts In this section, we look at systems where the demands for finished goods also generate demands for semi-finished goods and raw parts. We distinguish between two cases, one where the demands for semi-finished goods and raw parts are carried upstream by PA cards, and another where they are transferred upstream independently of the PA card movement. Systems where Demands are Carried Upstream by PA Cards The system in Figure 13 is identical to the multi-stage sequential kanban system in Figure 7, except that the finished goods coming out of WS 4 do not immediately leave the system by instantaneously passing through the finished goods buffer PA 4, but are stored in PA 4, waiting to be matched to customer demands that arrive to buffer D 5. This means that PA 4 is not always empty, as was the case in Figure 1; for this reason it is not drawn in dotted lines. Viewed from a different angle, the system in Figure 13 appears to be identical to the system in Figure 10, on which a multi-stage (2-stage) kanban mechanism has been superimposed to control the WIP. The truth is, however, that it is far from identical. Besides the obvious difference in WIP control, there is a another fundamental difference between the systems in Figure 13 and Figure 10, regarding the demands.

Production Release Control 19 Fig. 13 Production system with multi-stage sequential kanban control and demands for finished goods, semi-finished goods, and raw parts Namely, in the system in Figure 10, each incoming customer demand for finished goods does not generate any further demands upstream of buffer P 4 ; for this reason, we characterized that system as a push system. In the system in Figure 13, on the other hand, each incoming customer demand for a finished part in PA 4 also generates a demand for a part stored in the semi-finished goods buffer PA 2 and a demand for a raw part stored in P 0, as was the case in base-stock system in Figure 11. These two demands are transferred upstream to buffers DA 3 and DA 1, respectively. Unlike in the base-stock system in Figure 11, however, the demands are not transferred to their respective buffers instantly upon the arrival of the customer demand that generated them. Instead, they are carried upstream by the returning free PA cards (kanbans). Thus, each time a kanban is freed from buffer PA 4 and is returned upstream to buffer DA 3, it carries with it a demand for a semi-finished part in PA 2 and a demand for a raw part in P 0. If a semi-finished part is available in PA 2, then this part enters WS 3, after liberating the stage-1 kanban that was attached to it and picking up a free stage-2 kanban from buffer DA 3. The demand for a semi-finished part that was attached to this kanban is satisfied and hence dropped. The other demand, for a raw part, that was also attached to the stage-2 kanban, is attached to the liberated stage-1 kanban and is carried upstream to buffer DA 1. The buffers of free PA cards, are denoted by DA i instead of by A i to indicate that they contain authorization cards ( A ) attached to demands ( D ). The system in Figure 13 is called multi-stage (sequential) kanban control system in [6] and [34]. Figure 14 shows a system which is called integral control system in [6] and echelon kanban control system in [34]. It is identical to the system in Figure 9, except that it is driven by customer demands for finished goods, each of which also generates a demand for a part stored in the semi-finished goods buffer PA 2 and a demand for a raw part stored in P 0, as was the case in the kanban system in Figure 13. The difference with the kanban system in Figure 13 is that when a finished part leaves the finished goods buffer PA 4, it releases simultaneously two kanbans: one kanban returns to buffer DA 3 and the other to buffer DA 1. The first kanban carries with it a demand for a semi-finished part in PA 2 and the second carries a demand for a raw part in P 0.