Markovian inventory policy with application to the paper industry



Similar documents
Use of extrapolation to forecast the working capital in the mechanical engineering companies

Evaluating Inventory Management Performance: a Preliminary Desk-Simulation Study Based on IOC Model

CLOSED-LOOP SUPPLY CHAIN NETWORK OPTIMIZATION FOR HONG KONG CARTRIDGE RECYCLING INDUSTRY

PERFORMANCE METRICS FOR THE IT SERVICES PORTFOLIO

Method of supply chain optimization in E-commerce

CRM FACTORS ASSESSMENT USING ANALYTIC HIERARCHY PROCESS

This paper studies a rental firm that offers reusable products to price- and quality-of-service sensitive

SOME APPLICATIONS OF FORECASTING Prof. Thomas B. Fomby Department of Economics Southern Methodist University May 2008

A Study on the Chain Restaurants Dynamic Negotiation Games of the Optimization of Joint Procurement of Food Materials

Machine Learning Applications in Grid Computing

Markov Models and Their Use for Calculations of Important Traffic Parameters of Contact Center

RECURSIVE DYNAMIC PROGRAMMING: HEURISTIC RULES, BOUNDING AND STATE SPACE REDUCTION. Henrik Kure

arxiv: v1 [math.pr] 9 May 2008

ASIC Design Project Management Supported by Multi Agent Simulation

Online Bagging and Boosting

Airline Yield Management with Overbooking, Cancellations, and No-Shows JANAKIRAM SUBRAMANIAN

Fuzzy Sets in HR Management

Optimal Resource-Constraint Project Scheduling with Overlapping Modes

Research Article Performance Evaluation of Human Resource Outsourcing in Food Processing Enterprises

Extended-Horizon Analysis of Pressure Sensitivities for Leak Detection in Water Distribution Networks: Application to the Barcelona Network

International Journal of Management & Information Systems First Quarter 2012 Volume 16, Number 1

Searching strategy for multi-target discovery in wireless networks

Managing Complex Network Operation with Predictive Analytics

Cooperative Caching for Adaptive Bit Rate Streaming in Content Delivery Networks

Applying Multiple Neural Networks on Large Scale Data

Resource Allocation in Wireless Networks with Multiple Relays

How To Get A Loan From A Bank For Free

An Innovate Dynamic Load Balancing Algorithm Based on Task

Quality evaluation of the model-based forecasts of implied volatility index

ADJUSTING FOR QUALITY CHANGE

Media Adaptation Framework in Biofeedback System for Stroke Patient Rehabilitation

Reliability Constrained Packet-sizing for Linear Multi-hop Wireless Networks

Software Quality Characteristics Tested For Mobile Application Development

Calculating the Return on Investment (ROI) for DMSMS Management. The Problem with Cost Avoidance

AN ALGORITHM FOR REDUCING THE DIMENSION AND SIZE OF A SAMPLE FOR DATA EXPLORATION PROCEDURES

A framework for performance monitoring, load balancing, adaptive timeouts and quality of service in digital libraries

An Optimal Task Allocation Model for System Cost Analysis in Heterogeneous Distributed Computing Systems: A Heuristic Approach

The Application of Bandwidth Optimization Technique in SLA Negotiation Process

Load Control for Overloaded MPLS/DiffServ Networks during SLA Negotiation

INTEGRATED ENVIRONMENT FOR STORING AND HANDLING INFORMATION IN TASKS OF INDUCTIVE MODELLING FOR BUSINESS INTELLIGENCE SYSTEMS

SAMPLING METHODS LEARNING OBJECTIVES

An Approach to Combating Free-riding in Peer-to-Peer Networks

Physics 211: Lab Oscillations. Simple Harmonic Motion.

6. Time (or Space) Series Analysis

PREDICTION OF POSSIBLE CONGESTIONS IN SLA CREATION PROCESS

Insurance Spirals and the Lloyd s Market

Real Time Target Tracking with Binary Sensor Networks and Parallel Computing

Analyzing Spatiotemporal Characteristics of Education Network Traffic with Flexible Multiscale Entropy

AUC Optimization vs. Error Rate Minimization

Project Evaluation Roadmap. Capital Budgeting Process. Capital Expenditure. Major Cash Flow Components. Cash Flows... COMM2501 Financial Management

Evaluating the Effectiveness of Task Overlapping as a Risk Response Strategy in Engineering Projects

Standards and Protocols for the Collection and Dissemination of Graduating Student Initial Career Outcomes Information For Undergraduates

A CHAOS MODEL OF SUBHARMONIC OSCILLATIONS IN CURRENT MODE PWM BOOST CONVERTERS

Endogenous Credit-Card Acceptance in a Model of Precautionary Demand for Money

Preference-based Search and Multi-criteria Optimization

Part C. Property and Casualty Insurance Companies

Exploiting Hardware Heterogeneity within the Same Instance Type of Amazon EC2

Design of Model Reference Self Tuning Mechanism for PID like Fuzzy Controller

Halloween Costume Ideas for the Wii Game

Construction Economics & Finance. Module 3 Lecture-1

Evaluating Software Quality of Vendors using Fuzzy Analytic Hierarchy Process

Modeling Nurse Scheduling Problem Using 0-1 Goal Programming: A Case Study Of Tafo Government Hospital, Kumasi-Ghana

An Improved Decision-making Model of Human Resource Outsourcing Based on Internet Collaboration

Image restoration for a rectangular poor-pixels detector

Adaptive Modulation and Coding for Unmanned Aerial Vehicle (UAV) Radio Channel

Invention of NFV Technique and Its Relationship with NPV

Efficient Key Management for Secure Group Communications with Bursty Behavior

Implementation of Active Queue Management in a Combined Input and Output Queued Switch

Generating Certification Authority Authenticated Public Keys in Ad Hoc Networks

Study on the development of statistical data on the European security technological and industrial base

Option B: Credit Card Processing

Position Auctions and Non-uniform Conversion Rates

Exercise 4 INVESTIGATION OF THE ONE-DEGREE-OF-FREEDOM SYSTEM

Data Set Generation for Rectangular Placement Problems

Performance Evaluation of Machine Learning Techniques using Software Cost Drivers

ON SELF-ROUTING IN CLOS CONNECTION NETWORKS. BARRY G. DOUGLASS Electrical Engineering Department Texas A&M University College Station, TX

Pricing Asian Options using Monte Carlo Methods

REQUIREMENTS FOR A COMPUTER SCIENCE CURRICULUM EMPHASIZING INFORMATION TECHNOLOGY SUBJECT AREA: CURRICULUM ISSUES

Factored Models for Probabilistic Modal Logic

Modeling operational risk data reported above a time-varying threshold

The Research of Measuring Approach and Energy Efficiency for Hadoop Periodic Jobs

A short-term, pattern-based model for water-demand forecasting

Reconnect 04 Solving Integer Programs with Branch and Bound (and Branch and Cut)

Calculation Method for evaluating Solar Assisted Heat Pump Systems in SAP July 2013

Partitioned Elias-Fano Indexes

Local Area Network Management

Fuzzy Evaluation on Network Security Based on the New Algorithm of Membership Degree Transformation M(1,2,3)

Transcription:

Coputers and Cheical Engineering 26 (2002) 1399 1413 www.elsevier.co/locate/copcheeng Markovian inventory policy with application to the paper industry K. Karen Yin a, *, Hu Liu a,1, Neil E. Johnson b,2 a Departent of Wood and Paper Science, Uniersity of Minnesota, St. Paul, MN 55108, USA b IT Minnesota Pulp and Paper Diision, Potlatch Corporation, Cloquet, MN 55720, USA Received 20 August 2001; received in revised for 18 April 2002; accepted 18 April 2002 Abstract This paper concerns proble forulation and solution procedure for inventory planning with Markov decision process odels. Using data collected fro a large paper anufacturer, we develop inventory policies for the finished products. To incorporate both variability and regularity of the syste into atheatical forulation, we analyze probabilistic distribution of the deand, explore its connection with the corresponding Markov chains, and integrate these into our decision aking. In particular, we forulate the Markov decision odel by identifying the chain s state space and the transition probabilities, specify the cost structure and evaluate its individual coponent; and then use the policy-iproveent algorith to obtain the optial policy. Application exaples are provided for illustration. 2002 Elsevier Science Ltd. All rights reserved. Keywords: Inventory; Production planning; Markov chain; Markov decision process; Optial policy; Paper anufacturing 1. Introduction Inventory anageent is one of the crucial links of any supply chain. To anufacturers, it entails anaging product stocks, in-process inventories of interediate products as well as inventories of raw aterial, equipent and tools, spare parts, supplies used in production, and general aintenance supplies. In a broader sense, it coprises all kinds required to run a business including storage, personnel, cash and transportation facilities, etc. This paper focuses on inventory of finished products. A anufacturing copany needs an inventory policy for each of its products to govern when and how uch it should be replenished. Good inventory anageent offers the potential not only to cut costs but also to generate new revenues and higher profits. On the contrary, undersupply causes stockout and leads to lost sales; whereas oversupply hinders free cash flow and ay cause forced arkdowns. As a result * Corresponding author. Tel.: +1-612-624-1761; fax: +1-612-625-6286 E-ail addresses: kyin@un.edu (K.K. Yin), liux0295@tc.un. edu (H. Liu), neil.johnson@potlatchcorp.co (N.E. Johnson). 1 Fax: +1-612-625-6286. 2 Fax: +1-218-879-1079. of iproper inventory policies, both will diinish earnings and can have enough ipact to ake a copany non-profitable. Due to the everchanging arket conditions, the dynaic and rando nature of the deands, the close and coplicated relationship between resource/production planning and product inventory anageent, as well as the process uncertainties, atching supply with deand has always been a great challenge. Being able to offer the right product at the right tie for the right price reains frustratingly elusive (Fisher, Raan, & McClelland, 2000) to anufacturers and retailers. Process scheduling and planning have attracted growing attention in any industries. Nuerous papers in the area of design, operation and optiization of batch as well as continuous plants have been published (see, Applequist, Saikoglu, Pekny, & Reklaitis, 1997; Bassett, Pekny, & Reklaitis, 1997; Pekny & Miller, 1990; Petkov & Mararas, 1997 and the references therein). The ain objective of inventory anageent is to increase profitability. A frequently used criterion for choosing the optial policy is to iniize the total costs, which is equivalent to axiizing the net incoe in any cases. Scientific inventory anageent requires a sound atheatical odel to describe the behavior of the underlying syste and, quite often, an 0098-1354/02/$ - see front atter 2002 Elsevier Science Ltd. All rights reserved. PII: S0098-1354(02)00113-8

1400 K.K. Yin et al. / Coputers and Cheical Engineering 26 (2002) 1399 1413 optial policy with respect to the odel. A large nuber of works have been published in the past decades (see, e.g. Arrow, Karlin, & Scarf, 1958; Buchan & Koenigsberg, 1963; Buffa, 1980; Gupta, Maranas, & McDonald, 2000; Johnson & Montgoery, 1974; Starr & Miller, 1962; Veinott, 1965 aong any others.). Many odels have been developed for various inventory situations. The first inventory odel appeared in the literature ore than 70 years ago (Wilson, 1934) is frequently referred to as the Wilson forulation. This is a fixed order quantity syste that selects the order quantity to iniize the total costs in the inventory anageent. Several of its variations, such as odified reorder point syste with periodic inventory counts, the replenishent syste, and ultiple reorder systes etc. have been widely used. Many inventory systes possess coplications that require odels capable of handling specific probles in certain situations. Despite the large nuber of odels developed, however, there is still a wide gap between theory and practice. Siilar to any other dynaic processes in the real world, deand variation encountered by retailers or anufacturers is both rando and seasonal in nature. A rando/stochastic process ay be considered as an enseble of rando variables defined on a coon probability space and evolving over tie. The observed data are statistical tie series, which are single realizations of the underlying process. Contrary to those fro the deterinistic processes, the outcoes fro a stochastic process are not unique. Tie series of on-line data collected fro repetitions of the sae experient will not be the sae; levels of deand for a product change fro week to week. It is desirable or soeties necessary to quantify the dynaic relationships aong these rando events so as to better understand and effectively handle process uncertainties. Considering that the dynaics of such systes are often governed by Markov chains, we resort to Markovian odels for solution. Markov chain, a well-known subject introduced by Markov in 1906, has been studied by a host of researchers for any years (Chung, 1960; Doob, 1953; Feller, 1971; Kushner & Yin, 1997). Markovian forulations (see Chiang, 1980; Taylor & Karlin, 1998; Yang, Yin, Yin, & Zhang, 2002; Yin, Zhang, Yang, & Yin, 2001; Yin & Zhang, 1997, 1998; Yin, Yin, & Zhang, 1995 and the references therein) are useful in solving a nuber of real-world probles under uncertainties such as deterining the inventory levels for retailers, aintenance scheduling for anufacturers, and scheduling and planning in production anageent. Markov chain approach has been applied in the design, optiization, and control of queueing systes, anufacturing processes, reliability studies and counication networks, where the underlying syste is forulated as stochastic control proble driven by Markovian noise. This paper is concerned with proble forulation and solution procedure for inventory planning using Markov decision process (MDP) odels. We conducted this work using data collected fro the Potlatch Corporation. A diversified forest product copany, Potlatch s anufacturing facilities convert wood fiber into fine, coated paper. In order to iprove productivity and product quality as well as to lower production cost, this copany has recently copleted a 15-year odernization project. Much effort has also been ade to take advantage of the treendous progress in inforation technology to capture, store and to analyze production and trade data so as to enhance production planning and supply chain anageent. This work is concerned with inventory planning of coated fine paper produced by Potlatch Corporation s Minnesota Pulp and Paper Division (MPPD). In the pulp and paper industry, the search for good inventory control policy started several decades ago. Many odels have been developed and used. Nevertheless, inventory anageent reains to be a challenging proble to paper anufacturers and wholesalers. The dynaic and rando nature of the deands akes their forecasting very difficult or soeties ipossible. Despite the existence of the large nuber of inventory odels, very often anagers cannot find a single one suitable to their needs. As a result, decision has to be ade based on a cobination of experience, atheatical odels, and even on gut feels of a few individuals. It is desirable to shift such experience-based decision aking to an inforation-based decision aking. This will require a systeatic use of historical data and a theoretically sound atheatical odel applicable to the real situation. This work is intended in this direction. Incorporating both variability and regularity of the syste into atheatical forulation, we will analyze the probabilistic structure of the syste of interest, explore its connection with the corresponding Markov chains, and integrate these into our decision aking. In this work, we consider the deand to be a rando variable. We assue periodical review, or the inventory level is checked at fixed intervals and ordering decisions are only ade at these ties. In addition to the MDP odel, a replenishent syste odel will also be used for coparison. This paper is organized as follows. Several iportant concepts and frequently used odels in inventory anageent are outlined first. Followed by a review of Markov chain and the MDP odels in Section 3. Section 4 discusses the optial policy and the policyiproveent algorith. Application exaples are included in Section 5 for illustration and coparison. Suary and further discussion are given in Section 6.

K.K. Yin et al. / Coputers and Cheical Engineering 26 (2002) 1399 1413 1401 2. Several inventory control odels Operations in inventory systes as well as several iportant quantities involved are illustrated scheatically in Fig. 1. After receiving an order of aount Q, the inventory level will diinish continuously due to sales until the receipt of a new order. The average inventory level I during a period has a direct ipact on the carrying costs. Since inventory stock represents a considerable investent, it is desirable to aintain it at the lowest possible level provided that the custoer service can be guaranteed. One class of odels belong to the fixed order quantity systes, in which the reorder point P is the level of inventory at which a reorder should be placed. There is usually a tie lag, also referred to as the lead tie, L, between placing a reorder and receiving it. Therefore the average expected deands during this lead tie should be included in the reorder aount. To avoid situations such as teporary out-of-stock, back orders, and/or possible lost sales resulted fro the deand variation, it is also necessary to include a buffer/safety stock, B, in deterining the reorder point. Another class of odels are the so-called replenishent syste odels, where the reorder tie is fixed and the reorder quantity varies according to the stock on hand (not shown in the figure). A fixed aount M, called the replenishent level, is used as the upper liit in the deterination of the reorder quantity. The original Wilson forulation is also known as the econoic order quantity odel or the econoic lot-size odel. A fixed quantity syste, it assues that the costs in inventory anageent consist of two parts, ordering cost and carrying cost. Its essence is to choose an econoic order quantity to iniize the total costs. A reorder of a quantity Q is placed whenever the inventory falls below the reorder point P. Such syste is based on the assuption that perpetual inventory records are kept so that whenever the inventory level falls below the reorder point, a new order can be placed iediately. There are several odifications of the siple Wilson forulation and are useful for those cases in which perpetual inventory records are not available. Modified reorder point syste with periodic inventory reviews was developed for systes in which reviews are ade regularly. The review tie is needed in using this odel. The original Wilson odel is a single reorder point syste. To handle situations where deand during the lead tie frequently exceeds the order quantity, ultiple reorder systes have been developed. One of the two popular approaches uses the prescribed custoer service level (the probability of being out-of-stock) to seek the optial quantity policy by iniizing the costs. An alternative approach deterines the cost of stockouts first before adding it to the cost function to be iniized. The inventory cost is not explicitly considered in a replenishent syste; and neither is the fixed reorder quantity included. However, periodic reviews are required therein. Both the original replenishent odel and any of its variations have been widely used. In the original odel, the replenishent level M is deterined by M=B+S w (L+r), (1) where B is the safety stock; S w denotes the average sales per unit tie; L and r are the lead tie and the tie interval between reviews, respectively. The reorder quantity Q is coputed fro Q=M I Q=M I q 0 for Lr for Lr (2) Fig. 1. Operation of an inventory syste. where I is the current inventory level and q 0 is the quantity already in order. A widely used optional replenishent syste, also known as (S, s) policy, places a lower liit s on the size of the reorder. Since the optional replenishent odel possesses the advantages (Buchan & Koenigsberg, 1963) of having a quick response to increased deand and the ability of preventing high inventory during periods of lower deand, it is applicable to a wide range of operating conditions and can usually result in lower costs. For ore inforation of the various inventory odels, the readers are referred to Buchan and Koenigsberg, Hillier and Lieberan (1999) and Taylor and Karlin (1998) and the references therein.

1402 K.K. Yin et al. / Coputers and Cheical Engineering 26 (2002) 1399 1413 3. Markov decision processes Many processes, such as inventory systes in the real world, have uncertainty associated with the. In the eantie, they exhibit soe degree of regularity. It is desirable to incorporate both variability and regularity into atheatical odels and treat the quantitatively fro a probability point of view. Advances in statistics and stochastic processes have allowed us to do so. A stochastic process (see, Chiang, 1980; Chung, 1960; Doob, 1953; Feller, 1971) ay be considered as a collection of rando variables { t ()} (or { t }) defined on a coon probability space and indexed by the paraeter tt, where T is a suitable set and the index t often represents tie. For a fixed t, t () is a rando variable. For each, t () is called a saple path or a realization of the process. Note that the paraeter is often oitted for brevity. The state space of (t) is the collection of all values it ay take. Stochastic processes can be classified by their index, their state space, and other properties such as stationary vs. non-stationary and jup vs. sooth saple path, etc. 3.1. Marko chain and Marko property Markov chain is concerned with a particular kind of dependence of rando variables involved: When the rando variables are observed in sequence, the distribution of a rando variable depends only on the iediate preceding observed rando variable and not on those before it. In other words, given the current state, the probability of the chain s future behavior is not altered by any additional knowledge of its past behavior. This is the so-called Markovian property. For a discrete-tie process, T=(0, 1, 2, ) and P{ t+1 =j 0 =i 0,, t 1 =i t 1, t =i} =P{ t+1 =j t =i} (3) If the index t of t is continuous, i.e. T=[0, ), the Markov property eans that the values of s (st) are not influenced by the value of u for ut. We consider discrete-tie processes in this work. A stochastic process is a Markov chain if it possesses the Markovian properties and its state space is finite or countable. Throughout the paper, we consider tie hoogeneous Markov chains. Naely, t,t+1 P ij =P{ t+1 =j t =i} for all t. (4) In other words, the transition probabilities are independent of t. In this case, we say that the Markov chain is stationary and we ay drop the tie variable t for siplicity. 3.2. Proble stateent We are interested in inventory policies capable of handling situation where the deand during a period is rando, and where stock replenishents take place periodically, e.g. every week, every 2 weeks or once every onth, etc. Assue that the total aggregate deand for a specific product during any given period n is a rando variable d n. The frequency distribution of the deands encountered in a real-world situation often approxiates one of these three probability distributions: Poisson, noral, and exponential (see, e.g. Buchan & Koenigsberg, 1963; Hillier & Lieberan, 1999). The goodness of fit can be exained by using historical data. For cases where none of these three is suitable, the saple ean, variance, and saple frequency distribution of the deands are still easily obtainable fro the past records. Making decisions on whether and how uch to stock requires a stochastic odel. One possible approach is to use a ulti-period odel that incorporates the probability aspect into dynaic prograing. An alternative approach resorts to the MDP odel for solution. Observing the randoness and regularity in the inventory process, we choose to describe it with discretetie finite-state Markov chains. To establish the atheatical odel requires connections linking the eleents of the physical process to the logical syste, i.e. the Markov chain. More specifically, we need to specify the key eleents of the discrete-tie Markov chain, which entails designating its state space and prescribing the dependence relations aong the rando variables based on the real process data. 3.3. State space and transition probabilities of a Marko chain Let d 1, d 2, represent the deands (in lbs) for a particular product during the first week, the second week, Assue that d n are independent, identically distributed rando variables whose future values are unknown. Let X n denote the stock of certain product on hand at the end of the nth week. The states of the stochastic process, {X n }, consist of the possible values of its stock size. The stock levels at two consecutive periods are related by the current deand d n and the inventory policy chosen. For exaple, under the siple (S, s) policy, which requires that replenish to S if the stock level is lower than s; otherwise do not replenish, the current and the next stocks X n and X n+1 satisfy X n+1 = X n d n+1 S d n+1 if sx n S, if X n s. Since the successive deands d 1, d 2, are independent rando variables, the aounts in stock, X 0, X 1, (5)

K.K. Yin et al. / Coputers and Cheical Engineering 26 (2002) 1399 1413 1403 Table 1 States of the Markov chain State Aount of stock at hand 0 0X i s 1 sx i s+u 2 s+ux i s+2u 3 s+2ux i s+3u 1 s+( 2)uX i s+( 1)u s+( 1)uX i S Table 2 Decisions and actions Decision Action 0 Do not replenish 1 Replenish u lbs 2 Replenish 2u lbs 3 Replenish 3u lbs K Replenish Ku lbs constitute a Markov chain whose transition probability atrix can be calculated according to Eq. (5). The weight in stock at tie n, X n, is a continuous rando variable. To siplify the solution procedure, we discretize it via the following transforation. Let u denote the iniu order/production aount. For easy presentation, assue the aounts of replenishent will be u or its ultiples. We should note that ore general cases using any replenishent aount can be treated siilarly. Let s0 and Ss be a low and the highest possible levels of the inventory. Let X n = X n s u (6) where Z denotes the integer part of Z. Observe that X n is a discrete rando variable, which indicates the level of the stocks and takes values in M= {0,, }, where = S s u. (7) Such discretization allows us to odel this inventory syste by an (+1)-state Markov chain, whose state space X n M is shown in Table 1. Due to the periodic replenishent and the rando deand, at the end of each period the stock level ay undergo +1 possible events: it ay jup fro the current level, i, to a higher one j,(ij); it ay fall into a lower level k, (0ki ); or it ay stay in the sae state. To establish a stochastic odel requires the transition probabilities, that are the bases for dynaic odeling and optiization. The transition probability atrix P=P ij of the stationary Markov chain {X n } is of the for P 00 P 01 P 02 P 0 P 10 P 11 P 12 P 1 P= P 20 P 21 P 22 P 2. (8) P 0 P 1 P 2 P To copletely define a Markov process requires specifying its initial state (or the initial probability distribution, in general) and its transition probability atrix. For the inventory anageent proble, the forer is usually available, whereas the latter is affected by the rando deand as well as the replenishent activities. 3.4. Decisions, actions, and policies As entioned earlier, the inventory syste evolves over tie according to the joint effect of the probability laws and the sequence of decisions and actions. It fits to the general finite-state discrete-tie MDPs. The stock on hand at the end of each period is recorded. Subsequently, a decision is ade and an action is taken. Table 2 lists the possible decisions, labeled 0, 1, 2,, K, and a verbal description of their corresponding actions. The question needs to be answered is which decision should be chosen at any given tie and state. In other words, an inventory policy is needed. A policy is a rule that prescribes decisions to be ade for each state of the syste during the entire tie period of interest. Conceivably, there are a nuber of possible policies for each proble. Characterized by the values { 0 (R), 1 (R),, (R)}, any policy R specifies decisions i (R)=k, (k=0, 1,, K) for all states i, (i=1,, ) at every tie instant. We consider stationary policy only. Naely, the decision is deterined by the current state of the syste, regardless of tie. Very often the proble is to choose a policy that iniizes the long-run expected (average) cost. Table 3 presents two of the any potential policies, R a and R b, applicable to this inventory proble. A policy R requires that the decision i (R) beade whenever the syste is in state i. Effected by this policy as well as the rando deand, the syste will ove to a new state j according to the corresponding probabilities P ij. For countable ites, the deterination of the transition probability atrix is relatively straightforward. Since we are dealing with products easured by weights, we have discretized the into different levels illustrated in Eq. (6) and Table 1. Eq. (9) gives the transition probabilities P ij,(i=0, 1,, 4,j=0, 1,, 4)

1404 K.K. Yin et al. / Coputers and Cheical Engineering 26 (2002) 1399 1413 of a syste with five states (M={0, 1,, 4}) if the policy prescribes decision 0 for all states therefore no replenishent at any tie. and p(k j ) is the probability of a specific decision n+1 =k being chosen at a particular state X n+1 =j. P= 1 0 0 0 0 P d 2 u P d 2 u 0 0 0 P d 2 3u P u 2 2 d3u P d 2 u 0 0 P d 2 5u 3u P 2 2 d5u P u 2 2 d3u P d 2 u 0 P d 2 7u 5u P 3u P 2 d7u 2 3.5. Marko decision process 2 d5u 2 P u 2 d3u 2 P d u 2. (9) Owing to their applicability to a wide range of probles in engineering, anageent science, and biological and social science, MDP odels have attracted growing attention in recent years. Many probles in operations research such as stock options, resource allocation, queueing and achine aintenance fit well in the fraework of MDP odels (Taylor & Karlin, 1998; Yin & Zhang, 1997). Very often the proble is to choose an optial policy or the best rule for aking decisions at each tie instant. For ost probles it is sufficient to consider only those policies (Hillier & Lieberan, 1999) depending on the state of the syste at the present tie, and the possible decisions available. The aboveentioned inventory anageent is one of such exaples. As has been noted, the evolution of the syste is affected by the rando deands as well as the replenishent activities, which are governed by the inventory policy. Let X n be the state of the syste at tie n; and let n be the decision/action chosen. Then under any fixed policy R the pair Y n =(X n, n ) fors a two-diensional Markov chain with transition probabilities P[X n+1 =j, n+1 =kx n =i, n =k]=p( j i,k)p(k j ), (10) where p( j i, k) is the conditional probability of the chain s oving to state j at tie n+1 provided that the current state is X n =i and a decision n =k is taken; For a given feedback policy, the decision I (R)=k is prescribed for every state i=0, 1,,, thus p(ki )=1. Consequently, when the syste is in state i and the policy R is used and an action based on the decision i (R)=k is excised, the probability of its oving to state j at the next tie period P ij is given by P[X n+1 =j X n =i, i (R)=k]=p( j i,k) (11) Starting fro X 0, the realization of the underlying stochastic process is X 0, X 1, and the decisions ade are 0, 1, Note that 1 = Xn (R){0, 1, 2,, K} if the feedback policy is used. The sequences of observed states and decisions ade are called the MDP. 3.6. The long-run expected aerage cost Aong the any candidate policies, we seek the optial one in the sense that it will iniize the (long-run) expected average cost per unit tie. It should be noted that another consideration in practice is that the policy should be relatively siple and easily ipleentable. Suppose a cost C Xn n is incurred when the process is in state X n and a decision n is ade. A function of both X n =0, 1,,, and n =0, 1,, K, C Xn n is also a rando variable. Its long-run expected average cost per unit tie over a period of N is Table 3 Exaples of possible policies Policy Description 0 (R) 1 (R) 2 (R) (R) R a Replenish ( i )u lbs for state i 1 2 0 R b Replenish ( i )u lbs for state i2 1 0 0

K.K. Yin et al. / Coputers and Cheical Engineering 26 (2002) 1399 1413 1405 N 1 1 li N E[C Xn n ]= N n=0 K i=0k=0 ik C ik (12) where ik is the stationary (liiting) probability distribution associated with the transition probabilities in Eq. (10). Note that for a regular Markov chain (Taylor & Karlin, 1998), the liit probability distribution satisfies ik 0 for all i,k; and K i=0k=0 Also, jk = ik =1. (13) K i=0k=0 ik p( j i,k)p(k j ) for j=0,1,, and k=0,1,,k. (14) Our objective is to find a policy that iniizes the long-run expected average cost given in Eq. (12) where ik is related to the policy through Eqs. (13) and (14). 4. Optial policy and the policy-iproveent algorith A policy R can also be written in a atrix for D 00 D 01 D 02 D 0K D 10 D 11 D 12 D 1K D= D 20 D 21 D 22 D 2K D 0 D 1 D 2 D K (15) The first subscript i of any eleent D ik in the atrix D represents the state; and the second subscript k stands for the decision. Observe that D ik take values of 0 and 1 only. D ik =1 eans i (R)=k that calls for decision k and its corresponding action when the syste is in state i; whereas D jk =0 eans j (R)=0, i.e. no action will be taken when the syste is in state j. Since for any given policy, the decision to be ade and the action to be taken in any state i has been specified, Eqs. (12) (14) can be further siplified by replacing ik with i,or N 1 1 li NN E[C Xn n ]= n=0 and K i=0k=0 i C ik (16) i 0 for i=0,1,,; and i =1 (17) i=0 where i is the stationary distribution and (n) i = li P ii = j P ji. (18) n j=0 Although the expected cost of a policy can usually be expressed as a linear function of D ik, the linear prograing (LP) ethod is not directly applicable here due to its requireent of continuous variables whereas D ik being discrete. This difficulty is surountable (Hillier & Lieberan, 1999) by odifying the interpretation of the policy as displayed in atrix (Eq. (15)). In the odification, the D ik are considered to be probability distributions for the decision k to be ade when the syste is in state i, i.e. D ik =P{decision=kstate=i } for i=0,1,,; k=0,1,,k, (19) thus K 0D ik 1 and D ik =1. (20) k=0 Having changed D ik into continuous variables, such treatent enables us to use the LP approach in finding the optial solution. Rather than using the LP ethod, we resort to the Policy-Iproveent Algorith in this work, where D ik take values of 0 and 1 only. Let g(r) represent the long-run expected average cost per unit tie following any given policy R, i.e. g(r)= i C ik (21) i=0 Denote n i (R) the total expected cost of a syste starting in state i and evolving in a period of length n. By definition, it satisfies the following recursive forula n 1 in (R)=C ik + P ij (k) j (R). (22) j=0 Eq. (22) eans that this total expected cost n i (R) consists of two parts, the cost incurred in the first tie period, C ik, and the total expected costs thereafter. Note that C ik is also an expected cost and C ik = q ij (k)p ij (k) (23) j=0 where q ij (k) and P ij (k)=p( j i,k) are the expected cost and the probability when the syste oves fro state i to state j after decision k is ade, respectively. It can be shown that (Hillier & Lieberan, 1999) g(r)=c ik i (R)+ P ij (k) j (R) for i=0,1,,. j=0 (24) For a syste of +1 states, Eq. (24) consists of +1 siultaneous equations but +2 unknowns, g(r) and i (R) (i=0, 1,, ). To obtain a unique solution, it is custoary to specify (R)=0. Solving the syste of equations (Eq. (24)) yields the long-run average expected cost per unit tie g(r) if the policy R

1406 K.K. Yin et al. / Coputers and Cheical Engineering 26 (2002) 1399 1413 i (R 2 )=argin k CikR 2 i (R 1 )+ P ij (k R 2 ) j (R 1 ) j=0 for all i, (25) where arg in k f(k) is the value of k{0, 1,, K} that iniizes f(k). The set of the best decisions for all states (i=0, 1,, ) constitute the second, or the iproved policy R 2, whose long-run expected average cost per unit tie is given by Eq. (21). Repeating this iteration procedure until the two successive R s are the sae. For ore inforation of the policy-iproveent algorith, readers are referred to the book of Hillier and Lieberan (1999). 5. Application exaples Fig. 2. Weekly deands for P 1 (June 7, 1999 Deceber 25, 2000). Fig. 3. Weekly deands for P 2 (June 7, 1999 Deceber 25, 2000). is used. An optial policy is one that results in the lowest cost g(r*). A policy-iproveent algorith, an iteration procedure consisting of two steps, allows us to obtain the optial policy. Since there are only a finite nuber of possible stationary policies when the state space is finite, we will be able to reach the optial policy in a finite nuber of iterations (Ross, 1983). The procedure begins by choosing an arbitrary policy R 1. For the given policy R 1, the transition probabilities P ij (k) are available hence the expected costs C ikr in 1 Eq. (23) can be coputed. Subsequently, the values of g(r 1 ), 0 (R 1 ), 1 (R 1 ),, 1 (R 1 ) can be obtained fro Eq. (24). In the second step, the current values of i (R 1 ) are used to find an iproved policy R 2. Specifically, For each state i, choose such decision i (R 2 ) that akes the right-hand-side of Eq. (24) a iniu, i.e. Those outlined in the previous sections are powerful tools of forulating odels and obtaining the optial policies for the control of such systes that belong to the MDPs. Their application to a real inventory anageent proble is illustrated in this section. We will exaine the deand data, study its probability distribution, designate the possible decisions and the corresponding actions taken, identify the Markov decision odel related to the underlying syste by defining its state space and the transition probabilities, specify the cost structure and evaluate its individual coponent; and then use the policy-iproveent algorith to obtain the optial policy. The sale data shows that the deands can be subdivided into groups based on their distribution and/or their variability. Therefore we will use two products having different probability distributions and exhibiting different levels of variation for coparison. 5.1. The rando deands The database includes custoer deand for the MP- PD s ore than 1100 products during the period of June 1999 Deceber 2000. To protect private coercial inforation, naes of the products used in the paper have been changed; and the original data have also been twisted with the ain features reserved. Deands for soe of the products show trends and/or seasonality. However, any of the exhibit rando, unpredictable and rapid changes, which renders the attept of forecasting infeasible. Fig. 2 and Fig. 3 display the weekly deands for products P 1 and P 2. Variation of the forer is relatively sall the ratio of its standard deviation to the ean is 0.5. The changes in deand for P 2 appear ore erratic. Its standard deviation is 1.7 ties of its ean value. Of the over 1100 products we studied, this ratio ranges fro 0.4 to 9.1. Only one third of the products have the ratio less than one.

K.K. Yin et al. / Coputers and Cheical Engineering 26 (2002) 1399 1413 1407 Exaining actual frequency distributions of the weekly deand data suggests that any of the approxiate either noral or exponential distribution. P 2, for exaple, fits in approxiately with an exponential distribution as displayed in Fig. 4. Product P 1 represents another class of deands having a close-to-noral distribution (Fig. 5). For those not fitting well to either of these two, the statistics such as eans and standard deviations needed in our procedure are coputed fro the available data. Since the final-product inventory under consideration fits well into the fraework of the MDP forulation, we have forulated it as a MDP and defined the cost function. Our objective is to seek inventory policies that enable us to aintain a high level of custoer service at a iniu cost. To the best of our knowledge, this is Fig. 4. Coparisons of the actual deand distribution of P 2 with (a) exponential; (b) noral distributions. Fig. 5. Coparisons of the actual deand distribution of P 1 with (a) exponential; (b) noral distributions.

1408 K.K. Yin et al. / Coputers and Cheical Engineering 26 (2002) 1399 1413 Table 4 States of the Markov chain for P 2 State Inventory aount ( 10 000 lbs) 0 0X i 14.78 1 14.78X i 20.72 2 20.72X i 26.67 3 26.67X i 32.61 4 32.61X i 38.55 Table 5 Decisions and actions Decision Action 0 Do not replenish 1 Replenish 59 400 lbs 2 Replenish 118 900 lbs 3 Replenish 178 300 lbs 4 Replenish 237 700 lbs the first attept of using MDP odels in the paper industry. 5.2. Lead tie, buffer stock and a replenishent odel Inventory of each product is checked weekly to deterine whether and how uch this ite should be produced. An order is placed upon deterination. It usually takes 3 weeks to fill an order. Therefore we consider the lead tie to be a constant of 3 weeks herein. In calculating the stock level, it is also assued that backlogged deand is allowed and the unsatisfied portion is transferred to the next period. In practice, it is necessary to have enough stock on hand to cover the expected sales during lead tie, i.e. the expected deand during the upcoing 3 weeks. In addition, since actual deand during lead tie often exceeds this expected value because of deand fluctuation, a buffer/ safety stock is needed. Presuably, the size of the buffer should be deterined by both the degree of variation and the required custoer service level. Let D 1 and D 1 be the rando weekly deand and its expected value; be the designated service level, e.g. 95%, 97%, etc. Let c 1 be the critical value at which Pr{D 1 c 1 }=1. (26) In other words, the probability of the deand s being greater than c 1 is 1. Then the buffer stock is deterined by B=(c 1 D 1 )L (27) where L is the lead tie. Note that the service level is used in the deterination of the buffer stock, which requires a prescribed confidence level. The higher the confidence level, the higher the buffer needed and so the higher the inventory level. Having established the deand distribution, the lead tie as well as the size of the buffer stock, we can proceed to forulate the MDP odel fro the available deand data and, subsequently, to develop the optial policy by using the policy-iproveent algorith. The policy so obtained will then be exained in ters of its adequacy and perforance, evaluated by the required inventory level, the occurrence of stockout, and the nuber of orders to be placed, as detailed in the following subsections. For coparison, an (S, s) replenishent policy is also applied to the sae syste. This policy requires that if the end-of-period stock is less than s, thenan aount sufficient to increase the quantity of stock on hand to the level S is ordered; otherwise, no replenishent is undertaken. The replenishent level S=M is deterined by M=B+c 3. (28) Corresponding to the rando tri-weekly deand D 3, c 3 in Eq. (28) is the critical value at which Pr{D 3 c 3 }=1. The reorder point P is the sae as the lower level s and is deterined by P=B D 1 (29) 5.3. The MDP odel and the optial policy In practice, the order aount is not totally arbitrary. There is usually a iniu order/production aount, u, to be used. The choice of u will affect the state space of the Markov chain and hence the final policy according to our forulation (Eq. (6) and Table 1). We choose u to be the ean value of the 3-week deand. The state space of the Markov chain for P 2 isshown in Table 4. The upper bound of the state 0 is the level of the buffer stock. We choose the upper bound of state 4tobeM, i.e. the replenishent level in Eq. (28). The decisions and their corresponding actions are listed in Table 5. Orders are the iniu aount u=59 400 lbs or its ultiples as discussed in the Section 3.4. The inventory level at the end of each 3-week period is calculated by Inventory Level=Beginning Inventory +Aount Received Deand. As entioned in the previous sections, under a given policy the pair of rando variables Y n =(X n, n ) fors a two-diensional Markov chain. To evaluate a policy requires p( j i, k), the conditional probability of the chain s oving to state j at the (n+1)th tie period given the current state i and the kth decision specified by R. In particular, for a syste having 5 states and 5 different decisions, there are five transition probability

K.K. Yin et al. / Coputers and Cheical Engineering 26 (2002) 1399 1413 1409 atrices to be evaluated. The procedure for deterining the transition probabilities is the sae as that for deterining the transition probabilities in Eq. (9). Eq. (30) presents the five possible transition probability atrices of this MDP of P 2 under the five decisions tabulated in Table 5. The eleents in the atrix P(k) are the transition probabilities P ij (k) under decision k. An entry of eans that the decision is not valid due to an infeasible action, e.g. the suggested replenishent aount exceeds the axiu value M. Such inadissible decision/action will not be used. 1 0 0 0 0 0.63 0.37 0 0 0 P(0)= 0.26 0.37 0.37 0 0 0.11 0.15 0.37 0.37 0 0.04 0.07 0.15 0.37 0.37 0.63 0.37 0 0 0 0.26 0.37 0.37 0 0 P(1)= 0.11 0.15 0.37 0.37 0 0.04 0.07 0.15 0.37 0.37 0.26 0.37 0.37 0 0 0.11 0.15 0.37 0.37 0 P(2)= 0.04 0.07 0.15 0.37 0.37 0.11 0.15 0.37 0.37 0 0.04 0.07 0.15 0.37 0.37 P(3)= Table 6 Iteration results Iteration no. Policy 1 1 0 0 0 0 2 4 3 2 1 0 3 3 2 1 0 0 4 3 2 1 0 0 Table 7 The optial policy State 0 1 2 3 4 Inventory aount Action 0X i 14.78 Replenish 178 300 14.78X i 20.72 Replenish 118 900 lbs 20.72X i 26.67 Replenish 59 400 lbs 26.67X i 32.61 Do not replenish 32.61X i 38.55 Do not replenish 0.04 0.07 0.15 0.37 0.37 P(4)= (30) In an inventory syste, the ain costs that affect profit ay include anufacturing cost, holding cost, shortage cost, salvage cost, and discount rates, etc. In this work we consider two coponents, an average anufacturing cost of $C /lb and an average shortage cost of $C s /lb. The anufacturing costs include all those incurred during the production. The shortage costs are resulted fro lost of sales due to insufficient inventories. Let d denote the deand. Let Q k be the ordering aount under decision k; andx i be the average inventory aount when the syste is in state i. Then the expected cost C ik =C Q k +C s ax{(d X i),0}p ij (k) j=0 for i=0, 4, k=0, 4 (31) where P ij (k) is given in Eq. (30). The choice of the initial policy R 1 is arbitrary. Using the policy-iproveent algorith as outlined in Section 4 usually can yield the optial policy in only a few iterations. Table 6 presents the interediate and final results when the first policy calls for ordering 59 400 lbs if the syste is in state 0 and no ordering placed otherwise. By using the Markov decision odel and the policyiproveent algorith, we obtain the optial policy shown in Eq. (32), in which an entry D ik =1 eans that the policy calls for the kth decision if the syste is in state i. For exaple, D 21 =1 eans that decision 1 and its corresponding action (to replenish 59 400 lbs) will be exercised if the syste is in state 2 (stock at hand is between 207 200 and 266 700 lbs). D 00 D 01 D 02 D 03 D 04 D 10 D 11 D 12 D 13 D 14 D P 2 = D 20 D 21 D 22 D 23 D 24 D 30 D 31 D 32 D 33 D 34 D 40 D 41 D 42 D 43 D 44 0 0 0 1 0 0 0 1 0 0 = 0 1 0 0 0 1 0 0 0 0 1 0 0 0 0 (32)

1410 K.K. Yin et al. / Coputers and Cheical Engineering 26 (2002) 1399 1413 Fig. 6. A coparison of the MDP policy and the conventional replenishent policy Product: P 2; Service level: 97.6%. Fig. 7. A coparison of the MDP policy and the conventional replenishent policy Product: P 1; Service level: 98%. Table 7 provides a verbal description of this optial policy, which prescribes the decision/action to be ade under all possible conditions. It can be seen that the concepts and developent of the MDP odel are ore involved than ost frequently used inventory odels. Once a policy as depicted in Table 7 becoes available, however, its ipleentation is rather straightforward. It can usually provide us with better results as illustrated in Fig. 6 and Fig. 7. Fig. 6 copares the MDP policy with the conventional (S, s) (Eq. (28) and Eq. (29)) inventory policy based on the real deand data for product P 2, of which the weekly ean was 19 800 lbs. Its actual weekly deand ranges fro the lowest value of zero to the highest 140 000 lbs. Such high variability akes the prediction of future deand very difficult. Using conventional policies often leads to stockout or requires a higher inventory level. Since the randoness has been included in the MDP odel, better results can be

K.K. Yin et al. / Coputers and Cheical Engineering 26 (2002) 1399 1413 1411 expected. It indeed is the case as shown in our coparison. Fig. 6 shows that no stockout occurs under either policy. The sae upper bound M=386 000 lbs was applied to both ethods in their policy developent. However, the MDP policy results in an average inventory level of 236 000 lbs that is uch lower than the level of 315 000 lbs required by the conventional replenishent policy. The service level in the figure is a prescribed nuber. A service level of =97% eans that the allowed probability of stockout is 1 97% = 3%. The calculation shows that using MDP ethod, a total of 21 reorders have been ade during the 83-week period, which is also lower than the 28 ties required in the conventional ethod. For the low-variability class, the MDP policy also perfors better however not as drastic as the high-variability one. Fig. 7 shows that no stockout in either case, the average inventory level of 90 000 lbs fro MDP policy is lower than the 125 000 lbs required by the conventional replenishent policy. Its reorder tie of 27 is also lower than the 52 ties required for the conventional ethod. It is conceivable that the inventory level will be reduced if a sall aount of stockout is allowed. As displayed in Fig. 8, a change of the service level fro 97.6% to 96.5% results in a lowered inventory level fro 236 000 lbs (Fig. 6) to 189 000 lbs, corresponding to a 25% inventory savings. Its trade-off is a 3/82= 3.65% stockout. Fig. 9 shows that when a 2/82=2.4% stockout is allowed to happen, a 10% savings in inventory are achieved for P 1. Sae procedure as above was applied to several other products. Siilar conclusions were obtained. Of the two different ethods, the MDP syste consistently yielded better policies than the traditional replenishent ones. In general, it results in lower average inventory level and/or lower stockout and requires fewer reorders to be placed. These advantages are ore pronounced for products with highly variable deands, for which ost of other ethods do not perfor well. The coputation tie needed to develop an MDP policy and to obtain results shown in, e.g. Fig. 7, was 19 s using a Dell 700 MHz PC. 6. Suary and discussion This paper concerns Markovian inventory policy and its application in the paper industry. The Markov decision odels have proved to be useful in any systes and are powerful tools for inventory planning. After discussing the odel forulation and the solution procedure in general, we apply it to a proble of inventory control of finished paper products. Ephases have been put on several key steps in the odel developent, such as obtaining the state space of the Markov chain, designating the possible decisions and actions, calculating the transition probabilities, defining the cost structure and evaluating the cost function, and deterining the optial policy. Such practical issues as lead tie, inventory level, stockouts are discussed in detail. We have siplified the decision-aking procedure by discretizing the continuous deands into discrete levels. Fig. 8. A coparison of the MDP policy and the conventional replenishent policy Product: P 2; Service level: 96.5%.

1412 K.K. Yin et al. / Coputers and Cheical Engineering 26 (2002) 1399 1413 Fig. 9. A coparison of the MDP policy and the conventional replenishent policy Product: P 1; Service level: 97%. Such treatent is applicable to any cases in cheical industry where the products are easured by their weights. Our calculation results have shown that the MDP odels consistently provide better results than conventional odels, especially for systes exhibiting high variabilities. When there are trends in the syste, the ean is not a constant, and the covariance is a function depending not only on the tie lag but rather in a uch ore coplex fashion. Such non-hoogeneous MDP can be treated using ethods discussed in Yin and Zhang (1998). It is well known that any tie series, particularly sale data, often show seasonality. Building seasonal odels and using the for forecasting have becoe routine in inventory planning. This issue was not addressed herein because the data we used are not seasonal. If the series shows a arked seasonal pattern, the ethods developed above need to be odified. Forecasting product deand with atheatical odels derived fro historical data has been a coon practice in industries. This can lead to reasonably good prediction provided that the deand variation is relatively sall, or that certain trends such as seasonality are easily identifiable. The basic assuption of forecasting is that arkets and deands are predictable within certain accuracy. Given the rapid and often unpredictable changes in today s global econoy, nuerous uncertainty involved in the dynaic process, as well as the coplicated relationships aong the eleents and links of any given supply chain, in any cases this assuption is not true, which renders the forecasting unreliable or totally istaken. As a result, production and inventory decisions can no longer be ade solely based on forecasting alone. Other tools are needed to adequately address the variability issue. MDP odels offer a good alternative for such purpose. Stochastic odeling and siulation have becoe frequently used and powerful tools in quantifying dynaic relations of sequences of rando events and uncertainties. Considering the coplex and rando natures of any cheical processes, yet their still liited use of stochastic odeling and siulation, it is conceivable that Markov chain and stochastic odeling will find ore applications in the cheical engineering field. References Applequist, G., Saikoglu, O., Pekny, J., & Reklaitis, G. (1997). Issues in the use, design and evolution of process scheduling and planning systes. ISA Transactions, 36(2), 81 121. Arrow, K. J., Karlin, S., & Scarf, H. (1958). Studies in the atheatical theory of inentory and production. Stanford, CA: Stanford University Press. Bassett, M. H., Pekny, J. F., & Reklaitis, G. V. (1997). Using detailed scheduling to obtain realistic operating policies for a batch processing facility. Industrial & Engineering Cheical Research, 36(5), 1717 1726. Buchan, J., & Koenigsberg, E. (1963). Scientific inentory anageent. Englewood Cliffs, NJ: Prentice-Hall. Buffa, E. S. (1980). Modern production/operations anageent (6th ed.). New York: Wiley. Chiang, C. L. (1980). An introduction to stochastic processes and their applications. New York: Robert E. Krieger Publishing Co.

K.K. Yin et al. / Coputers and Cheical Engineering 26 (2002) 1399 1413 1413 Chung, K. L. (1960). Marko chain with stationary transition probabilities. Berlin: Springer. Doob, J. L. (1953). Stochastic processes. New York: Wiley. Feller, W. (1971). An introduction to probability theory and its applications. New York: Wiley. Fisher, M. L., Raan, A., & McClelland, A. S. (2000). Rocket science retailing is alost here are you ready? Harard Business Reiew, July August, 115 125. Gupta, A., Maranas, C. D., & McDonald, C. M. (2000). Mid-ter supply chain planning under deand uncertainty: custoer deand satisfaction and inventory anageent. Coputers & Cheical Engineering, 24(12), 2613 2621. Hillier, F. S., & Lieberan, G. J. (1999). Introduction to operations research (6th ed.). Oakland CA: Holden-Day, Inc. Johnson, L. A., & Montgoery, D. C. (1974). Operations research in production planning and inentory control. New York: Wiley. Kushner, H. J., & Yin, G. G. (1997). Stochastic approxiation algoriths and applications. New York: Springer. Pekny, J., & Miller, D. (1990). Exact solution of the no-wait flowshop scheduling proble with a coparison to heuristic ethods. Coputers & Cheical Engineering, 14(9), 1009 1023. Petkov, S. B., & Mararas, C. D. (1997). Multiperiod planning and scheduling of ultiproduct batch plants under deand uncertainty. Industrial & Engineering Cheistry Research, 36(11), 4864 4881. Ross, S. (1983). Introduction to stochastic dynaic prograing. New York: Acadeic Press. Starr, M., & Miller, D. (1962). Inentory control: theory and practice. Englewood Cliffs, NJ: Prentice-Hall. Taylor, H. M., & Karlin, S. (1998). An introduction to stochastic odeling. Boston: Acadeic Press. Veinott, A. F. (1965). The optial inventory policy for batch orderings. Operations Research, 13(3), 424 432. Wilson, R.H. (1934). A scientific routine for stock control. Harard Business Reiew, 13. Yang, H., Yin, G., Yin, K., & Zhang, Q. (2002). Control of singularly perturbed Markov chains: a nuerical study, to appear in The ANZIAM Journal ( forerly known as Journal of the Australian Matheatical Society, Series B: Applied Matheatics). Yin, G., Zhang, Q., Yang, H., & Yin, K. (2001). Discrete-tie dynaic systes arising fro singularly perturbed Markov chains. Nonlinear analysis: theory, ethods & applications, 47(7), 4763 4774. Yin, G. G., & Zhang, Q. (1998). Continuous Marko chains and applications, a singular perturbation approach. New York: Springer. Yin, G. G., & Zhang, Q. (1997). Matheatics of stochastic anufacturing systes. Providence: Aerican Matheatical Society. Yin, K., Yin, G. & Zhang, Q. (1995). Approxiating the optial threshold levels under robustness cost criteria for stochastic anufacturing systes, Proceeding IFAC Conference of Youth Autoation YAC 95, 450 454.