QUANTITATIVE METHODS IN BUSINESS PROBLEM #2: INTRODUCTION TO SIMULATION Dr. Arnold Kleinstein and Dr. Sharon Petrushka Introduction Simulation means imitation of reality. In this problem, we will simulate a random variable that describes the number of sick days taken by an employee during month. version July 27, 2000 1
Mathematics Discrete Random Variable A discrete random variable, X, is a set of numeric outcomes, together with an assignment of probability, p(x), to each outcome of the random variable, X. The values of X can be listed in a table. X outcomes p(x) probability of each outcome Example: The manager of a computer store believes from his study of records that his daily demand for a certain new type of computer will be a discrete random variable. The random variable is defined in the following table. x=demand p(x) 0 0.05 1 0.20 2 0.25 3 0.35 4 0.15 Simulating a Discrete Random Variable To better serve his customers, the manager would like to see how sales would go for a typical week. To do this, he simulates the sales of these new computers. Simulation means that for each day, he must pick a number 0, 1, 2, 3, or 4, according to the probabilities in the above table. That is, he must insure for each day, that the number 0 has a 5% change of coming up, the number 1 a 20% change, etc. version July 27, 2000 2
One way to do this is to first create the cumulative frequency distribution for the random variable. x=demand Cumulative probability p(demand < x) 0 0.00 1 0.05 2 0.25 3 0.50 4 0.85 5 1.00 Consider the number line interval, [0, 1], as being divided according to the cumulative probability numbers. 0.05.25.5.85 1 Randomly choose a number, R #, between 0 and 1. There is a 5% change that this number will be between 0 and.05, a 20% chance it will be between.05 and.25, etc. Thus, we can simulate the random variable if we choose the number of computers that will be sold according to the following rule: If R # falls in the interval Then choose x R to be 0 to less than.05 0.05 to less than.25 1.25 to less than.50 2.50 to less than.85 3.85 to less than 1.00 4 Example: If R # =.32, we determine that the number of computers sold is 2, since since.32 is between.25 and.50. version July 27, 2000 3
Exercise: Simulate 5 days worth of computer sales using the above random variable. Choose the random number using a computer or calculator. Day Random Number Computers Sold 1 2 3 4 5 version July 27, 2000 4
Excel =RAND function To randomly choose a number between 0 an 1, use the built-in Excel function, =RAND(). Example: Type =RAND() into cell A4. In the cell will appear randomly chosen number between 0 and 1. Table A table in Excel is a rectangular array of cells. To designate this table, use cell address of the top left corner cell of the table : cell address of the bottom right corner. Example: The table has been entered into cells A2 through B6. In Excel, this table is referred to as A2:B6. Cum. Probabilities x-values 0 20 0.1 25 0.5 30 0.8 35 1 40 =VLOOKUP To simulate a discrete random variable in Excel, we use the built-in function =VLOOKUP. The inputs for VLOOKUP are a random number between 0 and 1, and the random variable's cumulative distribution table. =VLOOKUP(R #, cumulative distribution table with columns reversed, column number of X-values) R # R# is a random number between 0 and 1. It is chosen using the =RAND() function. Cumulative Distribution Table with Columns Reversed The cumulative distribution table for VLOOKUP is the cumulative probability table for the random variable, but with the columns reversed. The first column is the vector of version July 27, 2000 5
cumulative probabilities. The second column is the vector of random variable values. This table is entered into the spreadsheet. The address of the table is entered into VLOOKUP. Column Number of X-Values If the X-values are now in the second column, the number 2 is entered here. Example: Suppose a random variable X has the following cumulative probability distribution table: X Cum Pr 20 0 25.1 30.5 35.8 40 1.0 To obtain the cumulative distribution table in Excel, enter the cumulative probability table in reverse order, with the cumulative probabilities in the first column. The X- values are in the column to the right. To simulate the random variable, enter the table and VLOOKUP as below: The table has been entered into cells A2 through B6, and the random number in cell A10. Cum. Probabilities x-values 0 20 0.1 25 0.5 30 0.8 35 1 40 R# Simulated Value =RAND() =VLOOKUP(A10,A2:B6,2) version July 27, 2000 6
VLOOKUP will choose values for the random variable, X, using the same procedure described in the math section. If random number is Choose X to be: 0 to less than.1 20.1 to less than.5 25.5 to less than.8 30.8 to less than 1 35 Thus, if the random number is.351707, VLOOKUP chooses the value 25 for X, since.351707 is in the interval [.1,.5). Cum. Prob x-values 0 20 0.1 25 0.5 30 0.8 35 R# Simulated Value 0.351707 25 version July 27, 2000 7
Business Problem Simulate a random variable, X, that describes the number of sick days taken by an employee during the month of November. Solution: First we must construct the random variable. We can do this using historical data and the relative frequency definition of probability Suppose the personnel office has compiled the number of sick days taken by each of 400 employees during the month of November. Let X = the number of sick days taken by an employee. The results are tabulated below. Note that the second column of the table contains the number of employees who took a particular number of sick days, and the third column contains the probability that a randomly selected employee took that number of sick days. This probability is computed using the frequency definition of probability, by dividing the number of employees taking that number of sick days by the total number of employees. X = number of days Number of Employees p(x) = number of employees 400 0 80.2000 1 34.0850 2 62.1550 3 129.3225 4 51.1275 5 23.0575 6 12.0300 7 9.0225 The cumulative probability distribution records the probability that the random variable lies below certain values. We construct the cumulative distribution for the random variable as follows: x = # of sick days cumulative probability less than 0.0000 less than 1.2000 less than 2.2850 less than 3.4400 less than 4.7625 less than 5.8900 less than 6.9475 less than 7.9775 less than 8 1.0000 version July 27, 2000 8
Simulate the Random Variable In the spreadsheet we enter the cumulative distribution table into cells A6:B14. The random number generator RAND() is entered in cells B18 through B27. The function VLOOKUP is entered into cells C18 through C27. The entry in cell C18 is:=vlookup(b18,$a$6:$b$14,2) A B C 1 Simulate the Number of Sick Days for Ten Employees 2 3 Cumulative Distribution Table 4 5 cumulative probability X = # of sick days 6 0.0000 0 7 0.2000 1 8 0.2850 2 9 0.4400 3 10 0.7625 4 11 0.8900 5 12 0.9475 6 13 0.9775 7 14 1.0000 8 15 16 17 Employee Random Number Number of sick days 18 1 0.530 3 19 2 0.667 3 20 3 0.518 3 21 4 0.344 2 22 5 0.458 3 23 6 0.880 4 24 7 0.651 3 25 8 0.292 2 26 9 0.148 0 27 10 0.204 1 version July 27, 2000 9
Additional Problems 1. Simulate 10 days worth of air conditioner sales. x=daily demand for air conditioners Cumulative probability p(demand < x) 0 0.00 1 0.05 2 0.25 3 0.50 4 0.85 5 1.00 2. Simulate 10 weeks of refrigerator sales when the historical sales data is as below: X = number of Number of weeks refrigerators sold/week 0 4 1 6 2 7 3 10 4 12 5 7 6 3 7 1 p(x) = number of weeks Total Research problem: Using historical data, simulate some aspect of a business. version July 27, 2000 10