A Comparative Study of the Pickup Method and its Variations Using a Simulated Hotel Reservation Data

A Comparative Study of the Pickup Method and its Variations Using a Simulated Hotel Reservation Data Athanasius Zakhary, Neamat El Gayar Faculty of Computers and Information Cairo University, Giza, Egypt athanasius.it@gmail.com, hmg@link.net, Amir F. Atiya Dept Computer Engineering Cairo University, Giza, Egypt amir@alumni.caltech.edu Abstract Detailed forecasts are major inputs to modern Hotel Revenue Management Systems. Accurate forecasts are crucial to improve rate and availability recommendations for rooms. The data used for hotel demand forecasting are based on current booking activities (Reservations), historical information regarding daily arrivals or rooms sold. Bookings are recent data that if used adequately can make the forecasting process more responsive to demand shifts. Very little work has been done on forecasting techniques using reservation data. In this paper, we examine in more details a popular forecasting model that uses reservation data, referred to in the literature as the pickup method. In particular, we present a new framework for the pickup technique with 8 different variations and compare the results of these variations using a variety of simulated hotel reservations data. Keywords: Pickup, Reservation-based Forecasting. 1 Introduction Hotel Revenue Management (RM) is commonly practiced in the hotel industry to help hotels decide on room rates and allocation. Detailed and accurate forecasts are crucial to good RM [1]. Inaccurate predictions lead to suboptimal decisions about the rate and availability recommendations produced by the RM system, that in turn have a negative effect on hotel revenue [2]. Accurate forecasting can also help hotels in better staffing, purchasing and budgeting decisions [3]. RM forecasting methods fall into one of three types: historical booking models, advanced booking models and 15 combined models. Historical booking models consider only the final number of rooms or arrivals on a particular stay night. Advanced booking models include only the buildup of reservations over time for a particular stay night. Combined models use either regression or a weighted average of historical models and advanced booking models to develop forecasts [1]. Bookings in hotels occur over an extended period of time. Hotels may take reservations for rooms days, weeks or even months in advance. This so called partial booking data, while incomplete, can be very useful in forecasting [4]. Particularly, partial booking data are recent data that can reflect demand shifts [5]. The Pickup forecasting model is a popular advanced booking model which exploits the unique characteristics of reservation data instead of relying only on complete arrival histories to make better forecasts. The main idea of using the pickup method is to estimate the increments of bookings (to come) and then aggregate these increments to obtain a forecast of total demand to come [4]. Pickup is defined as the number of reservations picked up from a given point of time to a different point of time over the booking process [1]. In spite of the fact that the pickup technique is widely applied in many Revenue Management applications, very little work has been found to discuss the pickup method in details. Besides, no detailed comparisons on hotel reservations data were reported using different variations of the pickup method. This work presents a new framework for the pickup technique with its different variations and compares the results of these variations using a variety of simulated hotel reservations data. The main goal of using the pickup method in this paper is to forecast the final number of arrivals for every day in the future within a given horizon. This future may mean the next month, or any certain period of time. This paper is organized as follows: In Section (2), the different variations of the pickup method are described. Section (3) presents the data used, experiments conducted

and the error measures used. In Section (4) results are summarized and discussed. Finally, concluding remarks are given in Section (5). 2 Variations of the Pickup Method In this section we present our view of how different pickup method variations can be implemented. Variations can be classified into three distinct groups that identify how data is preprocessed, what portion of the reservation data is used and finally what technique to use for forecasting the increments. Accordingly, we can group pickup method into: Additive or Multiplicative, Classical or Advanced and using or Weighted forecasting techniques. A single choice is to be selected from each of the above mentioned categories. Each group and its different alternatives are discussed in more details in the following subsections. Table (1) shows a typical Cumulative Booking Matrix that was captured during the May 5th. Question marks were placed in the cells of the bookings to come. Each row in this matrix represents the buildup of bookings for the corresponding arrival date. 2.1 Additive vs. Multiplicative Pickup Variation Additive pickup techniques assume the number of reservations on hand at a particular day before arrival is independent of the number of rooms sold and hence add the current bookings to average pickup to arrive at final bookings [6]. On the other hand, the multiplicative pickup techniques assume future bookings are proportional to current bookings and therefore to get the final forecast, current bookings are multiplied by the average pickup ratio [4].We outline the difference between the additive and multiplicative Pickup techniques in more details below. 2.1.1 The Additive Technique To implement the additive pickup technique, the cumulative reservation data (Table (1)) is processed as follows. Each column is subtracted from the column to the left. The result is called the incremental or net additive booking table C Add C Add i,j = C i,j C i,j+1,j = 0,1,...,h. (1) where: C i,j : the number in a cell in row i (arrival date) and column j (days before) in the Cumulative Booking Matrix. C i,h+1 is implicitly assumed to be 0. Table (2). shows the resulting incremental additive booking table C Add. The final number of arrivals in a certain day in the future can then be calculated by summing up the values of the corresponding row. where: A i = h j=0 C Add i,j. (2) 16 A i : the final number of arrivals in day i. Ci,j Add : the number in the cell in row i and column j of the net additive booking table. h: the length of the booking horizon. 2.1.2 The Multiplicative Technique For the multiplicative pickup technique, each column in the cumulative reservation data (Table (1)) is divided by the column to the right to obtain the incremental or net multiplicative booking table C Mult as shown in Table (3). Again this procedure can be described mathematically as follows: C Mult i,j = C i,j /C i,j+1,j = 0,1,...,h. (3) where: C i,j : the number in a cell in row i (arrival date) and column j (days before) in the Cumulative Booking Matrix. C i,h+1 is implicitly assumed to be 1. The obvious limitation of the multiplicative formula (3) is that it can not be used if one or more C i,j =0. This event is not uncommon. The final number of arrivals in a certain day in the future can be calculated by multiplying the values of the corresponding row. where: A i = h j=0 C Mult i,j. (4) A i : the final number of arrivals in day i. Ci,j Mult : the number in the cell in row i and column j of the net multiplicative booking table. h: the length of the booking horizon. 2.2 Classical vs. Advanced Pickup Method In the classical pickup method, only the booking data for completed booking curves are used in the forecasting process [1]. In the above tables, only booking data of days until day 4 is used in the forecasting phase. Incomplete booking data of days from May 5th and later will not be used in the forecasting phase. The Classical pickup method hence ignores available information of incomplete arrival dates that might be useful[6]. The Advanced pickup method, on the other hand, uses all the available complete and incomplete booking data [1] in the forecasting phase and hence uses reservation data of arrival dates that still did not occur. 2.3 vs. Weighted Pickup Method The goal of the pickup method is to ultimately estimate the increments of the bookings for all the days to come in

Table 1: Cumulative Booking Matrix Arrival date Number of days before the arrival date 0 1 2 3 4 5 6 7 8 1 104 94 79 58 58 38 24 14 6 2 113 101 87 63 59 38 23 12 7 3 87 76 60 38 43 29 13 4 4 4 107 95 79 55 59 41 25 17 8 5? 89 74 52 52 33 17 8 7 6?? 78 53 51 34 19 10 5 7??? 61 56 38 23 15 7 8???? 57 39 22 16 9 9????? 41 29 15 9 10?????? 23 11 7 Table 2: Net Additive Booking Matrix Arrival date Number of days before the arrival date 0 1 2 3 4 5 6 7 8 1 10 15 21 0 20 14 10 8 6 2 12 14 24 4 21 15 11 5 7 3 11 16 22-5 14 16 9 0 4 4 12 16 24-4 18 16 8 9 8 5? 15 22 0 19 16 9 1 7 6?? 25 2 17 15 9 5 5 7??? 5 18 15 8 8 7 8???? 18 17 6 7 9 9????? 12 14 6 9 10?????? 12 4 7 Table 3: Net multiplicative Booking Matrix Arrival date Number of days before the arrival date 0 1 2 3 4 5 6 7 8 1 1.106 1.189 1.362 1 1.526 1.583 1.714 2.333 6 2 1.119 1.1609 1.381 1.068 1.553 1.652 1.917 1.714 7 3 1.145 1.267 1.579 0.884 1.483 2.231 3.25 1 4 4 1.126 1.203 1.436 0.932 1.44 1.64 1.471 2.125 8 5? 1.203 1.423 1 1.576 1.941 2.125 1.142 7 6? 0.012 1.471 1.039 1.5 1.789 1.9 2 5 7?? 0.016 1.089 1.474 1.652 1.533 2.143 7 8??? 0.0176 1.462 1.773 1.375 1.778 9 9???? 0.0244 1.414 1.933 1.667 9 10??? 17??? 2.09 1.572 7

order to estimate the total arrivals in the future [4]. This corresponds to forecasting all the unknown values (indicated by question marks) in Table (1) for all the days in the forecasting period. The forecasting phase proceeds column by column according to the following equation: where: f j = e ω ij C ij,j = 0,1,...,h. (5) i=s s: is the index of the start value in the column to be used. It represents the date of the first day we have reservation data for. e: is the index of the end value. In the classical pickup, e is constant and it represents the index of the most recent completed arrival date. For the advanced pickup, e represents the index of the last known value in the current column. C ij : a value in a cell in Table (2) or Table (3) that corresponds to row (arrival date) i and column (days before) j. ω ij : is the weight that represents the degree of influence of C ij on the forecast. If the weights of all the elements C ij are set to be equal; then the forecasting technique would correspond to the simple average method. f j : the forecasted value used to estimate the unknown values in column j. It is only possible to fill the first unknown value, then use it with the previously known cells above it to forecast the next cell underneath it and so on. This will make no difference if the simple averaging is used. The pickup variations existing in the literature mainly use two types of forecasting techniques: and Weighted. We suggest the use of other forecasting techniques. Every column can be considered a onedimensional time series that contains some unknown values. smoothing, Holt s and Winters methods are good candidates as forecasting techniques. A thorough, state of the art survey on these methods can be found in[7]. Figure (1) illustrates the different proposed pickup method combinations. 3 Data and Experiments Reservation pattern in a certain hotel is affected by many components. A hotel reservation data simulator was built to model these components by different distributions and adds a random part to represent the uncertainty. Among these components: Trend, seasonality, booking curve, cancellation dynamics and length of stay. Trend is repre- 18 sented by a randomized exponential distribution. Length of stay was modeled by a normal distribution with mean Classical Additive Pickup Methods Advanced Classical Multiplicative Advanced Figure 1: Proposed Pickup method variations equal to the typical most frequent length of stay which is 4 and 3 as variance. High season peaks are represented by weighted sums of Gaussian functions. The width of the Gaussian function represents the duration of the high season period, and the weight (or height) represents relative strength of arrivals during that period. A function consisting of a weighted sum of two gamma functions was used to model booking curves. This function, after being adjusted for seasonality, gives the rate of bookings as a function of time before arrival. The cancellation rate was modeled by an exponentially decreasing curve. Bernoulli drawings are used to draw actual bookings and cancellations according to the values of the booking curve (or rate) and the cancellation rate. Different hotel reservations datasets were generated by running the simulator with different parameters. Each dataset generated contains the cumulative booking matrix which holds the daily buildup of bookings for four consecutive years. The booking horizon was set to 60 days, i.e. guests can make reservations only during the 60 days period before the corresponding arrival date. The outputs were evaluated and reviewed by a domain expert who has accepted the output of three datasets based on the closeness of the outputs to actual hotels data. The three datasets were used to compare different variations of the pickup method explained above. The goal of the experiments conducted in this research is to compare the relative performance of the different pickup method variations along different step ahead forecasting. Experiments were conducted on every dataset separately and for different steps ahead (7, 15, 30, and 60). For each dataset, one year of booking data was assumed to be available. The current day was set to be the first day after the first year, then the final arrivals for a certain step ahead interval was forecasted. The current day was shifted a week and the next interval was forecasted and so on until the end of the dataset. Different error measures were calculated and tabulated. Details of the error measures calculated are described next. 3.1 Error Measures Our comparative study uses 6 different error measures to assess the performance of the different implemented pickup variations. Let

1. A t : The actual value in the time series at time t. 2. F t : The forecasted value in the time series at time t. 3. n : The length of the time series. The following error measures are calculated: Mean Absolute Error (MAE): Measures average absolute deviation of forecast from the actual. MAE = 1 n A t F t. (6) n Mean Absolute Percentage Error (MAPE): The average absolute percentage of errors to the actual values. Accuracy is expressed as a percentage. MAPE = 1 n A t F t n A t. (7) Mean Square Error (MSE): The average of the square of the difference between the actual and the forecast. MSE = 1 n (A t F t ) 2. (8) n Root Mean Square Error (RMSE): Expresses the variance plus the bias of the estimator Standard Squared Error: where RMSE = V + B 2 (9) B = 1 n n (A t F t ) (10) [ V = 1 n ] A t F t 2 nb 2. (11) n 1 Minimum Absolute Error Ratio (MAE Ratio): Calculates the number of times each variation had the lowest absolute error, along all the forecasted days, divided by the number of the whole forecasted arrival points. In case of a tie, the counters of all the variations with the minimum absolute error are incremented by one. Root Mean Square Error Ratio (RMSE Ratio): This is a customized error measure. It is calculated as follows: 1. For every variation, generate a RMSE buffer with a length equals the current step ahead. 2. Fill the first cell in that buffer with RMSE for the first numbers in all the forecasted steps ahead calculated in the current dataset. 3. Repeat until all the cells of the buffer are full. 19 4 Results and Discussion As mentioned before, we conducted our experiments on three datasets (Dataset1, Dataset2, Dataset3). For every dataset we calculated 6 different error measures for 8 different combination of the pickup method. We repeated the experiments for the different steps ahead (7, 15, 30, 60). Tables (4 7) list the results for Dataset1. Table (8), summarize the results obtained for all datasets. In this table we ordered the best 3 variations grouped by the Dataset and the step ahead. Choosing the best 3 variations was based on the corresponding values of the different error measures. Studying Table (8), we can conclude the following: Multiplicative variations seem to outperform Additive variations in taking the first place; while additive variations generally appeared to be more robust. Classical pickup variations outperform advanced variations. Advanced variations failed to appear at all in this table and have shown poor performance. Although exponential smoothing variations are mostly taking the lead and appear much more than simple average variations, error measures of simple average variations are apparently comparable with the exponential variations. 5 Conclusions and Future Work In this paper, we have presented 8 variations of the pickup method. Experiments were conducted on 3 simulated datasets for hotel reservations data. Each variation was evaluated with 6 different error measures and for different steps ahead forecasts. Our study shows that classical pickup variations have outperformed the advanced pickup methods. On the other hand the Multiplicative, classical, Smoothing variation has been identified as the best technique. In the future, we intend to use other forecasting techniques like Winters and Holt s method. We also plan to investigate other reservation based forecasting models and compare their performance to the pickup method. Combining the results of different pickup variations is to be investigated. It would also be valuable to compare the performance of the pickup method to models that rely on historical data only, not taking the reservation information into account. Above all, we intend to verify the pickup method variations with real data obtained from real reservations. Acknowledgements This work is part of the Data Mining for Improving Tourism Revenue in Egypt research project within the Egyptian Data Mining and Computer Modeling Center of Excellence. We also would like to acknowledge the useful discussions with Dr Hisham El-Shishiny ( IBM Center for Advanced Studies in Cairo) and the continuous effort and suggestions of Professor Ali Hadi (American University of

Table 4: Error measures for Dataset: 1, step ahead: 7 days Additive, classical, simple 5.221 0.056 53.105 6.730 44.872 14.286 Additive, classical, Smoothing 5.222 0.056 53.081 6.728 44.689 28.571 Additive, Advanced, simple 5.510 0.059 67.759 7.113 46.520 0 Additive, Advanced, smoothing 5.507 0.059 67.604 7.112 46.612 0 Multiplicative, classical, simple 5.2710 0.0580 53.183 6.778 45.421 28.571 Multiplicative, classical, Smoothing 5.222 0.057 51.760 6.720 47.802 71.429 Multiplicative, Advanced, simple 5.505 0.059 67.056 7.117 40.018 0 Multiplicative, Advanced, Smoothing 5.489 0.059 66.295 7.099 40.201 0 Table 5: Error measures for Dataset: 1, step ahead: 15 days Additive, classical, simple 6.806 0.0720 91.031 8.777 38.624 13.333 Additive, classical, Smoothing 6.799 0.072 90.907 8.770 38.323 53.333 Additive, Advanced, simple 8.399 0.083 200.380 11.145 34.925 0 Additive, Advanced, smoothing 8.391 0.083 199.750 11.137 34.925 0 Multiplicative, classical, simple 7.020 0.074 100.472 9.024 35.957 13.333 Multiplicative, classical, Smoothing 6.940 0.073 96.108 8.912 37.032 33.333 Multiplicative, Advanced, simple 8.557 0.085 209.552 11.360 28.860 0 Multiplicative, Advanced, Smoothing 8.457 0.084 200.812 11.213 28.645 0 Table 6: Error measures for Dataset: 1, step ahead: 30 days Additive, classical, simple 8.903 0.095 148.901 11.887 32.266 16.667 Additive, classical, Smoothing 8.885 0.094 148.217 11.860 31.634 46.667 Additive, Advanced, simple 13.196 0.125 677.172 19.876 25.730 0 Additive, Advanced, smoothing 13.188 0.125 676.199 19.872 25.621 0 Multiplicative, classical, simple 9.136 0.100 160.261 12.115 31.002 26.667 Multiplicative, classical, Smoothing 8.891 0.094 161.724 11.861 31.569 16.667 Multiplicative, Advanced, simple 13.743 0.130 753.810 20.803 20.850 0 Multiplicative, Advanced, Smoothing 13.546 0.128 731.840 20.550 21.525 0 Table 7: Error measures for Dataset: 1, step ahead: 60 days Additive, classical, simple 14.225 0.153 404.836 19.423 27.072 30 Additive, classical, Smoothing 14.222 0.153 404.012 19.362 24.842 38.333 Additive, Advanced, simple 22.355 0.238 1425.581 34.498 19.077 0 Additive, Advanced, smoothing 22.409 0.239 1433.300 34.537 18.277 0 Multiplicative, classical, simple 17.970 0.206 727.839 23.130 26.227 10 Multiplicative, classical, Smoothing 14.747 0.163 456.488 19.791 30.338 25 Multiplicative, Advanced, simple 23.890 20 0.254 1730.211 37.705 15.946 0 Multiplicative, Advanced, Smoothing 23.867 0.252 1723.458 37.012 16.340 0

Table 8: Summary of the winning variations step ahead Dataset 1 Dataset 2 Dataset 3 7 Days Mul, Clas, Exp Mul, Clas, Exp Add, Clas, Sim Add, Clas, Exp Mul, Clas, Sim Add, Clas, Exp Add, Clas, Sim Add, Clas, Exp Mul, Clas,Exp 15 Days Add, Clas, Exp Mul, Clas, Exp Mul, Clas, Exp Add, Clas, Sim Add, Clas, Exp Mul, Clas, Sim Mul, Clas,Exp Add, Clas, Sim Add, Clas, Sim 30 Days Add, Clas, Exp Mul, Clas, Exp Mul, Clas, Exp Mul, Clas, Exp Add, Clas, Exp Mul, Clas, Sim Add, Clas, Sim Mul, Clas, Sim Add, Clas, Exp 60 Days Add, Clas, Exp Mul, Clas, Exp Add, Clas, Exp Add, Clas, Sim Add, Clas, Exp Mul, Clas,Exp Mul, Clas,Exp Add, Clas, Sim Add, Clas, Sim Cairo and Cornell University) to improve the work in this paper. References [1] L. R. Weatherford and S. E. Kimes, A comparison of forecasting methods for hotel revenue management, International Journal of Forecasting, vol. 99, pp. 401 415, January 2003. [2] U. M.-B. A. Ingold and I. Y. (Eds.), Yield Management. Continuum, 2nd ed., 2003. [3] M. B. Ghalia and P. Wang, Intelligent system to support judgmental business forecasting- the case of estimating hotel room demand, IEEE Transaction on fuzzy systems, vol. 8, August 2000. [4] K. T. Talluri and G. J. V. Ryzin, The Theory and Practice of Revenue Management. Springer Science+Buisness Media, Inc 2005. [5] E. L Heureux, A new twist in forecasting short-term passenger pickup, in Proceedings of the 26th Annual AGIFORS Symposium, 1986. [6] R. H. Zeni, Improved Forecast Accuracy in Airline Revenue Management by Unconstraining Demand Estimates from Censored Data. Ph.d. diss., Graduate School-Newark Rutgers, The State University of New Jersey, October 2001. [7] E. S. Gardner, smoothing: The state of the art Ű part II, June 2005. 21