An Improved Dynamic Programming Decomposition Approach for Network Revenue Management Dan Zhang Leeds School of Business University of Colorado at Boulder May 21, 2012
Outline Background Network revenue management formulation Classical dynamic programming decomposition An improved dynamic programming decomposition Numerical results Summary
Network RM Formulation (Gallego and van Ryzin, 1997; Gallego et. al., 2004; Liu and van Ryzin, 2008) m resources with capacity c (an m-vector) Capacity for resource i is c i. n products N = {1,..., n} Fare for product j is f j Product consumption matrix A = [a ij ] Finite time horizon with length τ In each period, there is one customer arrival with probability λ, and no customer arrival with probability 1 λ. Given a set of products S N, a customer chooses product j with probability P j (S). No-purchase probability P 0 (S) = 1 j S P j (S). Objective: Maximize expected total revenue
Applications Industry Resources Products Airlines Scheduled flights O-D itineraries at certain fare levels Hotels Room-days Single(multi)-day stays at certain rates Car rentals Car-days Single(multi)-day rentals at certain rates Air Cargo Scheduled flights (weight) O-D shipments at certain rates Scheduled flights (volume)
Dynamic Programming Formulation DP optimality equations: v t (x) = v τ+1 (x) = 0, max S N(x) { λ j S P j (S)(f j + v t+1 (x A j )) } + (λp 0 (S) + 1 λ)v t+1 (x), t, x, x. Notations v t (x): DP value function A j : resource incidence vector of product j N(x): {j N : x A j }
Dynamic Programming Formulation DP optimality equations: v t (x) = v τ+1 (x) = 0, max S N(x) { λ j S P j (S)(f j + v t+1 (x A j )) } + (λp 0 (S) + 1 λ)v t+1 (x), t, x, x. Notations v t (x): DP value function A j : resource incidence vector of product j N(x): {j N : x A j } Curse of dimensionality: state space grows exponentially with the number of resources
Choice-based Deterministic Linear Program (CDLP) z CDLP = max h S N λr(s)h(s) S N λq(s)h(s) c, h(s) τ, S N (Resource constraint) (Time constraint) h(s) 0, S N. (Non-negativity) Replace stochastic demand with deterministic fluid with rate λ Given offer set S N Total time S is offered: h(s) Revenue from unit demand: R(S) = j S f jp j (S) Consumption of resource i from unit demand: Q i (S) = j S a ijp j (S)
CDLP (Gallego et. al, 2004; Liu and van Ryzin, 2008) CDLP can by efficiently solved for certain class of choice models. The vector of dual values π associated with resource constraints can be used as bid-prices for resources z CDLP provides an upper bound on revenue Some recent references: Talluri (2010): Concave programming formulation Gallego, Ratliff, Shebalov (2011): Efficient reformulation
Classical Dynamic Programming Decomposition For each i, approximate the DP value function with v t(x) v t,i (x i ) + πk x k, } {{ } k i Value of the } {{ } i-th resource Value of all other resources t, x.
Classical Dynamic Programming Decomposition For each i, approximate the DP value function with v t(x) v t,i (x i ) + πk x k, } {{ } k i Value of the } {{ } i-th resource Value of all other resources t, x. Using the approximation in DP recursion leads to ( v t,i (x i ) = max λp j (S) f j ) a kj πk +v t+1,i (x i a ij ) S N(x i,c i ) j S k i } {{ } Fare proration + (λp 0(S) + 1 λ)v t+1(x i ), t, x i.
Classical Dynamic Programming Decomposition For each i, approximate the DP value function with v t(x) v t,i (x i ) + πk x k, } {{ } k i Value of the } {{ } i-th resource Value of all other resources t, x. Using the approximation in DP recursion leads to ( v t,i (x i ) = max λp j (S) f j ) a kj πk +v t+1,i (x i a ij ) S N(x i,c i ) j S k i } {{ } Fare proration + (λp 0(S) + 1 λ)v t+1(x i ), t, x i. Compute offer sets dynamically using the approximate value function v t(x) i v t,i (x i ), t, i.
Classical Dynamic Programming Decomposition A DP with m-dimensional state space is reduced to m one-dimensional DPs, one for each resource. 101 4 states (Assume 100 seats per flight) 4 101 states
Classical Dynamic Programming Decomposition A DP with m-dimensional state space is reduced to m one-dimensional DPs, one for each resource. 101 4 states (Assume 100 seats per flight) 4 101 states Variants of the approach are widely used in practice. Review: Talluri and van Ryzin (2004a)
DP Decomposition Bounds Proposition (Zhang and Adelman, 2009) The following relationships hold: (i) v t (x) min l=1,...,m {v t,l (x l ) + } k l π k x k v t,i (x i ) + k i π k x k, i, t, x; (ii) v 1 (c) v 1,i (c i ) + k i π k c k z CDLP, i. Decomposition value for each leg provides an upper bound on revenue Decomposition bounds are tighter than the bound from CDLP
Linear Programming Formulation of DP (Adelman, 2007) min v 1 (c) {v t( )} t v t (x) j S λp j (S)(f j + v t+1 (x A j )) + (λp 0 (S) + 1 λ)v t+1 (x), t, x, S N(x).
Linear Programming Formulation of DP (Adelman, 2007) min v 1 (c) {v t( )} t v t (x) j S λp j (S)(f j + v t+1 (x A j )) + (λp 0 (S) + 1 λ)v t+1 (x), t, x, S N(x). Huge number of decision variables and constraints
Linear Programming Formulation of DP (Adelman, 2007) min v 1 (c) {v t( )} t v t (x) j S λp j (S)(f j + v t+1 (x A j )) + (λp 0 (S) + 1 λ)v t+1 (x), t, x, S N(x). Huge number of decision variables and constraints Functional approximation idea: use a parameterized representation of the value function to reduce the number of decision variables
The Affine Functional Approximation (Zhang and Adelman, 2009) Affine approximation is given by v t (x) θ t + i V t,i x i, t, x. (1)
The Affine Functional Approximation (Zhang and Adelman, 2009) Affine approximation is given by v t (x) θ t + i V t,i x i, t, x. (1) Using (1) in the linear programming formulation leads to min θ,v θ1 + i V 1,i c i ( θ t + V t,i x i λp j (S) f j + θ t+1 + V t+1,i (x i a ij ) i j S i ( + (λp 0(S) + 1 λ) θ t+1 + ) V t+1,i x i, t, x, S N(x). i )
The Affine Functional Approximation The dual program is given by z P1 = max λp j (S)f j Y t,x,s Y t,x,s N(x) j S { c i, if t = 1, x i Y t,x,s = x,s N(x) (x i ) j S λp j (S)a ij Y t 1,x,S, t = 2,..., τ x,s N(x) { Y t,x,s = 1, if t = 1, x,s N(x) Y t 1,x,S, t = 2,..., τ. x,s N(x) Y 0. i, t,
The Affine Functional Approximation The dual program is given by z P1 = max λp j (S)f j Y t,x,s Y t,x,s N(x) j S { c i, if t = 1, x i Y t,x,s = x,s N(x) (x i ) j S λp j (S)a ij Y t 1,x,S, t = 2,..., τ x,s N(x) { Y t,x,s = 1, if t = 1, x,s N(x) Y t 1,x,S, t = 2,..., τ. x,s N(x) Y 0. i, t, Due to the large number of columns, solving the linear program above still requires considerable computational effort.
Functional Approximation Approaches for Network RM Citation Choice Model Functional approximation Solution strategy Adelman (2007) Independent demand Affine Column generation Zhang and Adelman (2009) MNLD Affine Column generation Zhang (2011) MNLD Nonlinear non-separable CDLP+Simultaneous DP Liu and van Ryzin (2008) MNLD Separable (fare proration) CDLP+DP Decomposition Miranda Bront et. al. (2009) MNLO Separable (fare proration) CDLP+DP Decomposition Farias and Van Roy (2008) Independent demand Separable concave Constraint sampling Meissner and Strauss (2012) MNLD Separable concave Column generation Kunnumkal and MNLD Separable (fare proration) Convex programming Topaloglu (2011) +DP Decomposition Tong and Topaloglu (2011) Independent demand Affine Reduction + Constraint generation Vossen and Zhang Independent demand Affine Reduction + MNLD + Dynamic disaggregation MNLD: Multinomial logit model with disjoint consideration sets MNLO: Multinomial logit model with overlapping consideration sets
Research Questions Computational cost: ADP (affine or separable concave approximation) classical DP decomposition
Research Questions Computational cost: ADP (affine or separable concave approximation) classical DP decomposition How can we balance solution quality with solution time? Can we improve the classical DP decomposition?
A Strong Functional Approximation (Zhang, 2011) v t (x) min ˆv t,i(x i ) + πk x k, t, x. i k i Nonlinear and non-separable functional approximation Each value v t (x) is approximated by a single value across legs Motivated by the decomposition bounds (Zhang and Adelman, 2009)
A Nonlinear Optimization Problem Using the new functional approximation leads to z NLP = min min ˆv 1,i(c i ) + πk c k ˆv t,i ( ) t,i i k i min ˆv t,i(x i ) + πk x k i k i λp j (S) f j + min ˆv t+1,l(x l a lj ) + πk (x k a kj ) l j S k l + (λp 0(S) + 1 λ)min ˆv t+1,l(x l ) + πk x k l, t, x, S N(x). k l The problem is a nonlinear optimization problem with a huge number of nonlinear constraints.
A Restricted Optimization Problem Step 1: Writing each constraint as m equivalent constraints Step 2: Restricting the constraints so that each constraint only involves one resource The restricted problem provides a relaxed bound: Proposition The objective value of the restricted program, z NLP, is bigger than z NLP.
An Equivalent Simultaneous Dynamic Program ˆv t,i(x i ) = max S N(x i,c i ) { min l i ( { λp j (S) f j + min ˆv t+1,i(x i a ij ) πk a kj, j S k i max [ˆv t+1,l(y l ) y l πl ] } }) a kj πk + πi x i 0 y l c l a lj k + (λp 0(S) + 1 λ) min i, t, x i. { ˆv t+1,i(x i ), min l i { max 0 y l c l [ˆv t+1,l(y l ) π l y l ] + π i x i DP recursion for resource i involves values from all other resources }}
An Equivalent Simultaneous Dynamic Program ˆv t,i(x i ) = max S N(x i,c i ) { min l i ( { λp j (S) f j + min ˆv t+1,i(x i a ij ) πk a kj, j S k i max [ˆv t+1,l(y l ) y l πl ] } }) a kj πk + πi x i 0 y l c l a lj k + (λp 0(S) + 1 λ) min i, t, x i. { ˆv t+1,i(x i ), min l i { max 0 y l c l [ˆv t+1,l(y l ) π l y l ] + π i x i DP recursion for resource i involves values from all other resources The dynamic program is equivalent to the restricted nonlinear program can be solved efficiently via a simultaneous dynamic programming algorithm leads to tighter revenue bounds }}
New Bounds Proposition (Zhang, 2011) Let {ˆv t,i ( )} t,i,x i be the optimal solution from the simultaneous dynamic program. The following results hold: (i) ˆv t,i (x i) v t,i (x i ), i, x i ; (ii) v 1 (c) z NLP z NLP = min i {ˆv 1,i (c i) + } k i π k c k min i {v 1,i (c i ) + } k i π k c k z CDLP. The simultaneous dynamic program provides tighter bounds on revenue than the classical decomposition.
Recap High dimensional dynamic program
Recap High dimensional dynamic program Large scale linear program
Recap High dimensional dynamic program Large scale linear program Large scale nonlinear program with nonlinear constraints
Recap High dimensional dynamic program Large scale linear program Large scale nonlinear program with nonlinear constraints Restricted nonlinear program with nonlinear constraints
Recap High dimensional dynamic program Large scale linear program Large scale nonlinear program with nonlinear constraints Restricted nonlinear program with nonlinear constraints Simultaneous dynamic program
Comparison: Classical vs. Improved Approaches Classical dynamic programming decomposition: Solve m single-leg DPs Prorated fares Fare proration Static bid-prices Solve CDLP
Comparison: Classical vs. Improved Approaches Classical dynamic programming decomposition: Solve m single-leg DPs Prorated fares Fare proration Static bid-prices Solve CDLP Network effects only captured through fare proration
Comparison: Classical vs. Improved Approaches Classical dynamic programming decomposition: Solve m single-leg DPs Prorated fares Fare proration Static bid-prices Solve CDLP Network effects only captured through fare proration Improved dynamic programming decomposition: Solve one simultaneous DP Static bid-prices Solve CDLP
Comparison: Classical vs. Improved Approaches Classical dynamic programming decomposition: Solve m single-leg DPs Prorated fares Fare proration Static bid-prices Solve CDLP Network effects only captured through fare proration Improved dynamic programming decomposition: Solve one simultaneous DP Static bid-prices Solve CDLP Network effects captured during DP recursion!
Computational Study: Problem Instances Randomly generated hub-and-spoke instances Number of non-hub locations (flights) in the set {4, 8, 16, 24} Number of periods in the set {100, 200, 400, 800} Two products for each possible itinerary Multinomial Logit Choice Model with Disjoint Consideration Sets (MNLD) Largest problem instance: 24 non-hub locations (flights), 336 products, 800 periods
Numerical Study: Policies DCOMP1: the new decomposition approach where the approximation m v t (x) ˆv t,i(x i ), t, x i=1 is used to compute control policies. DCOMP: the classical dynamic programming decomposition CDLP: static bid-price policy based on the dual values of resource constraints in CDLP CDLP10: A version of CDLP that resolves 10 times with equally spaced resolving intervals Each policy is simulated 20000 times
Computational Time Case # Parameters Capacity Load CPU seconds DCOMP1 DCOMP per leg factor CDLP DCOMP DCOMP1 DCOMP A1 (100,4,4,16) 10 1.17 0.16 2.03 2.75 35.38% A2 (200,4,4,16) 20 1.27 0.23 7.89 10.88 37.82% A3 (400,4,4,16) 40 1.19 0.16 31.92 43.73 37.00% A4 (800,4,4,16) 80 1.28 0.20 127.66 174.48 36.68% A5 (100,8,8,48) 5 1.43 1.52 5.75 7.47 29.89% A6 (200,8,8,48) 10 1.36 0.72 22.83 29.58 29.57% A7 (400,8,8,48) 20 1.35 1.61 91.67 118.92 29.73% A8 (800,8,8,48) 40 1.21 0.72 362.84 471.73 30.01% A9 (100,16,16,160) 2 1.65 4.64 15.09 19.42 28.67% A10 (200,16,16,160) 5 1.45 4.69 75.84 96.97 27.85% A11 (400,16,16,160) 10 1.29 2.92 303.66 388.97 28.10% A12 (800,16,16,160) 20 1.40 3.64 1218.67 1560.19 28.02% A13 (100,24,24,336) 1 1.45 3.69 24.72 31.81 28.70% A14 (200,24,24,336) 2 1.35 4.59 98.52 127.36 29.28% A15 (400,24,24,336) 5 1.29 4.39 492.73 630.84 28.03% A16 (800,24,24,336) 10 1.38 4.23 1978.14 2532.20 28.01%
Bound Performance Case # CDLP DCOMP DCOMP1 Bound improvement %-difference across legs bound bound bound %-CDLP %-DCOMP DCOMP DCOMP1 A1 24078.90 23985.56 22900.49 5.15% 4.74% 4.46% 0.00% A2 48367.58 48328.43 47588.56 1.64% 1.55% 1.87% 0.36% A3 89312.44 87576.49 86729.90 2.98% 0.98% 2.36% 0.00% A4 213102.50 211854.85 211087.37 0.95% 0.36% 0.58% 0.00% A5 32521.30 31029.90 30726.17 5.84% 0.99% 3.18% 0.05% A6 70541.63 68760.67 68617.41 2.80% 0.21% 2.18% 0.22% A7 107831.01 106339.36 106153.32 1.58% 0.18% 1.09% 0.00% A8 216080.83 212915.61 212848.05 1.52% 0.03% 1.49% 0.00% A9 26347.76 24953.24 24764.75 6.39% 0.76% 4.69% 0.00% A10 60629.35 58489.12 58118.33 4.32% 0.64% 2.95% 0.03% A11 101616.47 100069.27 99771.63 1.85% 0.30% 1.52% 0.01% A12 224780.69 222558.53 222231.72 1.15% 0.15% 0.94% 0.00% A13 13074.04 11845.73 10386.38 25.88% 14.05% 10.37% 0.00% A14 26296.19 24926.41 24373.33 7.89% 2.27% 5.50% 0.00% A15 74112.13 72089.14 71617.55 3.48% 0.66% 2.80% 0.03% A16 131457.79 129589.28 129273.91 1.69% 0.24% 1.44% 0.00%
Bounds from Individual Legs 132000 Bounds from individual Legs 131500 131000 130500 130000 129500 129000 128500 DCOMP DCOMP1 128000 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Leg
Bounds from Individual Legs 132000 Bounds from individual Legs 131500 131000 130500 130000 129500 129000 128500 DCOMP DCOMP1 128000 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Leg DCOMP1 bounds are more homogeneous across legs
A Hub-and-spoke Network with 2 Non-Hub Locations Case # τ Load Capacity DCOMP1 DCOMP1 Revenue Gains OPT-GAP factor per leg REV %-CDLP %-CDLP10 %-DCOMP B1 100 2.40 4 5775.47-3.17% 182.90% 8.56% 2.74% B2 200 2.13 9 13262.92-2.04% 209.79% 5.74% 4.11% B3 400 2.13 18 25456.41-6.53% -1.00% -0.33% 0.03% B4 800 2.13 36 53946.59-1.20% 212.87% 4.41% 7.17% B5 100 1.60 6 8034.27-7.25% 48.88% 3.00% 0.09% B6 200 1.60 12 17318.40-2.31% 13.42% 3.62% 5.77% B7 400 1.60 24 35472.06-1.25% 311.96% 4.78% 7.97% B8 800 1.60 48 65618.95-9.20% 42.16% -4.49% 0.00% B9 100 1.37 7 9269.56-5.85% 13.27% 1.33% 1.94% B10 200 1.28 15 20521.01-3.70% 15.05% 0.98% 4.65% B11 400 1.28 30 42471.91-2.10% 15.46% 1.70% 8.14% B12 800 1.28 60 86841.58-1.15% 15.47% 2.45% 5.52% B13 100 1.07 9 11107.51-1.68% 2.89% 0.47% 0.34% B14 200 1.07 18 23268.75-0.80% 4.84% 1.14% 0.11% B15 400 1.07 36 47824.97-0.27% 22.11% 2.22% 0.04% B16 800 1.07 72 96993.79-0.08% 7.92% 2.14% 0.01% B17 100 0.96 10 11854.02-0.85% 23.82% 1.46% 0.18% B18 200 0.91 21 25259.70-0.07% 2.61% 1.24% 0.04% B19 400 0.91 42 51593.10 0.03% 30.16% 2.48% 0.01% B20 800 0.91 84 104376.37 0.01% 31.46% 2.60% 0.00%
DCOMP1 Percentage Revenue Gain vs. Load Factor DCOMP1 percentage revenue gain 10 5 0 % CDLP10 % DCOMP 5 0.5 1 1.5 2 2.5 Load factor
DCOMP1 Percentage Revenue Gain vs. Load Factor DCOMP1 percentage revenue gain 10 5 0 % CDLP10 % DCOMP 5 0.5 1 1.5 2 2.5 Load factor Higher load factor Higher revenue gains
DCOMP1 Percentage Revenue Gain vs. Number of Periods DCOMP1 percentage revenue gain 10 5 0 % CDLP10 % DCOMP 5 0 100 200 300 400 500 600 700 800 900 Number of periods
DCOMP1 Percentage Revenue Gain vs. Number of Periods DCOMP1 percentage revenue gain 10 5 0 % CDLP10 % DCOMP 5 0 100 200 300 400 500 600 700 800 900 Number of periods Significant revenue gains for problems with long selling horizons!
A Hub-and-spoke Network with 4 Non-Hub Locations Case # τ Load Capacity DCOMP1 DCOMP1 Revenue Gains OPT-GAP factor per leg REV %-CDLP %-CDLP10 %-DCOMP C1 100 1.99 6 16795.66-5.08% 14.60% 0.46% 0.10% C2 200 1.99 12 35028.96-2.63% 52.79% 1.95% 1.32% C3 400 1.99 24 70163.84-3.35% 12.35% -0.11% 0.03% C4 800 1.99 48 143921.85-1.34% 52.31% 1.36% 0.66% C5 100 1.49 8 21860.34-4.23% 44.57% 1.71% 1.96% C6 200 1.49 16 45171.83-2.54% 14.18% 1.72% 2.67% C7 400 1.49 32 88532.95-5.29% 29.01% -2.27% -0.81% C8 800 1.49 64 184410.85-1.79% 1.02% 0.49% 0.18% C9 100 1.19 10 26270.03-5.00% 4.98% 1.46% 2.67% C10 200 1.19 20 54509.68-3.02% 4.18% 1.61% 4.11% C11 400 1.19 40 111520.14-1.79% 3.44% 1.51% 4.43% C12 800 1.19 80 226059.91-1.05% 57.07% 1.92% 3.08% C13 100 1.00 12 29208.18-5.00% 1.30% -0.90% 0.21% C14 200 1.00 24 61175.75-3.02% 2.18% 0.21% 0.08% C15 400 1.00 48 125854.79-1.79% 6.38% 0.74% 0.00% C16 800 1.00 96 256236.11-1.00% 2.66% 0.89% -0.06% C17 100 0.85 14 32057.27-2.55% 2.50% -0.81% 0.44% C18 200 0.85 28 66527.87-1.28% 3.05% -0.03% 0.20% C19 400 0.85 56 135897.71-0.51% 3.24% 0.26% 0.05% C20 800 0.85 112 274817.89-0.17% 3.21% 0.41% 0.02%
DCOMP1 Percentage Revenue Gain vs. Load Factor DCOMP1 percentage revenue gain 5 4 3 2 1 0 1 2 % CDLP10 % DCOMP 3 0.5 1 1.5 2 2.5 Load factor
DCOMP1 Percentage Revenue Gain vs. Number of Periods DCOMP1 percentage revenue gain 5 4 3 2 1 0 1 2 % CDLP10 % DCOMP 3 0 100 200 300 400 500 600 700 800 900 Number of periods
Summary and Future Directions Functional approximation approach is promising for solving large scale stochastic dynamic programs. However, implementations of the approach often require very high computational cost. The first nonlinear non-separable functional approximation for network RM problem Novel approximation architecture Better revenue bounds Improved heuristic policies Moderate computational cost Current work Exploiting special structures of the LP formulations of dynamic programs in value function approximation (Vossen and Zhang, 2012) Applications with real data (Zhang and Weatherford, 2012)
Thank you! Questions? Comments?