Markov Decision Processes for Ad Network Optimization

Size: px
Start display at page:

Download "Markov Decision Processes for Ad Network Optimization"

Transcription

1 Markov Decision Processes for Ad Network Optimization Flávio Sales Truzzi 1, Valdinei Freire da Silva 2, Anna Helena Reali Costa 1, Fabio Gagliardi Cozman 3 1 Laboratório de Técnicas Inteligentes (LTI) Universidade de São Paulo (USP) Av. Prof. Luciano Gualberto tv. 3, 158 Caixa Postal São Paulo SP Brazil 2 Escola de Artes, Ciências e Humanidades Universidade de São Paulo (USP) Av. Arlindo Béttio, 1000, Ermelino Matarazzo Caixa Postal São Paulo, SP 3 Decision Making Lab Universidade de São Paulo (USP) Av. Prof. Luciano Gualberto tv. 3, 158 Caixa Postal São Paulo SP Brazil {flavio.truzzi,valdinei.freire,fgcozman,anna.reali}@usp.br Abstract. In this paper we examine a central problem in a particular advertising scheme: we are concerned with matching marketing campaigns that produce advertisements ( ads ), to impressions where impression is a general term for any space in the internet that can display an ad. In this paper we propose a new take on the problem by resorting to planning techniques based on Markov Decision Processes, and by resorting to plan generation techniques that have been developed in the AI literature. We present a detailed formulation of the Markov Decision Process approach and results of simulated experiments. 1. Introduction Internet marketing has skyrocketed in recent years. Most companies adopt rudimentary forms of marketing, while others resort to sophisticated ideas based on collaborative, or even social, recommendations [Kiang et al. 2000, Nagurney 2010]. Sources report that Internet advertising revenues in the United States of America totaled $31.7 billion for the full year of 2011, and a compound annual growth rate (CAGR) of 20.3% since 2002 [Pwc 2012]. In this paper we examine a central problem in a particular advertising scheme. We are concerned with matching marketing campaigns to impressions. A campaign offers products through advertisements, termed ads. An impression is any space in the internet that can be occupied by an ad. For example, the space for a banner in a blog page is an impression. Campaigns have ads to be displayed, and impressions from publishers (blogs and websites alike) request ads to be displayed by them. Campaigns pay publishers to display ads; for instance if a blog carries an ad for a car maker, the car maker will pay the blogger either a fixed amount per impression, or a fixed amount per actual sale based on the impression; these are contractual arrangements between the campaign and

2 the publisher. Clearly there is interest in placing ads correctly, so that a campaign is presented to the right public, thus increasing the effect of the campaign, and the revenue of the publisher. A simple example: we should expect that a blog specialized in motherhood should carry campaigns geared toward women. To make matters concrete, we consider a fixed set of campaigns, a stream of impressions that request ads to be selected, and a broker, normally referred to as an ad network, that matches campaigns and impressions. The revenue of the broker comes from the campaigns, and again it can vary depending on contractual obligations; we present details about this later. The problem of matching campaigns and impressions has received attention, as our references, discussed later, indicate. The problem is normally modeled as a constrained optimization problem that maximizes the revenue subject to constraints such as budget limits and inventory availability. The state of the art is to solve the optimization problem off-line as a linear programming problem and serve ads using this optimal solution as a guidance to the online ad allocation [Chen et al. 2011]. However, such a solution ignores the obvious fact that impressions are generated dynamically, and the whole problem is one of dynamic, sequential decision making; that is, a planning problem. In this paper we propose a new take on the problem by resorting to planning techniques based on Markov Decision Processes, and by resorting to plan generation techniques that have been developed in the AI literature. Section 2 defines our problem formally, setting the stage for our solution. Our strategy was to build a simulator that can reproduce the behavior of campaigns, publishers, and the ad network. Even though the simulator is not a contribution in itself, we describe it in some detail in Section 3 because its behavior encodes some of the assumptions we are making concerning ad networks and internet advertising in general. Sections 4 and 5 respectively present the state of the art solution for ad-impression matching, based on linear programming, and our own proposal using Markov Decision Processes. Finally, Section 6 shows our proposal in action, while Section 7 concludes the paper. 2. Problem Definition: Basic Concepts and Assumptions An Ad Network is a network of sites that promotes the distribution of ads from advertisers to publishers. There are rules and constraints concerning the distribution of advertisements, and there is potential revenue for advertisers and publishers as users act in response to advertisement. This paper consider the revenue model based on cost per click (CPC) [Devanur and Kakade 2009]. This model returns a fixed monetary value each time a user clicks in an ad. The effective cost per impression (ecpi) is the total revenue produced by the revenue model in terms of impressions. Important concepts are [Muthukrishnan 2009]: Advertisers produce ads to be presented and allocate them into campaigns. Publishers are sites containing ads sent by the Advertisers and each ad spot in a publisher is a source. Campaigns are designed by the Advertisers, and are mainly characterized by: a set of ads, a starting date, an ending date, and a revenue model. It is also common

3 that an advertiser contracts a minimum volume of impressions and if CPC is considered as revenue model, then a campaign is usually constrained by a maximum budget to be spent 1. Ads contain the material to be distributed and the manifestation of an ad is considered an Impression from a campaign (advertiser) to a site (publisher). Requests are issued by sources (sites, publishers) that need ads to be presented. The main objective of an Ad Network is to maximize ecpi across a period of time; in other words, to find an optimal policy to distribute ads from campaigns to publishers, so as to fulfill requests and to satisfy constraints on minimum volume and budget. We assume that the ad network considers only the level of campaigns, i.e., ads in a campaign are considered equally. Moreover, we assume that the advertiser of a campaign does not interfere in the allocation of ads to publishers. We also merge sites into groups, such that sites in the same group present similar responses to impressions from the same campaign. The response from a group i to an impression of a campaign j is modeled by the Click-Through Rate (CTR), the probability p ij that a click is made in some ad given a group and a campaign. Note that sites in the same group are assumed to have the same CTR regarding ads in the same campaign. We also consider that ads in each campaign are available before the beginning of the requests, and that the ad network receives a stream of request which must be filled with such ads. This problem allows some trivial solutions in some cases. If there is no budget limit, then the optimal solution to the problem can be made considering only the ecpi, i.e. by choosing the campaign with the largest ecpi for each impression. On the other hand, if campaigns have no ending date, then the optimal solution is to choose campaigns with positive ecpi. Common solutions in the literature are obtained in two steps. The off-line step assumes that an estimate of the stream of requests is known a priori and obtains an optimal solution for it. The on-line step adapts the off-line optimal solution to allocate ads to requests as they appear in the stream of request. 3. Simulator We have built a simulator that generates a stream of requests from publishers, and a pool of ads from campaigns; from these the simulator manages a policy generator that can be for instance an MDP-based policy, and produces statistics concerning performance and revenue. The simulator was coded in the Python language and Matlab, and even though it is not by itself an innovative piece, it encodes a number of assumptions that are worth enumerating and justifying. First, we do not impose constraints on the minimum volume of impressions; these are not the most important constraints and we may introduce them in future work. Additionally, the following assumptions were adopted: 1. the ad network does not differentiate publishers, therefore is indifferent to consider publishers or groups (of publishers) provided that the click response to the impressions is sufficient when working with the group abstraction; 1 It is also common that campaigns constrain publishers that may receive ads, for instance, by defining main and secondary categories, but we consider it to be irrelevant in this work.

4 2. although the campaigns are originated from the advertisers, the ad network can treat the requests at a campaign level, with no concern for the description of advertisers; 3. future requests are not influenced by requests occurred in the past; 4. time is discrete; and 5. campaigns start in the beginning of the simulation and finish in the end of the simulation. A simulator can be defined by a tuple < τ, G, C, P CT R, P G, P request >, where: 1. τ N specifies the length of the simulation; 2. G defines the group set; C defines the campaign set (includes CPC and budget); 3. P CT R : G C [0, 1] specifies the CTR for all group-campaign pairs, and it is considered that the CTR does not change over time; and 4. P G : G [0, 1], specifies the probability of an impression belonging to a group i, and i G P G(i) = 1. For each discrete time step t < τ the simulator may generate one impression following a Bernoulli process with success probability P request, and the request is associated to one of the possible groups according to the probability P G. The association of a request with a group is denoted by a vector I = [I 1, I 2,..., I G ], where I i = 1 if the request is associated with the group i, I i = 0 otherwise. In this formulation we consider that only one request can occur per time step, this simplification can be made without loss of generality. After receiving the request vector I, the ad network must allocate impressions by choosing a campaign for each request, responding with a G C matrix A 0 such that for all 1 i G: C A ij I i. j=1 After receiving the allocation matrix A the simulator generates the click that indicates whether the impression allocated to a campaign has been clicked in that simulation time according to a Bernoulli distribution with success probability equals to the P CT R of the pair campaign-group. The simulator dynamics for each discrete time step t is: 1. generates the impression vector I; 2. receives an allocation matrix A; and 3. generates the click event for the pair campaign-group. 4. Policy Generation Through Linear Programming The state of the art solution for policy generation in ad networks seems to be the proposal by Chen et al [Chen et al. 2011], that we summarize here to fix notation and provide motivation for our own approach, to be presented in the next section. We have that: 1. i indexes G groups, and j indexes C campaigns; 2. P CT R (i, j) denotes the CTR of group i impression assigned to campaign j, q j denotes the CPC for campaign j (how much a campaign is paying for a click), and: v ij = P CT R (i, j)q j (1) is the expected cost per impression (ecpi) of such assignment;

5 3. g j is the budget for campaign j; 4. h i denotes the request availability constraint for group i; and 5. x ij denotes the number of impressions from group i allocated to campaign j. Ad allocation can be reduced to a linear program if we assume that all impressions are fixed at the beginning; in this case we have to solve: max v ij x ij x s.t. i,j j, v ij x ij g j, i i, x ij h i, j i, j, x ij 0. The solution presented in Expression (2) is optimal, but it requires knowledge of h i (number of requests available to the group i). On the other hand, h i is a random variable, i.e. in each run (day, week, month) the number of the requests of impressions is different. However, under the assumption that the probability distribution of the variable h i is independent of the allocation made at that run, it s possible to estimate that distribution. Then, it is possible to find an off-line solution using the mean value h i of such distribution. Chikering and Heckerman (2000) suggest using the estimation h i to turn the problem in Expression (2) a well-posed problem and to obtain a solution x ij. Getting back to our simulator, h i can be easily calculated by: h i = τp request P G (i). Then, at any time t, after receiving a request from group i, a campaign j is allocated to such request with probability: x ij. x i j i The problem with such solution is that it considers only h i, instead of considering the whole probability distribution over h i. In the next section we model the ad network problem as a Markov Decision Process which considers the whole dynamics of requests from groups. 5. Markov Decision Process Formulation We now present our own proposal for ad allocation. We start by reviewing the basics of Markov Decision Processes (MDPs). MDPs offer a general framework for sequential decision problems. An MDP is defined by a tuple S, A, T, R, τ [Puterman 1994] as follows: S is the set of all states of the process, A is the set of all possible actions to be executed at each state s S, T : S A S [0, 1] is the transition function, R : S A R is the reward function, and τ N is a finite horizon. (2)

6 The dynamics of an MDP is as follows, at any time t < τ: (i) the process is at state s t S, (ii) the action a t A is executed, (iii) the process generates reward r t = R(s t, a t ), and (iv) the process transits to some state s S with probability P (s t+1 = s s t = s, a t = a) = T (s, a, s ). The solution for an MDP is a policy π : S A that maps each state into an action. An optimal policy maximizes the following value function for all state s S, [ τ 1 ] V π (s) = E r t s 0 = s, π. V π (s) defines the expected accumulated reward in the horizon τ. t=0 We now model the ad network problem as an MDP by considering states, actions, transitions, and rewards Ad Network as an MDP States The state is modelled as s = [B 1, B 2,..., B j, I 1, I 2,..., I G ], where I i was defined in section 3 and B i is defined as the number of available ads in a campaign budget. For example, considering 5 campaigns and 3 groups, a possible state is Campaign Information Group Information { }} { { }} { [ 10, 3, 4, 2, 3, 0, 0, 1 ]. } {{ } State The values on the campaign information tells how many ads that campaign can afford in that state, in the example above, the campaign 1 can afford 10 impressions, campaign 2 can afford 3 impressions and so on. The group information contains the information of which group has generated a request, in the example above group 1 and 2 have not generated any requests, and group 3 has generated one request that should be allocated. Actions In this model an action is defined as the allocation of an ad of the campaign set C to any impressions of the group set G in a decision epoch. Given our problem definition our set of actions can be defined by A = {0, 1,..., C } with the following meaning: a = j (3) where j > 0 is the campaign index. It is possible to not allocate an impression to any campaign by assigning a = 0. Transitions For all actions a and all states s and s the function T must obey the following requirements: 0 T (s, a, s ) 1, and s S T (s, a, s ) = 1, i.e. the function defines a proper probability distribution over the possible next states.

7 The variables in vector I do not depend on the previous state and I is described in section 3, the B component of the state depends only on the previous B and in the occurrence of click events, given s = [B 1, B 2,..., B j, I 1, I 2,..., I i ] and s = [B 1, B 2,..., B j, I 1, I 2,..., I i], the transition function T is: T (s, a, s ) = P t (I ) P (B j B j, a, I) (4) j C where P (B j B j, a, I) is equal to: 1, if B j = B j and (a j or I i = 0 or B j = 0) P (B j B P CT R (i, j), if B j = B j 1 and (a = j and I i = 1 and B j > 0) j, a, I) = 1 P CT R (i, j), if B j = B j and (a = j and I i = 1 and B j > 0) 0, otherwise (5) and (1 P request ), if I i = 0 for all i G P t (I ) = P request P G (i), if I i = 1 and I i = 0 for all i i G 0, otherwise Rewards The reward function R is defined as R : S A R. It attaches a reward to each state given an action, i.e. a value is obtained by performing an action on the current state. In our model it is defined as follows: R(s, a) = E [ q j (B j B j) ] { j C (6) q j P CT R (i, j), if I i = 1 and B a > 0 =, 0 otherwise where q j and P CT R (i, j) were defined in section 4 and define respectively the CPC for campaign j and CTR between campaign j and group i. The intuition behind the reward function is that it represents a local evaluation and in the case of ad networks, this local evaluation happens to be the ecpi Generating policies using the MDP Finding a solution to an MDP consists of computing a policy that maximizes the accumulated reward sequence. A non-stationary deterministic policy π : S {0, 1,..., τ 1} A specifies which action will be executed at any state s S and at any time t < τ. Under a policy π at any time t < τ every state can be associated with a value which consists of the accumulated reward process induced by the Markov Decision Process. The expected total reward of a policy π at time t is defined for any state s S as: [ τ 1 ] V π (s, t) = E R(s i, π(s i, i)). (7) i=t

8 The value function V ( ) of an optimal policy can be defined recursively for any state s S and time t < τ by: { } V (s, t) = max a A where V (s, τ) = 0 for any state s S. R(s, a) + s S T (s, a, s )V (s, t + 1), (8) Given the optimal value function V ( ), an optimal policy can be chosen for any state s S and time t < τ by: { } π (s, t) = arg max a A R(s, a) + s S T (s, a, s )V (s, t + 1). (9) 6. A Case Study In this section we present the results of our MDP-based solution and compare its results with a probabilistic approach as presented in section 4 and a greedy approach that allocates ads to impressions with higher ecpi. All graphs that show the results of the experiments were made by calculating the average of 1000 executions Experiment 1 In this experiment we want to analyze what would be the effect of using different time limit τ for each approach: greedy, probabilistic and MDP. For this experiment the following parameters were set: G = 3, C = 3 and B(C i ) = 10 for all i C; P CT R is defined in table 1, P G is defined in table 2, P request = 0.4, and we considered three different τ: 100, 150, and 200. Table 1. CTR values C 1 C 2 C 3 G G G Table 2. P G values Group P G G 1.50 G 2.25 G 3.25 The parameters P CT R, P request, and P g were chosen in order to show the effect of a different time limit in each one of the three methods: greedy, probabilistic and MDP. In the top-left graph of Figure 1, τ was set to 100. In this case the probabilistic and the MDP approaches outperformed the greedy one. The greedy approach starts allocating requests to the campaign with higher ecpi (C 2 ), and after the end of the budget of this campaign its performance begins to degrade; this effect can be seen in the decrease of the curve slope near t = 40, while the MDP and the probabilistic approaches maintain almost the same results. In the top-right graph of Figure 1, τ was set to 150, and the effect of time can be seen on both MDP and probabilistic methods. The probabilistic maintains almost a straight line until the budgets are almost spent when its performance starts to degrade.

9 The MDP approach starts to outperform the greedy approach near t = 40, for the same reason explained in the above paragraph. When τ is set to 200 as shown on the bottom graph of Figure 1 the MDP agent spent all the budget of the campaigns near t = 185, outperforming both greedy and the probabilistic approaches, even though all methods are getting almost the same revenue in the end, for this τ. 25 Experiment 1 τ= Experiment 1 τ= Revenue 10 Revenue Greedy Probabilistic MDP Time 5 Greedy Probabilistic MDP Time Figure 1. Revenue over time. Top left: the simulation runs until τ = 100. Top right: the simulations runs until τ = 150. Bottom: the simulation runs until τ = Experiment 2 The objective of this experiment is to experience a little more realistic conditions. In the first experiment the probabilities have been chosen so as to force the greedy method to first exhaust campaigns with higher ecpi over the choice of a narrower allocation for a CTR group-campaign a bit more restricted, which is usually harder to be satisfied. In this experiments the following parameters were set: G = 3, C = 3 and B(C i ) = 10, i C; P CT R is defined in table 3, P G is defined in table 4, P request = 1, and τ ranging from 100 to 300. The values of P CT R were chosen at random and with low values to simulate a more realistic conditions.

10 Table 3. CTR values C 1 C 2 C 3 G G G Table 4. P G values Group P G G 1.35 G 2.20 G 3.45 In this experiment for τ = 100 the three methods perform almost equally as shown in the top-left graph of Figure 2. This effect came from the fact that there is still campaigns with high probability of clicks in the requests stream with not all budget spent. When τ is set to 200 as shown in the top-right graph of the same Figure, both probabilistic and MDP aproaches outperform the greedy method, because it starts to decrease its slope near t = 120. This effect can be explained because the greedy approach burns out the higher ecpi campaign faster, not taking advantage of requests from groups that have a more specific CTR for some campaign. In the bottom graph of Figure 2, when τ = 300 almost all budget is spent, and the probabilistic and greedy methods perform nearly the same. However, we can see that the MDP approach is still better than the other two approaches. 15 Experiment 3 τ= Experiment 3 τ= Revenue Revenue Greedy Probabilistic MDP Time 5 Greedy Probabilistic MDP Time 30 Experiment 3 τ= Revenue Greedy Probabilistic MDP Time Figure 2. Revenue over time. Top left: the simulation runs until τ = 100. Top right: the simulations runs until τ = 200. Bottom: the simulation runs until τ = 300.

11 7. Conclusion and Discussion In this paper we proposed a new method to ad allocation subjected to budget constraints as an alternative to solve the ad allocation problem using planning techniques in an MDP framework. We believe that this approach has a major improvement over the state of art solutions because the MDP can capture the dynamics of the problem that other methods cannot tackle. In our experiments neither the greedy nor the probabilistic methods could outperform our method, and the experiments indicate that modeling as MDPs is better than other approaches, even though there is still many scenarios and other methods to be investigated. Although our proposal can overcome the other methods, the MDP has a serious flaw: the state space is usually huge, even for small scenarios, making the transition probability function sometimes intractable, even using sparse matrices to represent it. However, we believe that the hugeness of the state space is an opportunity to explore new formalisms, alternative techniques and best representations of states. One way to tackle the hugeness of the state space is to use a factored representation of the MDP [Boutilier et al. 1999, Guestrin et al. 2003], which can represent large structured MDPs compactly. In this representation, the state is implicitly described by using state variables and the transition model is made using a dynamic Bayesian network, allowing an exponential reduction in the representation size of the MDPs. Moreover, there are algorithms that can automatically discover and exploit the hidden structure of factored MDPs [Kolobov et al. 2012] in order to solve MDPs faster and with less memory. The next step of this work is to start adding more constraints in the problem, such as: minimum number of impression per campaign, category restrictions and a better reward function that is not only based on revenue in favor of creating a more homogeneous distribution of ads from different campaigns. The presented simulator may have its rules relaxed to better capture the dynamics of the system being modeled. One may want that the system could handle receiving multiple requests from different groups. In order to do this the P request probability should be dropped, and the P G changed to λ of a Poisson distribution [Haight 1967], and the events of requests would be drawn from such distributions. This change would modify the I vector as follows, I = [I 1, I 2,..., I G ], where I i = X if the group i has issued X requests, and any group i G would be able to generate requests in the same simulation time. The planning solution that its proposed in this paper used a fixed CTR to create the plan, as did the other methods in the Literature, despite the fact that in real world the CTR is always changing. Then it is natural to investigate the possibility of adding reinforcement learning techniques [Sutton and Barto 1998] with the purpose of exploring more realistic scenarios and increasing the performance when the CTR is changing. We believe that the fusion of real time CTR estimation and MDP planning working together is the holy grail to solve the problem of ad allocation. There is many opened research opportunities in this area, motivated not only because its intrinsic business application, but also for the sake of the improvement that these

12 researches can make on planning, MDP, reinforcement learning and other branches of artificial intelligence. Acknowledgements Anna Helena Reali Costa and Fábio Gagliardi Cozman are partially supported by CNPq. Flávio Sales Truzzi is supported by CAPES. The work reported here has received substantial support through FAPESP grant 2008/ and FAPESP grant 2011/ References Boutilier, C., Dearden, R., and Goldszmidt, M. (1999). Stochastic dynamic programming with factored representations. Artificial Intelligence, 121:2000. Chen, Y., Berkhin, P., Anderson, B., and Devanur, N. R. (2011). Real-time bidding algorithms for performance-based display ad allocation. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD 11, pages , New York, NY, USA. ACM. Chickering, D. M. and Heckerman, D. (2000). Targeted advertising with inventory management. In Proceedings of the 2nd ACM conference on Electronic commerce, EC 00, pages , New York, NY, USA. ACM. Devanur, N. R. and Kakade, S. M. (2009). The price of truthfulness for pay-per-click auctions. Proceedings of the tenth ACM conference on Electronic commerce - EC 09, page 99. Guestrin, C., Koller, D., Parr, R., and Venkataraman, S. (2003). Efficient solution algorithms for factored MDPs. Journal of Artificial Intelligence Research, 19: Haight, F. A. (1967). Handbook of the Poisson distribution [by] Frank A. Haight. Wiley New York,. Kiang, M. Y., Raghu, T. S., and Shang, K. H.-m. (2000). Marketing on the Internet who can benefit from an online marketing approach? Decision Support Systems. Kolobov, A., Mausam, and Weld, D. S. (2012). Discovering hidden structure in factored MDPs. Artificial Intelligence, 189: Muthukrishnan, S. (2009). Ad Exchanges : Research Issues. pages Nagurney, A. (2010). An Integrated Framework for the Design of Optimal Web Banners. pages Puterman, M. L. (1994). Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley-Interscience. Pwc, U. S. (2012). IAB Internet Advertising Revenue Report - Full Year (April). Sutton, R. S. and Barto, A. G. (1998). Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA, USA, 1st edition.

Computing Near Optimal Strategies for Stochastic Investment Planning Problems

Computing Near Optimal Strategies for Stochastic Investment Planning Problems Computing Near Optimal Strategies for Stochastic Investment Planning Problems Milos Hauskrecfat 1, Gopal Pandurangan 1,2 and Eli Upfal 1,2 Computer Science Department, Box 1910 Brown University Providence,

More information

An Environment Model for N onstationary Reinforcement Learning

An Environment Model for N onstationary Reinforcement Learning An Environment Model for N onstationary Reinforcement Learning Samuel P. M. Choi Dit-Yan Yeung Nevin L. Zhang pmchoi~cs.ust.hk dyyeung~cs.ust.hk lzhang~cs.ust.hk Department of Computer Science, Hong Kong

More information

The Adomaton Prototype: Automated Online Advertising Campaign Monitoring and Optimization

The Adomaton Prototype: Automated Online Advertising Campaign Monitoring and Optimization : Automated Online Advertising Campaign Monitoring and Optimization 8 th Ad Auctions Workshop, EC 12 Kyriakos Liakopoulos 1, Stamatina Thomaidou 1, Michalis Vazirgiannis 1,2 1 : Athens University of Economics

More information

Using Markov Decision Processes to Solve a Portfolio Allocation Problem

Using Markov Decision Processes to Solve a Portfolio Allocation Problem Using Markov Decision Processes to Solve a Portfolio Allocation Problem Daniel Bookstaber April 26, 2005 Contents 1 Introduction 3 2 Defining the Model 4 2.1 The Stochastic Model for a Single Asset.........................

More information

Chi-square Tests Driven Method for Learning the Structure of Factored MDPs

Chi-square Tests Driven Method for Learning the Structure of Factored MDPs Chi-square Tests Driven Method for Learning the Structure of Factored MDPs Thomas Degris Thomas.Degris@lip6.fr Olivier Sigaud Olivier.Sigaud@lip6.fr Pierre-Henri Wuillemin Pierre-Henri.Wuillemin@lip6.fr

More information

Reinforcement Learning

Reinforcement Learning Reinforcement Learning LU 2 - Markov Decision Problems and Dynamic Programming Dr. Martin Lauer AG Maschinelles Lernen und Natürlichsprachliche Systeme Albert-Ludwigs-Universität Freiburg martin.lauer@kit.edu

More information

A Game Theoretical Framework for Adversarial Learning

A Game Theoretical Framework for Adversarial Learning A Game Theoretical Framework for Adversarial Learning Murat Kantarcioglu University of Texas at Dallas Richardson, TX 75083, USA muratk@utdallas Chris Clifton Purdue University West Lafayette, IN 47907,

More information

Learning the Structure of Factored Markov Decision Processes in Reinforcement Learning Problems

Learning the Structure of Factored Markov Decision Processes in Reinforcement Learning Problems Learning the Structure of Factored Markov Decision Processes in Reinforcement Learning Problems Thomas Degris Thomas.Degris@lip6.fr Olivier Sigaud Olivier.Sigaud@lip6.fr Pierre-Henri Wuillemin Pierre-Henri.Wuillemin@lip6.fr

More information

Reinforcement Learning

Reinforcement Learning Reinforcement Learning LU 2 - Markov Decision Problems and Dynamic Programming Dr. Joschka Bödecker AG Maschinelles Lernen und Natürlichsprachliche Systeme Albert-Ludwigs-Universität Freiburg jboedeck@informatik.uni-freiburg.de

More information

Bayesian Statistics: Indian Buffet Process

Bayesian Statistics: Indian Buffet Process Bayesian Statistics: Indian Buffet Process Ilker Yildirim Department of Brain and Cognitive Sciences University of Rochester Rochester, NY 14627 August 2012 Reference: Most of the material in this note

More information

Linear programming approach for online advertising

Linear programming approach for online advertising Linear programming approach for online advertising Igor Trajkovski Faculty of Computer Science and Engineering, Ss. Cyril and Methodius University in Skopje, Rugjer Boshkovikj 16, P.O. Box 393, 1000 Skopje,

More information

An Algorithm for Procurement in Supply-Chain Management

An Algorithm for Procurement in Supply-Chain Management An Algorithm for Procurement in Supply-Chain Management Scott Buffett National Research Council Canada Institute for Information Technology - e-business 46 Dineen Drive Fredericton, New Brunswick, Canada

More information

Monotone multi-armed bandit allocations

Monotone multi-armed bandit allocations JMLR: Workshop and Conference Proceedings 19 (2011) 829 833 24th Annual Conference on Learning Theory Monotone multi-armed bandit allocations Aleksandrs Slivkins Microsoft Research Silicon Valley, Mountain

More information

Inductive QoS Packet Scheduling for Adaptive Dynamic Networks

Inductive QoS Packet Scheduling for Adaptive Dynamic Networks Inductive QoS Packet Scheduling for Adaptive Dynamic Networks Malika BOURENANE Dept of Computer Science University of Es-Senia Algeria mb_regina@yahoo.fr Abdelhamid MELLOUK LISSI Laboratory University

More information

Invited Applications Paper

Invited Applications Paper Invited Applications Paper - - Thore Graepel Joaquin Quiñonero Candela Thomas Borchert Ralf Herbrich Microsoft Research Ltd., 7 J J Thomson Avenue, Cambridge CB3 0FB, UK THOREG@MICROSOFT.COM JOAQUINC@MICROSOFT.COM

More information

Bidding Dynamics of Rational Advertisers in Sponsored Search Auctions on the Web

Bidding Dynamics of Rational Advertisers in Sponsored Search Auctions on the Web Proceedings of the International Conference on Advances in Control and Optimization of Dynamical Systems (ACODS2007) Bidding Dynamics of Rational Advertisers in Sponsored Search Auctions on the Web S.

More information

A Sarsa based Autonomous Stock Trading Agent

A Sarsa based Autonomous Stock Trading Agent A Sarsa based Autonomous Stock Trading Agent Achal Augustine The University of Texas at Austin Department of Computer Science Austin, TX 78712 USA achal@cs.utexas.edu Abstract This paper describes an autonomous

More information

Solving Hybrid Markov Decision Processes

Solving Hybrid Markov Decision Processes Solving Hybrid Markov Decision Processes Alberto Reyes 1, L. Enrique Sucar +, Eduardo F. Morales + and Pablo H. Ibargüengoytia Instituto de Investigaciones Eléctricas Av. Reforma 113, Palmira Cuernavaca,

More information

An Introduction to Markov Decision Processes. MDP Tutorial - 1

An Introduction to Markov Decision Processes. MDP Tutorial - 1 An Introduction to Markov Decision Processes Bob Givan Purdue University Ron Parr Duke University MDP Tutorial - 1 Outline Markov Decision Processes defined (Bob) Objective functions Policies Finding Optimal

More information

A simple analysis of the TV game WHO WANTS TO BE A MILLIONAIRE? R

A simple analysis of the TV game WHO WANTS TO BE A MILLIONAIRE? R A simple analysis of the TV game WHO WANTS TO BE A MILLIONAIRE? R Federico Perea Justo Puerto MaMaEuSch Management Mathematics for European Schools 94342 - CP - 1-2001 - DE - COMENIUS - C21 University

More information

Scheduling a Multi-Cable Electric Vehicle Charging Facility

Scheduling a Multi-Cable Electric Vehicle Charging Facility Scheduling a Multi-Cable Electric Vehicle Charging Facility Tony T. Tran 1, Mustafa K. Dogru 2, Ulas Ozen 2 and J. Christopher Beck 1 1 Department of Mechanical and Industrial Engineering University of

More information

Single-Period Balancing of Pay Per-Click and Pay-Per-View Online Display Advertisements

Single-Period Balancing of Pay Per-Click and Pay-Per-View Online Display Advertisements Single-Period Balancing of Pay Per-Click and Pay-Per-View Online Display Advertisements Changhyun Kwon Department of Industrial and Systems Engineering University at Buffalo, the State University of New

More information

Analysis of a Production/Inventory System with Multiple Retailers

Analysis of a Production/Inventory System with Multiple Retailers Analysis of a Production/Inventory System with Multiple Retailers Ann M. Noblesse 1, Robert N. Boute 1,2, Marc R. Lambrecht 1, Benny Van Houdt 3 1 Research Center for Operations Management, University

More information

Best-response play in partially observable card games

Best-response play in partially observable card games Best-response play in partially observable card games Frans Oliehoek Matthijs T. J. Spaan Nikos Vlassis Informatics Institute, Faculty of Science, University of Amsterdam, Kruislaan 43, 98 SJ Amsterdam,

More information

Handling Advertisements of Unknown Quality in Search Advertising

Handling Advertisements of Unknown Quality in Search Advertising Handling Advertisements of Unknown Quality in Search Advertising Sandeep Pandey Carnegie Mellon University spandey@cs.cmu.edu Christopher Olston Yahoo! Research olston@yahoo-inc.com Abstract We consider

More information

Predictive Representations of State

Predictive Representations of State Predictive Representations of State Michael L. Littman Richard S. Sutton AT&T Labs Research, Florham Park, New Jersey {sutton,mlittman}@research.att.com Satinder Singh Syntek Capital, New York, New York

More information

An Application of Inverse Reinforcement Learning to Medical Records of Diabetes Treatment

An Application of Inverse Reinforcement Learning to Medical Records of Diabetes Treatment An Application of Inverse Reinforcement Learning to Medical Records of Diabetes Treatment Hideki Asoh 1, Masanori Shiro 1 Shotaro Akaho 1, Toshihiro Kamishima 1, Koiti Hasida 1, Eiji Aramaki 2, and Takahide

More information

Creating a NL Texas Hold em Bot

Creating a NL Texas Hold em Bot Creating a NL Texas Hold em Bot Introduction Poker is an easy game to learn by very tough to master. One of the things that is hard to do is controlling emotions. Due to frustration, many have made the

More information

Response prediction using collaborative filtering with hierarchies and side-information

Response prediction using collaborative filtering with hierarchies and side-information Response prediction using collaborative filtering with hierarchies and side-information Aditya Krishna Menon 1 Krishna-Prasad Chitrapura 2 Sachin Garg 2 Deepak Agarwal 3 Nagaraj Kota 2 1 UC San Diego 2

More information

TD(0) Leads to Better Policies than Approximate Value Iteration

TD(0) Leads to Better Policies than Approximate Value Iteration TD(0) Leads to Better Policies than Approximate Value Iteration Benjamin Van Roy Management Science and Engineering and Electrical Engineering Stanford University Stanford, CA 94305 bvr@stanford.edu Abstract

More information

Cost-Per-Impression and Cost-Per-Action Pricing in Display Advertising with Risk Preferences

Cost-Per-Impression and Cost-Per-Action Pricing in Display Advertising with Risk Preferences MANUFACTURING & SERVICE OPERATIONS MANAGEMENT Vol. 00, No. 0, Xxxxx 0000, pp. 000 000 issn 1523-4614 eissn 1526-5498 00 0000 0001 INFORMS doi 10.1287/xxxx.0000.0000 c 0000 INFORMS Cost-Per-Impression and

More information

Compact Representations and Approximations for Compuation in Games

Compact Representations and Approximations for Compuation in Games Compact Representations and Approximations for Compuation in Games Kevin Swersky April 23, 2008 Abstract Compact representations have recently been developed as a way of both encoding the strategic interactions

More information

Quality Optimal Policy for H.264 Scalable Video Scheduling in Broadband Multimedia Wireless Networks

Quality Optimal Policy for H.264 Scalable Video Scheduling in Broadband Multimedia Wireless Networks Quality Optimal Policy for H.264 Scalable Video Scheduling in Broadband Multimedia Wireless Networks Vamseedhar R. Reddyvari Electrical Engineering Indian Institute of Technology Kanpur Email: vamsee@iitk.ac.in

More information

6.231 Dynamic Programming and Stochastic Control Fall 2008

6.231 Dynamic Programming and Stochastic Control Fall 2008 MIT OpenCourseWare http://ocw.mit.edu 6.231 Dynamic Programming and Stochastic Control Fall 2008 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms. 6.231

More information

CPC/CPA Hybrid Bidding in a Second Price Auction

CPC/CPA Hybrid Bidding in a Second Price Auction CPC/CPA Hybrid Bidding in a Second Price Auction Benjamin Edelman Hoan Soo Lee Working Paper 09-074 Copyright 2008 by Benjamin Edelman and Hoan Soo Lee Working papers are in draft form. This working paper

More information

Expressive Banner Ad Auctions and Model-Based Online Optimization for Clearing

Expressive Banner Ad Auctions and Model-Based Online Optimization for Clearing Expressive Banner Ad Auctions and Model-Based Online Optimization for Clearing Craig Boutilier Dept. of Computer Science University of Toronto Toronto, ON, CANADA cebly@cs.toronto.edu David C. Parkes SEAS

More information

Contextual-Bandit Approach to Recommendation Konstantin Knauf

Contextual-Bandit Approach to Recommendation Konstantin Knauf Contextual-Bandit Approach to Recommendation Konstantin Knauf 22. Januar 2014 Prof. Ulf Brefeld Knowledge Mining & Assesment 1 Agenda Problem Scenario Scenario Multi-armed Bandit Model for Online Recommendation

More information

Passive Discovery Algorithms

Passive Discovery Algorithms t t Technische Universität Berlin Telecommunication Networks Group arxiv:1506.05255v1 [cs.ni] 17 Jun 2015 Optimized Asynchronous Passive Multi-Channel Discovery of Beacon-Enabled Networks Niels Karowski,

More information

Channel Allocation in Cellular Telephone. Systems. Lab. for Info. and Decision Sciences. Cambridge, MA 02139. bertsekas@lids.mit.edu.

Channel Allocation in Cellular Telephone. Systems. Lab. for Info. and Decision Sciences. Cambridge, MA 02139. bertsekas@lids.mit.edu. Reinforcement Learning for Dynamic Channel Allocation in Cellular Telephone Systems Satinder Singh Department of Computer Science University of Colorado Boulder, CO 80309-0430 baveja@cs.colorado.edu Dimitri

More information

17.6.1 Introduction to Auction Design

17.6.1 Introduction to Auction Design CS787: Advanced Algorithms Topic: Sponsored Search Auction Design Presenter(s): Nilay, Srikrishna, Taedong 17.6.1 Introduction to Auction Design The Internet, which started of as a research project in

More information

Model for dynamic website optimization

Model for dynamic website optimization Chapter 10 Model for dynamic website optimization In this chapter we develop a mathematical model for dynamic website optimization. The model is designed to optimize websites through autonomous management

More information

10-601. Machine Learning. http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html

10-601. Machine Learning. http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html 10-601 Machine Learning http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html Course data All up-to-date info is on the course web page: http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html

More information

Social Media Mining. Data Mining Essentials

Social Media Mining. Data Mining Essentials Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

More information

A Comparison of Google and Yahoo: Which is More Cost Effective When Using Sponsored Links for Small Companies?

A Comparison of Google and Yahoo: Which is More Cost Effective When Using Sponsored Links for Small Companies? CS-BIGS 2(2): 114-119 2009 CS-BIGS http://www.bentley.edu/csbigs/vol2-2/kennedy.pdf A Comparison of Google and Yahoo: Which is More Cost Effective When Using Sponsored Links for Small Companies? Kristin

More information

How To Find Influence Between Two Concepts In A Network

How To Find Influence Between Two Concepts In A Network 2014 UKSim-AMSS 16th International Conference on Computer Modelling and Simulation Influence Discovery in Semantic Networks: An Initial Approach Marcello Trovati and Ovidiu Bagdasar School of Computing

More information

Tracking Groups of Pedestrians in Video Sequences

Tracking Groups of Pedestrians in Video Sequences Tracking Groups of Pedestrians in Video Sequences Jorge S. Marques Pedro M. Jorge Arnaldo J. Abrantes J. M. Lemos IST / ISR ISEL / IST ISEL INESC-ID / IST Lisbon, Portugal Lisbon, Portugal Lisbon, Portugal

More information

An Introduction to Data Mining

An Introduction to Data Mining An Introduction to Intel Beijing wei.heng@intel.com January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail

More information

Process Modelling from Insurance Event Log

Process Modelling from Insurance Event Log Process Modelling from Insurance Event Log P.V. Kumaraguru Research scholar, Dr.M.G.R Educational and Research Institute University Chennai- 600 095 India Dr. S.P. Rajagopalan Professor Emeritus, Dr. M.G.R

More information

Data Preprocessing. Week 2

Data Preprocessing. Week 2 Data Preprocessing Week 2 Topics Data Types Data Repositories Data Preprocessing Present homework assignment #1 Team Homework Assignment #2 Read pp. 227 240, pp. 250 250, and pp. 259 263 the text book.

More information

A Markov Decision Process Model for Traffic Prioritisation Provisioning

A Markov Decision Process Model for Traffic Prioritisation Provisioning Issues in Informing Science and Information Technology A Markov Decision Process Model for Traffic Prioritisation Provisioning Abdullah Gani The University of Sheffield, UK Omar Zakaria The Royal Holloway,

More information

Techniques for Supporting Prediction of Security Breaches in. Critical Cloud Infrastructures Using Bayesian Network and. Markov Decision Process

Techniques for Supporting Prediction of Security Breaches in. Critical Cloud Infrastructures Using Bayesian Network and. Markov Decision Process Techniques for Supporting Prediction of Security Breaches in Critical Cloud Infrastructures Using Bayesian Network and Markov Decision Process by Vinjith Nagaraja A Thesis Presentation in Partial Fulfillment

More information

THE WHE TO PLAY. Teacher s Guide Getting Started. Shereen Khan & Fayad Ali Trinidad and Tobago

THE WHE TO PLAY. Teacher s Guide Getting Started. Shereen Khan & Fayad Ali Trinidad and Tobago Teacher s Guide Getting Started Shereen Khan & Fayad Ali Trinidad and Tobago Purpose In this two-day lesson, students develop different strategies to play a game in order to win. In particular, they will

More information

Call Admission Control and Routing in Integrated Service Networks Using Reinforcement Learning

Call Admission Control and Routing in Integrated Service Networks Using Reinforcement Learning Call Admission Control and Routing in Integrated Service Networks Using Reinforcement Learning Peter Marbach LIDS MIT, Room 5-07 Cambridge, MA, 09 email: marbach@mit.edu Oliver Mihatsch Siemens AG Corporate

More information

Safe robot motion planning in dynamic, uncertain environments

Safe robot motion planning in dynamic, uncertain environments Safe robot motion planning in dynamic, uncertain environments RSS 2011 Workshop: Guaranteeing Motion Safety for Robots June 27, 2011 Noel du Toit and Joel Burdick California Institute of Technology Dynamic,

More information

OPTIMAL DESIGN OF A MULTITIER REWARD SCHEME. Amir Gandomi *, Saeed Zolfaghari **

OPTIMAL DESIGN OF A MULTITIER REWARD SCHEME. Amir Gandomi *, Saeed Zolfaghari ** OPTIMAL DESIGN OF A MULTITIER REWARD SCHEME Amir Gandomi *, Saeed Zolfaghari ** Department of Mechanical and Industrial Engineering, Ryerson University, Toronto, Ontario * Tel.: + 46 979 5000x7702, Email:

More information

Finite Mathematics Using Microsoft Excel

Finite Mathematics Using Microsoft Excel Overview and examples from Finite Mathematics Using Microsoft Excel Revathi Narasimhan Saint Peter's College An electronic supplement to Finite Mathematics and Its Applications, 6th Ed., by Goldstein,

More information

CASH FLOW MATCHING PROBLEM WITH CVaR CONSTRAINTS: A CASE STUDY WITH PORTFOLIO SAFEGUARD. Danjue Shang and Stan Uryasev

CASH FLOW MATCHING PROBLEM WITH CVaR CONSTRAINTS: A CASE STUDY WITH PORTFOLIO SAFEGUARD. Danjue Shang and Stan Uryasev CASH FLOW MATCHING PROBLEM WITH CVaR CONSTRAINTS: A CASE STUDY WITH PORTFOLIO SAFEGUARD Danjue Shang and Stan Uryasev PROJECT REPORT #2011-1 Risk Management and Financial Engineering Lab Department of

More information

So today we shall continue our discussion on the search engines and web crawlers. (Refer Slide Time: 01:02)

So today we shall continue our discussion on the search engines and web crawlers. (Refer Slide Time: 01:02) Internet Technology Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No #39 Search Engines and Web Crawler :: Part 2 So today we

More information

Reinforcement Learning for Resource Allocation and Time Series Tools for Mobility Prediction

Reinforcement Learning for Resource Allocation and Time Series Tools for Mobility Prediction Reinforcement Learning for Resource Allocation and Time Series Tools for Mobility Prediction Baptiste Lefebvre 1,2, Stephane Senecal 2 and Jean-Marc Kelif 2 1 École Normale Supérieure (ENS), Paris, France,

More information

The Multiple Attribution Problem in Pay-Per-Conversion Advertising

The Multiple Attribution Problem in Pay-Per-Conversion Advertising The Multiple Attribution Problem in Pay-Per-Conversion Advertising Patrick Jordan 1, Mohammad Mahdian 1, Sergei Vassilvitskii 1, and Erik Vee 1 Yahoo! Research, Santa Clara, CA 95054. {prjordan, mahdian,

More information

Multi-agent System for Web Advertising

Multi-agent System for Web Advertising Multi-agent System for Web Advertising Przemysław Kazienko 1 1 Wrocław University of Technology, Institute of Applied Informatics, Wybrzee S. Wyspiaskiego 27, 50-370 Wrocław, Poland kazienko@pwr.wroc.pl

More information

A Behavior Based Kernel for Policy Search via Bayesian Optimization

A Behavior Based Kernel for Policy Search via Bayesian Optimization via Bayesian Optimization Aaron Wilson WILSONAA@EECS.OREGONSTATE.EDU Alan Fern AFERN@EECS.OREGONSTATE.EDU Prasad Tadepalli TADEPALL@EECS.OREGONSTATE.EDU Oregon State University School of EECS, 1148 Kelley

More information

Optimization under uncertainty: modeling and solution methods

Optimization under uncertainty: modeling and solution methods Optimization under uncertainty: modeling and solution methods Paolo Brandimarte Dipartimento di Scienze Matematiche Politecnico di Torino e-mail: paolo.brandimarte@polito.it URL: http://staff.polito.it/paolo.brandimarte

More information

Single item inventory control under periodic review and a minimum order quantity

Single item inventory control under periodic review and a minimum order quantity Single item inventory control under periodic review and a minimum order quantity G. P. Kiesmüller, A.G. de Kok, S. Dabia Faculty of Technology Management, Technische Universiteit Eindhoven, P.O. Box 513,

More information

Scalable Machine Learning - or what to do with all that Big Data infrastructure

Scalable Machine Learning - or what to do with all that Big Data infrastructure - or what to do with all that Big Data infrastructure TU Berlin blog.mikiobraun.de Strata+Hadoop World London, 2015 1 Complex Data Analysis at Scale Click-through prediction Personalized Spam Detection

More information

Optimizing Display Advertisements Based on Historic User Trails

Optimizing Display Advertisements Based on Historic User Trails Optimizing Display Advertisements Based on Historic User Trails Neha Gupta, Udayan Sandeep Nawathe Khurana, Tak Yeon Lee Tumri Inc. Department of Computer San Mateo, CA Science snawathe@tumri.com University

More information

Revenue Maximization with Dynamic Auctions in IaaS Cloud Markets

Revenue Maximization with Dynamic Auctions in IaaS Cloud Markets Revenue Maximization with Dynamic Auctions in IaaS Cloud Markets Wei Wang, Ben Liang, Baochun Li Department of Electrical and Computer Engineering University of Toronto Abstract Cloud service pricing plays

More information

PPC Campaigns Ad Text Ranking vs CTR

PPC Campaigns Ad Text Ranking vs CTR PPC Campaigns Ad Text Ranking vs CTR The analysis of ad texts efficiency in PPC campaigns executed by Sembox agency. Lodz, Poland; July 2010 r. Content Main Conclusions... 3 About Sembox... 6 The relation

More information

Mechanism Design for Federated Sponsored Search Auctions

Mechanism Design for Federated Sponsored Search Auctions Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence Mechanism Design for Federated Sponsored Search Auctions Sofia Ceppi and Nicola Gatti Dipartimento di Elettronica e Informazione

More information

Revenue Management for Transportation Problems

Revenue Management for Transportation Problems Revenue Management for Transportation Problems Francesca Guerriero Giovanna Miglionico Filomena Olivito Department of Electronic Informatics and Systems, University of Calabria Via P. Bucci, 87036 Rende

More information

A Branch and Bound Algorithm for Solving the Binary Bi-level Linear Programming Problem

A Branch and Bound Algorithm for Solving the Binary Bi-level Linear Programming Problem A Branch and Bound Algorithm for Solving the Binary Bi-level Linear Programming Problem John Karlof and Peter Hocking Mathematics and Statistics Department University of North Carolina Wilmington Wilmington,

More information

The Price of Truthfulness for Pay-Per-Click Auctions

The Price of Truthfulness for Pay-Per-Click Auctions Technical Report TTI-TR-2008-3 The Price of Truthfulness for Pay-Per-Click Auctions Nikhil Devanur Microsoft Research Sham M. Kakade Toyota Technological Institute at Chicago ABSTRACT We analyze the problem

More information

Revenue Optimization with Relevance Constraint in Sponsored Search

Revenue Optimization with Relevance Constraint in Sponsored Search Revenue Optimization with Relevance Constraint in Sponsored Search Yunzhang Zhu Gang Wang Junli Yang Dakan Wang Jun Yan Zheng Chen Microsoft Resarch Asia, Beijing, China Department of Fundamental Science,

More information

Programming Using Python

Programming Using Python Introduction to Computation and Programming Using Python Revised and Expanded Edition John V. Guttag The MIT Press Cambridge, Massachusetts London, England CONTENTS PREFACE xiii ACKNOWLEDGMENTS xv 1 GETTING

More information

Sanjeev Kumar. contribute

Sanjeev Kumar. contribute RESEARCH ISSUES IN DATAA MINING Sanjeev Kumar I.A.S.R.I., Library Avenue, Pusa, New Delhi-110012 sanjeevk@iasri.res.in 1. Introduction The field of data mining and knowledgee discovery is emerging as a

More information

arxiv:1506.04135v1 [cs.ir] 12 Jun 2015

arxiv:1506.04135v1 [cs.ir] 12 Jun 2015 Reducing offline evaluation bias of collaborative filtering algorithms Arnaud de Myttenaere 1,2, Boris Golden 1, Bénédicte Le Grand 3 & Fabrice Rossi 2 arxiv:1506.04135v1 [cs.ir] 12 Jun 2015 1 - Viadeo

More information

Risk Management for IT Security: When Theory Meets Practice

Risk Management for IT Security: When Theory Meets Practice Risk Management for IT Security: When Theory Meets Practice Anil Kumar Chorppath Technical University of Munich Munich, Germany Email: anil.chorppath@tum.de Tansu Alpcan The University of Melbourne Melbourne,

More information

IMPLEMENTATION OF RELIABLE CACHING STRATEGY IN CLOUD ENVIRONMENT

IMPLEMENTATION OF RELIABLE CACHING STRATEGY IN CLOUD ENVIRONMENT INTERNATIONAL JOURNAL OF ADVANCED RESEARCH IN ENGINEERING AND SCIENCE IMPLEMENTATION OF RELIABLE CACHING STRATEGY IN CLOUD ENVIRONMENT M.Swapna 1, K.Ashlesha 2 1 M.Tech Student, Dept of CSE, Lord s Institute

More information

For example: Standard Banners: Images displayed alongside, above or below content on a webpage.

For example: Standard Banners: Images displayed alongside, above or below content on a webpage. What is display advertising? The term display advertising refers to banner ads displayed on web pages. Over time, banner ads evolved from very simple basic text to include more engaging content and format

More information

CITY UNIVERSITY OF HONG KONG. Revenue Optimization in Internet Advertising Auctions

CITY UNIVERSITY OF HONG KONG. Revenue Optimization in Internet Advertising Auctions CITY UNIVERSITY OF HONG KONG l ½ŒA Revenue Optimization in Internet Advertising Auctions p ]zwû ÂÃÙz Submitted to Department of Computer Science õò AX in Partial Fulfillment of the Requirements for the

More information

Chapter 4. SDP - difficulties. 4.1 Curse of dimensionality

Chapter 4. SDP - difficulties. 4.1 Curse of dimensionality Chapter 4 SDP - difficulties So far we have discussed simple problems. Not necessarily in a conceptual framework, but surely in a computational. It makes little sense to discuss SDP, or DP for that matter,

More information

Motivation. Motivation. Can a software agent learn to play Backgammon by itself? Machine Learning. Reinforcement Learning

Motivation. Motivation. Can a software agent learn to play Backgammon by itself? Machine Learning. Reinforcement Learning Motivation Machine Learning Can a software agent learn to play Backgammon by itself? Reinforcement Learning Prof. Dr. Martin Riedmiller AG Maschinelles Lernen und Natürlichsprachliche Systeme Institut

More information

Scheduling Algorithms for Downlink Services in Wireless Networks: A Markov Decision Process Approach

Scheduling Algorithms for Downlink Services in Wireless Networks: A Markov Decision Process Approach Scheduling Algorithms for Downlink Services in Wireless Networks: A Markov Decision Process Approach William A. Massey ORFE Department Engineering Quadrangle, Princeton University Princeton, NJ 08544 K.

More information

Botticelli: A Supply Chain Management Agent

Botticelli: A Supply Chain Management Agent Botticelli: A Supply Chain Management Agent M. Benisch, A. Greenwald, I. Grypari, R. Lederman, V. Naroditskiy, and M. Tschantz Department of Computer Science, Brown University, Box 1910, Providence, RI

More information

INTEGRATED OPTIMIZATION OF SAFETY STOCK

INTEGRATED OPTIMIZATION OF SAFETY STOCK INTEGRATED OPTIMIZATION OF SAFETY STOCK AND TRANSPORTATION CAPACITY Horst Tempelmeier Department of Production Management University of Cologne Albertus-Magnus-Platz D-50932 Koeln, Germany http://www.spw.uni-koeln.de/

More information

A web marketing system with automatic pricing. Introduction. Abstract: Keywords:

A web marketing system with automatic pricing. Introduction. Abstract: Keywords: A web marketing system with automatic pricing Abstract: Naoki Abe Tomonari Kamba NEC C& C Media Res. Labs. and Human Media Res. Labs. 4-1-1 Miyazaki, Miyamae-ku, Kawasaki 216-8555 JAPAN abe@ccm.cl.nec.co.jp,

More information

Computational Advertising Andrei Broder Yahoo! Research. SCECR, May 30, 2009

Computational Advertising Andrei Broder Yahoo! Research. SCECR, May 30, 2009 Computational Advertising Andrei Broder Yahoo! Research SCECR, May 30, 2009 Disclaimers This talk presents the opinions of the author. It does not necessarily reflect the views of Yahoo! Inc or any other

More information

Basic Probability Concepts

Basic Probability Concepts page 1 Chapter 1 Basic Probability Concepts 1.1 Sample and Event Spaces 1.1.1 Sample Space A probabilistic (or statistical) experiment has the following characteristics: (a) the set of all possible outcomes

More information

CHAPTER 1 INTRODUCTION

CHAPTER 1 INTRODUCTION 1 CHAPTER 1 INTRODUCTION Exploration is a process of discovery. In the database exploration process, an analyst executes a sequence of transformations over a collection of data structures to discover useful

More information

AdTheorent s. The Intelligent Solution for Real-time Predictive Technology in Mobile Advertising. The Intelligent Impression TM

AdTheorent s. The Intelligent Solution for Real-time Predictive Technology in Mobile Advertising. The Intelligent Impression TM AdTheorent s Real-Time Learning Machine (RTLM) The Intelligent Solution for Real-time Predictive Technology in Mobile Advertising Worldwide mobile advertising revenue is forecast to reach $11.4 billion

More information

Discrete Frobenius-Perron Tracking

Discrete Frobenius-Perron Tracking Discrete Frobenius-Perron Tracing Barend J. van Wy and Michaël A. van Wy French South-African Technical Institute in Electronics at the Tshwane University of Technology Staatsartillerie Road, Pretoria,

More information

Options with exceptions

Options with exceptions Options with exceptions Munu Sairamesh and Balaraman Ravindran Indian Institute Of Technology Madras, India Abstract. An option is a policy fragment that represents a solution to a frequent subproblem

More information

Abstract Title: Planned Preemption for Flexible Resource Constrained Project Scheduling

Abstract Title: Planned Preemption for Flexible Resource Constrained Project Scheduling Abstract number: 015-0551 Abstract Title: Planned Preemption for Flexible Resource Constrained Project Scheduling Karuna Jain and Kanchan Joshi Shailesh J. Mehta School of Management, Indian Institute

More information

Efficient Node Discovery in Mobile Wireless Sensor Networks

Efficient Node Discovery in Mobile Wireless Sensor Networks Efficient Node Discovery in Mobile Wireless Sensor Networks Vladimir Dyo, Cecilia Mascolo 1 Department of Computer Science, University College London Gower Street, London WC1E 6BT, UK v.dyo@cs.ucl.ac.uk

More information

An Autonomous Agent for Supply Chain Management

An Autonomous Agent for Supply Chain Management In Gedas Adomavicius and Alok Gupta, editors, Handbooks in Information Systems Series: Business Computing, Emerald Group, 2009. An Autonomous Agent for Supply Chain Management David Pardoe, Peter Stone

More information

Data Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland

Data Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland Data Mining and Knowledge Discovery in Databases (KDD) State of the Art Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland 1 Conference overview 1. Overview of KDD and data mining 2. Data

More information

Lecture 8 The Subjective Theory of Betting on Theories

Lecture 8 The Subjective Theory of Betting on Theories Lecture 8 The Subjective Theory of Betting on Theories Patrick Maher Philosophy 517 Spring 2007 Introduction The subjective theory of probability holds that the laws of probability are laws that rational

More information

Supplement to Call Centers with Delay Information: Models and Insights

Supplement to Call Centers with Delay Information: Models and Insights Supplement to Call Centers with Delay Information: Models and Insights Oualid Jouini 1 Zeynep Akşin 2 Yves Dallery 1 1 Laboratoire Genie Industriel, Ecole Centrale Paris, Grande Voie des Vignes, 92290

More information

Community Mining from Multi-relational Networks

Community Mining from Multi-relational Networks Community Mining from Multi-relational Networks Deng Cai 1, Zheng Shao 1, Xiaofei He 2, Xifeng Yan 1, and Jiawei Han 1 1 Computer Science Department, University of Illinois at Urbana Champaign (dengcai2,

More information

FOUNDATIONS OF A CROSS- DISCIPLINARY PEDAGOGY FOR BIG DATA

FOUNDATIONS OF A CROSS- DISCIPLINARY PEDAGOGY FOR BIG DATA FOUNDATIONS OF A CROSSDISCIPLINARY PEDAGOGY FOR BIG DATA Joshua Eckroth Stetson University DeLand, Florida 3867402519 jeckroth@stetson.edu ABSTRACT The increasing awareness of big data is transforming

More information