Fast Demand Learning for Display Advertising Revenue Management



Similar documents
All pay auctions with certain and uncertain prizes a comment

Reasoning to Solve Equations and Inequalities

Value Function Approximation using Multiple Aggregation for Multiattribute Resource Management

Polynomial Functions. Polynomial functions in one variable can be written in expanded form as ( )

Econ 4721 Money and Banking Problem Set 2 Answer Key

Treatment Spring Late Summer Fall Mean = 1.33 Mean = 4.88 Mean = 3.

Factoring Polynomials

Lecture 3 Gaussian Probability Distribution

Small Business Networking

Small Business Networking

Small Business Networking

How To Network A Smll Business

LINEAR TRANSFORMATIONS AND THEIR REPRESENTING MATRICES

EQUATIONS OF LINES AND PLANES

MATH 150 HOMEWORK 4 SOLUTIONS

Babylonian Method of Computing the Square Root: Justifications Based on Fuzzy Techniques and on Computational Complexity

and thus, they are similar. If k = 3 then the Jordan form of both matrices is

An Undergraduate Curriculum Evaluation with the Analytic Hierarchy Process

Helicopter Theme and Variations

Integration. 148 Chapter 7 Integration

Small Business Networking

Operations with Polynomials

Mathematics. Vectors. hsn.uk.net. Higher. Contents. Vectors 128 HSN23100

Basic Analysis of Autarky and Free Trade Models

Experiment 6: Friction

Bayesian Updating with Continuous Priors Class 13, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Graphs on Logarithmic and Semilogarithmic Paper

Integration by Substitution

Enterprise Risk Management Software Buyer s Guide

DlNBVRGH + Sickness Absence Monitoring Report. Executive of the Council. Purpose of report

How To Set Up A Network For Your Business

Distributions. (corresponding to the cumulative distribution function for the discrete case).

SPECIAL PRODUCTS AND FACTORIZATION

PROF. BOYAN KOSTADINOV NEW YORK CITY COLLEGE OF TECHNOLOGY, CUNY

COMPARISON OF SOME METHODS TO FIT A MULTIPLICATIVE TARIFF STRUCTURE TO OBSERVED RISK DATA BY B. AJNE. Skandza, Stockholm ABSTRACT

Math 135 Circles and Completing the Square Examples

The Velocity Factor of an Insulated Two-Wire Transmission Line

CHAPTER 11 Numerical Differentiation and Integration

ClearPeaks Customer Care Guide. Business as Usual (BaU) Services Peace of mind for your BI Investment

Example 27.1 Draw a Venn diagram to show the relationship between counting numbers, whole numbers, integers, and rational numbers.

Euler Euler Everywhere Using the Euler-Lagrange Equation to Solve Calculus of Variation Problems

Vectors Recap of vectors

UNIVERSITY OF NOTTINGHAM. Discussion Papers in Economics STRATEGIC SECOND SOURCING IN A VERTICAL STRUCTURE

Or more simply put, when adding or subtracting quantities, their uncertainties add.

How To Understand The Theory Of Inequlities

Example A rectangular box without lid is to be made from a square cardboard of sides 18 cm by cutting equal squares from each corner and then folding

Redistributing the Gains from Trade through Non-linear. Lump-sum Transfers

Health insurance exchanges What to expect in 2014

5.2. LINE INTEGRALS 265. Let us quickly review the kind of integrals we have studied so far before we introduce a new one.

ORBITAL MANEUVERS USING LOW-THRUST

9 CONTINUOUS DISTRIBUTIONS

4.11 Inner Product Spaces

Health insurance marketplace What to expect in 2014

MODULE 3. 0, y = 0 for all y

This paper considers two independent firms that invest in resources such as capacity or inventory based on

Second Term MAT2060B 1. Supplementary Notes 3 Interchange of Differentiation and Integration

Economics Letters 65 (1999) macroeconomists. a b, Ruth A. Judson, Ann L. Owen. Received 11 December 1998; accepted 12 May 1999

TITLE THE PRINCIPLES OF COIN-TAP METHOD OF NON-DESTRUCTIVE TESTING

College Admissions with Entrance Exams: Centralized versus Decentralized

Physics 43 Homework Set 9 Chapter 40 Key

A.7.1 Trigonometric interpretation of dot product A.7.2 Geometric interpretation of dot product

Use Geometry Expressions to create a more complex locus of points. Find evidence for equivalence using Geometry Expressions.

Protocol Analysis / Analysis of Software Artifacts Kevin Bierhoff

Recognition Scheme Forensic Science Content Within Educational Programmes

Portfolio approach to information technology security resource allocation decisions

Discovering General Logical Network Topologies

Binary Representation of Numbers Autar Kaw

Lecture 5. Inner Product

Small Businesses Decisions to Offer Health Insurance to Employees

Techniques for Requirements Gathering and Definition. Kristian Persson Principal Product Specialist

On the Robustness of Most Probable Explanations

Decision Rule Extraction from Trained Neural Networks Using Rough Sets

piecewise Liner SLAs and Performance Timetagment

Understanding Basic Analog Ideal Op Amps

How To Get A Free Phone Line From A Cell Phone To A Landline For A Business

VoIP for the Small Business

6.2 Volumes of Revolution: The Disk Method

Solving BAMO Problems

VoIP for the Small Business

Homework 3 Solutions

Derivatives and Rates of Change

Learning to Search Better than Your Teacher

The Relative Advantages of Flexible versus Designated Manufacturing Technologies

P.3 Polynomials and Factoring. P.3 an 1. Polynomial STUDY TIP. Example 1 Writing Polynomials in Standard Form. What you should learn

How To Reduce Telecommunictions Costs

Week 7 - Perfect Competition and Monopoly

Multiple Testing in a Two-Stage Adaptive Design With Combination Tests Controlling FDR

Introducing Kashef for Application Monitoring

Transcription:

Fst Demnd Lerning for Disply Advertising Revenue Mngement Drgos Florin Ciocn Vivek F Fris April 30, 2014 Abstrct The present pper is motivted by the network revenue mngement problems tht occur in online disply dvertising In this setting, ech impression (demnd) type corresponds to vector of d user fetures; consequently, the overll number of demnd types tht need to be forecst is exponentil in d Our min contribution is to show tht such high dimensionl demnd spces cn still be estimted efficiently In prticulr, using number of demnd smples tht scles linerly in d nd qudrticlly in the number of dvertisers, we construct demnd estimtor tht informs simple bid-price lloction policy We show tht this policy grners t lest (1 ɛ) frction of the optiml revenues tht could be chieved with perfect priori informtion of the demnd type distribution 1 Introduction The explosion in recent yers in web nd mobile online dvertising volume hs brought to the fore host of dynmic lloction problems in the sme genre s the core RM problem of network revenue mngement (RM) One such modern instntition of RM occurs in online disply dvertising mrkets nd is commonly referred to in the literture s the Ad-Disply problem The commercil vlue tht is tied to solving this problem in prctice is on the order of $50 billion globlly 1 ; moreover, the growth in disply d spend is significntly outpcing tht of more trditionl dvertising chnnels, s online dvertising cptures incresing mrket shre with its vlue proposition of precise, specific customer trgeting Informlly, the Ad-Disply problem is formulted s follows: over some finite time horizon, n d network receives n online sequence of user rrivls clled impressions, ech ssocited with vector of fetures in some d-dimensionl spce X Ech vector x X identifies n impression type Upon the rrivl of n impression, the network must decide whether to llocte it to one of m competing dvertisers Ech dvertiser specifies bid r for ny impression belonging to set of comptible types X X, budget B which equls the mximum number of impressions the dvertiser is contrctully obliged to py for DFC is with the Slon School of Mngement, MIT VFF is with the Opertions Reserch Center nd Slon School of Mngement, MIT Emils: {ciocn, vivekf@mitedu} 1 http://wwwzenithoptimedicom/zenithoptimedi-releses-new-d-forecsts-globl-dvertising-continues-togrow-despite-eurozone-fers/ 1

The d network s gol is to mximize its overll revenues from llocting impressions to dvertisers, subject to not exceeding their d budgets We note tht impression rrivls constitute the nlog of customer rrivls in more trditionl RM pplictions, nd s such we will use the terms impressions nd demnd interchngebly The present pper deprts from clssicl RM models by ddressing the chllenges tht rise due to the extremely heterogenous, high-dimensionl nture of the impressions/demnd spce in Ad-Disply pplictions When n impression rrives, the d network hs ccess to detiled informtion describing user loction, demogrphics, pst browsing behvior nd vrious other user ttributes; in fct, it is precisely this informtion which llows dvertisers to set very specific nd precise customer trgeting conditions An exmple of such user dt is feture vector like: ( country=us, city=atlnt, device=iphone5, keywords=movies, ge_group=35-44, reltionship_sttus=single, eduction=college, hhi=> 100K, demogrphic=home_buyer, ) The impliction is tht X corresponds to d-dimensionl feture spce The demnd spce is thus inherently exponentilly sized ( X = O(exp(d)) nd, moreover, d could be nywhere from tens to hundreds of different ttributes, depending on the specific d network Compred to the the irline or hospitlity pplictions of RM, the number of demnd types tht must be modeled nd forecst here is mny orders of mgnitude higher Additionlly, the rte t which prticulr type rrives is driven by highly idiosyncrtic fctors, mking it unlikely tht one could employ demnd forecsting strtegies tht ggregte demnd types Due to the difficulty in building demnd models of such extreme dimensionlity, mny online dvertising systems currently used completely eschew the usge of demnd forecsts Insted, their pproch is to ssume tht impressions rrive dversrilly nd design lloction lgorithms which dmit constnt fctor competitive rtios versus the worst cse input The most successful such pproches re priml-dul schem the clssicl priml-dul result is 1 1/e pproximtion (Buchbinder nd or [2009], Meht et l [2005]) for AdWords lloction, n online dvertising model tht differs from Ad-Disply in the fct tht dvertiser budgets re denominted in terms of dollr budgets insted of impression counts 2 More recently, Feldmn et l [2009] nd Bhlgt et l [2012] design priml-dul schemes tht chieve 1 1/e for the Ad-Disply problem in the free disposl model; under the free disposl ssumption, dvertisers my receive more impressions tht their budgets specify, without being chrged for the excess lloctions While such schemes chieve constnt bounds ginst dversril demnd, they re lso excessively weighted towrds building robustness ginst unrelistic worst cse demnd scenrios nd their lloctions re overly conservtive in prctice In contrst, lgorithms tht mke use of historicl dt to build demnd forecsts stnd to chieve significnt revenue gins versus the bove dversril pproches Such n pproch tht relies on demnd forecst is the clssicl fluid model for the network revenue mngement problem (Gllego nd vn Ryzin [1997], Tlluri nd Ryzin [1998]); under the ssumption tht the rtes of rrivl of ech demnd type re deterministic nd known hed of time, these ppers show tht simple liner progrm provides n pproximtely optiml policy, s the number of demnd rrivls nd the resource cpcities grow simultneously However, these rtes re unknown nd must be estimted in prctice, nd moreover, in the context of online dvertising my be chnging t even the scle of one hour (see Ciocn nd Fris [2012] for model predictive control scheme tht generlizes the fluid model pproch to setting where the rrivl rtes re themselves stochstic process) Hence, for the fluid pproch to be prcticl, 2 AdWords is the previling model for sponsored serch, s opposed to the disply dvertising setting we consider in this pper 2

one needs to build high-fidelity estimte of the current shpe of demnd extremely quickly The question to sk is then: is it possible to lern n exponentilly sized demnd spce with reltively smll (poly-logrithmic in the size) number of demnd smples? ote tht, with such stringent condition on the smple complexity, it is hopeless to rrive t pin-point shrp resolution over ll of X Insted, our lerning gol is to rrive t n estimte over X tht we cn use to clculte n pproximtely optiml control to the lloction problem; put nother wy, we require forecst tht we cn plug into the Gllego nd vn Ryzin [1997] LP nd chieve roughly the sme objective vlue we would hve grnered with knowledge of the true forecst One nturl cndidte scheme to ccomplish this is to smple impressions nd plug the resulting empiricl distribution on X to obtin estimte bid-prices; these estimtes cn then be used s controls to drive the lloction decisions This lgorithm hs been nlyzed in sequence of ppers (Devnur nd Hyes [2009], Feldmn et l [2010], Agrwl et l [2009] nd Molinro nd Rvi) in the context of the AdWords problem, nd subsequently, the generl RM problem This line of results considers the rndom permuttion model of demnd rrivls, which ssumes tht the totl number of rriving impressions is known in dvnce, but the order of the rrivls is chosen t rndom over ll possible permuttions We note tht, while more generl thn the iid model we will consider in our model, the rndom permuttion model is still limited to describing sttionry demnd distributions 3 Their result is tht, with = ɛ X smples, the bid-price estimtes provide (1 ɛ) pproximtion of OPT, s long s OPT/r mx scles like m2 log( X /ɛ) However, the condition ɛ 3 on OPT proves to be quite restrictive: in order to mintin relistic scling scling of OPT with respect to m nd X, ɛ must be chosen to be constnt; the smple complexity is = O( X ) Such lerning complexity is still imprcticl in the disply dvertising context for two resons Even ssuming tht the demnd types rrive from distribution tht stys fixed throughout the entire lifetime of the lloction problem, hving to observe = O( X ) smples in order to lern the optiml control is quite wsteful given the mgnitude of X Secondly, s lluded to before, the demnd model is constntly chnging nd, in order to rect to these chnges in rrivl rtes, it is necessry to periodiclly updte the demnd forecst nd re-solve the fluid model LP In the context of online dvertising, over the hour during which we my expect the rrivl rtes to sty constnt, the d network my not even receive O( X ) impressions; therefore, the bility to lern t much fster speed thn O( X ) smples is criticl Given the bove chllenges, the question we pose is the following: ssuming tht impressions rrive iid from fixed, but unknown distribution µ on X, cn we lern simple policy tht grners revenues within (1 ɛ) fctor of clirvoynt optimum with priori knowledge of µ, using only = poly(log X, m) smples? We nswer this question to the ffirmtive by showing tht, with number of smples tht scles like m mx{m, log(m X )}, we cn compute (1 ɛ)- optiml bid-price policy tht is built upon simple empiricl estimtor ˆµ of µ We note tht one interprettion of our result is tht ˆµ, which hs Supp(ˆµ) Supp(µ), recovers ltent low dimensionl representtion of the rel distribution µ Moreover, we show tht, up to fctor ( ) 2 of rmx r vg log(m X ), this smple complexity is the best one could chieve using the empiricl estimtor The lgorithm we employ is essentilly identicl to the one used in the rndom permuttion model literture cited bove However, we restrict our nlysis to the Ad-Disply problems described bove rther thn the generl network revenue mngement problem The prticulr pricing structure ( single dvertiser bid for ll comptible impressions) tht is idiosyncrtic to 3 In fct, de Finetti s theorem estblishes tht the exchngeble distributions of demnd in rndom permuttion model re in fct independent conditioned on ltent vrible 3

Ad-Disply will be leverged in the nlysis in order to chieve our logrithmic scling in terms of X 2 Model nd Algorithm Impressions model: We consider discrete T -time period model in which, t ech time step 1 i T exctly one impression rrives to the d network Let X R d be discrete feture spce with ech point in X describing n impression type 4 We ssume tht there exists n unknown distribution µ : X [0, 1] X from which ech of the rriving T impressions re smpled iid; upon rrivl, ech impression is ssigned type in X ccording to µ Let X 1,, X T be the sequence of rndom vribles denoting the types of the rriving impressions Advertiser model: There re m dvertisers with budgets B T R m + B cn be interpreted s the budget per unit of impression Ech dvertiser is endowed with chrcteristic set X X such tht s bid for impression type x is r 1(x X ) for some positive r In order to provide exct constnts in our bounds, we will ssume m 4 throughout the pper Let OPT(T, B) be the mximum revenues chievble in this system In principle, this optimum could be clculted in the following wy Let B(t) the m-dimensionl rndom vrible tht describes the remining budgets of dvertisers (with the boundry condition B(0) = BT ), nd define the dmissible control set t time t to be: { } O t = o t : [m] {0, 1} st 1 o t 1, o t B(t) B(t 1) Then, [ T ] OPT(T, B) = mx E r 1(X t X )o t () o t O t,t [T ] t=1 In the nlysis of our lgorithm, we will work with nturl upper bound on this optimum vlue Let us define the following unit time optimiztion problem, together with its dul formultion: LP µ = mx subject to r 1(x X )z(x, )µ(x) x X z(x, )µ(x) B x X z(x, ) 1 z 0 D-LP µ = min subject to α(x) + B β() x X α(x) + µ(x)β() r 1(x X )µ(x) α, β 0 One cn interpret LP µ s the long run unit time revenue clirvoynt could chieve with priori knowledge of µ s T More rigorously, LP µ provides n upper bound on OPT(T,B) T, s stted in the following lemm whose proof is delyed to the ppendix Our nlysis will mke it convenient to use LP µ s the benchmrk to mesure the performnce of our lerning lgorithm Lemm 1 LP µ OPT(T,BT ) T 4 It is not necessry to constrin X to be discrete Alterntively, X could be ny Borel mesurble set nd d would correspond to the metric entropy of X 4

Furthermore, we mke the well-known observtion tht the dul of LP µ gives rise to vector of bid prices on dvertisers which cn implement control for the priml problem Definition 1 Let β R m + be vector of shdow prices for the m dvertisers The bid-price control ssocited to β is mp z β : X [m] {0, 1} such tht z β (x, ) = { 1, if rg mxj {r j 1(x X j ) β(j)} nd r 1(x X ) β() > 0 0, otherwise ote tht in our definition, we llow for llocting the sme impression doubly becuse of ties In Section 21, we will show tht the impct of how we del with ties is negligible We re now redy to describe our lerning lgorithm As with previous pproches such s Devnur nd Hyes [2009], we will give the lgorithm burn-in period to observe trining impressions nd estimte control policy This policy will then be used to decide the lloction for the impressions tht rrive over the following T period horizon 5 Lerning lgorithm: 1 Smple impressions from µ nd clculte the empiricl distribution ˆµ ˆµ (x) = 1 1(X i = x) 1 i 2 Compute n extreme point ˆβ rg min D-LPˆµ 3 Use the control z ˆβ to llocte impressions X 1,, X T ote tht the size of D-LPˆµ in step 2 of the lgorithm only depends on X rtificilly In fct, since t most points of ˆµ hve nonzero density, the impression dimension of D-LP is t most Put nother wy, the benefits of our lgorithm re two-fold - besides lerning control policy with smll smple complexity, the computtionl complexity of the underlying control problem is lso reduced by n equl mount We end this section by defining two pieces of nottion tht we will use throughout the rest of the pper For some bid-price β, let The set of x s which get llocted to dvertiser Z β = {x X : z β (x, ) > 0} The revenues of using policy β when impressions rrive from mesure ν, { } Rev ν (β) = = r min B, x X 1(x X )z β (x, )ν(x)dx { ( )} r min B, ν Z β 5 For simplicity, we ssume our trining dt comes from smples rriving t times + 1,, 1, 0 However, our nlysis could be modified to insted observe the first impressions 1,, from the T impression sequence 5

21 Optimlity of Bid-Prices In this section, we describe generic condition under which using the bid-price controls defined bove closely pproximtes the vlue of using the optiml priml control For nottionl convenience, let ρ = LP µ r mx This quntity (by definition between 0 nd 1) corresponds to the rtio between the optiml long run revenues per impression nd the highest chievble revenue per impression In Section 41, we give bounds on ρ for vrious fmilies of instnces of disply dvertising problems The following definition describes the level of grnulrity of distribution on X : Definition 2 A distribution ν : X [0, 1] is ɛ-good if ν ɛ ρ m For our result to hold, we mke the following following ssumption on the distribution µ from which impressions rrive Assumption 1 The true distribution µ is ɛ m -good We note tht this ssumption is without loss of generlity: if there exist hevy mss impression types tht violte this ssumption, we cn divide them into severl rtificil impression types such tht ech individul point x X hs mss less thn ɛ ρ m 2 The following two lemms stte the following: () For ɛ-good distributions, the bid-price control pproximtes the optiml priml control to within n ɛlp µ fctor (b) For = poly(log X, m), the empiricl distribution ˆµ is with high probbility ɛ-good The purpose of these lemms, whose proofs re delyed to the Appendix, is to gurntee tht the bid-price controls β rg min D-LP µ nd ˆβ rg min D-LPˆµ re pproximtely optiml Lemm 2 For ny ν tht is ɛ-good, where β ν rg min D-LP ν LP ν Rev ν (β ν) ɛlp µ, Lemm 3 Let X 1,, X be iid drws from the distribution µ nd for ll x X, ˆµ (x) = 1 1 i 1(X i = x) For 4 m ρ ɛ log m ( log X + log 1 ), δ ˆµ is ɛ-good with probbility t lest 1 δ s long s µ stisfies Assumption 1 6

Before moving on, we note tht there is loss of fctor of m in the grnulrity of ˆµ versus µ We could insted only ssume µ is ɛ-good nd still get the sme gurntee for ˆµ vi the Dvoretsky- Kiefer-Wolfovitz inequlity, but t cost of m in Lstly, we give the following bound on the totl number of possible bid prices which cn rise s the solution to the dul liner progrm over ll possible distributions ν We will lter use this lemm in Section 4, s proving tht ˆβ is pproximtely optiml will involve tking union bound over ll possible bid price controls tht our lgorithm could output Lemm 4 Let B = {β R m st distribution ν for which β is n extreme point of D-LP ν } Then ( ) X m B m Proof ote tht for ny distribution ν, β is clculted s solution to D-LP ν, or by the trnsformtion α(x) = α(x) ν(x) : min ν(x) α(x) + x X B β() subject to α(x) + β() r, x X α, β 0, The fesible set of the dul LP under the bove chnge of vrible does not depend on ν, so we re only left with counting the totl number of βs tht cn form its extreme points, of which there re t most ( X ) ( ) m X m m 3 A Lower Bound on In this section, we exhibit lower bound on the number of smples our lgorithm requires to find ner optiml ˆβ In order to do this, we sk simpler question: for fixed price policy β, how mny smples re needed to estimte the revenues of tht policy (ie bound the difference between Revˆµ (β) nd Rev µ (β))? In fct, s we show in this section, the estimtor Revˆµ (β) is bised for finite : ( ) Lemm 5 Fix ny bid-price β nd consider fmily of instnces where we set B = µ Z β = 1 m, Then, E [Revˆµ (β)] Rev µ(β) 1 m r vg 2π 3r m vg, where r vg = 1 m r 7

) Proof The bound is consequence of prt 2 of Lemm 9 in the Appendix with Y = ˆµ (Z β : [ E [Revˆµ (β)] Rev µ (β) = E [ = E = r E r ( r min {B, ˆµ Z β ( ) r ˆµ (Z β m) 1 ] [ ( ) (ˆµ Z β m) 1 ] 1 2π σ 1 m r vg 2π 3r m vg )} ] ( )) (ˆµ Z β 3(1 2E[ˆµ { ( )} r min B, µ Z β ( ) Z β A consequence of Lemm 5 is the following theorem describing the minimum smple complexity required for our lgorithm to work in expecttion We note tht the ctul smple complexity we prove in the next section is stronger: the result will hold true with high probbility rther thn in expecttion Theorem 1 The lgorithm requires drwing t lest = ( rvg LP µ ) 2 m ɛ 2 ]) smples to gurntee tht Rev µ (β) E[Rev µ ( ˆβ)] ɛlp µ Proof We exhibit simple instnce for which lrge estimtion gp in the vlue of the optiml bid-price policy β implies lrge optimlity gp for the pproximte bid-price policy ˆβ Let us fix bid price control β nd B = µ(z β ) = 1/m nd dditionlly, construct the bids such tht for every x X, there exists unique dvertiser such tht x X (in other words, ech impression type cn only go to one dvertiser) ote tht this implies tht β = β = 0, for ll, nd tht X = Z β Let us ssume tht the estimtion gp for β is such tht: Rev µ (β) E [Revˆµ (β)] nd we shll prove by contrdiction tht this implies: [ Rev µ (β) E Rev µ ( ˆβ) ] Assume the contrry; since Rev µ (β) [ uses ll the budgets of the dvertisers nd is thus greter or equl to both E [Revˆµ (β)] nd E Rev µ ( ˆβ) ], it must then be tht [ E [Revˆµ (β)] + Rev µ (β) E Rev µ ( ˆβ) ] + 8

[ which implies tht E Rev µ ( ˆβ) ] E [Revˆµ (β)], or, expnding the expressions [( { } { })] 1 ˆβ 1 r E min, µ(z ) min m m, ˆµ(Zβ ) 0 Since we hve constructed the instnce such tht no impression types cn go to two dvertisers, it follows tht ˆβ cn only tke two vlues, nmely { r, if ˆµ(X ) < µ(x ) ˆβ = 0, otherwise nd, since this implies we ccept impression type x iff ˆµ(X ) < µ(x ), [ { }] 1 ˆβ E min, µ(z ) = 1 m m P [ˆµ(X ) µ(x )] nd [ { }] 1 E min m, ˆµ(Zβ ) = 1 m P [ˆµ(X ) µ(x )] + ˆµ(Z β )(1 P [ˆµ(X ) µ(x )]) [ { }] 1 [ = E min m, ˆµ(Zβ ) + E [( { Hence E min 1 m, µ(z ˆβ } { })] ) min 1 m, ˆµ(Zβ ) ] ˆµ(Z β ) ˆµ(X ) µ(x ) (1 P [ˆµ(X ) µ(x )]) } {{ } >0 < 0 nd summing over ll we get the desired ( ) contrdiction To complete the proof, note tht to get = Θ r m vg ɛlp µ, we need to set = ( rvg LP µ ) 2 m ɛ 2 Theorem 2 Any bid-price lgorithm requires drwing t lest ( ) 2 rvg m = LP µ ɛ 2 smples to gurntee tht Rev µ (β) E[Rev µ ( ˆβ)] ɛlp µ 4 Smple Complexity The key step in our smple complexity nlysis will be to find uniform bound on the estimtion error of Revˆµ (β) Rev µ (β), over ll bid-prices β B We stte this key lemm below Lemm 6 For = 64 ( ρ 2 ɛ 2 m log(m X ) + log 1 ), δ P [ β B st Revˆµ (β) Rev µ (β) ɛlp µ ] δ 9

In order to prove the bove, we will proceed in two stges () We will first bound the estimtion error for fixed β B We will brek up this error into two components, which we bound in Lemms 7 nd 8, respectively (b) Hving bounded the error for fixed β, we will prove the bove lemm by tking union bound over ll possible bid-prices, whose crdinlity we hve upper bounded in Lemm 4 As lluded to bove, given prticulr β, we use the tringle inequlity to split the estimtion error into two components: (1) Revˆµ (β) Rev µ (β) Revˆµ (β) E [Revˆµ (β)] + E [Revˆµ (β)] Rev µ (β) We bound the two terms in eqution 1 seprtely: () The first component is probbilistic nd we control it using concentrtion of mesure rgument (b) The second component is precisely the expected bis we lower bounded in Section 3; in the following, we provide uniform mtching upper bound on the mgnitude of this bis llowing us to clculte the rte t which E [Revˆµ (β)] pproches Rev µ (β) Lemm 7 For fixed β B, P [ Revˆµ (β) E [Revˆµ (β)] ɛlp µ ] 2 exp ( ɛ2 ρ 2 8 Proof Let us view our estimte s function of the smples X = (X 1,, X ) drwn from µ to form the empiricl distribution, ie Revˆµ (β) = g(x) We begin by showing tht g stisfies bounded difference property Consider two prticulr sequences of observtions, s = (x 1,, x i, x ) nd s = (x 1,, x i, x ) inducing empiricl distributions ˆµ nd, respectively, ˆµ Since there is single smple on which s nd s differ, it follows tht: ˆµ (x) = ˆµ (x), x X \ {x i, x i} ˆµ (x) ˆµ (x) 1, x {x i, x i} nd, consequently g(s) g(s ) ) = r min{b, ˆµ (Z β } 2 r mx ) ( ) r min{b, ˆµ Z β } Using the Bounded Differences Inequlity (Proposition 1), it follows tht: P [ g(x) E [g(x)] ɛlp µ ] 2 exp ( ɛ2 ρ 2 8 ) 10

We now focus on the second term of eqution 1, E [Revˆµ (β)] Rev µ (β) nd prove n upper bound on this expected bis Lemm 8 For ny β R m nd m, E [Revˆµ (β)] Rev µ (β) 4r mx m Proof The expected bis is [ ( )} E [Revˆµ (β)] Rev µ (β) = E r min {B ], ˆµ Z β { ( )} r min B, µ Z β [ ( )} = E r min {B ], ˆµ Z β { ( )]} r min B, E [ˆµ Z β ( )} { ( )]}] E r [min {B, ˆµ Z β min B, E [ˆµ Z β ( ( )) 1 ( )) r σ (ˆµ Z β + 3 1 2µ Z β 2π 1 ( )) r σ (ˆµ Z β + 3 m 2π r mx where ( the)] first equlity ( ) follows from linerity of expecttions, the second from the fct tht E [ˆµ Z β = µ Z β, the first inequlity is n ppliction of the tringle inequlity, the second inequlity follows from Lemm 9 with X = ˆµ ) (Z β nd in the lst inequlity we ssumed m [ ] For ese of nottion, let us cll p = P 1(x Z β ) such tht ( )) σ (ˆµ Z β p (1 p ) = In order to find uniform bound on the expected bis E [Revˆµ (β)] Rev µ (β) (up to constnts), we cn now simply optimize the bove the bound over ll possible probbilities p: E [Revˆµ (β)] Rev µ (β) 1 2π 1 r mx 2π mx p 0,1 T p 1 mx p 0,1 T p 1 p (1 p ) r + 3 m r mx p (1 p ) + 3 m r mx The lst optimiztion problem is mximized when p = 1 m,, yielding the bound E [Revˆµ (β)] Rev µ (β) 1 m r mx 2π + 3r m mx 4r mx m 11

Before moving on to bounding the error over ll possible bid-prices, we note tht the result bove hevily uses the specil structure of the Ad-Disply problem by simplifying the bis to sum of truncted rndom vribles s in Lemm 9; such simplifiction would not be possible hd we used more generl price structure where dvertisers might bid different mounts over the spce X of comptible impression types ) Proof of Lemm 6 We show tht, if (m 64 log(m X ) + log 1 ρ 2 ɛ 2 δ, P [ β B st Revˆµ (β) Rev µ (β) ɛlp µ ] δ For ny = 1 64m ρ 2 ɛ 2, Lemm 8 gurntees tht such tht E[Revˆµ (β)] Rev µ (β) ɛ 2 LP µ, P [ β B st Revˆµ (β) Rev µ (β) ɛlp µ ] P [ β B st Revˆµ (β) E[Revˆµ (β)] ɛ ] 2 LP µ s consequence of eqution 1 To conclude our proof, we simply employ union bound over β B (Lemm 4) nd use Lemm 7 to show tht [ P β B st Revˆµ (β) E [Revˆµ (β)] ɛ ] ( ) [ m X 2 LP µ P Revˆµ (β) E [Revˆµ (β)] ɛ ] m 2 LP µ ( ) m X 2 exp ( 1 ) m 32 ɛ2 ρ 2 ( (m X ) m 2 exp 1 ) 32 ɛ2 ρ 2, ) The bove probbility is bounded by δ for 2 = (m 64 log(m X ) + log 1 ρ 2 ɛ 2 δ Therefore, tking mx{ 1, 2 } = 2 yields the result The following theorem uses the bove uniform bound on the estimtion error over ll bidprices to show tht the smpled problem, in which impressions rrive from ˆµ, provides close representtion of the originl problem Theorem 3 Let β rgmx D-LP µ nd ˆβ rgmx D-LPˆµ With probbility t lest 1 2δ, ( Rev µ (β ) Revˆµ ˆβ) ɛlpµ, for = 256 ( ρ 2 ɛ 2 m log(m X ) + log 1 ) δ Proof First we tke union bound over the events {ˆµ is not ɛ/2-good with respect to LP µ } nd { β, Revµ (β) Revˆµ (β) ɛ 2 LP } µ to show tht for = 256 ( ρ 2 ɛ 2 m log(m X ) + log 1 ) δ 12

the following hold with probbility t lest 1 2δ, (2) LPˆµ Revˆµ (β ) ɛ 2 LP µ, (3) Rev µ (β ) Revˆµ (β ) ɛ 2 LP µ, nd (4) Rev µ ( ˆβ) Revˆµ ( ˆβ) ɛ 2 LP µ We hve used Lemm 3 for 2 nd Lemm 6 for 3 nd 4 But then, Rev µ (β ) Rev µ ( ˆβ) ɛ 4 LP µ Revˆµ ( ˆβ) 3ɛ 4 LP µ Revˆµ (β ) 5ɛ 4 LP µ Rev µ (β ) 7ɛ 4 LP µ, where the first inequlity follows from pplying Assumption 1 nd Lemm 2, the second from eqution 3, the third from eqution 2, nd the fourth from eqution 4 It hence follows tht Rev µ (β ) + ɛlp µ Revˆµ ( ˆβ) Rev µ (β ) ɛlp µ, or, equivlently, Revµ (β ) Revˆµ ( ) ˆβ ɛlpµ We re now redy to prove our min result, which is direct consequence of the theorem bove, nd proves tht the smpled bid-price control ˆβ gives 1 ɛ pproximtion to the optiml priml control: Theorem 4 With probbility t lest 1 2δ, s long s Rev µ ( ˆβ ) (1 2ɛ)LP µ 256 ( ρ 2 ɛ 2 m log(m X ) + log 1 ) δ Proof Using the sme union bound over events s for theorem 3, LP µ Rev µ ( ˆβ) Rev µ (β ) Rev µ ( ˆβ) + ɛ 4 LP µ Rev µ (β ) Revˆµ ( ˆβ) + Revˆµ ( ˆβ) Rev µ ( ˆβ) + ɛ 2 LP µ 2ɛLP µ, where we hve used Lemm 2 in the first inequlity, the tringle inequlity in the second, nd Theorem 3 nd eqution 4 in the third inequlity 13

Finlly, we relte our min theorem bck to OPT(T, B T ) Theorem 4 is sttement regrding the performnce of our lgorithm s T However, it is strightforwrd to estblish tht this result holds for finite T, s long s T horizon the control is pplied to is t lest pproximtely s lrge s the size of the lerning horizon Corollry 1 Let stisfy the condition in Theorem 4 Let Rev T ( ˆβ ) be the revenues grnered from using the resulting bid-price control on the following T smples Then, with probbility 1 3δ, Rev T ( ˆβ ) (1 3ɛ)OPT(T, B T ), ) for T (m 64 log(m X ) + log 1 ρ 2 ɛ 2 δ Proof ote tht Rev T ( ˆβ ) = T RevˆµT ( ˆβ ) By pplying union bound nd using Lemm 6 nd Theorem 4, we cn gurntee tht, with probbility t lest 1 3δ, nd Hence, RevˆµT ( ˆβ Rev µ ( ˆβ ) ɛlp µ Rev µ ( ˆβ (1 2ɛLP µ Rev T ( ˆβ ) T (Rev µ ( ˆβ ) ɛlp µ ) where the lst inequlity follows from Lemm 1 T (1 2ɛ)LP µ (1 2ɛ)OPT(T, B T ), 5 Conclusions We hve nlyzed clss of RM models, specificlly Ad-Disply lloction problems, in which the smple complexity of lerning high-dimensionl demnd object scles linerly with its underlying dimension, wheres previous results suggested the best dependence ws exponentil Moreover, we hve estblished lower bound on this smple complexity tht is tight with respect to our nlysis ( ) 2 up to fctor of rmx r vg log(m X ) There re severl direction of future reserch tht we find prticulrly tempting: 1 While the rtio LP µ /r mx in our smple complexity is inescpble due to our lower bound, there is merit in understnding how it scles with respect to m nd X Clerly, one cn formulte dversril instnces where this rtio is rbitrrily lrge However, we conjecture tht for resonble probbilistic models describing our Ad-Disply instnces, LP µ scles pproximtely like O(1)r vg 2 One wy to interpret our result is tht we hve mde low rnk ssumption on the problem structure which hs resulted in revenue function tht is esier to lern thn it would be in more generl models An interesting direction is to sk whether other low rnk ssumptions cn yield similr results for broder clss of lloction problems 3 Lstly, nturl extension to our model is to think of dvertisers s lso rriving iid from some distribution of fetures It would be very interesting to see whether bid-price policy cn be built on smpling both dvertisers nd impressions, nd whether such n pproch could led to better dependence on m in the smple complexity 14

A Proofs for Section 2 Proof of Lemm 1 Proof Consider n ɛ-optiml control o for OPT(T, B T ) (which must exist for rbitrry ɛ > 0 Define the frequency counts for ech impression type C T (x) = {X t, t [T ] st X t = x}, s well s the liner progrm LP(T, B T ) = mx subject to r 1(x X )z(x, )C T (x) x X z(x, )C T (x) B T x X z(x, ) 1 z 0 Fix some reliztion ω of impressions rrivls, nd define ẑ such tht { t:x ẑ(x, ) = t=x ot () min B T, } t ot () C T (x) t ot () Then ẑ chieves the sme revenues s o in the originl problem, while being fesible for LP µ (T, B T ) Tking expecttions yields OPT(T, B T ) ɛ E[LP µ (T, B T )] Since the choice of ɛ ws rbitrry, OPT(T, B T ) E[LP µ (T, B T )] The lemm follows from the fct tht E[LP µ (T, B T )] T LP µ by Jensen s inequlity Proof of Lemm 2 Proof Through rndomly perturbing the lloction rewrds by by n rbitrrily smll mount δ(x, ) before we solve the dul problem, we cn gurntee tht with probbility, there cn be t most m ties in the djusted bids (this is stndrd rgument in the literture; for more in-depth tretment of this issue, see Agrwl et l [2009]) The loss from setting ll lloctions with ties to 0 is t most LP ν Rev ν (z(β ν )) mr mx mx x X ν(x) ɛlp µ Proof of Lemm 3 We mke use in the proof of the following simple bound on the tils of binomil rndom vrible ( ) Fct 1 (Dudley [2002]) P [Bin(, p) k] p k k e k p if k p, where Bin(, p) is binomil distribution with trils nd success probbility p 15

Proof Let us bound the probbility tht ˆµ is ɛ-bd [ P ˆµ ρ ɛ ] [ P ˆµ (x) ρ ɛ ] m m x X ) [ ( X P Bin, ρ ɛ m 2 ( e ρ ɛ X m) ( e ρ ɛ m X m) ] ρ m e ρ ɛ m(1 1 m) where the first inequlity is union bound over x X nd the third inequlity uses Fct 1 In order to mke this probbility lower thn some δ, one must hence choose B Lemms for Section 4 = 4m ρɛ = 4 ρ log X δ log m m log X δ ɛ log m Lemm 9 Consider rndom vrible X = 1 i=1 Y i, where Y is sequence of iid Bernoulli rv s with E[Y i ] = µ, σ(y i ) = σ, nd threshold b 0 Then, 1 E [min {b, X } min {b, µ}] σ 3(1 2µ) + 2π 2 In the specil cse tht b = µ, E [min {b, X } min {b, µ}] σ 3(1 2µ) 2π Proof We will prove both of these results by pproximting X with Gussin rv for which the computtion of the bis is esy For prt 1, note tht, by tringle inequlity, E [min {b, X } min {b, µ}] E [min {b, Z} min {b, µ}] + E [min{x µ, b µ} min{z µ, b µ}], where Z is Gussin rv with men nd vrince identicl to X s By lemm 10, we cn the second term from bove by 3(1 2µ) We will now give precise bounds on the first term (where we hve replced X with Z), which will symptoticlly dominte the second 16

Cse 1: b µ Then, [ E [min {b, Z} min {b, µ}] = E (Z b) ] [ E (Z µ) ] = E [(Z µ) +] = σ(x ) 2π = σ 2π where we hve used the ssumption tht b µ in the first inequlity, nd the fct tht, if if Z is 0 men Gussin, E[Z + ] = σ(z) 2π Cse 2: b > µ For this cse, E [min {b, Z} min {b, µ}] = E [min {Z µ, b µ}] = 1 2 E [Z µ Z µ 0] + 1 E [min {Z µ, b µ} Z µ > 0] 2 1 E [Z µ Z µ 0] 2 = E [min {Z µ, 0}] = σ, 2π where we hve used the fct tht in the first inequlity, E [min {Z µ, b µ} Z µ > 0] is positive due to b > µ nd less thn E [Z µ Z µ 0] by the symmetry of Z Proving prt 2 of the lemm is similr, except tht now we bound E [min {b, X } min {b, µ}] = E [ (X µ) ] E [ (Z µ) ] E [ (X µ) (Z µ) ] σ 3(1 2µ) 2π where we hve used the reverse tringle inequlity in the first inequlity, nd for the second, lemm 10 Lemm 10 Consider rndom vrible X = 1 i=1 Y i, where Y is sequence of iid Bernoulli rv s with E[Y i ] = µ, σ(y i ) = σ Then there exists Gussin rv Z such tht E[Z] = E[X ], σ(z) = σ(x ) nd 3(1 2µ) E [min{x µ, b µ} min{z µ, b µ}] = Proof For nottionl convenience, let f(x) = min{x µ, b µ} nd notice tht f is 1-Lipschitz function Moreover, let W = 1 i=1 (Y i µ) be the trnsformtion of X to n rv with stndrd norml men nd vrince 17

The expression in the sttement of the lemm becomes E [f(x ) f(z)] sup E [h(x ) h(z)] h: h 1 = d W (X, Z) = σ σ d W (W, (Z µ)) = σ 3E[( Y i µ /σ) 3 ] 3(1 2µ), where we hve used the Lipschitz property of f in the first inequlity, used the definition of the Wsserstein metric in the first equlity, pplied Proposition 2 for the second inequlity, nd used the fct tht E[ Y i µ 3 ] = µ(1 µ) 3 (1 µ)µ 3 for the lst equlity We lso stte without proof the Bounded Differences Inequlity ( proof of this inequlity cn be found in Motwni nd Rghvn [1995]): Proposition 1 (Bounded Differences Inequlity) Suppose tht f : R n R stisfies, 1 k n, f(x 1,, x i,, x n ) f(x 1,, x i,, x n ) L Consider vector X = (X 1,, X n ) with independent components Then, P [ f(x) E[f(X)] t] 2 exp ( t2 2nL 2 Finlly, we stte the following result which estblishes finite smple Centrl Limit Theorem convergence result under the Wsserstein metric; the result is derived using Stein s method nd proof cn be found in Chen et l [2011] Proposition 2 Consider sequence X 1,, X of independent rndom vribles with E[X i ] = 0, E[X 2 i ] = 1 nd E[ X i 3 ] < Then, d W (W, Z) 3 3/2 where d W is the Wsserstein distnce defined s d W (U, V ) = E[ X i 3 ], i=1 sup Eh(U) Eh(V ), h: h 1 W = 1 i=1 X i nd Z is Gussin rv with men nd vrince equl to W s References Shipr Agrwl, Zizhuo Wng, nd Yinyu Ye A dynmic ner-optiml lgorithm for online liner progrmming, 2009 ) 18

Annd Bhlgt, Jon Feldmn, nd Vhb Mirrokni Online lloction of disply ds with smooth delivery In Proceedings of the 18th ACM SIGKDD interntionl conference on Knowledge discovery nd dt mining, KDD 12, pges 1213 1221, ew York, Y, USA, 2012 ACM ISB 978-1-4503-1462-6 doi: 101145/23395302339720 URL http://doicmorg/101145/23395302339720 Buchbinder nd J or Online priml-dul lgorithms for covering nd pcking Mthemtics of Opertions Reserch, 34:270 286, My 2009 Louis HY Chen, Lrry Goldstein, nd Qi-Mn Sho orml Approximtion by SteinÕs Method Springer-Verlg, 2011 ISB 978-3-642-15007-4 DF Ciocn nd Vivek Fris Model predictive control for dynmic resource lloction mthemtics of opertions reserch Mthemtics of Opertions Reserch, 37(3):501 525, 2012 ikhil R Devnur nd Thoms P Hyes The dwords problem: online keyword mtching with budgeted bidders under rndom permuttions In Proceedings of the 10th ACM conference on Electronic commerce, EC 09, pges 71 78, ew York, Y, USA, 2009 ACM ISB 978-1-60558-458-4 doi: 101145/15663741566384 URL http://doicmorg/101145/15663741566384 RM Dudley Rel Anlysis nd Probbility Cmbridge Studies in Advnced Mthemtics Cmbridge University Press, 2002 ISB 9780521007542 URL http://booksgooglecom/books? id=7uut7uzvi0c Jon Feldmn, itish Korul, Vhb Mirrokni, S Muthukrishnn, nd Mrtin PÃ l Online d ssignment with free disposl In Stefno Leonrdi, editor, Internet nd etwork Economics, volume 5929 of Lecture otes in Computer Science, pges 374 385 Springer Berlin Heidelberg, 2009 ISB 978-3-642-10840-2 doi: 101007/978-3-642-10841-9_34 URL http://dxdoiorg/ 101007/978-3-642-10841-9_34 Jon Feldmn, Monik Henzinger, itish Korul, Vhb S Mirrokni, nd Cliff Stein Online stochstic pcking pplied to disply d lloction In Proceedings of the 18th nnul Europen conference on Algorithms: Prt I, ESA 10, pges 182 194, Berlin, Heidelberg, 2010 Springer-Verlg ISB 3-642-15774-2, 978-3-642-15774-5 URL http://dlcmorg/cittioncfm?id=1888935 1888957 G Gllego nd G vn Ryzin A multiproduct dynmic pricing problem nd its pplictions to network yield mngement Opertions Reserch, 45(1):pp 24 41, 1997 A Meht, A Sberi, U Vzirni, nd V Vzirni Adwords nd generlized on-line mtching In FOCS 05: Proceedings of the 46th Annul IEEE Symposium on Foundtions of Computer Science, pges 264 273 IEEE Computer Society, 2005 Mrco Molinro nd R Rvi In Artur Czumj, Kurt Mehlhorn, Andrew M Pitts, nd Roger Wttenhofer, editors, ICALP (1), Lecture otes in Computer Science, pges 701 713 Springer ISB 978-3-642-31593-0 Rjeev Motwni nd Prbhkr Rghvn Rndomized lgorithms Cmbridge University Press, ew York, Y, USA, 1995 ISB 0-521-47465-5, 9780521474658 Klyn Tlluri nd Grrett Vn Ryzin An nlysis of bid-price controls for network revenue mngement Mnge Sci, 44(11):1577 1593, ovember 1998 ISS 0025-1909 doi: 101287/ mnsc44111577 URL http://dxdoiorg/101287/mnsc44111577 19