Predicting Advertiser Bidding Behaviors in Sponsored Search by Rationality Modeling



Similar documents
Approximation Algorithms for Data Distribution with Load Balancing of Web Servers

An Ensemble Classification Framework to Evolving Data Streams

Asymptotically Optimal Inventory Control for Assemble-to-Order Systems with Identical Lead Times

Multi-agent System for Custom Relationship Management with SVMs Tool

TCP/IP Interaction Based on Congestion Price: Stability and Optimality

Expressive Negotiation over Donations to Charities

What is Candidate Sampling

An Efficient Job Scheduling for MapReduce Clusters

Forecasting the Direction and Strength of Stock Market Movement

DEFINING %COMPLETE IN MICROSOFT PROJECT

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

Recurrence. 1 Definitions and main statements

Luby s Alg. for Maximal Independent Sets using Pairwise Independence

An Interest-Oriented Network Evolution Mechanism for Online Communities

An Alternative Way to Measure Private Equity Performance

Predictive Control of a Smart Grid: A Distributed Optimization Algorithm with Centralized Performance Properties*

Joint Optimization of Bid and Budget Allocation in Sponsored Search

The Application of Fractional Brownian Motion in Option Pricing

Prediction of Success or Fail of Students on Different Educational Majors at the End of the High School with Artificial Neural Networks Methods

The Dynamics of Wealth and Income Distribution in a Neoclassical Growth Model * Stephen J. Turnovsky. University of Washington, Seattle

Support Vector Machines

7.5. Present Value of an Annuity. Investigate

Clustering based Two-Stage Text Classification Requiring Minimal Training Data

Cardiovascular Event Risk Assessment Fusion of Individual Risk Assessment Tools Applied to the Portuguese Population

THE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek

Dynamic Virtual Network Allocation for OpenFlow Based Cloud Resident Data Center

BERNSTEIN POLYNOMIALS

SIMPLIFYING NDA PROGRAMMING WITH PROt SQL

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network

ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING

A Novel Auction Mechanism for Selling Time-Sensitive E-Services

General Auction Mechanism for Search Advertising

NON-CONSTANT SUM RED-AND-BLACK GAMES WITH BET-DEPENDENT WIN PROBABILITY FUNCTION LAURA PONTIGGIA, University of the Sciences in Philadelphia

A Resources Allocation Model for Multi-Project Management

How Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence

1 Example 1: Axis-aligned rectangles

Lecture 3: Force of Interest, Real Interest Rate, Annuity

Using Series to Analyze Financial Situations: Present Value

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by

Can Auto Liability Insurance Purchases Signal Risk Attitude?

Credit Limit Optimization (CLO) for Credit Cards

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College

Addendum to: Importing Skill-Biased Technology

USING EMPIRICAL LIKELIHOOD TO COMBINE DATA: APPLICATION TO FOOD RISK ASSESSMENT.

Off-line and on-line scheduling on heterogeneous master-slave platforms

Staff Paper. Farm Savings Accounts: Examining Income Variability, Eligibility, and Benefits. Brent Gloy, Eddy LaDue, and Charles Cuykendall

Simple Interest Loans (Section 5.1) :

How To Calculate The Accountng Perod Of Nequalty

Factored Conditional Restricted Boltzmann Machines for Modeling Motion Style

CHAPTER 14 MORE ABOUT REGRESSION

Swing-Free Transporting of Two-Dimensional Overhead Crane Using Sliding Mode Fuzzy Control

SUPPORT VECTOR MACHINE FOR REGRESSION AND APPLICATIONS TO FINANCIAL FORECASTING

Dynamic Cost-Per-Action Mechanisms and Applications to Online Advertising

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements

Comparison of workflow software products

A Simple Congestion-Aware Algorithm for Load Balancing in Datacenter Networks

Pricing Model of Cloud Computing Service with Partial Multihoming

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting

Section 5.3 Annuities, Future Value, and Sinking Funds

On the Optimal Control of a Cascade of Hydro-Electric Power Stations

Logistic Regression. Lecture 4: More classifiers and classes. Logistic regression. Adaboost. Optimization. Multiple class classification

Financial Mathemetics

Risk-based Fatigue Estimate of Deep Water Risers -- Course Project for EM388F: Fracture Mechanics, Spring 2008

How To Understand The Results Of The German Meris Cloud And Water Vapour Product

Generalizing the degree sequence problem

Section 5.4 Annuities, Present Value, and Amortization

Number of Levels Cumulative Annual operating Income per year construction costs costs ($) ($) ($) 1 600,000 35, , ,200,000 60, ,000

Dynamic Cost-Per-Action Mechanisms and Applications to Online Advertising

A Secure Password-Authenticated Key Agreement Using Smart Cards

Feasibility of Using Discriminate Pricing Schemes for Energy Trading in Smart Grid

On-Line Trajectory Generation: Nonconstant Motion Constraints

Branch-and-Price and Heuristic Column Generation for the Generalized Truck-and-Trailer Routing Problem

Simulation and optimization of supply chains: alternative or complementary approaches?

When Talk is Free : The Effect of Tariff Structure on Usage under Two- and Three-Part Tariffs

The Greedy Method. Introduction. 0/1 Knapsack Problem

Optimal Customized Pricing in Competitive Settings

IMPACT ANALYSIS OF A CELLULAR PHONE

J. Parallel Distrib. Comput.

Answer: A). There is a flatter IS curve in the high MPC economy. Original LM LM after increase in M. IS curve for low MPC economy

v a 1 b 1 i, a 2 b 2 i,..., a n b n i.

Multi-Period Resource Allocation for Estimating Project Costs in Competitive Bidding

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12

Return decomposing of absolute-performance multi-asset class portfolios. Working Paper - Nummer: 16

PRACTICE 1: MUTUAL FUNDS EVALUATION USING MATLAB.

On the Interaction between Load Balancing and Speed Scaling

An Evaluation of the Extended Logistic, Simple Logistic, and Gompertz Models for Forecasting Short Lifecycle Products and Services

XAC08-6 Professional Project Management

Statistical Methods to Develop Rating Models

Estimation of Dispersion Parameters in GLMs with and without Random Effects

Robust Design of Public Storage Warehouses. Yeming (Yale) Gong EMLYON Business School

A Lyapunov Optimization Approach to Repeated Stochastic Games

Dynamic Pricing for Smart Grid with Reinforcement Learning

Project Networks With Mixed-Time Constraints

MATHEMATICAL ENGINEERING TECHNICAL REPORTS. Sequential Optimizing Investing Strategy with Neural Networks

Increasing Supported VoIP Flows in WMNs through Link-Based Aggregation

Effective Network Defense Strategies against Malicious Attacks with Various Defense Mechanisms under Quality of Service Constraints

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic

Transcription:

Predctng Advertser Bddng Behavors n Sponsored Search by Ratonaty Modeng Hafeng Xu Centre for Computatona Mathematcs n Industry and Commerce Unversty of Wateroo Wateroo, ON, Canada hafeng.ustc@gma.com Dy Yang Dept. of Computer Scence Shangha Jao Tong Unversty Shangha, 200240, P. R. Chna yangdy@apex.sjtu.edu.cn Bn Gao Mcrosoft Research Asa 13F, Bdg 2, No. 5, Danng St Bejng, 100080, P. R. Chna bngao@mcrosoft.com Te-Yan Lu Mcrosoft Research Asa 13F, Bdg 2, No. 5, Danng St Bejng, 100080, P. R. Chna tyu@mcrosoft.com ABSTRACT We study how an advertser changes hs/her bd prces n sponsored search, by modeng hs/her ratonaty. Predctng the bd changes of advertsers wth respect to ther campagn performances s a key capabty of search engnes, snce t can be used to mprove the offne evauaton of new advertsng technooges and the forecast of future revenue of the search engne. Prevous work on advertser behavor modeng heavy rees on the assumpton of perfect advertser ratonaty; however, n most cases, ths assumpton does not hod n practce. Advertsers may be unwng, ncapabe, and/or constraned to acheve ther best response. In ths paper, we expcty mode these mtatons n the ratonaty of advertsers, and bud a probabstc advertser behavor mode from the perspectve of a search engne. We then use the expected payoff to defne the objectve functon for an advertser to optmze gven hs/her mted ratonaty. By sovng the optmzaton probem wth Monte Caro, we get a predcton of mxed bd strategy for each advertser n the next perod of tme. We examne the effectveness of our mode both drecty usng rea hstorca bds and ndrecty usng revenue predcton and cck number predcton. Our expermenta resuts based on the sponsored search ogs from a commerca search engne show that the proposed mode can provde a more accurate predcton of advertser bd behavors than severa basene methods. Categores and Subject Descrptors H.3.5 [Informaton Systems]: Informaton Storage and Retreva - On-ne Informaton Servces Keywords Advertser modeng, ratonaty, sponsored search, bd predcton. Ths work was performed when the frst and the thrd authors were nterns at Mcrosoft Research Asa. Copyrght s hed by the Internatona Word Wde Web Conference Commttee (IW3C2). Dstrbuton of these papers s mted to cassroom use, and persona use by others. WWW 2013, May 13 17, 2013, Ro de Janero, Braz. ACM 978-1-4503-2035-1/13/05. 1. INTRODUCTION Sponsored search has become a major means of Internet monetzaton, and has been the drvng power of many commerca search engnes. In a sponsored search system, an advertser creates a number of ads and bds on a set of keywords (wth certan bd prces) for each ad. When a user submts a query to the search engne, and f the bd keyword can be matched to the query, the correspondng ad w be seected nto an aucton process. Currenty, the Generazed Second Prce (GSP) aucton [10] s the most commony used aucton mechansm whch ranks the ads accordng to the product of bd prce and ad cck probabty 1 and charges an advertsers f hs/her ad wns the aucton (.e., hs/her ad s shown n the search resut page) and s ccked by users [13]. Generay, an advertser has hs/her goa when creatng the ad campagn. For nstance, the goa mght be to receve 500 ccks on the ad durng one week. However, the way of achevng ths goa mght not be smooth. For exampe, t s possbe that after one day, the ad has ony receved10 ccks. In ths case, n order to mprove the campagn performance, the advertser may have to ncrease the bd prce n order to ncrease the opportunty for hs/her ad to wn future auctons, and thus to ncrease the chance for the ad to be presented to users and to be ccked. 2 Predctng how the advertsers change ther bd prces s a key capabty of a search engne, snce t can be used to dea wth the so-caed second order effect n onne advertsng [13] when evauatng nove advertsng technooges and forecastng future revenue of search engnes. For nstance, suppose the search engne wants to test a nove agorthm for bd keyword suggeston 3 [7]. Gven that the onne experments are costy (e.g., unsuccessfu onne experments w ead to revenue oss of the search engne), the agorthm w usuay be tested based on the hstorca ogs frst to see ts ef- 1 Usuay a reserve score s set and the ads whose scores are greater than the reserve score are shown. 2 Note that the advertser may aso choose to revse the ad descrpton, bd extra keywords, and so on. However, among these actons, changng the bd prce s the smpest and the most commony used method by advertsers. Pease aso note that snce GSP s not ncentve compatbe, advertsers mght not bd ther true vaues and changng bd prces s ther common behavors. 3 The same thng w happen when we evauate other agorthms ke traffc estmaton, ad cck predcton, and aucton mechansm.

fectveness (a.k.a., offne experment). However, n many cases, even f the agorthm works qute we n offne experment, t may perform bady after beng depoyed onne. One of the reasons s that the advertsers mght change ther bd prces n response to the changes of ther campagn performances caused by the depoyed new agorthm. Therefore, the experments based on the hstorca bd prces w be dfferent from those on onne traffc. To tacke ths probem, one needs a powerfu advertser behavor mode to predct the bd prce changes. In the terature, there have been a number of researches [4] [5] [22] [19] [2] [17] [3] that mode how advertsers determne ther bd prces, and how ther bd strateges nfuence the equbrum of the sponsored search system. For exampe, Varan [19] assumes that the advertsers bd the amount at whch ther vaue per cck equas the ncrementa cost per cck to maxmze ther uttes. The authors of [2] and [17] study how to estmate vaue per cck, by assumng advertsers are on the ocay envy-free equbrum, and assumng the dstrbutons of a the advertsers bds are ndependent and dentcay dstrbuted. Most of the above researches rey hghy on the assumptons of perfect advertser ratonaty and fu nformaton access 4,.e., advertsers have good knowedge about ther uttes and are capabe of effectvey optmzng the uttes (.e., take the best response). However, as we argue n ths paper, ths s usuay not true n practce. In our opnon, rea-word advertsers have mtatons n accessng the nformaton about ther compettors, and have dfferent eves of ratonaty. In partcuar, an advertser may be unwng, ncapabe, or constraned to acheve hs/her best response. As a resut, some advertsers frequenty adjust the bd prces accordng to ther recent campagn performances, whe some other advertsers aways keep the bd unchanged regardess of the campagn performances; some advertsers have good sense of choosng the approprate bd prces (possby wth the hep of campagn anayss toos [14] or thrd-party ad agences), whe some other advertsers choose bd prces at random. To better descrbe the above ntuton, we expcty mode the ratonaty of advertsers from the foowng three aspects: Wngness represents the propensty an advertser has to optmze hs/her utty. Advertsers who care tte about ther ad campagns and advertsers who are very serous about the campagn performance w have dfferent eves of wngness. Capabty descrbes the abty of an advertser to estmate the bd strateges of hs/her compettors and take the bestresponse acton on that bass. An experenced advertser s usuay more capabe than an nexperenced advertser; an advertser who hres professona ad agency s usuay more capabe than an advertser who adjusts bd prces by hssef/hersef. Constrant refers to the constrants that prevent an advertser from adoptng a bd prce even f he/she knows that ths bd prce s the best response for hm/her. The constrant usuay (athough not ony) comes from the ack of remanng budget. Wth the above notons, we propose the foowng mode to descrbe how advertsers change ther bd prces, from the perspectve 4 Pease note that some of these works take a Bayesan approach; however, they st assume that the prors of the vaue dstrbutons are pubcy known. of the search engne. 5 Frst, an advertser has a certan probabty to optmze hs/her utty or not, whch s modeed by the wngness functon. Second, f the advertser s wng to make changes, he/she w estmate the bd strateges of hs/her compettors. Based on the estmaton, he/she can compute the expected payoff (or utty) and use t as an objectve functon to determne hs/her next bd prce. Ths process s modeed by the capabty functon. By smutaneousy consderng the optmzaton processes of a the advertsers, we can effectvey compute the best bd prces for every advertser. Thrd, gven the optma bd prce, an advertser w check whether he/she s abe to adopt t accordng to some constrants. Ths s modeed by the constrant functon. Pease note that the wngness, capabty, and constrant functons are a parametrc. By fttng the output of our proposed mode to the rea bd change ogs (obtaned from commerca search engnes), we w be abe to earn these parameters, and then use the earned mode to predct the bd behavor change n the future. We have tested the effectveness of the proposed mode usng rea data. The expermenta resuts show that the proposed mode can predct the bd changes of advertsers n a more accurate manner than severa basene methods. To sum up, the contrbutons of our work are sted as beow. Frst, to the best of our knowedge, ths s the frst advertser behavor mode n the terature that consders dfferent eves of ratonaty of advertsers. Second, we mode advertser behavors usng a parametrc mode, and appy machne earnng technques to earn the parameters n the mode. Ths s a good exampe of everagng machne earnng n game theory to avod ts unreasonabe assumptons. Thrd, our proposed mode eads to very accurate bd predcton. In contrast, as far as we know, most of prevous research focuses on estmatng vaue per cck, but not predctng bd prces. Therefore, our work has more drect vaue to search engne, gven that bd predcton s a desred abty of search engne as aforementoned. The rest of the paper s organzed as the foowng. In Secton 2, we ntroduce the notatons and descrbe the wngness, capabty, and constrant functons. We present the framework of the bd strategy predcton mode n Secton 3. In Secton 4, we ntroduce the effcent numerca agorthm of the mode. In Secton 5, we present the expermenta resuts on rea data. We summarze the reated work n Secton 6, and n the end we concude the paper and present some nsghts about future work n Secton 7. 2. ADVERTISER RATIONALITY As mentoned n the ntroducton, how an advertser adjusts hs/her bd s reated to hs/her ratonaty. In our opnon, there are three aspects to be consdered when modeng the ratonaty of an advertser: wngness, capabty, and constrant. In ths secton, we ntroduce some notatons for sponsored search auctons, and then descrbe the modes for these ratonaty aspects. 2.1 Notatons We consder the keyword aucton n sponsored search. For smpcty, we w not consder connectons between dfferent ad campagns and we assume each advertser ony has one ad and bds on just one keyword for t. That s, the aucton partcpants are the keyword-ad pars. Advertsers are assumed to be rsk-neutra. 6 5 That s, the mode s to be used by the search engne to predct advertsers behavor, but not by the advertsers to gude ther bddng strateges. 6 Ths assumpton w resut n a unform defnton of utty functons for a the advertsers. However, our resut can be naturay

We use ( = 1,,I) to ndex the advertsers, and consder advertser as the defaut advertser of our nterest. Suppose n one aucton the advertsers compete for J ad sots. In practce, the search engne usuay ntroduces a reserve score to optmze ts revenue. Ony those ads whose rank scores are above ths reserve score w be shown to users. To ease our dscusson, we regard the reserve score r as a vrtua advertser n the aucton. We use a,j to denote the cck-through rate (CTR) of advertser s ad when t s paced at poston j. Smar to the settng n [2][17], we assume a,j to be separabe. That s,a,j = γ α j, whereγ s the ad effect and α j s the poston effect. We et α j = 0 when j > J. The sponsored search system w predct the cck probabty [11] of an ad and use t as a factor to rank the ads n the aucton. We use s to denote the predcted cck probabty of advertser s ad f t s paced n the frst ad sot. Note that both a,j and s are random varabes [2], snce they may be nfuenced by many dynamc factors such as the attrbutes of the query and the user who ssues the query. We assume a the advertsers share the same bd strategy spaceω whch conssts ofb dfferent dscrete bd prces denoted byb, = 1,,B. Furthermore, we denote the strategy of advertser as π = (π,1,,π,b ), whch s a mxed strategy. It means that w use bd strategyb wth a probabty ofπ,, = 1,,B. We assume advertser w estmate both the confguraton of hs/her compettors and ther strateges n order to fnd hs/her own best response. We use S (ncudng ) to ndcate the set of advertsers who are regarded by advertser as the partcpates of the aucton and uses (excudng) to ndcate the set of compettors of. We denote π () as s estmated bd strategy for a compettor ( ), and denote s own best-response strategy as π (). Note that both S and π () ( ) are random: () S s a random set due to the uncertanty n the aucton process: a) the partcpants of the aucton s dynamc [17]; b) n practce never knows exacty who are competng wth hm/her snce such nformaton s not pubcy avaabe. () π () ( ) s a random vector due to s ncompete nformaton and our uncertanty on s estmaton. More ntutons aboutπ () w be expaned n the modeng of the capabty functon (see Secton 2.3). To ease our dscusson, we now transform the uncertanty ofsto the uncertanty n bd prces, as shown beow. That s, we regard a the other advertsers as the compettors of and add the zero bd prce (denoted by b 0) to extend the bd strategy space. The extended bd strategy space s represented by Ω = Ω {b 0}. If an advertser s not a rea compettor of, we regard hs/her bd prce to be zero. Accordng to the above dscusson, S w be the whoe advertser set wth the set szei. Thus, we w ony consder the uncertanty of bd prces n the rest of the paper. 2.2 Wngness Wngness represents the propensty an advertser s wng to optmze hs/her utty, whch s modeed as a possbty. We mode wngness as a ogstc regresson functon W (x () t ). Here the nputx () t = (x () t,1,,x() t,h ) s a feature vector (H s the number of features) extracted for advertserat perodt, and the output s a rea number n[0,1] representng the probabty thatw optmze hs/her utty. 7 That s, advertser wth feature vector x () t extended to the case where advertsers dfferent rsk preferences are consdered. 7 Note that wng to optmze does not aways mean a change of bd. Probaby, an advertser attempts to optmze hs/her utty, but fnay fnds that hs/her prevous bd s aready the best choce. In w have a probabty of W (x () t ) to optmze hs/her utty, and a probabty of1 W (x () t ) to take no acton. In order to extract the feature vector x () t, we spt the hstorca aucton ogs nto T perods (e.g., T days). For each perod t T, y () t ndcates whether the bd was changed n perod t + 1. If the bd was changed, y () t = 1; otherwse, y () t = 0. Wth ths data, the foowng features are extracted: () The number of bd changes before t. The ntuton s that an advertser who changes bd more frequenty n the past w aso have a hgher possbty to make changes n the next perod. () The number of perods that an advertser has kept the bd unchanged unt t. Intutvey, an advertser who has kept the bd unchanged for a ong tme may have a hgher possbty to contnue keepng the bd unchanged. () The number of dfferent bd vaues used before t. The ntuton s that an advertser who has tred more bd vaues n the past may be regarded as a more actve bdder, and we may expect hm/her to try more new bd vaues n the future. (v) A Booean vaue ndcatng whether there are ccks nt. The ntuton s that f there s no cck, the advertser w fee unsatsfed and thus have a hgher probabty to make changes. Wth the above features, we wrte the wngness functon as, W (x () t ) = 1, (t = 1,,T). 1+e {β() 0 + H n=1 β n () x () t,n } Here β () = (β () 0,,β() H ) s the parameter vector for. To earn the parameter vector β (), we mnmze the sum of the frst-order error T t=1 y() t W (x () t ) on the hstorca data usng the cassca Broyden-Fetcher-Godfarb-Shanno agorthm (BFGS) [15]. Then we appy the earned parameter β () to predct s wngness of change n the future. 2.3 Capabty Capabty descrbes the abty of an advertser to estmate the bd strateges of hs/her compettors and take the best-response acton on that bass. A more experenced advertser may have better capabty n at east three aspects: nformaton coecton, utty functon defnton, and utty optmzaton. Usuay, n GSP auctons, a standard utty functon s used and the optma souton s not hard to obtan. Hence, we many consder the capabty n nformaton coecton,.e., the abty n estmatng compettors bd strateges. Recang that does not have any exact nformaton on hs/her compettors bds, t s a tte dffcut to mode how advertser estmates hs/her compettors strateges, because dfferent has dfferent estmaton technques. Before ntroducng the detaed mode for the capabty functon, we woud ke to brefy descrbe our ntuton. It s reasonabe to assume that s estmaton on s based on s market performance, denoted by Perf. Then we can wrte s estmaton as Est (Perf ), whch means appes some specfc estmaton technque Est on Perf. The market performance Perf s decded by a the advertsers bd profes due to the aucton property. That s, Perf = Perf π (π), here π s s hstorca bd hstogram. Note that we use π and π because we beeve the observed market performanceperf s based on the auctons durng a prevous perod, whe not just one prevous aucton. However, we are mosty nterested n proftabe keywords, the auctons of whch usuay have so many advertsers nvoved that π can be regarded as a constant envronment factor for any. Therefore, Perf ony depends on π,.e., Perf = Perf(π). ths case, he w keep the bd unchanged but we st regard t as wng to optmze.

Thus, we have Est (Perf ) = Est (Perf(π)). T now, the probem becomes much easer: s bnd to π, but the search engne has a the nformaton of π. To know Est (Perf ), the search engne ony needs to mode the functon Est (Perf( )) gven that π s known. Specfcay, we denote the above Est (Perf(π)) as our capabty functon A (π). As descrbed n Secton 2.1, A (π) s denoted by π (). The reason that A s named as capabty functon s cear: Est, the technques uses for estmaton, refects hs/her capabty. The reason thatπ () = A (π)s modeed to be random s aso cear: the search engne does not know what Est, and thus aspred by the concept of type n Bayesan Game [12] whch s a descrpton of ncompete game settng, we regard Est as a type of and mode ts dstrbuton. For the same π, dfferent advertsers may have dfferent estmatons accordng to ther varous capabtes. To smpfy our mode, we gve the foowng assumpton on π (). We assume that s estmatons on other advertsers bd strateges are a pure strateges. That s,π () s a random Booean vector wth just one eement equa to1. 8 Gven a bd b n wth possbty π,n from the hstorca bd hstogram π, we assume s estmaton has a fuctuaton around b n. The fuctuaton can be modeed by a certan probabty dstrbuton such as Bnoma dstrbuton or Posson dstrbuton. The parameters of the dstrbuton can be used to ndcate s capabty. Here we use Bnoma dstrbuton to mode the fuctuaton due to the foowng reasons: () Theoretcay, Bnoma dstrbuton can convenenty descrbe the dscrete bds due to ts own dscrete nature. Furthermore, the two parameters n Bnoma dstrbuton can we refect the capabty eves: the tra tmes N can contro the fuctuaton range (N = 0 means a perfect estmaton) and the success possbty δ (0,1) can contro the bas of the estmatons. Specfcay, fδ > 0.5, t means the estmaton s on average arger than the true dstrbuton and vce versa. () Expermentay, we have compared Bnoma dstrbuton wth some other we-known dstrbutons such as Gaussan, Posson, Beta, and Gamma dstrbutons, and the experment resuts show that Bnoma dstrbuton performs the best n our mode. For sake of smpcty, we et the fuctuaton range be an nteger 2N Ω, and the success possbty be δ (0,1). Then (N,δ ) are s capabty parameters. The fuctuaton on b n n π s modeed by Pr(A (b n)=b n+m) = π,n ( 2N N +m ) δ (N +m) (1 δ ) (N m), (m = N,...,N ). In the above formua, ; the symbo = means the equvaence of strategy; ( 2N N +m) s the number of (N +m)-combnatons n a set wth 2N ntegers. Therefore, by consderng a the bd vaues nπ, we have, = Pr(π () =b n) = Pr(A (π)=b n) ( ) N 2N δ (N +m) (1 δ ) (N m). N +m m= N π,n m 2.4 Constrant Constrant refers to the factor that prevents an advertser from adoptng a bd prce even f he/she knows that ths bd prce s the 8 Our mode can be naturay extended to the mxed strategy cases, wth a bt more compcated notatons and computng agorthms. best response for hm/her. In practce, many factors (such as ack of remanng budget and the aggressve/conservatve character of the advertser) may mpact advertser s eventua choces. For exampe, an advertser who acks budget or has conservatve character may prefer to bd a ower prce than the best response. We mode constrant usng a functon C, whch transates the best response (whch may be a mxed strategy) to the fna strategy wth step (a.k.a., dfference) c () t. That s, f the best bd strategy s π () at perod t, then C (π () ) w be b n +c () t wth probabty π (),n. Smar to the proposa n the wngness functon, we mode the step c () t usng a regresson mode. The dfference s that ths tme we use near regresson snce c () t s n nature a transaton dstance but not a probabty. Here we use the remanng budget as the feature x () t and bud the foowng functon form: c () t = β 1, +β 2, x () t x () x (), wherex () = t T x() t. T In the above formua, T s the set of perods for tranng and x () t s s remanng budget n perod t. In the tranng data, we use ( B n=1 bnπ(),n ) b() t as the abe for c () t. Here b () t s s rea bd at perod t; β 1, and β 2, are the parameters for the near regresson. Note that β 1, s ony reated to hmsef/hersef. Ths parameter reveas s nterna character on whether he/she s aggressve or not. One can ntutvey magne that for aggressve advertsers, β 1, w be postve because such advertsers are radca and they woud ke to overbd. Moreover, we normaze the budget n the formua because the amounts of budget vary argey across dfferent advertsers. The normazaton w hep to bud a unform mode for a advertsers. 3. ADVERTISER BEHAVIOR MODEL After expanng the advertser ratonaty n terms of wngness, capabty, and constrant, we ntroduce a new advertser behavor mode. Suppose advertser has a utty functon U. The nputs of U are s estmatons on hs/her compettors bd strateges, whch are gven by the capabty functon A. The goa of advertser s to fnd a mxed strategyπ () to maxmze ths utty,.e., argmax π () = argmax π () U (A (π ), = 1,,I) U (π (), = 1,,I, ). If we further consder the changng possbty W, the constrant functon C, and the randomness of A, we can get the genera advertser behavor mode that expans how advertser may determne hs/her bd strategy for the next perod of tme: π = W E A (C (argmaxu (π (), = 1,,I, )))+ π () (1 W )(0,..0,1,0...0) T,. (1) Here (0,..0, 1, 0...0) s the unchanged B-dmenson bd strategy where the ndex of the one (and the ony one) equas n f the bd n the prevous perod s b n. argmax outputs a B-dmenson mxed strategy of ;E A means the expectaton on the randomness of A (π j); W s the possbty that decdes to optmze hs/her utty. We want to emphass that equaton (1) s a genera expresson under our ratonaty assumptons. Though we have provded the detas of the mode n Secton 2 about W, A, C and we w

ntroduce the detas about U n the next subsecton, one can certany propose any other forms of the mode for a these functons. 3.1 Utty Functon To make the above mode concrete, we need to defne and cacuate the utty functon U for every advertser. Reca our assumpton that π () = A (π) s a pure strategy; that s, ony one eement n π () s one and a the other eements are zeros. Suppose the bd vaue that corresponds to the one n π () s o (o Ω and ). In ths case, the bd confguraton s o = (o 1,,o I), n whch a the advertsers bds are fxed. Pease note that the representatons n terms of o and the orgna representatons n term ofπ () are actuay equvaent to each other, snce they encode exacty the same nformaton and ts randomness n the bd strateges of adverters. Then we ntroduce the form of U. Based on the bd prces n o and ad quaty scores s ( = 1,,I), we can determne the ranked st n the aucton accordng to the commony used rankng rues (.e., the product of bd prce and ad quaty score [13]) n sponsored search. Suppose s ranked n postonj andˆ s ranked n poston j + 1. Accordng to the prcng rue n the generazed second prce aucton (GSP) [10],shoud payoˆsˆ/s for each cck. As defned n Secton 2.1, the possbty for a user to cck s ad n postonj sa j = γ α j. Suppose the true vaue of advertser for a cck s v (whch can be estmated usng many technques, e.g., [9]), then we have, U = E γ,α j,sˆ,s {(γ α j(v oˆsˆ s ))} = γ α j(v oˆsˆ s ). As expaned n Secton 2, γ, α j, sˆ, s are a random varabes. Here γ, α j, sˆ, and s are ther means. Snce U s near and the above four random varabes are ndependent of each other, the outsde expectaton can be moved nsde and substtuted by the correspondng means. 3.2 Fna Mode Wth a the above dscussons, we are now ready to gve the fna form of the advertser mode. By denotng o = (o 1,,o 1, o +1,,o I) as the bd confguraton wthout s bd, we get the foowng expresson for = 1,,I: π = W E o {C [argmax(γ α j(v oˆsˆ/s ))]} π () +(1 W )(0,..0,1,0...0) T. Here the randomness of A s specfcay expressed by the randomness ofo. Note that γ s a constant for and t w not affect the resut of argmax. Therefore we can remove t from the above expresson to further smpfy the fna mode: π = W E o {C [argmax(α j(v oˆsˆ/s ))]} π () +(1 W )(0,..0,1,0...0) T,. (2) 4. ALGORITHM In ths secton we ntroduce an effcent agorthm to sove the advertser mode proposed n the prevous sectons. To ease our dscusson, we assume that the statstcs α j, sˆ, and s are a known (wth suffcent data and knowedge about the market). Furthermore, we assume that the search engne can effectvey estmate the true vaue v n (2). Consderng the settng of our probem, we choose to use the mode n [9] for ths purpose. Tabe 1: O-smuator ntazeo = (o 1,,o I) = (0,0,,0) for = 1,...,I, f=random(); // random() unformy outputs a random foat number n [0,1]. sum = 0; n = 0; whe(sum < f) sum = sum+p(o = b n); n = n+1; o =b n; output o; Our dscussons n ths secton w be focused on the computatona chaenge to obtan the best response for a the cases of bd confguratons o (correspondng to o n (2)). Ths s a typca combnatora exposon probem wth a compexty of B I, whch w ncrease exponentay wth the number of advertsers. Therefore, t s hard to sove the probem drecty. Our proposa s to adopt a numerca approxmaton nstead of gvng an accurate souton to the probem. We can prove that the approxmaton agorthm can converge to the accurate souton wth a sma accuracy oss and much ess runnng tme. Our approxmaton agorthm requres the use of a O-smuator, whch s defned as foows. DEFINITION 1. (O-smuator) Suppose there s a random vector O = (O 1,,O I) P(o),.e., P(o) s the dstrbuton of O. Gven o Ω and P(o), an agorthm s caed an O- smuator f the agorthm randomy outputs a vector o wth the probabty P(o). As descrbed above, O-smuator actuay smuates the random vector O and randomy output ts sampes. In genera, t s dffcut to smuate a random vector; however, n our case, a the O are ndependent of each other and they have dscrete dstrbutons. Therefore, the smuaton becomes feasbe. In Tabe 1 we gve a descrpton of O-smuator. Here we assume O = (O 1,,O I) and O P (o ),o Ω. Furthermore, Ω = {b 0,b 1,,b B} s a dscrete space shared by a (ke the bd space n our mode) and ao are ndependent of each other. Note that f s a unformy random number from [0, 1], therefore the possbty that o equas b n s exacty P(O = b n). Thus, the possbty to output o = (o 1,,o I) sπ I =1P(O = o ), whch s exacty what we want. We then gve the Monte Caro Agorthm as shown n Tabe 2 to cacuatee o {argmax π () (α j(v oˆsˆ/s ))} for a certan. For smpcty, we denote Pr(π () = b n) as q (),n, and thus q(),0 s the possbty thats not n the aucton. In ths agorthm, the hstorca bd hstogram π and q (),0 are cacuated from the aucton ogs by Maxmum Lkehood Estmaton. Gven ratonaty parameter δ, N, and q (),0, we ntaze q(),n by the capabty functon. Then wtho generated byo-smuator, we can cacuate whch ranked st s optma forby sovngargmax () π (α j(v oˆsˆ/s )). Note that t s possbe that dfferent bds may ead to the same optma ranked st (wth the same utty). In ths case, the nverse functon argmax () π w output a bd set B o ncudng a the equay optma bds. By assumng that advertser w take any bd n B o wth unform probabty, we aocate each bd n B o wth

Tabe 2: Monte Caro Agorthm for = 0,...,I, ntazeπ,q (),0,s; for j = 0,...,J, ntazeα j; ntaze,δ,n,1/s ; π (),n = 0; for = 1,,I( ) andn = 1,,B q (),n = (1 q() N m= N π,n m,0 ) ( 2N N +m ) (N δ +m) (1 δ ) (N m) ; Bud ano-smuator wthp(o = o ) = Π I =1, q (),o, o ; for t = 1,,N, O-smuator outputs a sampe o ; Soveargmax () π for ab B o, π (), = π(), +1/ Bo ; for n = 1,,B, π (),n =π(),n /N. outputπ (),n ; (α j(v oˆsˆ/s )) to get B o ; 1 a weght B o averagey. Fnay, we use the smuaton tmesn to normaze the dstrbuton and output t. For the Monte Caro Agorthm, we can prove ts convergence to the accurate souton, whch s shown n the foowng theorem. THEOREM 1. Gvenπ andq (),0, the output of the Monte Caro Agorthm converges toe o {argmax () π (α j(v oˆsˆ/s ))} as the tmes of smuatonn grows. PROOF. We assume that the accurate souton sπ 0 and thus we need to prove n (n = 1,,B),π (),n π0,n as N. For a certan payer, we construct the foowng map: M : o B o = { a of sbestbdsn caseo }, o. Accordng to the defnton, we know that π,n 0 equas to the n th eement ofe o {argmax () π (α j(v oˆsˆ/s ))}, and then π,n 0 P(o ) = B o. a B o contanng b n Here P(o ) s the probabty of o. In the Monte Caro agorthm, we ntaze π (),n = 0, and suppose thatπ(),n ncreases by t n each step of the oop for t = 1,,N. Therefore, the vaue of π (),n w fnay be ( N t=1 t)/n. However, n each step t, for a sampe o, the expectaton of t s, P(o ) E( t) = B o. a B o contanng b n Hence, referrng to the Law of Large Number, ( N t=1 t)/n w converge to the expectaton of t, whch exacty equas π 0,n as N grows. Ths fnshes our proof of Theorem 1. Besdes the above theorem, we can aso prove some propertes of the proposed mode. We descrbe the propertes n the appendx for the readers who are nterested n them. 5. EXPERIMENTAL RESULTS In ths secton, we report the expermenta resuts about the predcton accuracy of our proposed mode. In partcuar, we frst descrbe the data sets and the expermenta settng. Then we nvestgate the tranng accuracy for the wngness, capabty, and constrant functons, to show the step-wse resuts of the proposed method. After that, we test the performance of our mode n bd predcton, whch s the drect output of the advertser behavor mode. At ast, we test the performance of our mode n cck number predcton and revenue predcton, whch are mportant appcatons of the advertser behavor mode. 5.1 Data and Settng In our experments, we used the advertser bd hstory data samped from the sponsored search og of a commerca search engne. We randomy chose 160 queres from the most proftabe 10,000 queres and extracted the reated advertsers from the data. We samped one aucton per 30 mnutes from the aucton og wthn 90 days (from March 2012 to May 2012) 9, so there are n tota 4,320 (90 24 2) auctons. For each aucton, there are up to 14 (4 on manne and 10 on sdebar) ads dspayed. We ftered out the advertsers whose ads have never been dspayed durng these 4,320 auctons, and eventuay kept 5,543 effectve advertsers n the experments. For the expermenta settng, we used the frst 3,360 auctons (70 days) for mode tranng, and the ast 960 auctons (20 days) as test data for evauaton. In the tranng perod, we used the frst 2,400 auctons (50 days) to obtan the hstorca bd hstogram π( = 1,,I) and the true vaue v ; we then used the rest 960 auctons (20 days) to earn the parameters for the advertser ratonaty. For carty, we st the usage of the data n Tabe 3. Note that the three perods n the tabe are abbrevated as P1, P2, and P3. 5.2 Dfferent Aspects of Advertser Ratonaty 5.2.1 Wngness Frst, we study the ogstc regresson mode for wngness. We tran the wngness functon usng the auctons n P2 accordng to the descrpton n Secton 2.2, and test ts performance on actons n P3. In partcuar, for any aucton t n P3, we get the vaue of y () t accordng to whether the bd was changed n the tme nterva [t 1,t], and use t as the ground truth. For the same tme perod, we appy the regresson mode to cacuate the predcted vaue [0,1] of y () t. We fnd a threshod n [0,1] such that ŷ () t s ŷ () t correspondngy converted to 0 or 1. Then we can cacuate the predcton accuracy compared wth the ground truth. Fgure 1 shows the dstrbuton of dfferent predcton accuraces among advertsers when the threshod s set to 0.15. Accordng to the fgure, we can see that the wngness functon gets a predcton accuracy of 100% for 39% (2,170 of 5,543) advertsers, and a predcton accuracy over 80% for 68% (3,773 of 5,543) advertsers. In ths regard we say the proposed wngness mode performs we on predctng whether the advertsers are wng to change ther bds. 9 In the search engne, ony the atest-90-day data s stored. To dea wth the seasona or hoday effects, we can choose seasona or hoday data from dfferent years nstead of the data n contnuous tme. We ony consder the genera cases n our experments.

Tabe 3: Data usage n the experments Purpose Tranng Test Perod P1: Day 1 to Day 50 P2: Day 51 to Day 70 P3: Day 71 to Day 90 #auctons 2,400 960 960 () Get hstorca bd hstogram Usage () Learn true vaue Learn ratonaty parameters Test mode bd prce bd prce bd prce ad quaty score ad quaty score ad quaty score Informaton requred ad poston cck number cck number budget budget pay per cck Number of Advertsers 2500 2000 1500 1000 500 0 0 0.2 0.4 0.6 0.8 1 Predcton Accuracy on Wngness Fgure 1: Dstrbuton of the predcton accuracy. 5.2.2 Capabty Second, we nvestgate the capabty functon. For ths purpose, we set C as an dentfy functon, and ony consder W and A. In the capabty functon A, we dscretey pck the parameter par(δ,n ) from the set{0,0.1,,0.9,1.0} {0,1,,9,10} and judge whch parameter par s the best usng the data n P2 as descrbed n Secton 2.3. We ca the advertser mode wth the earned wngness and capabty functons (wthout consderng the constrant functon) Ratonaty-based Advertser Behavor mode wth Wngness and Capabty (or RAB-WC for short). Its performance w be reported and dscussed n Secton 5.3. 5.2.3 Constrant Thrd, the constrant functon s mpemented wth a near regresson mode traned on P2, usng the remanng budget as the feature, accordng to the dscussons n Secton 2.4. By appyng the constrant functon, we get the compete verson of the proposed mode. We ca t Ratonaty-based Advertser Behavor mode wth Wngness, Capabty, and Constrant (or RAB-WCC for short). Its performance w be gven n Secton 5.3. 5.3 Bd Predcton In ths subsecton, we compare our proposed advertser mode wth sx basenes n the task of bd predcton. The predcted bd prces are the drect outputs of the advertser behavor modes. The basenes are sted as foows: Random Bd Mode (RBM) refers to the random method of bd predcton. That s, we w randomy seect a bd n the bd strategy space as the predcton. Most Frequent Mode (MFM) refers to an ntutve method for bd predcton, whch works as foows. Frst, we get the hstorca bd hstogram from the bd vaues n the tranng perod, and then aways output the hstorcay most frequentyused bd vaue for the test perod. If there are severa bd prces that are equay frequenty used, we w randomy seect one from them. Best Response Mode (BRM) [5] refers to the mode that predcts the bd strategy to be the best response by assumng the advertsers know a the compettors bds n the prevous aucton. Regresson Mode (RM) [8] refers to the mode that predcts the bd strategy usng a near regresson functon. In our experments, we used the foowng 5 features as the nput of ths functon: the average bd change n hstory, the bd change n the prevous tme perod, cck number, remanng budget, and revenue n the prevous perod. RAB-WC refers to the mode as descrbed n the prevous subsecton. RAB-WCC-D refers to the degenerated verson of RAB- WCC. That s, we seect the bd wth the maxmum probabty n the mxed bd strategy output by RAB-WCC. We adopt two metrcs to evauate the performances of these advertser modes. Frst, we use the kehood of the test data as the evauaton metrc [9]. Specfcay, we denote a probabstc predcton mode as M, 10 whch outputs a mxed strategy of advertser n perod t as π [t] = (π [t],0,,π[t],b ) n the bd strategy space Ω0. Suppose the ndex of the rea bd strategy of n perod t s ω [t]. Consderng a perod set T and an advertser set I, we defne the foowng kehood: P T,I(M) = Π t T, I (π [t],ω [t] P T,I(M) refects the probabty that mode M produces the rea data ω [t] for a t T and a I. To make the metrc normazed and postve, we adopt the geometrc average and a negatve ogarthmc functon. As a resut, we get D T,I(M) = n( T I PT,I(M) ) = ). npt,i(m). T I We ca t negatve ogarthmc kehood (NLL). It can be seen that wth the same T and I, the smaer NLL s, the better predcton M gves. 10 Pease note some of the modes under nvestgaton are determnstc modes. We can st compute the kehood for them because determnstc modes are speca cases of probabstc modes.

Second, we use the expected error between the predcted bd strategy and the rea bd as the evauaton metrc. Specfcay, we defne the metrc as the aggregated expected error (AEE) on a perod set T and an advertser set I,.e., B t T I =0 π [t], (b b ω [t] ). (3) The average NLL and AEE on a the 160 queres of the above agorthms are shown n Tabe 4. We have the foowng observatons from the tabe. Our proposed RAB-WCC acheves the best performance compared wth a the basene methods. RAB-WCC-D performs the second best among these methods, ndcatng that the bd wth the maxmum probabty n RAB-WCC has been a very good predcton compared wth most of the basenes. RAB-WC performs the thrd best among these methods, showng that: a) the proposed ratonaty-based advertser mode can outperform the commony used agorthms n bd predcton; b) the ntroducton of the constrant functon to the ratonaty-based advertser mode can further mprove ts predcton accuracy. RBM performs amost the worst, whch s not surprsng due to ts unform randomness. BRM aso performs very bad. Our expanaton s as the foowng. In BRM, we assume the advertsers know a the compettors bds before seectng the bds for the next aucton. However, the rea stuaton s far from ths assumpton. So the best response w not be the rea response for most cases. MFM mode performs better than BRM. Ths s not dffcut to nterpret. MFM s a data drven mode, wthout too much unreastc assumptons. Therefore, t w ft the data better than BRM. RM performs better than MFM but worse than RAB-WC, RAB-WCC-D, and RAB-WCC. RM s a machne earnng mode whch everages severa features reated to the advertser behavors, therefore t can outperform MFM whch s smpy based on countng. However, RM does not consder the ratonaty eves n ts formuaton, and therefore t cannot ft the data as we as our proposed mode. Ths ndcates the mportance of modeng advertser ratonaty when predctng ther bd strategy changes. In addton to the average resuts, we gve some exampe queres and ther correspondng NLL and AEE on the 960 th aucton n P3 n Tabe 5 and Tabe 6. The best scores are backened n the tabe. At frst gance, we see that RAB-WCC acheves the frst postons n most of the exampe queres, whe RAB-WCC-D and RAB-WC acheve the frst postons for the rest exampe queres. In most cases, RBM performs the worst, and RM performs moderatey. To sum up, we can concude that the proposed RAB-WCC method can predct the advertsers bd strateges wth the best accuracy among a the modes under nvestgaton. 5.4 Cck and Revenue Predcton To further test the performance of our mode, we appy t to the tasks of cck number predcton and revenue predcton. 11 We compare our mode wth two state-of-the-art modes on these tasks. The frst basene mode s the Structura Mode n Sponsored Search [2], abbrevated as SMSS-1. The second basene mode s the Stochastc Mode n Sponsored Search [17], abbrevated as SMSS- 2. SMSS-1 cacuates the expected number of ccks and the expected expendture for each advertser by consderng some uncertanty assumptons on sponsored search marketpace. SMSS-2 assumes that a the advertsers bds are ndependent and dentcay dstrbuted and they earn the dstrbuton by mxng a the advertsers hstorca bds. We use the reatve error and absoute error as compared to the rea cck numbers and revenue n the test perod as the evauaton metrcs. Specfcay, suppose the vaue output by the mode and the ground truth vaue are φ and ϕ respectvey, then the absoute error and the reatve error are cacuated as φ ϕ and φ ϕ /ϕ respectvey. The performance of a the modes under nvestgaton are sted n Tabe 7. Accordng to the tabe, we can ceary see that RAB-WCC performs better than both SMSS-1 and SMSS-2. The absoute errors on cck number and revenue made by SMSS-1 are very arge as compared to the other methods. The reatve errors made by SMSS- 1 are arger than 50% for both cck number and revenue predcton, whch are not good enough for practca use. The reatve error made by SMSS-2 for revenue predcton s even arger than 80%. In contrast, our proposed RAB-WCC method generates reatve errors of no more than 20% for both cck and revenue predcton (and the absoute errors are aso sma). Athough the resuts mght need further mprovements, a 20% predcton error has aready provded qute good references for the search engne to make decson. 6. RELATED WORK Besdes the randomzed bd strategy and the strategy of seectng the most frequenty used bd, there are a number of works on advertser modeng n the terature. Eary work studes some smpe cases n sponsored search such as auctons wth ony two advertsers and auctons n whch the advertsers adjust ther bds n an aternatng manner [1] [21] [18]. Later on, greedy methods were used to mode advertser behavors. For exampe, n the random greedy bd strategy [4], an advertser chooses a bd for the next round of aucton that maxmzes hs/her utty, by assumng that the bds of a the other advertsers n the next round w reman the same as n the prevous round. In the ocay-envy free bd strategy [10] [16], each advertser seects the optma bd prce that eads to a certan equbrum caed ocay-envy free equbrum. In [6], the advertser bd strateges are modeed usng the knapsack probem. Compettor-bustng greedy bd strategy [22] assumes that an advertser w bd as hgh as possbe whe retanng hs/her desred ad sot n order to make the compettors pay as much as possbe and thus exhaust ther advertsng resources. Other smar work ncudes ow-dmensona bd strategy [20], restrcted baanced greedy bd strategy [4], and atrustc greedy bd strategy [4]. In [5], a mode that predcts the bd strategy to be the best response s proposed by assumng the advertsers know a the compettors bds n the prevous aucton. In [8], a near regresson mode s used base on a group of advertser behavor features. In addton, a bd strategy based on ncrementa cost per cck s dscussed n [19] 11 After outputtng the bd predcton, we smuated the aucton process based on those bds and made estmaton on the revenue and ccks accordng to the smuaton resuts.

Tabe 4: Predcton performance Mode RBM MFM BRM RM RAB-WC RAB-WCC-D RAB-WCC NLL 3.939 1.420 2.154 1.289 1.135 1.056 1.018 AEE 35.392 34.748 77.526 40.397 14.616 10.553 8.876 Tabe 5: Predcton performance on some exampe queres (NLL) Mode RBM MFM BRM RM RAB-WC RAB-WCC-D RAB-WCC car nsurance 3.067 1.198 1.777 2.468 0.995 0.975 0.975 dsney 2.169 0.541 2.592 0.300 0.130 0.140 0.130 pad 4.457 1.288 2.075 0.747 0.315 0.325 0.310 jcpenney 2.089 0.511 3.213 0.487 0.263 0.351 0.262 medcare 3.649 1.466 1.750 2.866 1.125 1.127 1.121 stock market 5.068 1.711 2.100 1.839 1.373 1.349 1.362 [2], whch proves that an advertser s utty s maxmzed when he/she bds the amount at whch hs/her vaue per cck equas the ncrementa cost per cck. 12 However, pease note that most of the above works assume that the advertsers have the same ratonaty and ntegence n choosng the best response to optmze ther uttes. Therefore they have sgnfcant dfference from our work. Actuay, to the best of our knowedge, there s no work on advertser behavor modeng that consders dfferent aspects of advertser ratonaty. 7. CONCLUSIONS AND FUTURE WORK In ths work, we have proposed a nove advertser mode whch expcty consders dfferent eves of ratonaty of an advertser. We have apped the mode to the rea data from a commerca search engne and obtaned better accuracy than the basene methods, n bd predcton, cck number predcton, and revenue predcton. As for future work, we pan to work on the foowng aspects. Frst, n Secton 2.1, we have assumed that the auctons for dfferent keywords are ndependent of each other. However, n practce, an advertser w bd mutpe keywords smutaneousy and hs/her strateges for these keywords may be dependent. We w study ths compex settng n the future. Second, we w study the equbrum n the aucton gven the new advertser mode. Most prevous work on equbrum anayss s based on the assumpton of advertser ratonaty. When we change ths foundaton, the equbrum needs to be re-nvestgated. Thrd, we w appy the advertser mode n the functon modues n sponsored search, such as bd keyword suggeston, ad seecton, and cck predcton, to make these modues more robust aganst the second-order effect caused by the advertser behavor changes. Fourth, we w consder the appcaton of the advertser mode n the aucton mechansm desgn. That s, gven the advertser mode, we may earn an optma aucton mechansm usng a machne earnng approach. 8. ACKNOWLEDGMENTS We thank We Chen, Tao Qn, D He, Wenku Dng, and Xnxn Yang for ther vauabe suggestons and comments on ths work, 12 Incrementa cost per cck s defned as the advertser s average cost of addtona ccks receved at a better ad sot. and thank Pngguang Yuan for hs hep on the data preparaton for the experments. APPENDIX In the appendx, we dscuss some propertes of the proposed mode. Frsty, we gve a theorem on the reatonshp of true vaue and bd. Secondy, we gve a theorem reated to the estmaton accuracy of the true vaue. A. RELATIONSHIP We dscuss about the reatonshp between true vaue v and our predcted bd strategy. Note that we w many focus on the resuts from the capabty functon because both wngness and compromse functons are not effected by the true vaue v accordng to ther defntons. For ths purpose, by settngw = 1 andc as the dentty functon nπ, we defne: F(v ) = E σo {argmax(α j(v oˆsˆ/s ))} π () E(v ) = (b 1,b 2,...,b B)(F(v )) T HereF(v ) s ab-dmenson strategy vector ande(v ) s the average bd of the strategyf(v ). Under a very common assumpton that ad poston effect α j decreases wth the sot ndex j, Theorem 2 shows that an advertser wth a hgher true vaue w generay set a hgher bd to optmze the utty, whch s consstent to the ntuton. Ths concuson shows the consstency of our mode n the capabty part. THEOREM 2. Assume α j decreases n j, then E(v ) s monotone nondecreasng nv. PROOF. To prove E(v ) s monotone nondecreasng, we ony need to prove that o and > 0, (b 1,b 2,...,b B)(argmax(α j(v (1+ ) oˆsˆ/s ))) π () (b 1,b 2,...,b B)(argmax(α j(v oˆsˆ/s ))), (4) π () and then the w keep unchanged n the expectaton of o. We denote j and j 0 as the best rank of for the cases that true vaues are v (1 + ) and v respectvey. Here o s fxed and best rank means the rank that eads to the optma utty.

Tabe 6: Predcton performance on some exampe queres (AEE) Mode RBM MFM BRM RM RAB-WC RAB-WCC-D RAB-WCC car nsurance 89.459 89.883 305.703 107.335 33.207 22.188 12.760 dsney 5.019 4.895 9.703 0.217 0.297 0.171 0.140 pad 16.355 15.428 30.856 0.662 0.975 0.458 0.385 jcpenney 5.036 5.145 16.337 1.476 1.411 1.165 0.209 medcare 98.206 99.014 225.774 111.248 20.221 16.695 3.744 stock market 37.576 38.360 72.640 97.035 5.824 4.137 1.486 We denote ˆ and ˆ 0 as the advertsers who rank at (j + 1) and (j 0 + 1) respectvey. Note that for a fxed o, j and j 0 can be dfferent due to dfferent true vaue of. If we are abe to prove j j 0, then the nequaty (4) w be vad snce a nondecreasng best rankng yeds a nondecreasng best bd strategy. As j 0 s the best rank for the true vaue v, we have, α j 0(v oˆ 0 sˆ 0 /s ) α j (v /s oˆ sˆ ). (5) Assumng j < j 0, we have, By addng (3) and (4), we got, α j 0v > α j v. (6) α j 0(v (1+ ) oˆ0 sˆ0 /s ) > α j (v (1+ ) oˆ sˆ /s ). Ths equaton reveas that j 0 s a better rank than j and j shoud not be the best rank for the true vaue v (1 + ), whch s contradctve to the defnton of j. Therefore, the assumpton j < j 0 s not vad, whch aso fnshes our proof of ths theorem. B. ESTIMATION ACCURACY As dscussed n Secton 4, we choose the mode n [9] for the true vaue predcton. Usuay, the estmaton s not perfect and there mght be some errors. Fortunatey, we can prove a theorem whch guarantees that the souton of ths mode w keep accurate f the estmaton errors are not very arge. Ths hods true because the payment rue of GSP s dscrete and t aows the sma-scae vbraton of true vaue. Before ntroducng the theorem, we gve some notatons frst. For a fxedo and true vauev, s best rank s denoted asbr o (Best Rank), the optma utty s denoted asbu o, and the rankng score of ˆ (the one ranked next to ) n the optma case s denoted BS o. To descrbe the theorem, we aso denote the second optma utty assu o (Second Utty), whch s the argest utty ess thanbu o n the fxedo. THEOREM 3. We assume that α j decreases n j and set θ = max o ( SUo BU o ),ρ = max o (BR o ), andω = max o (BS o ), (v ω/s > 0). Letv ncrease by v ( R), thenf(v ) w keep unchanged f αρ α 1 (1 θ)(1 ω s v ), where α 1 s the CTR at the frst poston. In order to prove the bound of keeps F(v ) unchangng, we prove the foowng emma nstead. LEMMA 1. If satsfes αρ α 1 (1 θ)(1 ω s v ), then o we have,argmax () π α j(v (1+ ) oˆsˆ/s ) = argmax () π α j(v oˆsˆ/s ). The proof of Theorem 3 w be fnshed at once after we sum up a the cases of o n Lemma 1. Tabe 7: Predcton performance n appcatons Mode SMSS-1 SMSS-2 RAB-WCC Reatve Error (Cck) 0.52 0.11 0.19 Absoute Error (Cck) 2.02 0.71 0.23 Reatve Error (Revenue) 0.54 0.83 0.18 Absoute Error (Revenue) 659.06 124.80 25.75 PROOF. Snce a change of argmax () π s equvaent to a change of BR o, we consder the crtca pont that the ncrease of makes the best rank transfer exacty from j 0 toj (j 0 j ). Thus we have: j (j j 0 ),s.t.j,j 0 maxmzesα j(v (1+ ) oˆsˆ/s ) smutaneousy, and then we can get, α j (v /s (1+ ) oˆ sˆ ) = α j 0(v (1+ ) oˆ 0 sˆ 0 /s ). (7) From equaton (7) we have, = α j 0(v /s oˆ0 sˆ0 ) α j (v /s oˆ sˆ ). (8) v (α j α j 0) Assume there s a θ 0 such that α j (v oˆ sˆ /s ) = θ 0α j 0(v oˆ0 sˆ0 /s ). (9) Then equaton (8) s transformed as, = (1 θ0)α j 0(v oˆ0 sˆ0 /s ) v (α j α j 0) α j 0 = (1 θ 0) α j α j 0 (1 oˆ0 sˆ0 /s v ). (10) Consderng j 0 s the best rank, from equaton (9) we have, θ 0 = α j (v oˆ sˆ /s ) α j 0(v oˆ0 sˆ0 /s ) SUo BU o θ < 1. (11) In addton, there hods j 0 max o (BR o ) = ρ, α j 0 α j 0 α ρ, α j α j 0 > αρ α 1, (12) oˆ0 sˆ0 = BS o max o (BS o ) = ω. (13) Accordng to (10) and (11),(12),(13), we fnay have, = 1 θ 0 α j 0 α j α j 0 (1 oˆ0 sˆ0 s v ) > αρ α 1 (1 θ)(1 ω s v ). As s the crtca pont, for any fxed o, f αρ α 1 (1 θ)(1 ω s v ), BR o and argmax () π " w keep unchanged. Ths ends our proof of Lemma 1.

C. REFERENCES [1] K. Asdemr. Bddng patterns n search engne auctons. In Second Workshop on Sponsored Search Auctons (2006), ACM Eectronc Commerce. Press., 2006. [2] S. Athey and D. Nekpeov. A structura mode of sponsored search advertsng auctons., 2010. Avaabe at http://groups.haas.berkeey.edu/ marketng/scs/pdf_2010/paper_athey.pdf. [3] A. Broder, E. Gabrovch, V. Josfovsk, G. Mavromats, and A. Smoa. Bd generaton for advanced match n sponsored search. In Proceedngs of the fourth ACM nternatona conference on Web search and data mnng, WSDM 11, pages 515 524, New York, NY, USA, 2011. ACM. [4] M. Cary, A. Das, B. Edeman, I. Gots, K. Hemer, A. R. Karn, C. Matheu, and M. Schwarz. Greedy bddng strateges for keyword auctons. In EC 07 Proceedngs of the 8th ACM conference on Eectronc commerce. ACM Press., 2007. [5] M. Cary, A. Das, B. G. Edeman, I. Gots, K. Hemer, A. R. Karn, C. Matheu, and M. Schwarz. On best-response bddng n gsp auctons., 2008. Avaabe athttp: //www.hbs.edu/research/pdf/08-056.pdf. [6] D. Chakrabarty, Y. Zhou, and R. Lukose. Budget constraned bddng n keyword auctons and onne knapsack probems. In WWW 07 Proceedngs of the 16th nternatona conference on Word Wde Web. ACM Press., 2007. [7] Y. Chen, G.-R. Xue, and Y. Yu. Advertsng keyword suggeston based on concept herarchy. In WSDM 08 Proceedngs of the nternatona conference on Web search and web data mnng. ACM Press., 2008. [8] Y. Cu, R. Zhang, W. L, and J. Mao. Bd andscape forecastng n onne ad exchange marketpace. In KDD 11 Proceedngs of the 17th ACM SIGKDD nternatona conference on Knowedge dscovery and data mnng. ACM Press., 2011. [9] Q. Duong and S. Lahae. Dscrete choce modes of bdder behavor n sponsored search. In Workshop on Internet and Network Economcs (WINE). Press., 2011. [10] B. Edeman, M. Ostrovsky, and M. Schwarz. Internet advertsng and the generazed second-prce aucton: Seng bons of doars worth of keywords. In The Amercan Economc Revew., 2007. [11] T. Graepe, J. Q. Candea, T. Borchert, and R. Herbrch. Web-scae bayesan cck-through rate predcton for sponsored search advertsng n mcrosoft s bng search engne. In Proceedngs of the 27th Internatona Conference on Machne Learnng. ACM, 2010. [12] J. C. Harsany. Games wth ncompete nformaton payed by bayesan payers, -. In Management Scence., 1967/1968. [13] J. Jansen. Understandng Sponsored Search: Core Eements of Keyword Advertsng. Cambrdge Unversty Press., 2011. [14] B. Ktts and B. Lebanc. Optma bddng on keyword auctons. In Eectronc Markets. Routedge Press., 2004. [15] D. C. Lu and J. Noceda. On the mted memory bfgs method for arge scae optmzaton. In Journa of Mathematca Programmng. Sprnger-Verag New York., 1989. [16] C. Nttaa and Y. Narahar. Optma equbrum bddng strateges for budget constraned bdders n sponsored search auctons. In Operatona Research: An Internatona Journa. Sprnger Press, 2011. [17] F. Pn and P. Key. Stochastc varabty n sponsored search auctons: Observatons and modes. In EC 11 Proceedngs of the 12th ACM conference on Eectronc commerce. ACM Press., 2011. [18] S. S. S. Reddy and Y. Narahar. Bddng dynamcs of ratona advertsers n sponsored search auctons on the web. In Proceedngs of the Internatona Conference on Advances n Contro and Optmzaton of Dynamca Systems. Press., 2007. [19] H. R. Varan. Poston auctons. In Internatona Journa of Industra Organzaton, 2006. [20] Y. Vorobeychk. Smuaton-based game theoretc anayss of keyword auctons wth ow-dmensona bddng strateges. In UAI 09 Proceedngs of the Twenty-Ffth Conference on Uncertanty n Artfca Integence. AUAI Press., 2009. [21] X. Zhang and J. Feng. Fndng edgeworth cyces n onne advertsng auctons. In Proceedngs of the 26th Internatona Conference on Informaton Systems ICIS. Press., 2006. [22] Y. Zhou and R. Lukose. Vndctve bddng n keyword auctons. In ICEC 07 Proceedngs of the nnth nternatona conference on Eectronc commerce. ACM Press., 2007.