A Survey on Sponsored Search Advertising in Large Commercial Search Engines A Survey on Sponsored Search Advertising in Large Commercial Search Engines George Trimponias, Dimitris Papadias Department of Computer Science and Engineering, Hong Kong University of Science and Technology {trimponias, dimitris}@cse.ust.hk Technical Report HKUST-CS13-03 Abstract. Large commercial search engines such as Google Search, Microsoft s Bing, and Yahoo! Search have recently emerged as information gateways for millions of Internet users. Their unique role as an intermediary between Internet users and the vast Web content has created exciting marketing opportunities for many commercial firms that wish to advertise their product or service. As a result, a new multi-billion dollar market for sponsored search has been established, where advertisers pay a fee determined by an auction in order to be displayed in highlighted textual content or alongside the Web search results. Motivated by its unprecedented proliferation and enormous success, we conduct a survey on the newly introduced paradigm of sponsored search advertising. Our contributions are twofold. On the one hand, we provide an extensive and self-contained survey on the sponsored search market, covering as diverse topics as the structure of sponsored search advertising, practical issues that major search engines need to deal with, and even a brief history of the market for paid links. On the other hand, we investigate several auction designs for the sponsored search market, and discuss their properties and mathematical underpinnings. We conclude with future directions and challenges. 1
Technical Report HKUST-CS13-03 1 Introduction Over the past years, large commercial search engines 1 have emerged as information gateways for millions of Internet users. In response to a user s query, search engines generate a ranked list of results based on sophisticated information retrieval algorithms. These pages might represent commercial entities selling goods or services, recommendation sites, review sites, etc. Google Search alone, the dominant search engine on the World-Wide-Web, is estimated to index more than 40 billion web pages 2 in a Web that consists of over a trillion unique URLs 3. Moreover, it serves more than one billion search requests a day 4, many of which are related to decision-making tasks such as shopping or film reviews. It should then come as no surprise that modern search engines have a critical power in shaping the Web users actions. This unique role as an intermediary between Internet users and the vast Web content has created exciting marketing opportunities for many commercial firms that wish to advertise their product or service. A new market for sponsored search has been established, where advertisers pay a fee to be displayed in highlighted textual content or alongside the Web search results. Usually, when a user enters a query, a limited number of paid (sponsored) links (slots) appears on top or to the right side of the unpaid (organic or algorithmic) search results. Advertisers who have expressed interest in this query compete for the paid positions; as supply and demand vary unpredictably across the user queries, and the number of possible keywords is prohibitively large, the search engine relies on the market to determine the winners and prices by using auctions among the advertisers. Every time a user issues a query, the search engine runs an auction to determine the winners, i.e., the advertisers that will be displayed along the ad slots, and the price that each of them has to pay to the search engine. Payments are based on the pay per click model, i.e., the advertiser only pays the specified price when the user actually clicks on their ad. Figure 1 depicts part of the first page for the query melanoma treatment using Google Search. The prominent highlighted result at the top of the page and the three results on the right side are sponsored ads, while the 5 results below the highlighted ad correspond to the top-5 organic results. 1 The term search engines encompasses pure Web search engines (e.g., Google Search), information portals with search functionality (e.g., Yahoo!), metasearch engines (e.g., Metacrawler), niche search engines (e.g., CiteSeer), and comparison shopping engines (e.g., mysimon, Shopping.com) [31]. 2 See http://www.worldwidewebsize.com (accessed March 7, 2013). 3 See http://googleblog.blogspot.com/2008/07/we-knew-web-was-big.html (accessed March 7, 2013). 4 See http://www.nytimes.com/2011/03/06/weekinreview/06lohr.html?pagewanted=1&_r=1&hpw (accessed March 7, 2013). 2
A Survey on Sponsored Search Advertising in Large Commercial Search Engines Fig. 1. Top organic and sponsored results for the query melanoma treatment using Google Search. Sponsored search is a large and rapidly growing source of revenue for all large search engines. Currently, the two most prominent players in the sponsored search market are Google s AdWords [73], and Microsoft s Bing Ads which powers the sponsored links for both Microsoft Bing and Yahoo! Search [72]. Google s total revenue in fiscal year 2010 was $29.321 billion. Over 96% of it was related to advertising, while less than 4% came from licensing and other sources 5. To get a better picture of the sponsored search market, note that, as of September 2011, Google boasted a market capitalization of $170.76 billion; General Motor s (largest US automaker) $31.88 billion paled in comparison 6. Several other companies, including LookSmart, FindWhat, espotting, InterActiveCorp (Ask Jeeves), and ebay (Shopping.com), earn hundreds of millions of dollars in sponsored search revenue annually [48]. Interestingly, this new advertising model has had a significant economic impact on the entire business activity; in 2009 only, Google generated a total of $54 billion of economic activity for American businesses, website publishers and non-profits 7. Interestingly, the sponsored search advertising revenue has enabled large search companies to finance the very expensive infrastructure that is necessary to make the vast Internet information accessible, and to develop many free services such as spell checking, currency conversion, flight times, and desktop searching applications [41]. 5 See http://investor.google.com/financial/2010/tables.html (accessed March 7, 2013). 6 See https://www.google.com/finance?client=ob&q=nasdaq:goog and http://www.google.com/finance?q=nyse%3agm (accessed March 7, 2013). 7 See http://googleblog.blogspot.com/2010/05/googles-us-economic-impact.html (accessed March 7, 2013). 3
Technical Report HKUST-CS13-03 Motivated by its unprecedented proliferation and enormous success, we conduct a survey on the newly introduced paradigm of sponsored search advertising. The field is still undergoing many changes and it is still an area of intense research within the academic community. We have thus decided to focus on the major industrial aspects of sponsored search as well as the theoretical fundamentals of the auctions that large commercial search engines employ to sell paid links to advertisers. Our goal is the survey to be as self-contained as possible, and serve as a guide to interested readers who wish to introduce themselves to this exciting field. We emphasize that in some parts we have followed very closely the exposition of some excellent surveys or works, as we found them to be exceptionally structured and very clearly written; we will state it explicitly whenever this is the case since these parts do not represent original work of ours. Section 2 investigates the structure of sponsored search advertising. Section 3 discusses practical issues that arise in sponsored search. Section 4 provides an overview of sponsored search auctions. Section 5 surveys several interesting properties of various auction designs for the sponsored search market, and demonstrates how they are structurally connected. Section 6 analyzes the GSP procedure with game-theoretic tools, and explores various types of equilibria. Finally, Section 7 concludes the paper with future directions and challenges. 2 Structure of Sponsored Search Advertising Three distinct players define the dynamics of sponsored search advertising: the advertisers, the search engine, and the users. In this Section, we investigate the main characteristics and structure of each party, as well as how they interact one with another. 2.1 Advertisers The advertiser seeks to place properly designed advertisements to promote their product or service. They target interesting users by declaring to the search engine a list of keywords that a relevant user may search for. For each keyword, they additionally determine their maximum cost per click (maximum CPC), also known as maximum bid, which corresponds to the maximum amount of money that the advertiser is willing to spend to appear on the results page for a given keyword; it can be as low as $0.10. The actual CPC, on the other hand, refers to the actual amount of money that the advertiser is charged when the user clicks on their ad. On average, actual CPC ranges between $0.50 and $0.90, depending on the position 8 ; for competitive markets, however, it can get much higher 9. Note that bidding takes place continuously: advertisers can update their bids as 8 See http://www.adgooroo.com/has_google_changed_their_cpc_formula.php#more (accessed March 7, 2013). 9 According to wordstream.com, the most expensive keyword category at the close of 2010 was Insurance with a top CPC of $54.91, followed by Loans with a top CPC of $44.28. See http://www.wordstream.com/articles/most- 4
A Survey on Sponsored Search Advertising in Large Commercial Search Engines often as they wish, although in reality the majority changes their bids on a daily or weekly basis 10. To avoid missing a potential user, the advertiser tries to cover all possible keywords. Unavoidably, a user query may match several keywords, so the advertiser essentially ends up competing with itself. Coming up with the right keywords and bids in the presence of complex keyword interactions turns out to be a hard problem [65]. In addition to the bids, an advertiser may also declare a maximum daily budget. In practice, most advertisers have operating budgets or spending targets, and are not willing to spend arbitrarily for their marketing campaign. They report their budget constraints to the search engine, which is responsible for properly allocating the budget. Search engines usually impose limits on the maximum number of times an advertiser can update their bid during a single day; for Google AdWords, for instance, the maximum allowed number of updates per day is 10 11. Efficient budget allocation is actually one of the most extensively studied optimization problems for both the search engine and the advertiser, e.g., [1][12][55][56][59]. A particular challenge is that users arrive online in a largely unpredictable way; consequently, the optimal allocation is out of reach and approximation online algorithms are used instead [59]. In the pay per click model, the advertiser is naturally interested in maximizing the number of clicks that they receive. In reality, however, this assumption is rather restrictive: large and popular commercial firms may have a greater interest in getting an ad slot in a high position, and care less about the total number of clicks they receive. The reason is that the top ad slots are associated with an increased brand awareness effect, regardless of whether the ad is actually being clicked 12. As a result, branding advertisers would like to have direct control over the position of their ad. Special position-based auctions [3] have been introduced to address this issue. Usually, the advertiser is interested in the successful conversions rather than the total number of received clicks. These correspond to desired user actions on the landing page, such as product sales, membership registrations, newsletter subscriptions, software downloads, etc. Given this data, the advertiser can have a clear picture of the keywords that are worth more, and adapt its bidding behavior. Conversion rate maximization has been investigated in [45]. 2.2 Search Engine The search engine provides a suitable mechanism to enable the interaction of advertisers and users. Not surprisingly, there is an ongoing conflict for a search engine between revenue maximization and high-quality sponsored results. The fact that the web search market is largely dominated by only few big players led by Google, Microsoft, and expensive-keywords (accessed March 7, 2013). 10 See http://www.seroundtable.com/archives/022226.html (accessed March 7, 2013). 11 See http://adwords.google.com/support/aw/bin/answer.py?hl=en&answer=8761 (accessed March 7, 2013). 12 Nielsen/NetRatings. Interactive advertising bureau (IAB) search branding study, August 2004. Commissioned by the IAB Search Engine Committee. 5
Technical Report HKUST-CS13-03 Yahoo! is of no help: empirical studies clearly demonstrate that when the quality of search falls below a certain threshold, the user is very likely to defect to another search engine [53]. A myopic strategy would be to increase revenue by increasing the number and prominence of ad slots; in the long run, however, this would take a toll on the search engine s perceived quality by the users [10] and expected revenue [31]. In practice, the number of ad slots usually ranges anywhere between 0 and 12. In particular, according to the market research company AdGooRoo 13, in year 2008 Google AdWords displayed 5.5-6.0 ads per query, and Bing Ads only 3.85 ads per query (United States market only). The search engines have at their disposal a variety of ways to measure the quality of an ad, and discard low-quality ads. The most prominent measure of ad quality is the clickthrough rate (CTR), which is the number of times an ad is clicked divided by the number of times the ad is shown. The CTR is computed for each ad and keyword, and is a strong indicator of the relevance of both the ad and the keyword. In general, an average CTR is in the neighborhood of 2% 14, while a CTR of 1% or more is considered a good goal for new advertisers 15. Other quality measures include: 1) relevance of the landing page to the declared keyword and the ad creative 16 [16]; and 2) historical CTR for a query-ad pair over long time periods [38]. In practice, large search engines utilize a quality score that takes into account a variety of factors to measure how relevant the keyword is to the ad text and to a user s search query. For instance, each time that a keyword matches a user query, Google AdWords calculates a quality score for the keyword based on the historical CTR of the keyword and the matched ad, the advertiser account history, the quality of the landing page 17, the relevance of the keyword and the matched ad to the search query, the account s performance in the geographical region where the ad will be displayed, as well as other relevance factors 18. Note that the exact formula for computing the exact score is one of the best kept secrets of a search engine. Major search engines enforce reserve prices, dictating the minimum price that an advertiser can pay per click for a particular keyword. The minimum bid is usually bidderspecific and quality-based. For instance, Google AdWords specifies a minimum allowable bid of as low as US$0.01 for high quality keywords 19. Reserve prices are necessary to discourage bidders from bidding aggressively on irrelevant keywords, and thus compromising the search engine s quality as perceived by the users. Besides, according to a well-known result in the theory of optimal auction design [47], setting suitable reserve prices may substantially raise revenues. Ostrovsky et al. [63] perform a large-scale field 13 See http://succeed.adgooroo.com/q208_search_advertising_report.html (accessed March 7, 2013). 14 See http://www.google.com/support/forum/p/adwords/thread?tid=7aeb3290fd8feccb&hl=en (accessed March 7, 2013). 15 See http://adwords.google.com/support/aw/bin/answer.py?hl=en&answer=107955 (accessed March 7, 2013). 16 To appreciate how important that is, note that it is estimated that more than 30% of all landing pages are only very remotely related to the ads [16]. 17 According to the landing page and site quality guidelines, the landing page is expected to 1) feature relevant and original content, 2) be transparent, and 3) be easy to navigate. See http://adwords.google.com/support/aw/bin/answer.py?hl=en&answer=46675 (accessed March 7, 2013). 18 See http://adwords.google.com/support/aw/bin/answer.py?hl=en&answer=10215 (accessed March 7, 2013). 19 See http://adwords.google.com/support/aw/bin/answer.py?hl=en&answer=18275 (accessed March 7, 2013). 6
A Survey on Sponsored Search Advertising in Large Commercial Search Engines experiment on reserve prices for sponsored search auctions conducted by Yahoo! to sell advertisements, and show that reserve prices can have a significant positive effect on the search engine s revenue. 2.3 Users Users constitute through their actions the commodity that the advertisers bid on. An advertiser expresses interest in those user queries that it deems as relevant by declaring keywords to the search engine. The assumption that the user expresses interest in the product or service of the advertiser is implicit throughout this process. But taken from the user perspective, a different story emerges. Users view the search engine as a gateway to the vast information content of the Web. By posing textual search queries, they express to the search engine an intention that the search engine attempts to capture, sometimes without success. For instance, consider a user who enters the search query apple with the intention of getting information on the vitamin content of the apple fruit. Interestingly, as of 30 September 2010, the top-10 organic results by Google Search are all related to Apple Inc. Corporation, which in this example is totally unrelated to the user s intention. Unfortunately, even today there is a very limited understanding of user intention and behavior. However, an aspect of user behavior that has received considerable attention and research in the last years concerns the interplay between organic and sponsored results, as well as the question of how sponsored links are generally perceived. Ghose and Yang, for instance, identify in [33] that retailer-specific and brand-specific information in a sponsored link increases the efficiency of online advertising; the former increases clickthrough rates, whereas the latter raises conversion rates. The same authors show in [71] that organic and sponsored results tend to be positively interdependent. In particular, total clickthrough rates, conversion rates, and revenues in the presence of both paid and organic search listings are significantly higher than those in the absence of paid search advertisements. Reiley et al. investigate in [64] the externalities among the north ads, i.e., the sponsored results that appear above the organic listings, and find that rival north ads impose a positive, rather than negative, externality on existing north listings. In other words, the top north listing receives more clicks when additional sponsored results appear below it. Agarwal et al. show in [2] that although clickthrough rate decreases with position, the conversion rate first increases and then decreases with position for longer keywords. As a result, top positions in sponsored search advertisements are not necessarily the optimal positions for advertisers. Rutz and Bucklin study in [66] the interactions between generic versus branded keywords, and find that generic keywords may induce positive spillovers on the clickthrough rate of branded keywords. Similarly, Jeziorski and Segal [42] and Chiou and Tucker [15] show the prevalence of negative externalities across ads: as many as 50% more clicks would occur in a hypothetical world in which each ad faces no competition. Finally, Edelman and Gilchrist [27] investigate how users perceive the labels of the sponsored results. Concretely, they show that relative 7
Technical Report HKUST-CS13-03 to users receiving the sponsored link or ad labels, users receiving the paid advertisement label click 23% and 26% fewer advertisements, respectively. 3 Practical Issues in Sponsored Search Advertising In this Section, we investigate several practical issues that arise in sponsored search advertising. We have tried to place particular emphasis on how major search engines deal with these issues in practice. 3.1 Ranking and Pricing Schemes Two major market design questions are 1) how to allocate advertisements to slots, and 2) how to price the ads. Interestingly, all large search engines address these two issues in a very similar way. First, ads are sorted in decreasing order of their rank, where the ad rank is determined by both the bid placed by the advertiser on the keyword, and the quality of the ad. The ad with the highest rank appears in the first position, and so on down the page, until all slots have been filled. Note that the exact ad rank formula differs from one search engine to another. Google AdWords 20, for instance, has made public that their ad rank is determined by the product CPC bid Quality Score. Bing Ads 21, on the other hand, claim that their ad ranks are determined by various factors such as the bid amount, the ad CTR, and the ad relevance, but have kept the details of the ad rank formula secret. Pricing takes place after the winning ads and their ranks have been specified to determine the cost per click that the advertiser will be charged whenever a user clicks on their ad. The natural method would be to make bidders pay what they bid, but that leads to well-known race conditions and instabilities, as we will explore in Section 4.1. Instead, all large search engines 22 currently employ a generalized second-price auction (GSP) [26][68]. A GSP auction charges an advertiser the minimum amount required to maintain their ad s position in search results, plus a tiny increment. Though we discuss sponsored search auctions in Section 5 in detail, let s briefly demonstrate how GSP auctions work in practice by considering the ad ranking of Google AdWords. Suppose K slots are available, and are numbered 1,, K, starting from the top and going down. Moreover, let the advertiser A i at position i have a maximum bid b i and a quality score QS i. In GSP, the price for a click for A i is determined by the advertiser A i+1 below them, and given by b i+1 QS i+1 /QS i, which is the minimum that A i would have needed to bid to attain their 20 See https://adwords.google.com/support/aw/bin/answer.py?hl=en&answer=6111 (accessed March 7, 2013). 21 See http://advertise.bingads.microsoft.com/en-ca/producthelp/bingads/topic?query=moonshot_conc_whatisadposition.htm (accessed March 7, 2013). 22 For the pricing policy of Google AdWords, see http://support.google.com/adwords/answer/6297?hl=en&ref_topic=24937. For Bing Ads, see http://community.bingads.microsoft.com/ads/en/bingads/b/blog/archive/2009/10/28/sem-beginner-series-what-is-costper-click-cpc.aspx. All accessed March 7, 2013. 8
A Survey on Sponsored Search Advertising in Large Commercial Search Engines position. Note that in this pricing scheme, a bidder s payment does not take into consideration their own bid. 3.2 Account Structure and Ad Structure Advertisers organize their ad campaigns through a specially designed structure, which is the same for the three major search engines. Concretely, the advertiser information is organized into three levels: account, campaign, and ad group. Depending on their marketing goals, an advertiser may have one or more accounts. An account is associated with a unique email address, password, and billing information, and contains one or more campaigns. A campaign has its own budget and targeting options (time-zone, geographical area, etc.) to determine where these ads will appear. Typically, campaigns are created to achieve a clear marketing goal, and are made up of one or more ad groups. An ad group contains a set of ads and a keyword list that will trigger these ads to show. Bids can be applied to all the keywords in an ad group, or custom bids may be set for individual keywords. (a) Account (b) Ad creative Fig. 2. Example advertiser account and displayed ad creative. An ad (also known as creative by Google) is composed of three elements that are visible by the user on the results page. The headline is the first line of the ad, and acts as a link to the advertiser s website. It is considered a good practice for the advertisers to include at least one of the keywords in the headline. The lines of text are the two lines right after the headline, and describe the advertised product or service. Finally, the display URL is the last line, and is used to show the URL of the promoted website. Besides the three visible elements, search engines demand from the advertisers to set a non-visible destination URL (also known as landing page), which is the exact page within the website that is most relevant to the product or service described in the ad. Figure 2(a) shows an example 23 of an account that contains one campaign with one ad group containing four 23 The example was taken from the website of Yahoo! Search Marketing but is not online anymore, as of March13, 2013. 9
Technical Report HKUST-CS13-03 keywords and two ads, while Figure 2(b) demonstrates how the ad might display when a user searches with the term notebook computers. Finally, we note that efficient indexing structures for both the advertiser accounts and the sponsored ads have been investigated in the literature, e.g., [9][46]. 3.3 Click Probabilities Commercial search engines compute the clickthrough rate of an ad without taking into account the ad position. Under this model, a given ad would have the same probability to be clicked, irrespective of whether it appears at the top or at the bottom of the sponsored results. However, one would normally assume that slots at the top receive more clicks. According to Accuracast 24, for example, the average CTR clearly varies across position with ads at the top getting more clicks 25. For this reason, prior research has focused on clickthrough models that incorporate the ad position. Aggarwal et al. [6] introduce a separable click model, which defines the click probability as the product of two components: an ad-specific clickthrough rate, and a position-specific visibility factor. Unfortunately, subsequent experimental studies [43][18] have failed to validate this model and have emphasized its inadequacies. The most significant one is that it completely discounts the effects of other ads shown on the same page, namely the ad externalities. Intuitively, a high-quality relevant ad placed at the top can detract from other ads; conversely, a very low-quality or offensive ad may entice the user to completely disregard all ads on the page. Recent work [5][44][18] has suggested Cascade models for the ad externalities. In addition to the ad-specific clickthrough-rate, the basic Cascade model assumes an ad-specific continuation probability. This latter parameter describes the probability that a user will look at the ads below once it has looked at the current ad. 3.4 Payment Models Advertisers make payments to the search engine, as the latter provides the platform which enables the advertisers to deliver advertising material to the users. Interestingly, the two parties have conflicting views on when a payment should occur. From the search engine s point of view, the advertiser should pay on an impression basis, i.e., every time their ad is shown to the user. Indeed, the search engine provides the advertiser with the opportunity to be displayed, which is in its own right an advertising opportunity. Taken from the advertiser s point of view, however, it is the actual conversions that matter: even if the ad is displayed, this is of little value if the user does not actually proceed to an action that is valuable to the advertiser. The former perspective leads to a pay per impression model, whereas the latter to a pay per conversion (or, pay per action) model. To reconcile the two 24 See http://knowledge.accuracast.com/articles/adwords-clickthrough.php (accessed March 7, 2013). 25 Surprisingly, evidence suggests that conversion rates vary insignificantly with position. See http://adwords.blogspot.com/2009/08/conversion-rates-dont-vary-much-with-ad.html (accessed March 7, 2013). 10
A Survey on Sponsored Search Advertising in Large Commercial Search Engines opposing viewpoints, the two parties have agreed on a middle-ground, namely, the pay per click model. Under this model, an advertiser makes a payment to the search engine only when a user actually clicks on their ad. Note that Google AdWords introduced a beta test for pay per action advertising in March, 2007 26 ; however, it was subsequently retired in June, 2008 27. 3.5 Parameter Estimation Most work on sponsored search auctions usually takes for granted various parameters such as CTR and position-visibility. It turns out, however, that estimating this parameters is a difficult problem [22][54]. Indeed, there is an inherent tradeoff between learning these parameters and applying them: one cannot know a priori that a given ad has low quality unless it is exposed to the user; but then it was a bad idea to display the ad in the first place. This is similar to the exploitation vs. exploration tradeoff in reinforcement learning, and has been discussed in [70]. 3.6 Incomplete Knowledge Both the search engine and the advertisers have incomplete knowledge at many different levels. The greatest challenge from the search engine s perspective is that it is not aware of the future query workload. This perplexes the allocation of an advertiser s budget: without knowledge of future queries, budget allocation may be suboptimal and inefficient. Prior research has tackled this problem by modeling incomplete knowledge as online queries, and employing approximate online algorithms, e.g., [59][65][70][55][56][35][62]. Further parameters that are not known include CTR and visibility factors; in this case, parameter estimation techniques can be employed (See Section 3.5). Advertisers are also in a hard situation. Not only are they unaware of the future queries, they do not know the other advertisers bids, budgets, and CTRs as well. As a result, they face a very complex optimization problem, e.g., [48][11][14][65][30]. Furthermore, in this uncertain environment, the advertiser must come up with a profitable keyword choice; this issue is addressed in [65] through an adaptive algorithm that looks at the historical keyword performance. Note that large search engines provide the advertisers with tools that facilitate their keyword choices. Google AdWords, for instance, has developed 2 tools in this direction: 1) the Keyword Tool 28 allows the advertiser to build extensive, relevant keyword lists; 2) the Bidding and Budget Tool 29 automatically adjusts all bids of an 26 See http://adwords.blogspot.com/2007/03/pay-per-action-beta-test.html (accessed March 7, 2013). 27 See http://adwords.blogspot.com/2008/06/we-are-retiring-pay-per-action-beta.html (accessed March 7, 2013). 28 See http://adwords.google.com/support/aw/bin/answer.py?hl=en&answer=147602 (accessed March 7, 2013). 29 See http://adwords.google.com/support/aw/bin/answer.py?hl=en&answer=113234 (accessed March 7, 2013). Note that advertisers may also use external bid management software used for the automatic controlling of bids. In this case, search engines generally restrict the bidding information available to the software, and require a review of any automated bidding code [41]. 11
Technical Report HKUST-CS13-03 advertiser given its budget to get the most clicks possible; 3) the Traffic Estimator 30 provides the advertiser with keyword search traffic (such as local monthly searches), and various estimates (such as estimated average pay-per-click and average position of an ad). 3.7 Click Fraud Click fraud is a type of Internet crime that occurs when a sponsored link is intentionally clicked with no intention of generating value. It can be done automatically by computer scripts or directly by humans, and can be attributed to a variety of reasons, including a competitor s desire to minimize the impact of an ad campaign, simple vandalism, or a desire by a publisher to increase their income [41]. Jansen estimates in [40] that an astonishing 1 percent of all search-engine visits result in an unidentifiable fraudulent click, which translates into hundreds of millions of dollars for both the search industry and the advertisers who end up paying for clicks than are of zero value to them. Unfortunately, identifying click fraud turns out to be surprisingly difficult, since it is hard to know who is behind a computer and what their intentions are. Common tools employed by search engines include aggressive monitoring and improved automated filters that use sophisticated data mining technology. Interestingly, a shift from the pay per click to the pay per action paradigm would partially alleviate this problem, since the advertiser would be charged only for successful conversions rather than clicks. 3.8 Keyword Match A great challenge faced by the advertisers is to come up with the right set of keywords. Users searching for the keyword may use singular or plural, synonyms and other variations, may misspell, use extensions, or reorder the words. In fact, users may even search using terms that are not in the keyword (for instance, consider the keyword tennis shoes and the search query US Open sneakers). As a result, it is difficult or downright impossible for an advertiser to identify all possible variations of keywords that a user may use in their query. Major search engines address this issue by providing a structured bidding language. In general, 4 matchtypes are supported 31 : exact, phrase, broad, and negative. In exact matchtype, the ad would be eligible to appear when a user searches for the specific keyword in this order, and without any other terms in the search query. In phrase matchtype, the ad would be eligible to appear when a user searches on the keyword, with the terms in that order; it can also appear for searches that contain other terms as long as it 30 See http://adwords.google.com/support/aw/bin/answer.py?hl=en&answer=8692 (accessed March 7, 2013). 31 For keyword matching options in Google AdWords, see http://adwords.google.com/support/aw/bin/answer.py?hl=en&answer=6100. For Microsoft s Bing Ads, see http://community.bingads.microsoft.com/ads/en/bingads/b/blog/archive/2008/04/07/keyword-match-types.aspx. All accessed March 7, 2013. 12
A Survey on Sponsored Search Advertising in Large Commercial Search Engines includes the exact keyword. In broad matchtype, the ad would be eligible to appear when a user's search query contains all terms in the keyword in any order, and possibly along with other terms; it could also appear for the variations that we listed earlier. Finally, negative matchtype ensures that the ad will not be shown upon occurrence of certain terms in the query. Broad match is the default option for large search engines. One of its primary benefits is that it helps the advertisers attract more traffic to their website; as a result, it boosts the number of clicks and conversions as well. In addition, broad match saves the advertisers time when constructing their campaigns, lets them take advantage of global search trends, and is cost-effective since they do not have to spend money on keywords that do not work 32. Exact match, on the other hand, is the least flexible matchtype, and likely leads to fewer impressions, clicks, or conversions, compared to broad match. However, or the advertisers carefully constructs a comprehensive keyword list, the traffic they receive may be more targeted to their product or service. Phrase match lies somewhere in the middle: it is more targeted than broad match, but more flexible than exact match. Finally, negative keywords are especially useful if the advertiser s account contains several broad match keywords. 3.9 Bidding Expressivity Bidding expressivity concerns how to best translate advertiser needs into an appropriate bidding language. For example, a wine producer in Sacramento may want to target its ads only to users located in the state of California. Commercial search engines allow the advertisers to fine-tune their ads by targeting 1) specific locations, 2) days of the week, 3) time of day, 4) demographic (gender and age) groups, and 5) languages. Obviously, a more expressive bidding language may be better tailored to the user needs, but comes at a high complexity cost of the auctions and the middleman software. Recently, more expressive bidding languages have been investigated. Even-Dar et al. introduce in [28] context-based auctions where advertisers can bid on keywords that satisfy specific contexts such as gender, income, likely task, etc. They further show that under certain conditions the overall social welfare increases when moving from standard to contextbased mechanisms. Martin et al. [57] propose multi-feature auctions that enable advertisers to express bids on multiple features, namely, clicks, conversions, and slot positions. For instance, an advertiser may express that they only wish to be placed in prominent positions; or, they may prefer their ads to be placed near the top or bottom of the list, but not in the middle; or, they may value purchases (conversions) but have zero valuations for clicks alone. In the multi-feature model, the advertiser declares a bid table that summarizes their valuation over different combinations of the tree features; an efficient, scalable, and parallelizable infrastructure on the search engine s side is responsible for ad ranking and pricing. To account for ad externalities, Ghosh et al. 32 See http://adwords.google.com/support/aw/bin/answer.py?hl=en&answer=6137 (accessed March 7, 2013). 13
Technical Report HKUST-CS13-03 augment in [34] the standard bidding language with an exclusivity feature: an advertiser s value depends on whether or not other ads are shown along with their ad, i.e., whether they are shown exclusively or not. They further introduce two GSP-like auction mechanisms with two types of outcomes: either a single ad is displayed exclusively, or multiple ads are simultaneously shown. Both mechanisms are usually characterized by high efficiency and revenue. An interesting model, namely, the General Auction Mechanism (GAM), was recently proposed by Aggarwal et al. in [4]. GAM allows the bidders to specify both a valuation and a maximum price for each ad slot. The former is a measure of how much the slot is worth to them, whereas the latter is the maximum amount of money that they are willing to pay for the ad to appear in the slot. It turns out that GAM is a powerful model that includes several novel auction mechanisms at its core. Unfortunately, its expressiveness is rather limited for practical scenarios: 1) it cannot handle simultaneously both per-click and per-impression valuations, 2) it does not allow to express how tight the budget is, and 3) it cannot model risk-averse bidders. Motivated by these shortcomings, Duetting et al. discuss in [23] a new powerful auction mechanism that additionally runs in polynomial time. 3.10 Strategic Bidder Behavior An advertiser strategy includes all actions taken by them to meet their marketing and campaign goals; for instance, a strategy dictates to the advertisers how to allocate their funds, and what keywords to declare. The auction mechanisms that were employed by search engines in the early years of sponsored search were particularly susceptible to certain bidding patterns [26] that led to market inefficiencies and instabilities. The most frequent patterns were bidding war cycles 33 [7] and gap jamming [13][32]. In gap jamming, advertisers raise their bids to a point just below their immediate competitors; as a result, the competitors pay the maximum CPC, and deplete their budget fast. In return, advertisers can protect themselves from such spiteful behaviors by gradually shading their bids or even investing in bid management software. The most prominent proposed solutions to vindictive bidding are stochastic auctions and contingent-payment auctions [58]. Stochastic auctions allow all bidders to win with some probability; under this scheme, vindictive bidders will periodically be forced to pay high amounts, so their incentive is reduced. Moreover, stochastic auctions imply that even low bidders can win occasionally, and the search engine can acquire a more accurate clickthrough estimate for all bidders, in effect trading off high immediate revenue for optimizing future decision making. Note that bidders with near equal slot valuations can share a slot [11]. Finally, in contingent-payment auctions, the winning bidder only pays when a bidder-specific contingency occurs; this makes it harder for a spiteful bidder to deplete a competitor s budget. Modern search engines, on the other hand, shield the auction process from undesirable advertiser behavior by incorporating ad quality scores into the ranking and 33 We discuss bidding war cycles in detail in Section 4.1. 14
A Survey on Sponsored Search Advertising in Large Commercial Search Engines pricing schemes. Indeed, although the advertiser can still manipulate their declared maximum CPC, they have no control over any ad quality score, which is involved in both the ad allocation and pricing. 4 An Overview of Sponsored Search Auctions The market for sponsored search advertising had to undergo significant adjustments to address a number of structural shortcomings. The initial unstable mechanisms were replaced by more robust auction designs. This is evident in both the ad allocation and pricing schemes that have evolved in steps over the years. Interestingly, changes in the market for sponsored search took place at a very fast pace. This could be due to the competitive pressures on mechanism designers, the much lower costs of entry and experimentation, advances in the understanding of market mechanisms, and improved technology [26]. In the following, we discuss how sponsored search auctions evolved over the years, based on the seminal work of Edeleman, Ostrovsky and Schwartz [26]. 4.1 Generalized First-Price Auctions Beginning in 1994, early Internet advertising followed a pay per impression model, where advertisers paid a fee to show their ads a fixed number of times (typically one thousand impressions). In 1997, Overture (then GoTo; acquired by Yahoo! in 2003 for $1.63 billion 34 ) introduced a radically new model to sell Internet advertising. In the original Overture auction design, each advertiser submitted a maximum cost per click bid for a particular keyword. An auction would play out in an automated fashion every time a visiting user would trigger the ad spot. Advertisers were ranked in decreasing order of their declared bids, making highest bids the most prominent. Every time a user clicked on a sponsored link, the advertiser was charged an amount of money equal to their latest bid. The underlying ranking and pricing schemes were reminiscent of the first-price auction, a well-studied paradigm in the theory of auctions. In a first-price auction, a single item is for sale; all bidders declare bids that correspond to their maximum willingness to pay for the item. The highest bidder wins, and pays an amount equal to their bid. In sponsored search, on the other hand, multiple items (the ad slots) are for sale; the items are inherently different: higher ad positions are worth more than lower ad positions. As before, the interested advertisers communicate their bids to the search engine, and are subsequently ranked in decreasing order of their bid. The highest bidder wins the first slot and pays its bid; the second-highest bidder wins the second position and pays its bid, and so on until all ad slots have been sold 35. The analogy to the classic single-item first-price auctions led to the name generalized first-price auction. Obviously, the newly introduced pay per click 34 See http://docs.yahoo.com/docs/pr/release1102.html (accessed March 7, 2013). 35 If the number of ad slots exceeds the number of advertisers, then we simply ignore all remaining slots. 15
Technical Report HKUST-CS13-03 model allowed the advertisers to better target their ads: instead of paying for a banner ad that would be displayed to everyone visiting a website, advertisers could now explicitly specify which keywords (and, thus, users) were relevant to their advertising campaign. Moreover, the ease of use, the very low entry costs, and the transparency of the mechanism (see [26]) quickly led to the success of Overture. Indeed, major search engines including Yahoo! and MSN adopted Overture s platform as their advertising provider. Unfortunately, the novel auction mechanism induced a very unstable dynamics, in the sense that bidders would not state their true valuations, and would keep changing their bids in response to other bidders behavior. For instance, in a keyword market with two advertisers and two ad slots, assume that a click is worth $1.00 to the first advertiser and $1.50 to the second. If the first advertiser bids $1.00, then the second will likely bid $1.01 (i.e., the other bid plus a minimal increment), thereby claiming the first slot, and paying as little as possible. But then, the first bidder will try to lower its bid to the minimum bid (or reserve price), say $0.10, reducing its costs while still preserving the second slot, which is the best position it can get given the other advertiser s high bid. But then the second advertiser will lower its bid, e.g., to $0.11, and the advertisers will raise each other in small increments until the second advertiser outbids the first advertiser s valuation, and the first advertiser drops to the minimum bid. Under these assumptions, a cyclical pattern will continue indefinitely. This instability is further exacerbated by automated bid management softwares. Figure 3 (taken from [25]) demonstrates this behavior. It presents top bids, in dollars, for a specific keyword, every 15 minutes from 12:15 AM to 2:15 PM on July 18, 2002. Clearly, a sawtooth pattern emerges in the bidding process, whereby bids increase yielding more and more teeth, when they suddenly drop to lower values. (a)14 hours (b) 1 week Fig. 3. Sawtooth bidding pattern (taken from [25]). Clearly, the generalized first-price auction is not truthful: the advertisers have no incentive to declare their true valuations to the search engine. Instead, they devote considerable time and resources to the constant manipulation of their bids, potentially paying less attention to ad quality and other campaign goals. Furthermore, it is not efficient: although an advertiser may value the top spot more than its competitors, it get lower positions half or 16
A Survey on Sponsored Search Advertising in Large Commercial Search Engines more of the time. Finally, the volatile prices may take a serious toll on the search engine s revenues, as empirically demonstrated in [25]. 4.2 Generalized Second-Price and Vickrey-Clarke-Groves Auctions Under the generalized first-price auction, the advertisers have an incentive to game the system in their favor, and engage in inefficient and endless bidding wars. Google was the first search engine to address these shortcomings by introducing its own pay per click system, AdWords Select, in February 2002. Google recognized that the i th highest bidder would never be willing to pay more than the (i+1) th highest bid plus a minimal increment. Its newly introduced generalized second-price auction reflected this principle: advertisers were still ranked in decreasing order of their bids, but an advertiser in position i would now pay a price per click equal to the bid of the advertiser in position i+1 (plus an increment) 36. This price corresponds to the minimum amount of money that an advertiser must pay to retain their position, and is independent of its bid. The new auction structure makes the market less susceptible to gaming, and is thus more robust. Recognizing these benefits, Yahoo!/Overture also switched to GSP. Note that the GSP auction is a generalization of the second-price auctions for single items. In the latter, bidders submit their bids for the single object; the one with the highest bid wins, but pays the second-highest bid. The winner s payoff is its valuation minus the price it has to pay, which is non-zero for distinct bids; the other bidders payoff is zero. Under this pricing scheme, bidders have an incentive to bid their true valuations. Indeed, assume that bidder i submits a bid b i that is different from its true valuation v i : either b i > v i, or b i < v i. Consider the first case. If i is the winner, then by bidding more it is still the highest bidder, and will pay the second-highest bid, as before; thus, i has no incentive to overbid. On the other hand, if i is not the winner, then there must be another winner j with a higher bid b j > b i. If i bids less than b j, then it still loses the auction, and faces the same situation as before; but if i bids more than b j then it wins the auction. In this case, it will have to pay b j > b i > v i, effectively ending up paying more than what the object is worth to him. Thus, i has no incentive to overbid. Take now the second case. If i is the winner, two things can happen if it underbids. Either it is still the winner, in which case it pays the second-highest bid as before, or it may end up losing the auction, in which case it does not get the object for a zero payoff. In either case, i has no incentive to underbid. If i is not the winner, then by underbidding it has no chance to be the auction winner, so it faces the same situation as before; thus, it has no incentive to underbid. Although the GSP auction is a straightforward generalization of the single-item secondprice auction, it turns out that it does not maintain the property of truthfulness. We demonstrate this by citing an example from [26]. Consider three bidders, with values per click of $10, $4, and $2, and two positions. The first position has a clickthrough rate of 36 In fact, this was Overture s implementation; Google AdWords additionally considers quality scores for every ad, as described in Section 0. 17
Technical Report HKUST-CS13-03 200 clicks per hour, whereas the second 199 clicks per hour. If the bidders bid truthfully, then bidder 1 s payoff is ($10 $4)*200 = $1200. If, instead, it underbids by bidding only $3 per click, then it will get the second position, but its payoff will be equal to ($10 $2)*199 = $1592 > $1200. Thus, bidder 1 has an incentive to underbid. It turns out that the GSP auction does not maintain the truthfulness of the second-price auction, because it fails to capture accurately the underlying principle that characterizes the second-price auction. In particular, the GSP auction follows a pricing scheme, where a bidder who wins an object pays an amount of money that equals the bid right below. Indeed, this constitutes a straightforward generalization of the second-price auction which determines a price for the winner equal to the second-highest bid. However, generalizing the second-price in this direction yields an auction that is not truthful. In order to come up with an auction design that satisfies truthfulness, one must first get a better insight into the right interpretation of the second-price mechanism. Revisiting the second-price auction, we can indeed provide an alternative interpretation: the winner is requested to pay an amount of money equal to the externalities that it imposes on the others, i.e., the decreases in the valuations of other bidders because of its presence. To understand what this means, consider for a moment two bidders who bid for one item. Now, imagine that the highest bidder 1 with valuation v 1 did not participate in the auction; then the bidder 2 with valuation v 2 would win the auction. Because of bidder s 1 presence, bidder 2 does not have the chance to acquire the item and increase its valuation by v 2. In other words, bidder s 1 presence imposes a negative externality on bidder 2 equal to v 2 ; to compensate for this, bidder 1 thus pays an amount equal to v 2. So, the winner ends up paying the second-highest valuation, which is the standard second-price. In the case of a single item, the two interpretations yield identical mechanisms, namely the second-price auction. However, for multiple items, the resulting auctions are very different. The more straightforward generalization yields the GSP auction as we saw above, while the externality-based interpretation yields the Vickrey-Clarke-Groves (VCG) auction, named after William Vickrey [69], Edward H. Clarke [17], and Theodore Groves [37]. Contrary to GSP, VCG gives bidders an incentive to bid their true value, and is socially optimal, i.e., the bidder with the highest valuation acquires the highest position, the bidder with the second-highest valuation receives the second-highest position, etc. We illustrate how the two auctions work in sponsored search with an example taken from [26]. Suppose there are two slots on a page and three bidders. An ad in the first slot receives 200 clicks per hour, while the second slot gets 100. Bidders 1, 2, and 3 have values per click of $10, $4, and $2, respectively, and assume that they bid truthfully. Payments per click in GSP are $4 and $2 (plus a tiny increment), so the total payments of bidders one and two are $800 and $200, respectively. Let us now compute VCG payments for this example. The second bidder s payment is $200, as in GSP. However, the payment of the first advertiser is now $600: $200 for the externality that it imposes on bidder 3 (by forcing him out of position 2) and $400 for the externality that it imposes on bidder 2 (by moving him from position 1 to position 2 and thus causing him to lose (200 100) = 100 clicks per hour). The total revenues under GSP are $1000, whereas the total payments 18
A Survey on Sponsored Search Advertising in Large Commercial Search Engines under VCG are $800. Indeed, it turns out that if advertisers bid truthfully, then revenues under GSP are always higher than VCG [26]. 4.3 Recent Market Development Two striking features of the recent development in the sponsored search market are consolidation and convergence. In February 2010, Microsoft and Yahoo! announced a partnership agreement under the name Search Alliance, whereby Yahoo! s both algorithmic and paid search result platforms would be powered by Microsoft 37. This effectively leaves the US market with only two big players in the paid search advertising field. Moreover, although search engines are quite opaque about their auction protocols, it turns out that all major search engines have walked in Google s steps: they have now incorporated an ad quality score in both the ranking and pricing schemes, and they use GSP auctions to sell keywords. One could say that Google, as the market leader, has paved the way that others have followed. This is also evident in the various practical aspects that we explored in Section 3, such as account and ad structure, keyword match, bidding expressivity, etc. Both consolidation and convergence are observed in all big markets, so it should come as no surprise that this is also the case with the market for sponsored search advertising. Interestingly, GSP rather than VCG is used in practice, even though the latter would (at least theoretically) diminish incentives for strategizing and facilitate the advertisers task. Edelman et al. [26] attribute this to several reasons. First, VCG is hard to communicate to typical advertisers. Second, switching to VCG may entail substantial transition costs. VCG revenues are lower than GSP revenues for the same bids, and bidders might be slow to stop shading their bids. Third, the revenue consequences of switching to VCG are largely unpredictable. We believe that the introduction of the ad quality score has also played a role in the wide adoption of GSP. Indeed, the ad quality scores are now an integral part of both the ranking and pricing protocols; even if advertisers manipulate their bids, it is very difficult to game the system since they have no control over the ad quality scores. 5 Matching Markets, VCG, and the GSP Procedure In the previous Section, we discussed how after years of testing and experimentation, the sponsored search industry adopted the GSP protocol. We also contrasted GSP to the truthful VCG mechanism, and studied the underlying principles that characterize the two auction designs. In this Section, we will investigate in more depth the structure and properties of the VCG mechanism, and will explain in what ways GSP and VCG are 37 See http://www.searchalliance.com/apac/en/yahoo-and-microsoft-to-implement-search-alliance (accessed March 7, 2013). 19
Technical Report HKUST-CS13-03 structurally related. In this direction, we will introduce a very interesting and well-studied class of models, namely, matching markets. This Section is based on Chapters 10 and 15 of [24], which together provide an excellent treatment of markets and auctions. 5.1 Matching Markets A market refers to any one of a variety of systems, institutions, procedures, social relations and infrastructures whereby parties engage in an exchange. We focus on twosided markets with exactly two sets of agents, the sellers, each with an item for sale, and the buyers, each of whom wants to acquire an item. We consider that a buyer j has a valuation v ij for the item held by seller i, with the subscripts i and j indicating that the valuation depends on the identities of both the seller i and the buyer j. We will also assume that each valuation is a nonnegative whole number. We assume that the sellers have a valuation of 0 for each item; they care only about receiving payments from the buyers. To make the notion of payment clear, consider that each seller i puts its item up for sale for a price p i 0. If a buyer j buys the item from seller i at this price, then its payoff is equal to its valuation for the item, minus the amount of money it had to pay to acquire the item, i.e., v ij p i. Given a set of prices, one for each item, we assume that a buyer j wants to buy from seller i for which its payoff is maximized. Note that if the payoff is maximized in a tie between several sellers, then the buyer is indifferent to the identity of the seller and may choose any of them. Second, if the payoff is negative for every choice of seller i, then the buyer does not transact and thus gets a payoff of 0. We call the seller or sellers than maximize the payoff for buyer j the preferred sellers of buyer j, provided the payoff from these sellers is nonnegative; otherwise, we say that buyer j has no preferred seller. For a set of prices, we define the preferred-seller graph on buyers and sellers by simply constructing an edge between each buyer and their preferred seller(s) in the corresponding bipartite graph. Sellers Buyers Valuations Prices Sellers Buyers Payoffs a x 12, 4, 2 5 a x 7, 2, 2 b y 8, 7, 6 2 b y 3, 5, 6 c z 7, 5, 2 0 c z 2, 3, 2 (a) (b) 20
A Survey on Sponsored Search Advertising in Large Commercial Search Engines Prices Sellers Buyers Payoffs Prices Sellers Buyers Payoffs 2 a x 10, 3, 2 3 a x 9, 3, 2 1 b y 6, 6, 6 1 b y 5, 6, 6 0 c z 5, 4, 2 0 c z 4, 4, 2 (c) Fig. 4. Preferred-seller graphs for different sets of prices (adapted from [24]). Figures 4(b)-4(d) depict the preferred-seller graphs for three different sets of prices when the buyer valuations are as in Figure 4(a). First, observe in Figure 4(b) that if each buyer simply claims its preferred item, then each buyer ends up with a different item; there is thus no contention for items. We call such a set of prices market-clearing, since they cause each item to get bought by a different buyer. Prices in Figure 4(c) are not market-clearing, because buyers x and z both want the item offered by the single seller a. Note that prices in Figure 4(d) are market clearing as well, in the sense that buyers can coordinate their actions so that each of them ends up with a different seller. Formally, a set of prices is market-clearing if the resulting preferred-seller graph has a perfect matching. A natural question that arises when discussing market-clearing prices is whether they always exist for any set of buyer valuations. This question was answered affirmatively by the economists Damange, Gale, and Sotomayor in [20], who described an auction-like procedure that takes as input an arbitrary set of buyer valuations, and arrives at market clearing prices. In fact, their method is equivalent to a construction of market-clearing prices discovered by the Hungarian mathematician Egervary in 1916 [51]. A second natural question concerns the social welfare of the resulting assignment, i.e., the total valuation of the resulting assignment. Interestingly, one can prove that for any set of market-clearing prices, a perfect matching in the resulting preferred seller graph, is socially optimal, i.e., has the maximum total valuation of any assignment of sellers to buyers. (d) 5.2 Sponsored Search as a Matching Market It is now straightforward to establish a connection between matching markets and the market for sponsored search. The advertising slots are the inventory that the search engine is trying to sell to the advertisers. One subtlety is that in the sponsored search market there is one seller (the search engine) that puts up several items (ad slots) for sale to potential buyers (advertisers), while the matching market model assumes several sellers, each of 21
Technical Report HKUST-CS13-03 them associated with exactly one item. It turns out that this is not important: the marketclearing prices and the perfect matching in the preferred-seller graph are computed as before, while the unique seller (search engine) collects all payments. From the advertiser s side, we assume that each advertiser j has a revenue per click v j, i.e., the expected amount of money it receives per user who clicks on the ad. We further assume that this value is intrinsic to the advertiser and does not depend on what was being shown on the page when the user clicked on this ad; for instance, we do not take into account ad externalities. To complete the construction of the matching market, we need to determine the valuation v ij that advertiser j has for slot i. This corresponds to the benefit that j receives from being shown in slot i, and it depends not only on the advertiser s revenue per click, but on the actual number of received clicks as well. Usually, we model the advertiser s valuation for slot i as v ij = q j r i v j, where r i is the clickthrough rate of slot i, and q j is a quality factor for advertiser j. At this point, we have constructed a matching market with well-defined ingredients. The participants consist of a set of buyers and a seller that puts up several items for sale. Each buyer has a valuation v ij for the item offered by item i, which depends on the identities of both the buyer and item. The goal is to match up buyers with items, in a way that no buyer purchases two different items and the same item isn t sold to two different buyers. According to the discussion in Section 5.1, this is always possible by constructing a set of market-clearing prices that produce a perfect matching in the preferred-seller graph. Moreover, the assignment of buyers to sellers in this perfect matching always maximizes the social welfare, i.e., the advertiser s total valuation for the items they get. 5.3 VCG Prices and the Market-Clearing Property Unfortunately, the construction of prices that we have just described can be carried out by the search engine, only if it knows the valuations of the advertisers. In practice, the search engine does not have a way to find out these valuations, and has to rely on truthful reporting from the advertisers side. The real challenge is then to design a price-setting procedure, where the advertisers have an incentive to report truthfully, in the sense that they cannot receive a higher payoff by misreporting. And this is exactly where the VCG mechanism that we discussed in Section 4.2 comes into play. Recall that under the VCG principle, we first assign items to buyers so as to maximize the total valuation (social welfare). Then, the price buyer j should pay for item i it receives is equal to the externality that it imposes on the other buyers through its acquisition of this item. To make this clear, let S denote the set of sellers and B denote the set of buyers. Let denote the maximum total valuation over all possible perfect matchings of sellers and buyers. Now, let S i denote the set of sellers with the seller i removed, and let B j denote the set of buyers with buyer j removed. Then, if we give item i to seller j, the best total valuation the rest of the buyers could get is. On the other hand, if buyer j simply did not exist but item i were still an option for everyone else, then the best total valuation the 22
A Survey on Sponsored Search Advertising in Large Commercial Search Engines rest of the players could get is. Thus, the total harm caused by buyer j to the rest of the buyers is the difference between how they would do without j present and how they do with j present, i.e., the difference. This is the VCG price that we charge buyer j for item i:. Interestingly, if items are assigned and prices computed according to the VCG mechanism, then two very interesting properties hold. First, the resulting assignment of buyers to sellers maximizes the total valuation of any perfect matching of items and buyers; this is easy to justify, since the VCG mechanism is designed to maximize the total valuation. Second, truthfully announcing a valuation is a dominant strategy for each buyer, i.e., each buyer has an incentive to report truthfully irrespective of what all other buyers report. An immediate consequence in sponsored search auctions is that the advertisers do not have an incentive to strategize their bids, and the search engine can get to know the advertisers true valuations. To explain the VCG price construction, let s get back to the matching-market of Figure 4(a). First, we determine that the matching that maximizes the total buyers valuation assigns seller a to buyer x, b to z, and c to y. The matching of maximum valuation suggests how sellers and buyers match up. The second step is to compute the price every buyer has to pay to its assigned item. For instance, to determine the price that x must pay, first note that once x is assigned to a, the maximum total valuation among all matchings between the remaining sellers and buyers would be 11, by matching y to c and z to b. In the other hand, if x were not present, then the maximum total valuation possible would be 14, by matching y to b and z to a. The difference between these two quantities is the VCG price for buyer x: 14 11=3. Similarly, we can compute that the prices for items z and y are 1 and 0, respectively. Interestingly, this set of prices is the same in Figure 4(d), where we can see that the prices are marketclearing, since they yield a perfect matching in the induced preferred-seller graph. Later, we see that this is not a mere coincidence. Comparing the market-clearing prices that we discussed in Section 5.1 and the VCG prices, we can notice that there is a crucial difference between them. The former are posted prices, in that the seller simply announces a price and is willing to charge it to any buyer who is interested. The VCG prices, on the other hand, are personalized prices: they depend on both the item being sold and the buyer to whom it is being sold. The VCG price p ij paid by buyer j for item i may well differ from the VCG price p ik that buyer k would pay for the same item. It turns out, however, that market-clearing and VCG prices are in fact related. First, despite their definition as personalized prices, VCG prices are always market-clearing. That is, suppose we were to compute the VCG prices for a given matching market, first determining a matching of maximum total valuation, and then computing the corresponding VCG prices. Then, however, assume we go on to post the prices publicly: rather than requiring the buyers to follow the matching used in the VCG construction, we allow any buyer to buy any item at the indicated price. Despite this seemingly greater freedom, each buyer will in fact achieve the highest payoff by selecting the item it was assigned when the VCG prices were determined. In other words, VCG prices are market 23
Technical Report HKUST-CS13-03 clearing. In principle, there could be a multitude of such clear-marketing prices; it would then be worth to investigate which set of market-clearing prices the VCG prices actually correspond to. Interestingly, VCG prices form the minimum market-clearing prices, i.e., market-clearing prices of minimum total sum. Moreover, there is only one set of such prices. Summarizing, we get the following result that establishes the relationship between VCG prices and the matching-market model, proved by Leonard [52] and Demange [19]: in any matching market, the VCG prices form the unique set of market-clearing prices of minimum total sum. 5.4 Properties of the GSP Protocol In Section 3.1, we introduced the GSP auction, and explained that it is the standard procedure that all large search engines have adopted to sell sponsored search advertising. Various GSP implementations have been proposed, but throughout this Section we will focus on a simplified but insightful model. Concretely, we assume that each advertiser j announces a bid consisting of a single number b j (their maximum CPC). As usual, it is up to the advertiser whether or not its bid is equal to its true valuation per click, v j. Assume that the bids per click are b 1, b 2, in descending order, and that the clickthrough rate of slot i is r i (for simplicity, we omit in this discussion the ad quality score). The GSP protocol will assign bidder i to slot i for a price equal to bid b i+1. In other words, each advertiser pays a price per click equal to the bid of the advertiser just below him. The cumulative price for slot i will obviously be r i b i+1. We can formulate this problem as a game, where each advertiser is a player, the bid is its strategy, and its payoff is its revenue minus the price it pays. In this game, we will consider Nash equilibria these correspond to bids where no player has an incentive to deviate given that all other players stick to their strategies. Given this model, it is worth to investigate some basic properties of the GSP protocol. As we will show shortly, the GSP suffers from various pathologies that the VCG was designed to avoid. We will illustrate them by using the example in Figure 5, taken from [24]. As we can see in Figure 5(a), there are two slots for ads, with clickthrough rates 10 and 4. A third fictitious slot with clickthrough rate 0 has been added, so as to equalize the number of advertisers and slots. There are three advertisers x, y, and z, with values per click of 7, 6, and 1, respectively. Figure 5(b) shows the advertiser valuations for each slot, which are simply the product of their revenue per click times the clickthrough rate of the slot. Even an example as simple as Figure 5(a) can demonstrate one the main shortcomings of the GSP procedure: truth-telling may not constitute a Nash equilibrium. Indeed, if each advertiser bids its true valuation, then advertiser x gets the top slot at a price of 6; since there are 10 clicks associated with this slot, x pays 10 6=60 for the slot. Advertisers x s valuation for the top slot is 7 10=70, so its payoff is 70 60=10. Now, if x were to lower its bid to 5 (lower than y), then it would get the second slot for a price per click of 1, and would have to pay 4 1=4 for that slot. Since its valuation for the second slot is 4 7=28, it 24
A Survey on Sponsored Search Advertising in Large Commercial Search Engines would receive a payoff of 28 4=24, which is an improvement over truthful bidding. So, advertiser x has an incentive to shade its bid downward, and truth-telling is not a Nash equilibrium. Clickthrough rate Slots Advertisers Revenues per click Slots Advertisers Valuations Prices Slots Advertisers Payoffs 10 a x 7 a x 70, 28, 0 40 a x 30, 24, 0 4 b y 6 b y 60, 24, 0 4 b y 20, 20, 0 0 c z 1 c z 10, 4, 0 0 c z -30, 0, 0 (a) Setting (b) Valuations (c) VCG Prices Fig. 5. An example of a set of advertisers and slots (adapted from [24]). The example in Figure 5(a) illustrates a further complex property of the GSP mechanism. In particular, GSP may accept multiple equilibria, among which some may produce socially suboptimal assignments of advertisers to slots. First, suppose the advertiser x bids 5, advertiser y bids 4, and advertiser z bids 2. It is not difficult to see that this set of bids forms a Nash equilibrium, since no bidder has an incentive to overbid or underbid. This is an equilibrium that produces a socially optimal allocation of advertisers to slots, since x gets slot a while y gets b and z gets c. Now, consider the set of bids where x bids 3, y bids 5, and z bids 1. Again, it is not hard to verify that this set of bids is a Nash equilibrium. Note that as opposed to before, this equilibrium is not socially optimal, since advertiser x is assigned slot b and y is assigned slot a. The multiplicity of equilibria has significant implications on the search engine s revenue considerations, since revenue depends on which equilibrium is actually played in the game. To better illustrate this, consider the two equilibria that we discussed in the example of Figure 5(a). In the former, the 10 clicks of slot a are sold 4 per click, and the 4 clicks of slot b are sold for 2 per click, for a total revenue to the search engine of 48. In the latter, the 10 clicks of slot a are sold for 3 per click, and the 4 clicks of slot b are sold for 1 per click, for a total revenue of 34. Different equilibria thus yield different revenues to the search engine. It is instructive to see how these revenues compare to the revenue generated by the VCG mechanism. To determine the VCG prices, recall that we first need to find the socially optimal assignment of advertisers to slots; this is achieved by assigning slot a to advertiser x, and slot b to advertiser y. Then, we compute the prices for x and y by determining the harm (externality) that each advertiser causes to all others. With little effort, one can see that the price x must pay is 40 for the full set of clicks for the first slot, and the price y must pay is 4 for the full set of clicks for the second slot. Thus, the total revenue collected by the search engine is equal to 44. Figure 5(c) shows the preferred seller-graph and the corresponding perfect matching for these VCG prices. Interestingly, the VCG revenue is lower than the first GSP equilibrium, but higher than the second one. 25
Technical Report HKUST-CS13-03 That means that whether GSP or VCG generates more revenue to the search engine depends on the actual equilibrium chosen by the GSP procedure. 5.5 GSP and Matching Markets The previous discussion demonstrated some of the complex characteristics of the GSP mechanism. However, there exists an interesting connection between GSP and marketclearing prices. Since the VCG prices are market-clearing prices as well, this connection also holds between GSP and VCG. Notably, from a set of market-clearing prices for the matching market of advertisers and slots, we can always construct a set of bids in Nash equilibrium for the GSP procedure - moreover, this equilibrium generates a socially optimal assignment of advertisers to slots. Since there always exists a set of marketclearing prices for any matching market (see Section 5.1), this connection implies that there always exists a set of socially optimal equilibrium bids for the GSP auction. To see how this is possible, assume first without loss of generality 38 that we have an advertiser set labeled 1, 2,, n in decreasing order of their bids, and a slot set labeled 1, 2,, n in decreasing order of their clickthrough rates, both of equal size n. Consider any set of market-clearing prices p 1,, p n. Note that by p i we denote the (cumulative) price for slot i for the full set of clicks it receives. Since any set of market-clearing prices yields a socially optimal assignment of advertisers to slots, slot i will be assigned to advertiser i. Given this price set, we will construct a set of bids in the GSP auction that produces this same set of market-clearing prices, together with the same socially optimal matching of advertisers to slots. Then we will show that the set of bids forms a Nash equilibrium. First, we show how to construct a set of bids for the GSP auction. Consider the price per click that slot j receives; this must be equal to the price for the full set of clicks divided by j s slot clickthrough rate r j, i.e.,. Now, we argue that. Indeed, consider two slots j and k, where j < k. We will show that. Since prices are market-clearing, advertiser k prefers slot k to slot j. In slot k, advertiser s k total payoff is its payoff per click times the clickthrough rate of slot k, i.e., ; on the other hand, advertiser s k payoff in slot j would be. Since slot k is preferred to slot j, we will have, or,. But, as slots are labeled in decreasing order of their clickthrough rate, and j < k. Thus,, or, equivalently,, which is what we wanted to show. Now that we have decreasing prices per click, we can construct the set of bids for the GSP auction in the following manner. For i > 1, we have advertiser i place a bid equal to, and we have advertiser 1 bid any amount greater than. With these bids, and given that the prices per click are decreasing in slot 38 If the number of advertisers is higher than the number of slots, we can introduce additional fictitious slots to equalize the two numbers. If the number of advertisers is lower, we can ignore the superfluous slots. 26
A Survey on Sponsored Search Advertising in Large Commercial Search Engines position, we have that for each i, advertiser i is assigned to slot i (socially optimal matching), and is charged a price of 39. Next, we show why this set of bids forms a Nash equilibrium, by proving that no advertiser has an incentive to overbid or underbid. Consider advertiser j in slot j. If it were to raise its bid, then it would either remain in the same position, or it would get a higher slot, say i, with i < j. In the latter case, j would push advertiser i one slot down, and would need to pay advertiser i s current bid. Note that this is actually larger than what advertiser i was paying before for slot i, namely, the bid of advertiser i+1. So, j would get slot i at a price higher than the current price at slot i. Since prices are market-clearing, advertiser j either prefers slot j to slot k at these prices or is indifferent between getting any of the two slots; this means that j s payoff from getting slot j at its current price is at least as high as its payoff from getting slot i at its current price. But we just showed that advertiser j would need to pay a price higher than slot i s current price to acquire it; its payoff given the higher price will then be lower than its payoff at the current price, and, thus, lower than the payoff it enjoyed from receiving slot j at its current price. Hence, advertiser j does not want to overbid. On the other hand, if advertiser j were to lower its bid, either it would get slot j as before, or it would get a lower slot, say k, with k > j. In the latter case, j would need to bid just under advertiser k, and would pay an amount of money equal to the price that advertiser k is currently paying. But since prices are market-clearing, j would get at most the same payoff from slot k as it gets from slot j. As a result, it does not have an incentive to underbid. Summarizing, there always exists a set of bids that forms a socially optimal Nash equilibrium for the GSP procedure. We can construct such a set of bids from any set of market-clearing prices in the corresponding matching market. 6 Analysis of the GSP Procedure In this Section, we provide a formal analysis of the GSP auction using concepts and principles from Game Theory. In this direction, we have closely followed Lahaie s work [48][49][50] for its clear and structured exposition. In general, auction analysis not only depends on the ranking and pricing schemes, but on the underlying assumptions about the game structure and agents knowledge as well. We account for these assumptions by designing proper models of the auction game. In order to predict the outcome of the auction, we utilize different Equilibrium concepts that correspond to stable states where no agent gets a benefit from changing its strategy. 39 Under the GSP protocol, bidder n pays a price per click equal to 0; on the other hand, our initial set of marketclearing prices assumes for bidder n a price per click of. This is, however, not a problem: in a matching market, we can always drop the price of the lowest seller to 0. Indeed, we can do this by simply reducing all seller prices by an amount equal to the lowest price. This does not change the edges in the preferred-seller graph. 27
Technical Report HKUST-CS13-03 6.1 Model There are k positions to be allocated among n k bidders. We consider a separable clickthrough rate for slot j of the form a i c j, where a i [0,1] is the advertiser-specific visibility factor (or, relevance), and c j [0,1] the position specific factor. We further assume that slots in higher positions receive more clicks than slots in lower positions, i.e., c 1 > > c k, and let c j = 0 for j > k. We can interpret c j as the probability that an ad in position j will be noticed, and a i as the probability that it will receive a click when noticed. Bidder (advertiser) i has a valuation v i per click. We assume that bidders have quasilinear utilities, so that the utility per click at a price p is v i p, and the utility for all clicks at price p is a i c j (v i p). The bidder s valuation is private; neither the auctioneer (search engine) nor the other bidders can observe them. Bidder i has a weight w i that depends on its identity but not on the advertisers valuations or bids. If agent i bids b i, its reported score, or simply its score, will be s i = w i b i ; its true score, however, is w i v i. Obviously, an agent may not bid its true valuation if this choice does not maximize its utility. Bidders are ranked by score, so that the agent with the highest score gains the highest position, and so on. 6.2 Dominant Strategy We have already demonstrated in Section 5.4 that truthful reporting is not a dominantstrategy equilibrium in the GSP procedure. We may then ask whether there exists a payment rule that, together with a given weighting scheme, makes it a dominant strategy for the bidders to report truthfully. To answer this question, we will utilize several results from the theory of auctions and, in particular, mechanism design [60][47]. We define agent i s value to be its type, and we denote it as t i. Each agent has a value function parameterized by its type, which gives its total utility derived from a position before any payments are issued: Agent i s utility for position j at a cumulative price of q (price per click times clickthrough rate), parameterized by its type, is again quasi-linear: Let z i (b) be the position assigned to i when the vector of bids is b = (b i ) 1 i n, and similarly let p i (b) be agent i s total payment. Holmstrom s Lemma [39] describes the form of the payment rule p for an allocation rule z in order for truthful reporting to be a dominant strategy. It assumes individual rationality for all bidders, i.e., a bidder with value 0 per click does not pay anything, and states that there is a unique candidate payment rule that achieves dominant-strategy incentive compatibility for a given allocation rule. Let be agent i s complete information maximum valuation when the others are bidding and its own value per click is t i : 28
A Survey on Sponsored Search Advertising in Large Commercial Search Engines Then Holmstrom s Lemma can be stated as follows [60]: Holmstrom s Lemma. Suppose V i is continuously differentiable in type. If truthful reporting is a dominant strategy for agent i, then its payment must satisfy the following equation: Proof: See Appendix. Lahaie [49] applies Holmstrom s Lemma by splitting the interval of integration into I j =[ ], i j n. The idea is that if bidder s i bid belongs to I j, then it wins position j, since it is the j th highest bidder. Then, for, we have, thus, and hence, ( ) Note also that, and (individual rationality). Thus, (1) becomes ( ) ( ) ( ) This is the total payment charged; to find the payment per click, we just need to divide by the clickthrough rate, thus getting: ( ) It is worth to investigate the weight w i that bidder i is associated with. The case where w i =1 corresponds to the rank by bid (RBB) allocation rule that assigns slots in order of bids. The case where w i =a i, on the other hand, corresponds to the rank by revenue (RBR) allocation rule that allocates slots according to the bidder revenues. Thus, RBR maximizes social welfare under truth-telling, and achieves an efficient allocation. Since, by the Green-Laffont theorem [36], the VCG mechanism is the unique mechanism that is efficient, truthful, and individually rational, the RBR rule and payment rule (3) constitute exactly the VCG mechanism. Indeed, in the VCG mechanism, a bidder pays an amount equal to the externality that it imposes on others. This implies that agent i s payment will be the added utility that agents in positions i+1 and lower would receive if i were not present. But this is exactly what formula (3) describes. 29
Technical Report HKUST-CS13-03 6.3 Nash Equilibrium In the previous Section, we discussed what form the payments must have in the setting where bidders are ranked by their scores, if we want to ensure that truthful-reporting forms an equilibrium in dominant strategies. On the other hand, the GSP procedure does not follow the payments we derived; instead it makes bidder i pay the minimum amount to retain its position, i.e.,. As we have previously shown, truth-telling is not, in general, an equilibrium for these payments. In this Section, we will attempt to characterize the equilibria that the GSP auction yields. First, note that we can think of sponsored search auctions as continuous time or infinitely repeated games in which bidders originally have private information about their types, and in the process can adjust their bids repeatedly. Such games may have a large set of equilibria and are prohibitively complex to analyze. Instead, Edelman and Ostrovsky suggest in [26] that we can focus on simple strategies, and then study the rest points of the bidding process: if the vector of bids stabilizes, at what bids does it stabilize? If the process stabilizes, then the result can be modeled as the Nash equilibrium in pure strategies of the static one-shot game of complete information, since each bidder will be playing a best response to all other bidders. Without loss of generality, let s assume that there are as many slots as bidders, i.e., n=k. Now, consider the bids (b 1,, b n ) such that. We further assume that b n+1 =0. The term w i b i corresponds to the declared revenue per click of bidder i as opposed to that represents the true revenue per click. According to the GSP procedure, bidder i will get assigned to slot i. In order for the bid vector (b 1,, b n ) to be a pure Nash equilibrium, each agent prefers its position to the others given the bids, so the following set of inequalities must hold: ( ) ( ) In the above system, can be interpreted as the total utility of the agent in position i. To ensure individual rationality, we should also demand. In fact, this is implied by inequality (5) for j=n: (4) and (7) imply, i.e., the money per click that a bidder pays must not exceed their true revenue (individual rationality). 6.4 Symmetric Nash Equilibrium Looking back at the set of inequalities (4)-(6), we note the following asymmetry. A bidder who bids for a lower position j has to pay to win that position; but a bidder who bids for a higher position j has to pay to win that position. This set of inequalities 30
A Survey on Sponsored Search Advertising in Large Commercial Search Engines describes exactly what we call the Nash equilibrium of the game. Let s now focus on a particular type of Nash equilibrium, namely, equilibria in which bidders do not have an incentive to win a higher position j even if they have to pay instead of. ( ) ( ) If we denote the true revenue of advertiser i as r i and the declared revenue as p i, then the above system can be rewritten as: ( ) The system (11)-(12) exactly describes this new type of equilibrium, namely, the symmetric equilibrium 40. The sense in which equilibria that satisfy these inequalities are symmetric is that all bidders, when contemplating to bid for position k, expect to pay the same price p k for this position. If each bidder takes these prices as given and fixed, and picks the position that generates for him the largest payoff, then there will be exactly one bidder who wants to acquire that position, provided that ties are resolved correctly. Thus, the market for each slot clears: both demand and supply are equal to 1 (provided we resolve tries). Interestingly, this set of prices corresponds to the set of market-clearing prices that we saw in Section 5.1. Indeed, the price vector p=(p 1,, p n ) is a symmetric equilibrium if and only if (p 1,, p n ) are market clearing prices for the assignment problem that attempts to match bidders to slots. We call p the Walrasian equilibrium, or the competitive equilibrium of the game. From Section 5.1, we know that such an equilibrium always exists and maximizes the social welfare. Moreover, it is easy to see that the symmetric equilibrium is a refinement of the Nash equilibrium [68]. To confirm the existence of a symmetric equilibrium, we consider the following linear program: ( ) Its dual program is: ( ) 40 Edelman and Ostrovsky [26] independently introduced this refinement and called it locally envy-free equilibrium, which requires that each player cannot improve its payoff by exchanging bids with the player above. 31
Technical Report HKUST-CS13-03 The above system corresponds to the linear programming formulation of a simple assignment problem which attempts to maximize the sum of the agents revenues/scores, weighted by the position effects. According to a classic result by Shapley and Shubik [67], such programs always accept an integer optimal solution, which in this case explicitly describes the assignment of bidders to positions. If x * is an optimal assignment, and ( ) is an optimal dual solution, then by complementary slackness we have. Assignment and prices will constitute a symmetric equilibrium. Since the primal is obviously always feasible and bounded, the dual is as well, and there always exists a symmetric equilibrium. Because the agents have unit-demand valuations, the set of dual solutions to the assignment problem forms a convex lattice [67]. The maximal and minimal elements of this lattice are the solutions that maximize and minimize the component of the objective, respectively. Concerning the minimal element (see, e.g., [68][50]): ( ) Note that payment (13) agrees exactly with payment (3). This is, of course, not a coincidence. Recall our discussion in Section 5.3: the minimal market-clearing prices in the assignment problem coincide with the VCG payments. The symmetric equilibrium concept states that agent bids should constitute market-clearing prices. Hence, minimal symmetric equilibrium bids should coincide with VCG payments, or, in the case of a position auction, the weighted equivalents of VCG payments, which are given by Holmstrom s Lemma. As a result, minimal symmetric equilibria and truthful position auctions are revenue-equivalent. 7 Conclusion Sponsored search is nowadays a thriving market, which has entirely redefined the way advertisers interact with consumers to promote their product or service. In little more than a decade, sponsored search has evolved into a multibillion dollar industry, and major search engines enjoy a market capitalization that far exceeds long-established companies in other industries. There is no reason to believe that the upward growth trend will not continue over the next years. On the contrary, dramatic technological advances that are currently taking place will keep shaping the search industry, and will turn out to be major driving forces for change and improvement. 32
A Survey on Sponsored Search Advertising in Large Commercial Search Engines In particular, the proliferation of mobile devices with embedded GPS functionality has created a market for location-based advertising with great potential. Mobile users searching for products or services are usually a much desired advertising target because the advertising material they receive can have a great impact on their decision. For instance, a mobile user who enters the query pizza restaurant is probably interested in having a pizza meal in a nearby restaurant, and their decision will very likely be influenced by the sponsored ads that will be displayed on their device. The advances in communications have effectively created opportunities for more relevance and reach, and that could translate into bigger profits for both major search engines and advertisers. Surprisingly, large search engines treat mobile users in exactly the same way as other users. In our opinion, an interesting direction for the search engines would be to better tailor their business model to the needs and peculiarities of mobile search. Ensuring high revenues as consumers switch their search activity from the wired Web to mobile, or at least complement search across channels, is going to be one of the major business concerns for large search engines in the coming years. Similarly, the explosion of social networking sites and social media forums such as blogs has created new opportunities for personalized ads. Users are characterized by profiles based on the blogs they read, the tweets they follow, the history of queries they pose to the search engine, the preferences and interests of the users they are connected to through social networking sites, etc. It is a great challenge for search engines to exploit this information in order to better understand the user needs, and match users with relevant ads. Under the current paradigm, advertisers target consumers based on keyword queries that reveal certain intentions; a major shift would be for advertisers to target consumers based on their profiles. This is already happening with social networks such as Facebook that acquire significant revenues from personalized online advertising. Privacy concerns and the fear of being viewed as intrusive complicate this issue, and so it is unclear how far this new model will go. A simple solution would be for search engines to set up an opt-in mode of operation for interested users. Note that the opportunities that arise from the structure of Web 2.0 do not stop here. There is accumulating evidence that social networking sites, blogs, product review sites, and other kinds of sites promoting wisdom of the crowds and social commerce play a key role in influencing the level of buzz about firms, brands, and products in the online world [21]. The levels of buzz and trends can in turn have a great impact on the keyword demand from advertisers. Indeed, the volume of content about keywords that is being created by users in various social media and social commerce forums is an indication of demand for those keywords and should translate into a direct economic value for both the search engine and the advertisers. Although modern search engines provide to the advertisers keyword monitoring tools for hot keywords and keyword trends, mining the vast and diverse content of the Web 2.0 will create even more opportunities for successful advertising. Despite the great opportunities that lie ahead, there are still complex open problems for the sponsored search industry. Here, we emphasize only some of them. One of the greatest challenges is the task of parameter estimation. Given a log of search and ad traffic over a 33
Technical Report HKUST-CS13-03 significant period of time, the question is how to design and validate efficient learning methods for estimating these parameters. In this direction, it would help to identify models that fit these data. In practice, this is a very hard problem since user behavior is affected by many factors including human physiology and psychology (e.g., what part of the screen does a user first look at?), which are very hard to model. Another important line of future research deals with sponsored search auction design. The currently employed GSP protocol exhibits several disturbing pathologies, leading to inefficiencies and lower revenues. The VCG procedure is not affected by these problems but it has not been adopted as the standard auction format by the search engine industry. But even the VCG auction may not be sufficiently expressive for classes of advertisers that have budget constraints, non-linear valuations, etc. In this direction, new auction designs can be explored. Moreover, grand simulation platforms that can generate search traffic, ad inventories, ad clicks, and market specifics at the Internet scale could help both the search engine industry and the academic world develop intuition into the world of sponsored search auctions and the associated dynamics [29]. For instance, such platforms could help identify grand, unified models that capture the complex and subtle relationships between all three parties in paid search. On the other hand, we speculate that because of the hundreds of millions of auctions occurring on a daily basis, major search companies have the opportunity to experiment with very different auction designs, do empirical studies, and select the ones that are most suitable to them. References [1] Abrams, Z., Mendelevitch, O., Tomlin, J. Optimal delivery of sponsored search advertisements subject to budget constraints. ACM Electronic Commerce, 2007. [2] Agarwal, A., Hosanagar, K., Smith, M.D. Location, location, location: An analysis of profitability of position in online advertising markets. Journal of Marketing Research, forthcoming, available at http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1151537. [3] Aggarwal, G., Feldman, J., Muthukrishnan, S. Bidding to the top: VCG and equilibria of position-based auctions. In Proc. Workshop on Approximation and Online Algorithms, 2006. [4] Aggarwal, G., Muthikrishnan, S., Pal, D., Pal, M. General auction mechanism for search advertising. ACM WWW, 2009. [5] Aggarwal, G., Feldman, J., Muthukrishnan, S., Pal, M. Sponsored search auctions with Markovian users. In Proc. Workshop on Internet and Network Economics, 2008. [6] Aggarwal, G., Goel, A., Motwani, R. Truthful auctions for pricing search keywords. ACM Electronic Commerce, 2006. [7] Asdemir, K. Bidding patterns in search engine auctions. In Proc. 2 nd Workshop on sponsored Search Auctions, ACM Electronic Commerce, 2006. [8] Barron, E. N. Game theory: An introduction. Wiley-Interscience, 2008. [9] Bendersky, M., Gabrilovich, E., Josifovski, V., Metzler, D. The anatomy of an ad: structured indexing and retrieval for sponsored search. WWW, 2010. 34
A Survey on Sponsored Search Advertising in Large Commercial Search Engines [10] Bhargava, H. K., Feng, J. The impact of sponsored results on the quality of information gatekeepers. International Conference on Electronic Commerce, 2007. [11] Borgs, C., Chayes, J., Etesami, O., Immorlica, N., Jain, K., Mahdian, M. Dynamics of bid optimization in online advertisement auctions. ACM WWW, 2007. [12] Borgs, C., Chayes, J., Immorlica, N., Mahdian, M., Saberi, A. Multi-unit auctions with budgetconstrained bidders. ACM Electronic Commerce, 2005. [13] Brandt, F., Weiss, G. Antisocial agents and Vickrey auctions. Intelligent Agents VIII, 2333:335-347, 2002. [14] Cary, M., Das, A., Edelman, B., Giotis, I., Heimerl, K., Karlin, A. R., Mathieu, C., Schwarz, M. Greedy bidding strategies for keyword auctions. ACM Electronic Commerce, 2007. [15] Chiou, L., Tucker, C. How does the use of trademarks by third-party sellers affect online search? NET Institute Working Paper, available at http://ssrn.com/abstract=1686438. [16] Choi, Y., Fontoura, M., Gabrilovich, E., Josifovski, V., Mediano, M., Pang, B. Using landing pages for sponsored search ad selection. ACM WWW, 2010. [17] Clarke, E.H. Multipart pricing of public goods. Public Choice, 11(1):17-33, 1971. [18] Craswell, N., Zoeter, O., Taylor, M., Ramsey, B. An experimental comparison of click position-bias models. ACM Web Search and Data Mining, 2008. [19] Demange, G. Strategyproofness in the assignment market game. Laboratoire d Econometrie de l Ecole Polytechnique, 1982. [20] Demange, G., Gale, D., Sotomayor, M. Multi-item auctions. Journal of Political Economy, 94(4):863-872, 1986. [21] Dhar, V., Ghose, A. Sponsored Search and Market Efficiency. INFORMS J. on Information Systems Research, 21(4):760-772, 2010. [22] Dominowska, E., Richardson, M., Ragno, R. Predicting clicks: Estimating the click-through rates for new ads. ACM WWW, 2007. [23] Duetting, P., Henzinger, M., Weber, I. An expressive mechanism for auctions on the web. ACM WWW, 2011. [24] Easley, D., Kleinberg, J. Networks, crowds, and markets: Reasoning about a highly connected world. Cambridge University Press, 2010. [25] Edelman, B., Ostrovsky, M. Strategic bidder behavior in sponsored search auctions. Decision Support Systems, 43(1):192-198, 2007. [26] Edelman, B., Ostrovsky, M., Schwarz, M. Internet advertising and the generalized second price auction: Selling billions of dollars worth of keywords. American Economic Review, 9(1):242-259, 2007. [27] Edelman, B., Gilchrist, D.S. Sponsored links or Advertisements?: Measuring labeling alternatives in Internet search engines. Harvard Business School Working Paper, Number 11-048, available at http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1706121. [28] Even-Dar, E., Kearns, M., Wortman, J. Sponsored search with contexts. In Proc. Workshop on Internet and Network Economics, 2007. [29] Feldman J., Muthukrishnan, S. Algorithmic methods for sponsored search advertising. In Performance Modeling and Engineering, pp. 91-124, 2008. [30] Feldman, J., Muthukrishnan, S., Pal, M., Stein, C. Budget optimization in search-based advertising auctions. ACM Electronic Commerce, 2007. [31] Feng, J., Bhargava, H., Pennock, D. Implementing sponsored search in web search engines: Computational evaluation of alternative mechanisms. INFORMS J. on Computing, 19(1):137-148, 2007. 35
Technical Report HKUST-CS13-03 [32] Ganchev, K., Kulesza, A., Tan, J., Gabbard, R., Liu, Q., Kearns, M. Empirical price modeling for sponsored search. In Proc. 3 rd Workshop on Sponsored Search Auctions, ACM Electronic Commerce, 2006. [33] Ghose, A., Yang, S. An empirical analysis of search engine advertising: Sponsored search in electronic markets. Management Science, 55(10):1605-1620, 2009. [34] Ghosh A., Sayedi, A. Expressive auctions for externalities in online advertising. ACM WWW, 2010. [35] Goel, G., Mehta, A. Online budgeted matching in random input models with applications to AdWords. SODA, 2008. [36] Green, J., Laffont, J.-J. Characterization of satisfactory mechanisms for the revelation of preferences for public goods. Econometrica, 45:427-438, 1977. [37] Groves, T. Incentives in teams. Econometrica, 41:617-631, 1973. [38] Hillard, D., Schroedl, S., Manavoglu, E., Raghavan, H., Leggetter, C. Improving Ad Relevance in Sponsored Search. ACM Web Search and Data Mining, 2010. [39] Holmstrom, B. Groves schemes on restricted domains. Econometrica, 47(5):1137-1144, 1979. [40] Jansen, B.J. Click fraud. Computer, 40(7):85-86, 2007. [41] Jansen, B., Mullen, T. Sponsored search: An overview of the concept, history, and technology. Int. J. Electronic Business. 6(2):114-131, 2008. [42] Jeziorski, P., Segal, I. What makes them click: Empirical analysis of consumer demand for search advertising. Working paper, available at http://www.stanford.edu/~isegal/ads.pdf. [43] Joachims, T., Granka, L., Pan, B., Hembrooke, H., Radlinski, F., Gay, G. Evaluating the accuracy of implicit feedback from clicks and query reformulations in web search. ACM Transactions on Information Systems, 25(2), 2007. [44] Kempe, D., Mahdian, M. A cascade model for externalities in sponsored search. In Proc. 4 th International Workshop on Internet and Network Economics, 2008. [45] Kitts, B., Laxminarayan, P., LeBlanc, B., Meech, R. A formal analysis of search auctions including predictions on click fraud and bidding tactics. In Proc. 1 st Workshop on Sponsored Search Auctions, ACM Electronic Commerce, 2005. [46] Koenig, A.C., Church, K., Markov, M. A data structure for sponsored search. ICDE, 2009. [47] Krishna, V. Auction theory. Academic Press, 2002. [48] Lahaie, S. An analysis of Alternative slot auction designs for sponsored Search. ACM Electronic Commerce, 2006. [49] Lahaie, S. An overview of some basic properties of sponsored search auctions. 2008, available at http://www.cs.columbia.edu/coms6998-3/ssresults.pdf. [50] Lahaie, S., Pennock, D.M. Revenue analysis of a family of ranking rules for keyword auctions. ACM Electronic Commerce, 2006. [51] Lovasz, L., Plummer, M. Matching theory. North-Holland, 1986. [52] Leonard, H.B. Elicitation of honest preferences for the assignment of individuals to positions. Journal of Political Economy, 91(3):461-479, 1983. [53] Levene, M. An Introduction to Search Engines and Navigation. Addison Wesley, 2010. [54] Mahdian, M., Immorlica, N., Jain, K., Talwar, K. Click fraud resistant methods for learning click-through rates. In Proc. Workshop on Internet and Network Economics, 2005. [55] Mahdian, M., Nazerzadeh, H., Saberi, A. Allocating online advertisement space with unreliable estimates. ACM Electronic Commerce, 2007. 36
A Survey on Sponsored Search Advertising in Large Commercial Search Engines [56] Mahdian, M., Saberi, A. Multi-unit auctions with unknown supply. ACM Electronic Conference, 2006. [57] Martin, D.J., Gehrke, J., Halpern, J.Y. Toward expressive and scalable sponsored search auctions. ICDE, 2008. [58] Meek, C., Chickering, D.M., Wilson, D.B. Stochastic and contingent-payment auctions. In Proc. 1 st Workshop on Sponsored Search Auctions, ACM Electronic Commerce, 2005. [59] Mehta, A., Saberi, A., Vazirani, U., Vazirani, V. AdWords and generalized online matching. FOCS, 2005. [60] Milgrom, P. Putting Auction Theory to Work. Cambridge University Press, 2004. [61] Milgrom, P., Segal, I. Envelope theorems for arbitrary choice sets. Econometrica, 70(2):583-601, 2002. [62] Muthukrishnan, S., Pal, M., Svitkina, Z. Stochastic models for budget optimization in searchbased advertising. In Proc. Workshop on Internet and Network Economics, 2007. [63] Ostrovsky, M., Schwarz, M. Reserve prices in Internet advertising auctions: A Field Experiment. ACM Electronic Commerce, 2011. [64] Reiley, D., Li, S., Lewis, R. Northern exposure: A field experiment measuring externalities between search advertisements. ACM Electronic Conference, 2010. [65] Rusmevichientong, P., Williamson, D. An adaptive algorithm for selecting profitable keywords for search based advertising services. ACM Electronic Commerce, 2006. [66] Rutz, O., Bucklin, R.E. From generic to branded: A model of spillover in paid search advertising. Journal of Marketing Research, 48(1):87-102, 2011. [67] Shapley, S., Shubik, M. The assignment game I: The core. International Journal of Game Theory, 1:111-130, 1972. [68] Varian, H. Position auctions. International Journal of Industrial Organization, 25(6):1163-1178, 2007. [69] Vickrey, W. Counterspeculation, auctions, and competitive sealed tenders. Finance, 16:8-27, 1961. [70] Wortman, J., Vorobeychik, Y., Li, L, Langford, J. Maintaining equilibria during exploration in sponsored search auctions. In Proc. Workshop on Internet and Network Economics, 2007. [71] Yang, S., Ghose, A. Analyzing the relationship between organic and sponsored search advertising: Positive, negative, or zero interdependence? INFORMS J. on Marketing Science, 29(4):602-623, 2010. [72] http://advertise.bingads.microsoft.com/en-us/home [73] http://adwords.google.com/support/aw/?hl=en 37
Technical Report HKUST-CS13-03 APPENDIX Proof of Holmstrom s Lemma Recall that bidder s i utility given its bid b i and the other agents bids b -i is equal to where b=(b i,b -i ) the bidding vector, z i (b) the allocation rule that assigns slot z i (b) to bidder i according to the bidding vector, and p i (b) the pricing rule that determines the amount of money that agent i needs to pay. Note that i s utility is parameterized by its type t i. Suppose that the other bidders bids b -i are fixed. Bidder i will submit the bid that maximizes its utility, i.e., Its maximum possible payoff will thus be: Note again that the is parameterized by agent i s type t i, while b -i is fixed. Equality (3) corresponds to a parameterized optimization problem [61], where the parameter is i s type t i. If truth-telling is a dominant strategy for agent i, then we must have that (4) irrespective of b -i. To compute, we resort to the envelope theorem by Milgrom and Segal [61]. Concretely, since V i is continuously differentiable: Let s now explore the terms in (5). First, we have that because of (1), (3), and (4). For the second term we have: since ( ) Putting (5), (6), (7) together, we get This completes the proof 41. 41 In fact, Lemma 1 says something even stronger: the payments in any two incentive compatible mechanisms with the same allocation rule are equivalent up to a constant. Indeed, note that the two last terms on the right side of (8) are fixed for a given allocation rule. But then the only possibility is to change the first term on the right side, which is however a constant. Note that the payment rule that charges 0 to a bidder with value 0, i.e., = 0, corresponds to the VCG payments, and achieves prices of minimum total sum. 38