Knowledge-Based Approaches to. Evgeniy Gabrilovich. gabr@yahoo-inc.com

Similar documents

An Overview of Computational Advertising

Computational Advertising Andrei Broder Yahoo! Research. SCECR, May 30, 2009

Web Advertising 1 2/26/2013 CS190: Web Science and Technology, 2010

Technical challenges in web advertising

GrammAds: Keyword and Ad Creative Generator for Online Advertising Campaigns

Automatic Generation of Bid Phrases for Online Advertising

GUIDE TO GOOGLE ADWORDS

Different Users and Intents: An Eye-tracking Analysis of Web Search

ADWORDS CONTINUED: CREATING MORE AD GROUPS

The 20-Minute PPC Work Week. Making the Most of Your PPC Account in Minimal Time. A WordStream Guide

Invited Applications Paper

The 8 Key Metrics That Define Your AdWords Performance. A WordStream Guide

Studying the Impact of Text Summarization on Contextual Advertising

Web Mining Seminar CSE 450. Spring 2008 MWF 11:10 12:00pm Maginnes 113

An Empirical Analysis of Sponsored Search Performance in Search Engine Advertising. Anindya Ghose Sha Yang

How to Use Google AdWords

How. B2B Companies. Can Generate More Demand and Better Leads at Less Cost

Here are our Pay per Click Advertising Packages:

Revenue Optimization with Relevance Constraint in Sponsored Search

Chapter 20: Data Analysis

Computational advertising

Google AdWords Handbook. for tour & activity companies

6 Digital Advertising. From Code to Product gidgreen.com/course

THE 20-MINUTE PPC WORK WEEK MAKING THE MOST OF YOUR PPC ACCOUNT IN MINIMAL TIME

Mobile Real-Time Bidding and Predictive

WSI White Paper. Prepared by: Ron Adelman Search Marketing Expert, WSI

DIGITAL MARKETING BASICS: PPC

A Beginner s Guide to the Google Display Network

SEARCH ENGINE MARKETING 101. A Beginners Guide to Search Engine Marketing

ROI-Based Campaign Management: Optimization Beyond Bidding

Understanding and Improving AdWords Quality Score. Larry Kim, Founder, CTO Will Eisner, VP, Product July 21, 2011

7 THINGS YOU NEED TO KNOW BEFORE TRYING GOOGLE ADWORDS

Search Taxonomy. Web Search. Search Engine Optimization. Information Retrieval

Search Engine Optimization with Jahia

BUY. Mobile Retargeting for Retailers: Recapturing Customers Cross-Platform. February AMPUSH.1

Finding Advertising Keywords on Web Pages. Contextual Ads 101

CIBC Business Toolkit Grow and Manage Your Business Online. Part 5: Grow Online Worksheet

A survey on click modeling in web search

Internet Marketing Guide

INTERNET MARKETING. SEO Course Syllabus Modules includes: COURSE BROCHURE

Online Traffic Generation

Digital media glossary

Towards Inferring Web Page Relevance An Eye-Tracking Study

SPONSOREDUPDATES. user guide

Driving Online Traffic and Measuring Offline Phone Conversions

A/B SPLIT TESTING GUIDE How to Best Test for Conversion Success

A SIMPLE GUIDE TO PAID SEARCH (PPC)

Search Engine Marketing (SEM) with Google Adwords

DIGITAL MARKETING SERVICES

The ABCs of AdWords. The 49 PPC Terms You Need to Know to Be Successful. A publication of WordStream & Hanapin Marketing

Introduction. Regards, Lee Chadwick Managing Director

Pay Per Click Marketing Specialists

[Ramit Solutions] SEO SMO- SEM - PPC. [Internet / Online Marketing Concepts] SEO Training Concepts SEO TEAM Ramit Solutions

Introduction to Pay Per Click

Building a Question Classifier for a TREC-Style Question Answering System

Campaign and Ad Group Management. Google AdWords Fundamentals

Data Mining in Web Search Engine Optimization and User Assisted Rank Results

How To Run A Successful Linkedin Ad Campaign

Retargeting with Google AdWords

GrammAds: Keyword and Ad Creative Generator for Online Advertising Campaigns

Guide to Selling Google AdWords for Resellers Consultative, Solution Based Sales

1. Introduction to SEO (Search Engine Optimization)

AdWords Explained in 876 Words PPC Audit XMen

OUR ADWORDS PROPOSAL. 1 Church Street Epsom KT17 4PF info@clickrepublic.co.uk

Identifying Best Bet Web Search Results by Mining Past User Behavior

Introduction to Search Engine Marketing

Google AdWords PPC Advertising

HOW TO CHOOSE A DIGITAL MARKETING AGENCY

Pay Per Click Advertising

DIGITAL MARKETING BASICS: SEO

Search Engine Optimization Glossary

Search engine marketing

Effective Data Retrieval Mechanism Using AML within the Web Based Join Framework

SEO 360: The Essentials of Search Engine Optimization INTRODUCTION CONTENTS. By Chris Adams, Director of Online Marketing & Research

How to make the most of search engine marketing (SEM)

8 Simple Things You Might Be Overlooking In Your AdWords Account. A WordStream Guide

ARE YOU SPENDING YOUR PPC BUDGET WISELY? BEST PRACTICES AND CREATIVE TIPS FOR PPC BUDGET MANAGEMENT

GOING FROM GOOD TO GREAT TOP 5 WAYS TO OPTIMIZE YOUR PPC PERFORMANCE

Creating a Landing Page to Achieve Maximum Results

PUTTING SCIENCE BEHIND THE STANDARDS. A scientific study of viewability and ad effectiveness

7 Tips for Google AdWords Optimisation. By Ferdie Bester

Tutorial 5. Pay per click for testing and profit

Digital Marketing: Strategies & Measurement

Digital Marketing Trends in the Education Market A Comprehensive Analysis of the School Year

CONTEXTUAL RETARGETING

Transcription:

Ad Retrieval Systems in vitro and in vivo: Knowledge-Based Approaches to Computational Advertising Evgeniy Gabrilovich gabr@yahoo-inc.com 1

Thank you! and the KSJ Award Panel My colleagues and collaborators My Ph.D. advisor, Prof. Shaul Markovitch 2

Outline Why advertising? Why computational o a? A brief history of approaches to Web advertising Research questions In vitro: adaptation of classical IR to ad retrieval In vivo: lessons learned from observing how advertisers create and how users interact with ads Future research directions Better targeting via modeling of user behavior Fully automated ad creation Technologies described might or might not be in actual use at Yahoo! 3

What changed in 100 years: measurability and reach Half the money I spend on advertising is wasted; the trouble is I don't know which half. -- John Wanamaker (attributed) [1838-1922] No more coupon codes Flexible ad targeting + conversion tracking Experimentation rules! 4

What is Computational Advertising? A new scientific sub-discipline that provides the foundation for building online ad retrieval platforms Find the optimal ad for a given user in a specific context Information retrieval Microeconomics Computational Advertising Statistical modeling & machine learning Large-scale text analysis 5

US online advertising spending (source: emarketer.com, November 2010) Year Online Online % of total media 2009 $22.7B 13.9% 2010 $25.8B 15.3% 2011 $28.5B 16.7% 2012 $32.6B 18.3% 2013 $36.0B 19.8% 2014 $40.5B 21.5% 6

Not all ads are commercial! Ray Bradbury (CHI 2009, Hoffmann et al.) 7

Ads as another source of content for enriching Web search results I do not regard advertising as entertainment or an art form, but as a medium of information. 8

Textual advertising Sponsored Search Ads driven by search keywords aka a.k.a. keyword driven ads or paid search Content Match Ads driven by the content of a web page a.k.a. context driven ads or contextual ads Textual advertising on the Web is strongly related to NLP and information retrieval 9

Sponsored search example Textual ads driven by keyword search 10

Content match example 11

Content match example #2 12

Anatomy of an ad Tutorial at SIGIR 2010 Information Retrieval Challenges in Computational Advertising research.yahoo.com/tutorials/sigir Title Creative Display URL Landing URL http://research.yahoo.com/sigir10_compadv com/sigir10 compadv Bid phrases: {SIGIR 2010, computational advertising, Evgeniy Gabrilovich,...} Bid: $0.10 13

Beyond keyword matching Matching ads is relatively simple for explicitly bid keywords Exact match Covering only those is not enough advertisers need volume! Broad match (or advanced match) Suppose your ad is Low prices on Seattle hotels Naïve approach: bid on all queries that contain the word Seattle Problems Seattle's Best Coffee Chicago Alaska cruises start point Ideally: bid on queries related to Seattle as a travel destination The system should facilitate concept-level ad matching 14

An ultra-brief history of approaches to Web advertising The old school: database-style ad matching Exact match (query = bid phrase) Broad match via query rewrites Content match: reduce the problem to exact match (extract bid phrases from pages) Essentially record lookup Learning to rank The new approach: knowledge-based ad retrieval Elaborate query expansion Ad indexing and scoring using all the info available Bid phrases, title, creative, URL, landing page, etc. Akin to document indexing in IR 2 nd pass relevance reordering (re-ranking) Using features not available to the 1 st pass model (e.g., setlevel features, click history) 15

Research questions How to index the ad corpus? How to select relevant ads? Should we show ads at all? What really happens after an ad click? What is the interplay between the organic and sponsored results? Can we generate entire ad campaigns automatically? 16

Feature generation for ad retrieval How to select relevant ads Problem: Ads are short, queries are even shorter (Future work: how to circumvent this inherent limitation) Solution: Expand them via feature construction Query expansion using Web search results SIGIR 2007, w. Broder et al.; ACM TWEB 2009, Gabrilovich i et al. Ad expansion (bid phrases as queries) SIGIR 2008, w. Radlinski et al. Next: Interaction-based features & more 17

Query expansion using Web search results How to select relevant ads Humans often find it hard to readily see what the query is about But looking at the search results makes things easy Let computers do the same thing Infer query intent from the top algorithmic search results Our approach: cross-corpora pseudo relevance feedback Construct additional features from Web search results to retrieve better ads 18

How to select relevant ads Example: Davy Byrne's CATEGORIES 1.Travel/Destinations/Europe/ Ireland/Dublin 2. Entertainment/Bars and Night Clubs/Brew Pubs 19

If we know it is about Dublin pubs, then ads are easy to find How to select relevant ads 20

How to select relevant ads 1 st pass retrieval Enriched feature space beyond the bag of words (SIGIR 2007, Broder et al.; CIKM 2008, w. Broder et al.) Feature generation to help bridge the lexical mismatch Candidate ads to be re-ranked 21

Learning when (not) to advertise (CIKM 2008, w. Broder et al.) We do not have to show ads! Roughly half of the queries have no ads Should we show ads at all Repeatedly showing non-relevant ads can have detrimental long-term effects Modeling actual (short- and long-term) costs of showing non-relevant ads is very difficult We want to predict when (not) to show ads 22

Should we show ads at all Two approaches: Thresholding vs. Machine Learning Global threshold on relevance scores of individual ads Only show ads with scores above the threshold Problem: Scores are not necessarily comparable across queries! Learn a binary yprediction model for sets of ads Features defined over sets of ads rather than individual ads Relevance (word overlap, cosine similarity between ad and query/page etc.) Result set cohesiveness (coefficient of variation of ad scores, result set clarity, entropy) 23

Incorporating click history (WSDM 2010, Hillard et al.) Should we show ads at all Binary classifier (relevant / non-relevant ads) Baseline: text overlap features (query/ad) Click history (query/ad) with back-off Click propensity in query/ad translation Bayes rule ) p ( D Q ) p ( Q D ) p ( D ) / p ( Q) p( Q trans( q m n IBM D) Model trans( 1 j d i ) j 0 i 0 q logs count( q logs q j d i ) count( q j d j i ) d Counting clicks for query/ad word pairs i ) Cold start (i.e., no click history) is OK Using click data to overcome synonymy Query = running gear Ad = Best jogging shoes Results Query coverage 9% Ads per query 12% CTR 10% Same # clicks on fewer ads 24

Should we show ads at all Incorporating multi-modal interaction data (SIGIR 2010, Guo & Agichtein) i Ready to buy or just browsing? Classifying research- and purchase-oriented sessions Inferring eye gaze position from observable bl actions Keystrokes, GUI (scroll/click), mouse movement, browser (new tab, close, back/forward) Research vs. purchase classification (in lab): F1 = 0.96 Ad clickthrough in sessions classified as Purchase > 2X compared to sessions classified as Research Predicting future ad clicks: F1 = 0.07 0.17 (+141%) 25

How to index the ad corpus Semantically-aware ad indexing and retrieval (WWW 2010, w. Bendersky et al.) 26

Structure of online ad campaigns: the ad schema How to index the ad corpus Advertiser Buy appliances on Black Friday New Year deals on lawn & garden tools Account 1 Account 2 Kitchen appliances Creatives Brand name appliances Compare prices and save money www.appliances-r-us.com Ad group 1 Ad Campaign 1 Ad group 2 Bid phrases { Miele, KitchenAid, Cuisinart, } Campaign 2 What is the right approach to document length normalization? Use information from higher levels to alleviate data sparsity at children nodes What is the appropriate p indexing unit? 27

How to index the ad corpus Possible indexing approaches 1. Term index (Cartesian product of all creatives and bid terms) Huge index, small focused documents 2. Creative index (a creative is coupled with all the bid terms in the ad group) Two-stage retrieval (first choose the creative, then pick the term) Bid terms are duplicated across creatives 3. Ad group index Indexing units are entire ad groups Three-stage retrieval first choose the ad group then the creative and finally pick the term Most compact index 28

How to index the ad corpus Retrieval e speed vs. relevance e Are we trading effectiveness ect e ess for efficiency? Term index yields most relevant ads, yet is least efficient (20x slower than the ad group index) Ad group index is most efficient (2x faster than creative index), yet least effective 29

How to index the ad corpus Using learning to rank techniques: structured re-ranking ranking Step 1: Retrieve an initial set of candidates using the ad group index Step 2: Re-rank the candidate set using structural features (don t ignore the structure by scoring creatives and terms independently) Ad group score, creative-term term pair score # bid terms in the ad group Unigram entropy (cohesiveness) of the ad group Ratio of query words covered by the ad group text Fraction of the titles / terms / URLs that contain at least one query term Other features are possible! More effective than the term index Still very efficient! 30

What happens after an ad click What happens after an ad click? Quantifying the impact of landing pages in Web advertising (CIKM 2009, w. Becker et al.) 31

What happens after an ad click Conceptually: context transfer Search engine result page Click! Landing page Conversion (e.g., purchase of the product or service being advertised) User s activity on the advertiser s Web site 32

What happens after an ad click All landing pages are not created equal (and neither are the corresponding conversion rates) Homepage Main page of the advertiser s site 25% Category Main page of a sub-section of the browse advertiser s site, which describes a 38% category of related products Search Search within the advertiser s site transfer OR on other Web pages 26% Other Terminal pages (e.g., promotion 11% pages or forms) 33

Homepage 25% Category browse 38% Search transfer 26% Other 11% What happens after an ad click Taxonomy of landing pages: examples 34

Homepage 25% Category browse 38% Search transfer 26% Other 11% What happens after an ad click Taxonomy of landing pages: examples 35

Homepage 25% Category browse 38% Search transfer 26% Other 11% What happens after an ad click Taxonomy of landing pages: examples 36

What happens after an ad click Using the landing page taxonomy Picking the right landing page type for each ad Improving the conversion rate Improving advertisers ROI! 37

What happens after an ad click Landing page type usage vs. conversion: breakdown by query frequency Observed conversion rates are in sharp contrast with usage frequency of the different page types! 38

What happens after an ad click Landing page type usage vs. conversion: breakdown by query price As the price goes up, so does the conversion rate Again, observed conversion rates are in sharp contrast with usage frequency across a range of query prices! 39

What is the interplay... Competing for users attention: On the interplay between organic and sponsored search results (WWW 2010, w. Danescu-Niculescu-Mizil et al.) 40

What is the interplay... Ads as another source of content for enriching Web search results 41

What is the interplay... The interplay between ads and organic results... What information consumes is rather obvious: it consumes the attention of its recipients. Hence a wealth of information creates a poverty of attention and a need to allocate that attention efficiently among the overabundance of information sources that might consume it. Herbert Simon, Designing Organizations for an Information-Rich World, 1971 Is there competition for clicks between ads and organic results? Do users prefer ads that are similar to the organic results, or do they prefer diversity it? We found that the answers to these questions depend on the type of the query See also Ashkan et al. (ECIR 2009, CIKM 2009) 42

What is the interplay... Relation between the CTR of ads and the CTR of organic results Negative correlation (competition) Users are only willing to spend limited time and effort on each query Positive correlation (depends on the quality of results) Easy query ( online radio ) decent ads and organic results clicks on both Hard query ( what should I do today? ) poor results on both sides no clicks on either Independence (null hypothesis) Users consider ads and organic results as two independent sources of information o 43

What is the interplay... Findings: competition + positive correlation 44

Decoupling the forces What is the interplay... Users are willing to invest limited effort in each query competition In order to single out the competition effect, we tried to explicitly model the amount of effort the user is willing to invest Low effort = navigational i queries [Broder, 2002] (27% of queries) Pandora radio, Bank of America High effort = non-navigational queries Meaning of life, academia vs. industry 45

What is the interplay... Competition clearly exists for navigational queries We also examined different degrees of navigationality: the less navigational the query, the less competition t o we observeded 46

What is the interplay... Another viewpoint: Do users prefer ads that are more similar to the organic results or more diverse ads? Well, both... So we need to dig deeper again 47

Let s break down by navigationality again What is the interplay... 48

Counterintuitive? What is the interplay... 49

What is the interplay... Responsive and incidental ads Responsive ads Directly address the user s information need Likely to be similar to the organic results Incidental ads Tangentially related to the user s information need Unreasonable as organic results, but ok for ads Likely to be different from the organic results Example: query = free internet radio Responsive Incidental Pandora Internet Radio Discount Bose Computer Speakers 50

Now it all makes sense... What is the interplay... 51

Research frontier: towards automatic ad generation Title Tutorial SIGIR 2010 and Textatsummarization I f Information ti generation Retrieval R t iti l given Challenges Ch iin NL i ll Creative Rich Computational Advertising a rich product description product Full text indexing Display URL research.yahoo.com/tutorials/sigir (possibly personalized) feed based b d on the h entire i Landing URL product description Use mono-lingual MT to generate bid phrases from the product description [w. Ravi et al., 2010] Bid phrases: h {SIGIR 2010 2010, Bid generationadvertising, for advanced computational match on exact match Evgeniy g based y Gabrilovich,...}} bids andd conversion i data d t Bid: $0.10 [w. Broder et al., 2011] ECIR Yahoo! Research 2010 2011 http://research http://research.yahoo.com/sigir10_compadv compadv Matching i yahoo thecom/sigir10 landing i page to the query type ((to maximize conversion)) [w. Becker et al., 2009] 52

Key messages 1. Computational advertising = A principled way to find the best match between a user in a context and a suitable ad 2. Advertising is a form of information 3. Finding the best ad is an IR problem Classical IR needs significant adaptation 4. Plenty to be learned, problems are far from solved! Vibrant research area Numerous open research questions in other areas of Comp. Adv. audience selection, guaranteed delivery, personalization, more interactive ads,... 53

Thank you! gabr@yahoo-inc.com http://research.yahoo.com/~gabr 54

This talk is Copyright Yahoo! 2011. Yahoo! and the Author retain all rights, including copyright and distribution rights. No publication or further distribution in full or in part is permitted without explicit written permission. The opinions expressed herein are the responsibility of the author and do not necessarily reflect the opinion of Yahoo! Inc. This talk benefitted from the contributions of many colleagues and co-authors at Yahoo! and elsewhere. Their help is gratefully acknowledged. 55