A survey on click modeling in web search

Similar documents
Invited Applications Paper

Nominal and ordinal logistic regression

Revenue Optimization with Relevance Constraint in Sponsored Search

How much can Behavioral Targeting Help Online Advertising? Jun Yan 1, Ning Liu 1, Gang Wang 1, Wen Zhang 2, Yun Jiang 3, Zheng Chen 1

Introduction to Auction Design

Basics of Statistical Machine Learning

Learning to Rank Revisited: Our Progresses in New Algorithms and Tasks

Statistical Machine Learning

A Logistic Regression Approach to Ad Click Prediction

Click efficiency: a unified optimal ranking for online Ads and documents

Optimizing Display Advertisements Based on Historic User Trails

Click-Through Rate Estimation for Rare Events in Online Advertising

A General Framework for Mining Concept-Drifting Data Streams with Skewed Distributions

How to assess the risk of a large portfolio? How to estimate a large covariance matrix?

How To Cluster On A Search Engine

Finding Advertising Keywords on Web Pages. Contextual Ads 101

Cell Phone based Activity Detection using Markov Logic Network

An Empirical Analysis of Sponsored Search Performance in Search Engine Advertising. Anindya Ghose Sha Yang

Dynamical Clustering of Personalized Web Search Results

8. Time Series and Prediction

Probabilistic Models for Big Data. Alex Davies and Roger Frigola University of Cambridge 13th February 2014

A Practical Application of Differential Privacy to Personalized Online Advertising

Internet Advertising and the Generalized Second Price Auction:

Click-through Prediction for Advertising in Twitter Timeline

Subordinating to the Majority: Factoid Question Answering over CQA Sites

Technical challenges in web advertising

Predictive Indexing for Fast Search

Course: Model, Learning, and Inference: Lecture 5

GOOGLE ADWORDS. Optimizing Online Advertising x The Analytics Edge

Competition-Based Dynamic Pricing in Online Retailing

Variance Reduction. Pricing American Options. Monte Carlo Option Pricing. Delta and Common Random Numbers

1 Maximum likelihood estimation

Bayes and Naïve Bayes. cs534-machine Learning

The ABCs of AdWords. The 49 PPC Terms You Need to Know to Be Successful. A publication of WordStream & Hanapin Marketing

Considerations of Modeling in Keyword Bidding (Google:AdWords) Xiaoming Huo Georgia Institute of Technology August 8, 2012

Different Users and Intents: An Eye-tracking Analysis of Web Search

Forecasting Trade Direction and Size of Future Contracts Using Deep Belief Network

Linear Threshold Units

Collaborative Filtering. Radek Pelánek

12.5: CHI-SQUARE GOODNESS OF FIT TESTS

Question 2 Naïve Bayes (16 points)

Branding and Search Engine Marketing

Decompose Error Rate into components, some of which can be measured on unlabeled data

CSCI567 Machine Learning (Fall 2014)

An Introduction to Information Theory

17. SIMPLE LINEAR REGRESSION II

MATH4427 Notebook 2 Spring MATH4427 Notebook Definitions and Examples Performance Measures for Estimators...

Purchase Conversions and Attribution Modeling in Online Advertising: An Empirical Investigation

The Lane s Gifts v. Google Report

Normality Testing in Excel

Sibyl: a system for large scale machine learning

Predict the Popularity of YouTube Videos Using Early View Data


Linear Classification. Volker Tresp Summer 2015

Large Scale Learning to Rank

Tests for Two Survival Curves Using Cox s Proportional Hazards Model

Comparing Tag Clouds, Term Histograms, and Term Lists for Enhancing Personalized Web Search

Maximum Likelihood Estimation

Domain Bias in Web Search

Routing Questions for Collaborative Answering in Community Question Answering

A NURSING CARE PLAN RECOMMENDER SYSTEM USING A DATA MINING APPROACH

Polarization codes and the rate of polarization

Internet Advertising and the Generalized Second-Price Auction: Selling Billions of Dollars Worth of Keywords

Data Mining in Web Search Engine Optimization and User Assisted Rank Results

STATISTICA Formula Guide: Logistic Regression. Table of Contents

How To Predict Clickthrough Rate On A Website

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

Pearson's Correlation Tests

SEARCH ENGINE OPTIMIZATION USING D-DICTIONARY

Categorical Data Visualization and Clustering Using Subjective Factors

Chapter 4: Vector Autoregressive Models

Sampling Biases in IP Topology Measurements

EFFECTIVE ONLINE ADVERTISING

Introduction. A. Bellaachia Page: 1

Learning from Data: Naive Bayes

A Practical Scheme for Wireless Network Operation

CITY UNIVERSITY OF HONG KONG. Revenue Optimization in Internet Advertising Auctions

Interaction between quantitative predictors

Comparing Multiple Proportions, Test of Independence and Goodness of Fit

Agenda. Mathias Lanner Sas Institute. Predictive Modeling Applications. Predictive Modeling Training Data. Beslutsträd och andra prediktiva modeller

Spatial Statistics Chapter 3 Basics of areal data and areal data modeling

Poisson Models for Count Data

Identifying Best Bet Web Search Results by Mining Past User Behavior

An Empirical Study of Two MIS Algorithms

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2

Transcription:

A survey on click modeling in web search Lianghao Li Hong Kong University of Science and Technology

Outline 1 An overview of web search marketing 2 An overview of click modeling 3 A survey on click models Unbiased hypothesis Position bias hypothesis Depend on click pattern Depend on user intent 4 Future work

Outline 1 An overview of web search marketing 2 An overview of click modeling 3 A survey on click models Unbiased hypothesis Position bias hypothesis Depend on click pattern Depend on user intent 4 Future work

Search engine marketing

Generalized second-price auction

Search advertising demo

Outline 1 An overview of web search marketing 2 An overview of click modeling 3 A survey on click models Unbiased hypothesis Position bias hypothesis Depend on click pattern Depend on user intent 4 Future work

Why do we need click prediction? Revenue is highly influenced by click probability prediction. Search engines rank ads with expected revenue E[revenue] = P ad (click) GSP(ad)

How to predict click behavior? Click-through logs help! Figure: Ranking presented for the query support vector machine

How to predict click behavior? To predict clicks by counting! P ad (click) = # of clicks # of impressions However, that is far from satisfaction clicks are biased due to the user browsing behavior long tail and cold start problems

How to predict click behavior? Long tail and cold start problems

How to predict click behavior? Long tail and cold start problems

Long tail query demo: Google vs. Bing

A unified framework for click modeling Problem definition Definition 1: (Click modeling) Let random variable u denotes a user, q denotes a query issued by the user, a denotes an ad, r is the position of the ad. The binary variable c is 1 if the ad is clicked and 0 otherwise. Let L denotes the impression list and S denote the click sequence. Click modeling aims to explain observed click events. The shorthand is: P(c, q, a, u, r, L, S) Goals of click modeling 1 To estimate the actual ad relevance from biased click-through logs 2 To predict P(c = 1 q, a, u, r, L, S) for future impressions

An overview of click models Hypotheses in click modeling To model click events, we have to incorporate proper browsing hypotheses (i.e., generative process). The main hypotheses include: Unbiased hypothesis: P(c q, a, u, r, L, S) = P(c q, a) Position bias hypothesis: P(c q, a, u, r, L, S) = P(c q, a, r) Depend on click pattern: P(c q, a, u, r, L, S) = P(c q, a, r, S) : P(c q, a, u, r, L, S) = P(c q, a, r, L) Depend on user intent: P(c q, a, u, r, L, S) = P(c q, a, u, r)

Outline 1 An overview of web search marketing 2 An overview of click modeling 3 A survey on click models Unbiased hypothesis Position bias hypothesis Depend on click pattern Depend on user intent 4 Future work

Unbiased hypothesis Outline 1 An overview of web search marketing 2 An overview of click modeling 3 A survey on click models Unbiased hypothesis Position bias hypothesis Depend on click pattern Depend on user intent 4 Future work

Unbiased hypothesis Unbiased hypothesis: Basic hypothesis Basic hypothesis In the basic hypothesis, there is no bias associated with the observed clicks. This leads to the simplest model: P(c q, a, u, r, L, S) = P(c q, a) Remark In the basic hypothesis, the click probability is dominated by the relevance between query q and ad a.

Position bias hypothesis Outline 1 An overview of web search marketing 2 An overview of click modeling 3 A survey on click models Unbiased hypothesis Position bias hypothesis Depend on click pattern Depend on user intent 4 Future work

Position bias hypothesis Position bias hypothesis: Examination hypothesis Examination hypothesis (WWW 07, Richardson et al.) Examination hypothesis assumes that an ad be clicked must be both examined (i.e. e = 1) and relevant: P(c = 1 q, a, u, r, L, S) =P(c = 1 q, a, r) Independence assumption = P(c = 1 e, q, a, r)p(e q, a, r) e {0,1} =P(c = 1 e = 1, q, a)p(e = 1 r) Examination hypothesis Novelty: The first attempt to model position bias

Position bias hypothesis Position bias hypothesis: Examination hypothesis Examination hypothesis (WWW 07, Richardson et al.) The position bias P(e = 1 r) can be experimentally measured by presenting users with the same ad at various positions on the page, and observing the user clicks. Remark In the examination hypothesis, the position bias is modeled with the query-independent examination probability P(e r) and eliminated from the relevance estimation.

Depend on click pattern Outline 1 An overview of web search marketing 2 An overview of click modeling 3 A survey on click models Unbiased hypothesis Position bias hypothesis Depend on click pattern Depend on user intent 4 Future work

Depend on click pattern Depend on click pattern: Cascade hypothesis Cascade hypothesis (WSDM 08, Carswell et al.) Cascade hypothesis assumes that an user scans each ad sequentially without any skips until she clicks on an ad and does not examine any additional ads after the click: P(e 1 = 1) = 1 P(e i = 1 e i 1 = 0) = 0 P(e i = 1 e i 1 = 1) = 1 c i 1 Novelty: The first attempt to model click pattern

Depend on click pattern Depend on click pattern: Cascade hypothesis Cascade hypothesis (WSDM 08, Carswell et al.) The probability of a click sequence with kth ad being clicked is: P(c = 1 r = k, q, a, u, L, S) =P(c = 1 r = k, q, a, L, S) Independence assumption k 1 =P(c = 1 r = k, q, a) P(c = 0 r = i, q, a) i=1 Cascade hypo. Remark This model is quite restrictive since it allows at most one click per query session.

Depend on click pattern Depend on click pattern: Multiple-click model Multiple-Click Model (WSDM 09, Guo et al.) Novelty: To enable multiple clicks in a session by incorporating a decision phase for continuing examining results. Figure: The user model of dependent click model

Depend on click pattern Depend on click pattern: Multiple-click model Multiple-click model (WSDM 09, Guo et al.) The probability of examination and click is given by: P(e = 1 r = 1) = 1 P(c = 1 r = i) = P(e = 1 r = i)p(c = 1 e = 1, r = i) P(e = 1 r = i + 1) = λ i P(c = 1 r = i) + P(c = 0 r = i) The probability of a click sequence with kth ad being clicked is: P(c = 1 r = k, q, a, u, L, S) =P(c = 1 r = k, q, a, L, S) Independence assumption k 1 =P(c = 1 r = k, q, a) λ i P(c = 1 r = i, q, a) + P(c = 0 r = i, q, a) i=1

Depend on click pattern Depend on click pattern: Dynamic Bayesian Network Dynamic Bayesian Network (WWW 09, Chapelle and Zhang) Novelty: The first attempt to model post click pattern Key idea: Model both post and perceived relevance Figure: The DBN used for clicks modeling.

Depend on click pattern Depend on click pattern: Dynamic Bayesian Network Dynamic Bayesian Network (WWW 09, Chapelle and Zhang) The following equations describe the model: A i = 1, E i = 1 C i = 1 P(A i = 1) = a u P(S i = 1 C i = 1) = s u C i = 0 S i = 0 S i = 1 E i+1 = 0 P(E i+1 = 1 E i = 1, S i = 0) = γ E i = 0 E i+1 = 0 where γ is the probability that an user examines the next result if she is not satisfied with the current result.

Depend on click pattern Depend on click pattern: Dynamic Bayesian Network Experiment: Data: 58,000,000 sessions and 682,000 unique queries from the click logs of the UK market X-axis: # of training sessions occurred at Position 1 Y-axis: MSE between the true CTRs and predicted CTRs

Outline 1 An overview of web search marketing 2 An overview of click modeling 3 A survey on click models Unbiased hypothesis Position bias hypothesis Depend on click pattern Depend on user intent 4 Future work

: Temporal click model Temporal click model (SIGIR 09, Xu et al.) Key idea: (Externality) An ad may receive fewer clicks when co-displayed with high quality ads. Novelty: The first attempt to model ad externality Data study 1 Data: Ad impression sequences with exactly two ads. Two data sets are constructed by collecting one month (tens of millions) ads shown on north and south, respectively. Ground truth: Empirical CTR as the measure of ad quality. Experiment Setting: Group impressions with similar ad quality at Position 1 into one bin and plot the average CTR at Position 2, and vice versa.

: Temporal click model

: Temporal click model Data study 2 Experiment Setting: Group impressions with similar CTR at Position 1 in one bin and plot the percentage of events where the first click occurred at Position 2 and vice versa.

: Temporal click model Figure: The first click influenced by ad quality

: Temporal click model Temporal click model (SIGIR 09, Xu et al.) Positional rationality hypothesis: 1 Users examine both ads together to assess their qualities, 2 If the ad at Position 2 is much better than ad at Position 1, users would click the ad at Position 2 first

: Temporal click model The proposed method Input: click-through log of ad impression sequence A =< a 1, a 2 >. Output: the predicted CTR of ads. Generative process:

: Temporal click model Graphical model: E: examination variable, E {0, 1} R a : ad quality variable, R a [0, 1] U a : position bias variable, U a [0, 1] F: random variable for the first pick, F {a 1, a 2 } S: random variable for the re-pick, S {a 1, a 2 } C i : click random variable for ith click, C {0, 1}

: Temporal click model Experiments Data set: 0.3 million unique queries and 0.1 billion sessions shown at north. 1.1 million queries and 0.65 billion sessions shown at south. Evaluation: MSE between true CTRs and predicted CTRs. Baselines: 1) Naive CTR statistics (NS) estimates CTR by counting, and 2) Bayesian browsing model (BBM) (KDD 09 Liu et al.)

: Temporal click model

: Temporal click model Experiment results Both TCM and BBM are significantly better than NS for all query frequencies. TCM is noticeably better than BBM on less-frequent queries but shows similar performance on frequent ones.

: Relational click prediction Relational click prediction (WSDM 12, Xiong et al.) Key idea: Click events would be influenced by the similarity between co-displayed Ads. Novelty: The first attempt to model similarity influence Figure: Two ad lists for query itunes account.

: Relational click prediction Data study Data: 0.7 million unique queries and 0.6 million unique ads from one month click logs Experiment setting: 1 Group ads into a specific context, i.e. a triple T =< q, a, r >, where q, a and r represent query, ad and position respectively. 2 Select a triple T that appears in multiple pageviews, i.e. l =< q, ad list >. 3 Calculate similarity between ad a and other ads in each l. 4 Compute empirical CTR of T in each l, and compare them with the average CTR of T on all pageviews. Evaluation: CTR T,l = CTR T,l CTR T CTR T

: Relational click prediction X-axis denotes the similarity between a and other co-displayed ads. Y-axis denotes the average CTR l for different triples.

: Relational click prediction Data study CTR l is negatively correlated with the similarity between surrounding ads. The intuition is that: When the surrounding ads are similar to the given ad in their contents (or topics), it is likely that they will distract user s attention.

: Relational click prediction The proposed method Key idea: Modeling ads in an ad list together instead of treating them independently. (P(c q, a, u, L, r = 1),, P(c q, a, u, L, r = n)) T = F(X, R) where X = {x 1,, x n } includes all the feature vectors x i extracted from < q, a, u, L, r = i >, and R encodes the relation between ads.

: Relational click prediction Graphical model: Figure: A continuous CRF model for relational click prediction.

: Relational click prediction Let Y = {y 1,, y n } denotes the predicted CTRs of ads. The probability distribution of output Y conditioned on input X is defined as P(Y X) = 1 Z(X) exp h(y i, X; w) + βg(y i, y j, X) i j>i where h is the vertex feature function representing the dependence between CTR and input feature vectors, g is the edge feature function representing pairwise relationship between ads.

: Relational click prediction Individual modeling For simplicity, they define the vertex feature function as follows, h(y i, X; w) = (y i f (x i ; w)) 2 where f (x i ; w) is the output of any conventional click model. Relational modeling As discussed, if two ads are very similar to each other, their click probabilities will both become lower. To encode this intuition, they define the edge feature function as below. g(y i, y j, X) = s i,j (y i + y j ), where s i,j is the term similarity between ads i and j.

: Relational click prediction The whole model By combining all the feature functions, we obtain the overall conditional probability distribution: P(Y X) = 1 Z(X) exp (y i f (x i ; w)) 2 + βs i,j (y i + y j ) j>i i

: Relational click prediction Experiments Data set: 0.7 million unique queries and 0.6 million unique ads from one month click logs Extracted features: history COEC, relevance of ad to query, attractiveness of ad title and description, reputation of advertiser, etc. Baselines: Logistic Regression (LOCAL) and an variant of the proposed method with no edge features being used. Evaluation: MSE between true CTR and predicted CTR

: Relational click prediction The proposed method significantly outperform baselines. It performs better in lower positions than higher positions which consists with the cascade assumption. Figure: Results of NMSE.

: Relational click prediction They further study the performance of click prediction with respect to different levels of similarities in the ad lists. When the similarity between ads increases, the performance of CRF also increases. Figure: NMSE results at different similarity levels.

Depend on user intent Outline 1 An overview of web search marketing 2 An overview of click modeling 3 A survey on click models Unbiased hypothesis Position bias hypothesis Depend on click pattern Depend on user intent 4 Future work

Depend on user intent Depend on user intent: Task-centric click model Task-centric click model (KDD 11, Zhang et al.) Novelty: The first attempt to model user behavior across multiple query sessions. Key ideas: Users tend to express their information needs incrementally, and click fresh documents that are not included before

Depend on user intent Depend on user intent: Task-centric click model Task-centric click model (KDD 11, Zhang et al.) Figure: The macro model of TCM.

Depend on user intent Depend on user intent: Task-centric click model Task-centric click model (KDD 11, Zhang et al.) Figure: The micro model of TCM.

Depend on user intent Depend on user intent: Task-centric click model Graphical model:

Depend on user intent User intent demo

Outline 1 An overview of web search marketing 2 An overview of click modeling 3 A survey on click models Unbiased hypothesis Position bias hypothesis Depend on click pattern Depend on user intent 4 Future work

Future work Click modeling for long tail query, crowdsourcing? Automatic feature construction, deep learning? Evaluation metrics Click modeling for web search in mobile device very different user browsing behavior may be totally different business model

Reference WWW 07, Richardson et al.: Predicting clicks: estimating the click-through rate for new ads WSDM 08, Carswell et al.: An experimental comparison of click position-bias models WSDM 09, Guo et al.: Efficient multiple-click models in web search WWW 09, Chapelle and Zhang: A dynamic bayesian network click model for web search ranking SIGIR 09, Xu et al.: Temporal click model for sponsored search WSDM 12, Xiong et al.: Relational click prediction for sponsored search KDD 11, Zhang et al.: User-click modeling for understanding and predicting search-behavior

Thanks