Search Marketing Cannibalization Analytical Techniques to measure PPC and Organic interaction
2 Search Overview
How People Use Search Engines Navigational Research Health/Medical Directions News Shopping Advice 3
Leveraging Search Demand Data General Trends Estimating effects of other media campaigns Measuring Brand Equity Aiding in creating Advertising Value Propositions Competitive Advantages/Disadvantages Shifts in Consumer Preferences Attitudinal Measurement http://www.people.fas.harvard.edu/~sstephen/papers/racia lanimusandvotingsethstephensdavidowitz.pdf 4
Search Demand Data Examples Media Effect Example: godaddy Consumer Preferences/Advertising Propositions 5 Source: Google Insights for Search
Search Demand: Competitive Measurement macys dillards 6
Paid vs. Organic Search Cannibalization 7
Cannibalization History Dates back to the on-going struggle between Finance & Marketing Have no fear! Digital Analytics are here! Why should we pay for something we already get for free?! We need to be where our market is, regardless of the As you may have guessed, PPC does take some clicks that cost! otherwise would have gone to SEO, a process known as cannibalization However, PPC typically provides incremental traffic and conversions above and beyond the cannibalized traffic 8
Measurement Methods Econometric modeling techniques Measuring the expected vs. actual traffic Incorporating pulse effects into model Leveraged when no control group was implemented (Paid search entirely turned off) Typical scenario occurring in the real world In Market Testing Randomly assign markets to treatment (PPC) and control (no PPC) T-tests/ANOVA vs. ANCOVA Google s study leveraging Bayesian techniques: http://static.googleusercontent.com/external_content/untrusted_dlcp/research.goog le.com/en/us/pubs/archive/37161.pdf 9
Important Data Elements for Analysis Organic ranking a very important component in the analysis: If Organic position is below the fold then analysis will commonly prove that PPC is highly incremental Google s study did not segment based on organic position Historical organic ranking information is commonly unavailable Therefore, analysis typically focuses on Brand keywords as opposed to Non-Brand because organizations typically rank in 1 st position for their brand terms 10
11 ARIMA Forecasting
ARIMA Forecasting Process Familiarize yourself with the Time Series Plot the data! Identifying Stationarity Augmented Dickey Fuller Tests Differencing Leveraging Autocorrelation and Partial Autocorrelation Correlelograms to identify the appropriate process Common plots to identify Autoregressive vs. Moving Average components Evaluating Model Performance Holdout time periods Common diagnostic measures (AIC, SBC, MAPE) 12
Common SAS Syntax PROC ARIMA DATA=inputDSN; IDENTIFY VAR=depvar(differencing options) crosscorr=(exogenous variables) stationarity=(adf) esacf; ESTIMATE p= q= input=(exogenous variables) method=ml; FORECAST back=n lead=n id=datevar out=outputdsn; RUN; QUIT; 13
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 Plotting the Time Series 290,000 Shock/Seasonal effect Total Visits by Week 280,000 270,000 260,000 Shock/Seasonal effect 250,000 240,000 230,000 220,000 2008 2009 2010 2011 Plotting the time series reveals that there appears to be an upward trend in the data Additionally, there appears to be external shocks and/or seasonal trends 14
1/12/2008 2/12/2008 3/12/2008 4/12/2008 5/12/2008 6/12/2008 7/12/2008 8/12/2008 9/12/2008 10/12/2008 11/12/2008 12/12/2008 1/12/2009 2/12/2009 3/12/2009 4/12/2009 5/12/2009 6/12/2009 7/12/2009 8/12/2009 9/12/2009 10/12/2009 11/12/2009 12/12/2009 1/12/2010 2/12/2010 3/12/2010 4/12/2010 5/12/2010 6/12/2010 7/12/2010 8/12/2010 9/12/2010 10/12/2010 11/12/2010 12/12/2010 1/12/2011 2/12/2011 3/12/2011 4/12/2011 5/12/2011 6/12/2011 Plotting the Time Series 300,000 Trend of Paid and Organic 250,000 200,000 150,000 100,000 50,000 0 Visits_Paid Visits_Organic On average, Paid traffic appears to contribute ~70K visits per week (~30%) of the overall traffic When Paid was paused, there was an immediate decline 15 However, visits did increase in the 3 rd /4 th week But then decline again the 5 th /6 th week
1/12/2008 2/12/2008 3/12/2008 4/12/2008 5/12/2008 6/12/2008 7/12/2008 8/12/2008 9/12/2008 10/12/2008 11/12/2008 12/12/2008 1/12/2009 2/12/2009 3/12/2009 4/12/2009 5/12/2009 6/12/2009 7/12/2009 8/12/2009 9/12/2009 10/12/2009 11/12/2009 12/12/2009 1/12/2010 2/12/2010 3/12/2010 4/12/2010 5/12/2010 6/12/2010 7/12/2010 8/12/2010 9/12/2010 10/12/2010 11/12/2010 12/12/2010 1/12/2011 2/12/2011 3/12/2011 4/12/2011 5/12/2011 6/12/2011 Plotting the Time Series 290,000 280,000 270,000 260,000 250,000 240,000 230,000 220,000 210,000 200,000 Trend of Visits w/media 160 140 120 100 80 60 40 20 0 Media Visits_Actual Leverage an econometrics model to forecast the expected sales while controlling for trend and media influences 16
Identification Augmented Dickey Fuller Test suggests that no differencing is required ACF/PACF suggests both Autoregressive and Moving Average elements Incorporate transfer functions to control for exogenous variables of media/trend Forecast the expected sales prior to the test Additionally, rebuild model after test w/ the intervention effect identifying the period in which Paid Search is paused 17
Model Parameters Pre-test/Forecast Built ARIMA(X) model pre test to establish the expected overall traffic from both Paid and Organic search prior to the test All parameter estimates significant Trend (Num1) and Media (Num2) are both significant 18
Model Parameters Pre-test/Forecast 270000 260000 250000 240000 230000 220000 210000 Forecast vs. Actual Pre-Test Model 200000 1/1/2011 2/1/2011 3/1/2011 4/1/2011 5/1/2011 100 80 60 40 20 0 media Visits_Actual Forecast for Visits_Actual Forecast accuracy very high as the Forecast vs. Actual prior to the forecast (4/9) ~11,000 visits lost per week when comparing the Actual vs. Forecast during the dark period Or roughly 16% of the Average Paid Traffic 19
Actual Results w/intervention Effect Subsequent to the dark period, the model was rebuilt with an intervention effect for the period of paused paid activity 20 Parameter for the intervention effect (0/1) confirms the comparison of the Forecast vs. Actual Significant with a parameter estimate of -11,363 Note additional models were built and evaluated with log transformations on the visits and media Models results were similar (intervention effect interpretation is slightly different)
21 Applying Learnings
Implementing Learnings Is the incremental paid traffic worth the cost? CPCs can be adjusted upwards by taking into consideration the cannibalization rate, for example: Adj CPC=Actual CPC/Incremental Rate or Adj CPC=Actual CPC/(1-Cannibalization Rate) 22 Bottom Line: Implement learnings into your bidding algorithms!
23 Appendix
Common ACF/PACF Correlogram Plots 1.0 0.8 0.6 0.4 0.2 0.0-0.2-0.4-0.6-0.8-1.0 ACF of AR(1) phi>0 1 2 3 4 5 6 7 8 9 10 ACF of AR(1) phi>0 1.0 0.8 0.6 0.4 0.2 0.0-0.2-0.4-0.6-0.8-1.0 ACF of AR(1) phi<0 1 2 3 4 5 6 7 8 9 10 ACF of AR(1) phi<0 24
Common ACF/PACF Correlogram Plots ACF of MA(1) theta>0 ACF of MA(2) theta1, theta2<0 1.0 0.8 0.6 0.4 0.2 0.0-0.2-0.4-0.6-0.8-1.0 1 2 3 4 5 6 7 8 9 10 1.0 0.8 0.6 0.4 0.2 0.0-0.2-0.4-0.6-0.8-1.0 1 2 3 4 5 6 7 8 9 10 ACF of MA(1) theta>0 ACF of MA(2) theta1, theta2<0 PACF of MA(1) theta>0 PACF of MA(2) theta1,theta2<0 1.0 0.8 0.6 0.4 0.2 0.0-0.2-0.4-0.6-0.8-1.0 1 2 3 4 5 6 7 8 9 10 1.0 0.8 0.6 0.4 0.2 0.0-0.2-0.4-0.6-0.8-1.0 1 2 3 4 5 6 7 8 9 10 PACF of MA(1) theta>0 PACF of MA(2) theta1,theta2<0 25
ACF/PACF Cheat Sheet Process ACF PACF ARIMA(0,0,0) no significant spikes no significant spikes ARIMA(0,1,0) d=1 slow attenuation 1 spike at order of differencing ARIMA(1,0,0) phi>0 exponential decay, positive spikes 1 positive spike at lag 1 ARIMA(1,0,0) phi<0 oscillating decay, begins with negative spike 1 negative spike at lag 1 ARIMA(2,0,0) phi>0 exponential decay, positive spikes 2 positive spikes at lags 1 and 2 ARIMA(2,0,0) phi1<0 phi2>0 oscillating exponential decay 1 negative spike at lag 1, 1 positive spike at lag 2 ARIMA(0,0,1) theta>0 1 negative spike at lag 1 exponential decay of negative spikes ARIMA(0,0,1) theta<0 1 positive spike at lag 1 oscillating decay of positive and negative spikes ARIMA(0,0,2) theta1,theta2>0 2 negative spikes at lags 1 and 2 exponential decay of negative spikes ARIMA(0,0,2) theta1,theta2<0 2 positive spikes at lags 1 and 2 oscillating decay of positive and negative spikes ARIMA(1,0,1) phi>0 theta>0 exponential decay of positive spikes exponential decay of positive spikes ARIMA(1,0,1) phi>0 theta<0 exponential decay of positive spikes oscillating decay of positive and negative spikes ARIMA(1,0,1) phi<0 theta>0 oscillating decay exponential decay of negative spikes ARIMA(1,0,1) phi<0 theta<0 oscillating decay of negative and positive spikes oscillating decay of positive and negative spikes 26
Brian Conner VP, Analysis and Decision Support 412.319.3033 bconner@impaqt.com