Strategic Online Advertising: Modeling Internet User Behavior with



Similar documents
FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT MINING SYSTEM

DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS

DISPLAY ADVERTISING: WHAT YOU RE MISSING. Written by: Darryl Chenoweth, Digital Marketing Expert

Comparison of K-means and Backpropagation Data Mining Algorithms

Mobile Real-Time Bidding and Predictive

Beyond the Click : The B2B Marketer s Guide to Display Advertising

The Science and Art of Market Segmentation Using PROC FASTCLUS Mark E. Thompson, Forefront Economics Inc, Beaverton, Oregon

STATE OF THE INDUSTRY: HOW CONTENT MARKETING AND NATIVE WILL DRIVE A NEW ERA OF ENGAGEMENT

Sizmek on Creative Optimization

CLUSTER ANALYSIS FOR SEGMENTATION

Driving Results with. Dynamic Creative

STATISTICA. Clustering Techniques. Case Study: Defining Clusters of Shopping Center Patrons. and

IBM SPSS Direct Marketing 23

AdTheorent s. The Intelligent Solution for Real-time Predictive Technology in Mobile Advertising. The Intelligent Impression TM

Support Vector Machines with Clustering for Training with Very Large Datasets

Prediction of Stock Performance Using Analytical Techniques

A Study of Web Log Analysis Using Clustering Techniques

Role of Customer Response Models in Customer Solicitation Center s Direct Marketing Campaign

Clustering. Danilo Croce Web Mining & Retrieval a.a. 2015/201 16/03/2016

Digital Playbook. November Big Ideas

Data Exploration Data Visualization

Driving Results with. Dynamic Creative Optimization

A Programme Implementation of Several Inventory Control Algorithms

Here are our Pay per Click Advertising Packages:

Measuring success on Facebook

EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set

Employer Health Insurance Premium Prediction Elliott Lui

Data Mining Project Report. Document Clustering. Meryem Uzun-Per

Knowledge Discovery and Data Mining. Structured vs. Non-Structured Data

Numerical Algorithms Group

IBM SPSS Direct Marketing 22

Medical Information Management & Mining. You Chen Jan,15, 2013 You.chen@vanderbilt.edu

Make Better Decisions with Optimization

Final report of ITS Center project: Signal system data mining.

Making Integrated Campaigns Work: How a Search Marketing Mindset Can Drive the ROI of Display Advertising

Analysis of Object Oriented Software by Using Software Modularization Matrix

The Data Mining Process

Media Kit Reach out to investors and traders who are hungry for Options Trading information and services. Optiontradingpedia.

RETARGETING. A Beginner s Guide to Retargeting 101

On Video Content Delivery in Wireless Environments

Sabre Media Kit. powering progress

Data Mining: Overview. What is Data Mining?

Social Media Mining. Data Mining Essentials

Data Mining: Exploring Data. Lecture Notes for Chapter 3. Introduction to Data Mining

BEHAVIORAL MARKETING THE MOST EFFECTIVE WAY TO MARKET TO BUYERS, NOT SUGGESTED DEMOGRAPHICS

KnowledgeSTUDIO HIGH-PERFORMANCE PREDICTIVE ANALYTICS USING ADVANCED MODELING TECHNIQUES

Integer Programming: Algorithms - 3

STATISTICA Formula Guide: Logistic Regression. Table of Contents

In this presentation, you will be introduced to data mining and the relationship with meaningful use.

Automated Statistical Modeling for Data Mining David Stephenson 1

MAXIMIZING RETURN ON DIRECT MARKETING CAMPAIGNS

Chapter ML:XI (continued)

Using multiple models: Bagging, Boosting, Ensembles, Forests

A QoS-Aware Web Service Selection Based on Clustering

Distinguishing Humans from Robots in Web Search Logs: Preliminary Results Using Query Rates and Intervals

The Power of Storytelling: Taking a Sequenced Approach to Digital Marketing

Research on Clustering Analysis of Big Data Yuan Yuanming 1, 2, a, Wu Chanle 1, 2

BIDM Project. Predicting the contract type for IT/ITES outsourcing contracts

Adobe maximizes its digital marketing returns.

Project Proposal: SAP Big Data Analytics on Mobile Usage Inferring age and gender of a person through his/her phone habits

Remarketing Case Study. Benchmark Report on Achieving B2B Remarketing Success by Optimizing Brand Lift

Gerry Hobbs, Department of Statistics, West Virginia University

A Beginner s Guide to the Google Display Network

Realize Campaign Performance with Call Tracking. One Way Marketing Agencies Prove Their Worth

Performance Metrics for Graph Mining Tasks

How To Promote Your Hotel Business With Sabre Media

A Software and Hardware Architecture for a Modular, Portable, Extensible Reliability. Availability and Serviceability System

CIBC Business Toolkit Grow and Manage Your Business Online. Part 2: Grow Your Web Presence

MODELING CUSTOMER RELATIONSHIPS AS MARKOV CHAINS. Journal of Interactive Marketing, 14(2), Spring 2000, 43-55

HYBRID GENETIC ALGORITHMS FOR SCHEDULING ADVERTISEMENTS ON A WEB PAGE

A FUZZY BASED APPROACH TO TEXT MINING AND DOCUMENT CLUSTERING

Clustering UE 141 Spring 2013

Paper Downtime of a truck = Truck repair end date - Truck repair start date

Variable Selection in the Credit Card Industry Moez Hababou, Alec Y. Cheng, and Ray Falk, Royal Bank of Scotland, Bridgeport, CT

Master of Science in Marketing Analytics (MSMA)

COM CO P 5318 Da t Da a t Explora Explor t a ion and Analysis y Chapte Chapt r e 3

An approach of detecting structure emergence of regional complex network of entrepreneurs: simulation experiment of college student start-ups

MULTIPLE-OBJECTIVE DECISION MAKING TECHNIQUE Analytical Hierarchy Process

Categorical Data Visualization and Clustering Using Subjective Factors

POLITICAL TOOLKIT Midterm Elections 2014

Capturing Meaningful Competitive Intelligence from the Social Media Movement

Mobile Phone APP Software Browsing Behavior using Clustering Analysis

EXPLORING & MODELING USING INTERACTIVE DECISION TREES IN SAS ENTERPRISE MINER. Copyr i g ht 2013, SAS Ins titut e Inc. All rights res er ve d.

Multivariate testing. Understanding multivariate testing techniques and how to apply them to your marketing strategies

The Quantcast Display Play-By-Play. Unlocking the Value of Display Advertising

The Scientific Data Mining Process

Branding and Search Engine Marketing

to Boost SEO Growth Services To learn more, go to: teletech.com

The ABCs of AdWords. The 49 PPC Terms You Need to Know to Be Successful. A publication of WordStream & Hanapin Marketing

K-means Clustering Technique on Search Engine Dataset using Data Mining Tool

Pay Per Click Marketing

For example: Standard Banners: Images displayed alongside, above or below content on a webpage.

The Value of Connecting Supply Data to Demand

An Overview of Knowledge Discovery Database and Data mining Techniques

BENCHMARK REPORT. Research and insights for engaging subscribers EXCERPT

Linear programming approach for online advertising

Standardization and Its Effects on K-Means Clustering Algorithm

CHAPTER 1 INTRODUCTION

Clustering through Decision Tree Construction in Geology

Digital Marketing Trends in the Education Market A Comprehensive Analysis of the School Year

Transcription:

2 Strategic Online Advertising: Modeling Internet User Behavior with Patrick Johnston, Nicholas Kristoff, Heather McGinness, Phuong Vu, Nathaniel Wong, Jason Wright with William T. Scherer and Matthew W. Burkett Abstract We investigated how online advertising could be made more receptive to Internet users needs, thereby improving the efficacy. Only three to five viewers in a thousand click on a given online banner advertisement. To improve the low response rate, marketers need to reach the right audience, which will yield a higher return on advertising dollars by eliminating wasted ads and maximizing campaign effectiveness. Users will see relevant ads based on their preferences and fewer ads that do not interest them. We identified and modeled user behavior and characterized a subpopulation of users to help predict advertising response. We then developed an exploratory optimization and targeting technology for use by Advertising.com by serving users ads probabilistically on the basis of their online behaviors. After preliminary algorithm validation, the increase in early user clicks indicates the potential effectiveness of these models in improving the online advertising response rate. The resultant potential increase in action could lead to increased revenue. T I. INTRODUCTION he Internet is quickly becoming the preferred medium Manuscript received April 14, 2006. P. Johnston is a Bachelors of Science Candidate with the Systems and Charlottesville VA 22904 USA (email: pnj9t@virginia.edu) N. Kristoff is a Bachelors of Science Candidate with the Systems and Charlottesville VA 22904 USA (email: nmk7f@virginia.edu) H. McGuinness is a Bachelors of Science Candidate with the Systems and Charlottesville VA 22904 USA (email: ham5c@virginia.edu) P. Vu is a Bachelors of Science Candidate with the Systems and Charlottesville VA 22904 USA (email: pdv7t@virginia.edu) N. Wong is a Bachelors of Science Candidate with the Systems and Charlottesville VA 22904 USA (email: nkw3y@virginia.edu) J. Wright is a Bachelors of Science Candidate with the Systems and Charlottesville VA 22904 USA (email: jww8d@virginia.edu) W. Scherer is an Associate Professor with the Systems and Charlottesville VA 22904 USA (email: wts@virginia.edu) M. Burkett is a Masters Candidate with the Systems and Information Engineering Department, University of Virginia, Charlottesville VA 22904 USA (email: mwb5x@virginia.edu) for advertising. It is currently one of the advertising industry s major challenges, as well as its greatest opportunity [4]. There is great potential on the Internet for advertising as the amount of users online continues to increase dramatically. With the growth of the Internet, many new forms of advertising have emerged and the industry has evolved substantially. Information and data about users are now more accessible than ever. One of the current issues with online advertising is capturing this new wealth of information and creating effective and efficient ad campaigns. Internet advertisers must assess their online ads and develop strategies for maximizing user response. They must determine the optimal position of ads to create their needed value for the business. To achieve the best value for their online advertising dollar, companies must focus on both the individual consumer [and] the goals of the business [5]. This problem is indeed challenging. The Internet s dynamic medium and large amount of data traffic add to these difficulties. Only by implementing a solution that addresses these challenges will business be successful in their online advertising [5]. The solution is directed advertising. This technique narrows the scope of the marketing campaign, thus saving time and money [3]. By gaining knowledge about the users, Internet advertisers can direct ads accordingly. Companies have developed several solutions to improve Internet user response rate including time-of-day optimizations, geo-targeting, banner placement, and content optimizations. Time-of-day optimizations are similar to television advertising. Key placement is determined by prime-time viewing hours or when audiences are more likely to respond to a particular ad [1]. Many major companies incorporate IP-based geo-location that uses lookup tables to locate users in specific regions [1]. Geo-targeting can localize web content and cater to specific users. Banner placement optimization determines the best location of an advertisement banner based on user response [1]. Finally, content optimization matches user keyword searches with advertisements pertinent to the user s search. Each of these tools can better match users with a specific advertisement, thus improving user responsiveness.

II. RESEARCH GOALS The goal of this project is to boost user responsiveness to online advertising by using data in different ways to create added lift. Since user behavior is highly complex, an understanding about behavior patterns is necessary to translate the data into business intelligence. The goal is to assess meaningful relationships between the data to leverage information on the users. This will be accomplished through these objectives: 1. Analysis and data mining to determine meaningful relationships and trends 2. Develop mathematical models to predict probable responses to targeted marketing 3. Propose a methodology that will improve advertising strategies and increase user response rates 4. Evaluate proposed methodology to verify and validate its effectiveness 5. Formal recommendation of an improved methodology that generates more efficient ad campaigns and boosts online user response to advertising III. ADVERTISING.COM A. Technology Advertising.com, a provider of results-based interactive marketing services to advertisers and publishers, has developed a proprietary technology, AdLearn, that creates dynamic ad campaigns directly targeting users. This new technology is based on a complex mathematical algorithm that factors in properties such as ad placement, site performance, and user behavior. The technology combines ad and site-based data with anonymous user preferences. It is a real-time optimization solution that dynamically processes advertising campaigns and automatically refines ad placements to maximize results [1]. B. Business Practices This project involves close interaction with Advertising.com. User data will be analyzed and any valuable information will be extracted and leveraged to increase conversion rates. A conversion is defined as the act of a user interacting with an advertisement and then providing information to the advertiser. This could be numerous forms including a purchase or registration of any service or good. Advertising.com currently operates the industry s largest advertising network and purchases ad space inventory from web, search, and email publishers. They place a variety of ad campaigns on their large network of websites and user activity patterns are traced. Unique, anonymous data are collected about the times ads are shown to a user and the subsequent actions performed by the same user. The project will require various systems analysis tasks including problem formulation, mathematical modeling, and data analysis. Meaningful relationships between the data are assessed to leverage information on the users. When added to the existing technologies, this information can be harnessed to better predict advertising response. This could potentially aid Advertising.com s proprietary algorithms and could generate more conversions. Increased conversion rates ultimately translate into more revenue and profit for the Advertising.com and its clients IV. ALTERNATIVES A. Random Advertising This is a low-tech method of advertising online. Random advertisements are displayed to users as they visit websites in the network. This technique is not efficient and results in wasted impressions. There is usually a lower conversion percentage and companies ultimately lose money. More efficient and effective alternatives are available. B. Current AdLearn Algorithm The current algorithm being used by Advertising.com is highly effective. By using a technology that combines ad and site-based data and individual user preferences, campaigns are optimized to provide the best results. Users are targeted and wasted clicks are reduced. The overall algorithm is highly successful but there can always be improvements. V. SOLUTION AND METHODOLOGY The primary objective of the project is to use behavioral data about online consumers to market ads more efficiently and boost user responsiveness to advertising. Through better placement of advertisements to targeted audiences, Advertising.com could reduce the total number of ads served for the same or better result, thus reducing costs. In addition, it could increase the number and efficiency of conversions for their clients, allowing them to remain a leader in the industry. This is accomplished through in depth analysis and modeling of online user behavior based on website visitation. Once a user is characterized by a user and website grouping, an appropriate ad campaign will be served based on the most potential lift (Fig. 1). The results of the analysis will then be translated into actionable business intelligence and finally a tested methodology that can be implemented by Advertising.com.

Website Cluster Ad Campaigns i Users that visit one of the Top 500 Websites Websites 1 2 500 Top 500 Websites by Highest Frequency of Visits 1 2 User Cluster Users i Matrix Values based on: Binary Frequency Relative Frequency (Percentage) Fig. 3. Clustering Input Matrix Fig. 1. 3-Dimensional Cube Model The methodology is as follows: 1. Classification of users To identify subgroups of users, we employed clustering analysis tools within SAS, a statistical analysis software package. Clustering is a method of organizing data into groups called clusters. Clusters are collections of data objects that are similar or close to each other and dissimilar or far from objects in other clusters. We used clustering to categorize users on the basis of their behaviors so that appropriate advertisements could be served to them. We then extracted patterns in user behavior from each cluster. There are many types of clusters including, hierarchical and centroid clusters. Each clustering method uses different criteria to group data objects. We used centroid clustering because it can manage large data sets. The inputs to clustering are user impression data. A small sample is shown below (Fig. 2): In binary inputs, a value of one corresponds to a particular user encountering an advertisement on a given website. A zero value corresponds to a nonevent. Similarly, frequency inputs record the number of impressions a given user encounters on a particular website. Relative frequency is an extension of the frequency input that takes into account the total number of impressions an individual user encounters. This input is calculated as a percentage of the number of impressions a given user encounters on an individual website versus the total number of impressions the user encounters across all 500 websites. Analysis of the user cluster distribution revealed that binary clustering differentiated clusters best. Next, we picked k 1 number of user clusters using centroid clustering with k 1 means. User clustering groups Internet customers on the basis of similar website visitation behavior (Fig. 4). This clustering approach minimizes the sum of squared errors from an individual observation and the centroid, or multidimensional mean. The centroid can be thought of as the center of a sphere with radius r, where r is minimum distance that encompasses all the observations in the cluster. Websites 1 2 500 1 2 Fig. 2. Sample Impression Data User Clusters Matrix Values based on: Binary The top 500 websites were extracted on the basis of popularity, indicated by the highest frequency of user visits. These websites represented the sites with the vast majority of the user activity. Next, we extracted users corresponding to the users who have visited the top 500 websites. The intersection of these users and the websites formed the basis of the matrix (Fig. 3). Each cell represents a pairing of an individual online user and a single website. The matrix values can be based on three criteria: (1) binary, (2) frequency, (3) relative frequency. k 1 Fig. 4. User Clustering 2. Classification of websites In website classification, a similar method of clustering is used. First, we picked k 2 number of website clusters using centroid clustering with k 2 means. Clustering on websites will subgroup websites on the basis of similar user clusters. This will

assign a particular subgroup of users to a certain subgroup of website clusters (Fig. 5). Thus, once a user is assigned to a user cluster, the user can be directed to one of the websites in the corresponding website cluster. This will better categorize users and websites, reducing wasted advertisements. Website Clusters served. Likewise this assignment can use one of four choices: (1) the maximum ratio of perceived lift, (2) maximum expected revenue, (3) maximum expected click rate, or (4) a randomized strategy. New User Based on Website Visited, Assign Web Cluster Update/ determine P[UserCluster=x] (based on prior probabilities and previous WC visits) 1 2 k 2 User Clusters 1 2 k 1 Matrix Values based on: Binary User Moves Based on WC and UC, Assign Campaign based on the following A. Maximum Ratio B. Randomized Strategy (with constraints) C. Expected Click Rate D. Expected Revenue Rate Based on criterion: A. Maximum probability B. Randomized Strategy (with constraints) Assign User Cluster Fig. 5. Website Clustering Fig. 6. Progression of New Users 3. Integration of advertising campaigns and associated actions Once user and website clusters were formed, we added a third dimension: campaigns. By integrating these fields, we determined what campaigns to show Internet users on the basis of users corresponding website and user clusters. The campaign dimension adds the notion of user response on the basis of the action or click data. Thus, a user can be assigned a particular ad on a given website that generates a higher user response. By targeting particular campaigns to certain users on a given website, more advertising dollars will be generated. Integrating these three dimensions may improve ad placement and user response as seen in the cube model previously (Fig. 1). 4. Implementation of the cube model We developed a repeatable process for turning user subgroup response data into actionable business intelligence. The predictive models developed previously are the foundation for an optimization algorithm. The algorithm works dynamically using input details about an advertisement and past website visitation behavior. These inputs determine advertisement placement. When clustering, we considered how to account for changes in user behavior and how to encompass this within cluster assignment. The process works as described below (Fig. 6). Advertising.com encounters a new user when the user is served an impression. A user is assigned to a website cluster corresponding to the visited website. The user is then assigned to an user cluster corresponding to the prior probabilities of the user visiting a history of website clusters. In this assignment, either a maximum probability or a randomized strategy can be used. A user and web cluster combination determine which campaign is 5. Time of Day Analysis of Session Bins Analysis of the time a user spends on the Internet, how quickly users move around, and how often users browse is the second general category of behavioral information known as session behavior. A session is the period of time during which a user is sitting in front of their computer and performing some activity on the Internet. The assumption behind session analysis is that people with different session behaviors are likely to respond differently to specific marketing campaigns. There were several steps involved in performing an analysis of session behavior. The first was to define is the session cutoff time. The cutoff time is the period of time required between impressions to assume that the user stopped browsing the Internet, and then later returned to browsing thus having a new session. The session cut-off time for this analysis was thirty-three minutes. Given the cutoff times, the sessions were calculated from the impression data and various attributes for each user. It was unclear which session behavior would be most useful, thus as many attributes as possible were considered and then only the most important attributes were used in categorizing the users. Some of the session attributes calculated from the data included the following: the number of sessions per day, the start time of the first session, the end time of the last session, the average duration of each session, the average number of impressions, average number of unique creatives viewed, average number of unique sites visited, average time between sessions, and average impressions per minute during the session. Combining some or all of

these gives a clearer picture of a user s Internet browsing behavior completely independent of what specific sites he or she visits. Given the session attributes for each user, the next step was to use that data to again create subgroups of users based on similar behavior. The dataset was divided into groups through a series of two way splits based on the value of an attribute. For example, the first split was between users that had one session per day and those that had more than one session per day. Each of those groups would then be split again based on other criteria. Eventually each user was assigned to a bin based on the ranges their attribute values fall into. This was done manually by analyzing the data graphically and seeing where the natural breaks in behavior were. The following bins were created as shown in Table I. TABLE I SUMMARY OF SESSION BINS average number of 1 sessions per day >1 < 10 seconds average duration 10 seconds 33 minutes 33 minutes 92 minutes > 92 minutes average impressions < 64 per session 64 average time between < 133 minutes sessions 133 minutes Once the session bins have been identified, it was necessary to check the validity of the groupings. This project assumes that the clusters will remain stationary from day to day. To check the validity of the bins, each individual day of the week was binned and analyzed. We analyzed the click behavior of the clusters. The same clusters provided similar response rates throughout the week. Therefore, the session bins are stationary. Two conclusions were made. The bin attributes will not change from day to day and serving the campaigns to each session bin that have the highest click-through rates will increase the total number of clicks. The method of analyzing session behavior is likely to help boost response rates and increase the efficiency of Internet advertising if implemented correctly. 6. Performance evaluation and validation Numerous validation techniques were used to ensure data consistency. By comparing the user and website clusters with the primary categorizations of each website, we determined that clusters approximated the given categories. We also used another tool to evaluate the centroids of the clusters. The centroids show the multidimensional means for each user and website cluster. The performance metrics are based on the percentage of websites that falls into a certain confidence interval of values. These values indicate the significance of an event or non-event. The overall percentage of websites that falls within this range will yield an accurate picture of the similarity of the users within the cluster. 7. Simulation testing methodology We tested the algorithm, first with simulation techniques. We ran the algorithm on past data and the analyzed the results of the recommended versus actual ad campaign assignments. A comparison of these assignments revealed the potential lift of the algorithm. We also tested the algorithm on other sets of data to validate the algorithm stability. Using simulation, we changed inputs to approximate lower and upper bounds and a confidence interval on the lift the algorithm generated. After initial testing, we developed an experimental design for field testing for possible integration within Advertising.com s systems. VI. RESULTS The project explores improving overall ad campaign effectiveness and user response to online advertising. through data analysis and modeling of online user behavior. The methodology was proposed to Advertising.com to improve their current ad campaign strategies. The strategies outlined in the methodology refined the placements of their current ads. In order to measure and quantify the impact of this methodology, a lift analysis was used to observe its overall performance. The metric in this project is the lift that can be attained while utilizing the proposed methodology over current ad campaign strategies (Table II). In this lift, click-through and conversion rates for ad campaigns have been assessed. This determined how effective the methodology is in boosting user response rates. TABLE II LIFT FOR CAMPAIGNS BASED ON UC, WC Given: User Cluster: 10 Web Cluster: 13 Campaign Lift 49464 2.61 max 32105 0.00 84970 1.45 87033 0.00 76017 2.34 87430 0.00 85145 0.64

User behavior was modeled to characterize a subpopulation of users to predict advertising response. Through the iterative approach specified above, we produced an algorithm to predict the most responsive advertising campaigns to show a particular user on the basis of their past behaviors. We ran the algorithm on past data and the analyzed the results of the recommended versus the actual ad campaign assignments. While there was substantial lift, the algorithm rested upon the assumption that user website visitation behavior and user response remains stable over a finite period of time. Assumptions were refined throughout the project. We believed that no assumptions could be made after a user makes its first click. It is unclear how a user s behavior will change after he or she clicks on an advertisement. While we hoped that the user would continue to click on the advertisements served, there were no past data that confirmed this assumption. Clicks can also be mistaken for accidents, if users do not intend to click on a particular ad. Since no assumptions could be made after the first click, we sought to serve users pertinent ads as quickly as possible. If users click earlier on advertisements, the algorithm would eliminate wasted advertising. Using the methodology described above, we investigated first-click analysis. The number of impressions it took before a given user clicked on an advertisement served was used as a method for comparing the algorithm with past data. Results show that there is potential lift. With the proposed algorithm, users are shown ads leading to clicks and action faster than Advertising.com s current AdLearn method. VII. CONCLUSION Advertising.com and its clients may ultimately benefit from the outcome of this research effort. They will be able to provide their clients with more effective ad campaigns while generating more revenue for both parties. They will also be able to improve the algorithm of AdLearn and other related technologies. This analysis can also be used to further improve other aspects of Advertising.com s business and may allow them to gain a competitive advantage in the growing market. Future researchers should continue analysis of clustering methodology. Since clustering is a means of organizing data, there are correct assignments. This makes it hard to validate. Future researchers could determine reasonable metrics for assessing clusters to validate clusters. Researchers should investigate the type of clustering method that bests partitions the data. This may identify patterns and distributions within website access behaviors that could characterize users. The stability of user response rate and the user and website groupings all need further validation. This will determine how often the client will need to re-cluster the user and website groups. Preliminary testing in a live environment will determine the true effectiveness of the algorithm. Significant IT efforts are required to achieve all the desired functionality, such as integration of business model constraints. We recommend that Advertising.com test the algorithm for one day, on limited customers and limited campaigns, and compare the results of user response rates between the days before and after the test day. If the results show significant lift, then the algorithm has potential for integration into Advertising.com s business model. If it does not show significant lift, then with refinements the methodology may still improve user responsiveness to online advertising. VIII. ACKNOWLEDGMENTS We would like to acknowledge the following people for their contributions and support for this Capstone project: Advertising.com o Scott Ferber; Chief Executive Officer o Mark Hrycay, Business Analysis o Tom Pomroy; Business Analysis The Systems and Information Engineering Department at the University of Virginia IX. REFERENCES [1] Advertising.com. General Website. Retrieved September 3, 2005 from <www.advertising.com>. [2] Arens, William. "Contemporary Advertising", 7th Edition; New York: Irwin McGraw Hill, 515, 1999. [3] Copp, V. Reinventing Direct Marketing. Journal of Direct Marketing,vol. 11, no. 4, Fall 1997. [4] Gretzel U., Yuan Y., and Fesenmaier D. (2000) Preparing for the New Economy: Advertising Strategies and Change in Destination Marketing Organizations [5] "Growing Concern about Value of Online Advertising Underscores Need for Optimization," Business Wire, pp. 2721, July 12, 2000.