Improving the Customer Experience in Big Box Retail Stores



Similar documents
Smart Cart to Recognize Objects Based on User Intention

An Intelligent Shopping Assistant

Analytics in an Omni Channel World. Arun Kumar, General Manager & Global Head of Retail Consulting Practice, Wipro Ltd.

Towards a Transparent Proactive User Interface for a Shopping Assistant

Analytics on Big Data

TIBCO Industry Analytics: Consumer Packaged Goods and Retail Solutions

Selection of Optimal Discount of Retail Assortments with Data Mining Approach

KNIME TUTORIAL. Anna Monreale KDD-Lab, University of Pisa

Databases - Data Mining. (GF Royle, N Spadaccini ) Databases - Data Mining 1 / 25

Product recommendations and promotions (couponing and discounts) Cross-sell and Upsell strategies

A Business Intelligence Training Document Using the Walton College Enterprise Systems Platform and Teradata University Network Tools Abstract

Digital INCITE introduces its WiFi analytics platform

SPATIAL DATA CLASSIFICATION AND DATA MINING

NICE MULTI-CHANNEL INTERACTION ANALYTICS

Data Mining. Toon Calders

INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET)

Data Analytical Framework for Customer Centric Solutions

An Overview of Knowledge Discovery Database and Data mining Techniques

Laboratory Module 8 Mining Frequent Itemsets Apriori Algorithm

POPAI 2014 MASS MERCHANT SHOPPER ENGAGEMENT STUDY. Executive Summary Report An official POPAI publication

Page 1. Transform the Retail Store with the Internet of Things

Discovering, Not Finding. Practical Data Mining for Practitioners: Level II. Advanced Data Mining for Researchers : Level III

Overview, Goals, & Introductions

Harnessing Data to Optimize and Personalize the In-Store Shopping Experience

A Survey on Association Rule Mining in Market Basket Analysis

Insights from McKinsey s Global iconsumer Research. Six Strategies to Win the Mobile Consumer Showdown

Web Mining using Artificial Ant Colonies : A Survey

Building Your O2O Funnel

Automated Collaborative Filtering Applications for Online Recruitment Services

TH ANNUAL GLOBAL SHOPPER STUDY. June 2015

Clustering Marketing Datasets with Data Mining Techniques

Making big data digestible

FIVE INDUSTRIES. Where Big Data Is Making a Difference

DATA DRIVEN DECISION SUPPORT TO SUPERMARKET LAYOUT

Data Mining: Introduction. Lecture Notes for Chapter 1. Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler

M15_BERE8380_12_SE_C15.7.qxd 2/21/11 3:59 PM Page Analytics and Data Mining 1

INCREASE REVENUE PER SQUARE METER WITH ACTIONABLE INSIGHTS. Powered by

The Data Mining Process

Healthcare Measurement Analysis Using Data mining Techniques

Big Data. Fast Forward. Putting data to productive use

ECLT 5810 E-Commerce Data Mining Techniques - Introduction. Prof. Wai Lam

Banking On A Customer-Centric Approach To Data

BIG DATA What it is and how to use?

Real Time Bus Monitoring System by Sharing the Location Using Google Cloud Server Messaging

Ma$$ive An Intelligent Mobile Grocery Assistant

WHITE PAPER Analytics for digital retail

SKILL SETS NEED FOR ANALYTICS- DESCRIPTIVE, PREDICTIVE AND PRESCRIPTIVE ABSTRACT

USING DATA SCIENCE TO DISCOVE INSIGHT OF MEDICAL PROVIDERS CHARGE FOR COMMON SERVICES

GROCERY SHOPPING (BEING A SMART CONSUMER) &

Cloud Enabled Emergency Navigation Using Faster-than-real-time Simulation

Pentaho Data Mining Last Modified on January 22, 2007

Chapter 8 Customer Relationship Management Benefits of CRM Helps in improving customer retention and loyalty Helps in generating high customer

Dave Sly, PhD, MBA, PE Iowa State University

Database Marketing, Business Intelligence and Knowledge Discovery

Implementation of Data Mining Techniques to Perform Market Analysis

Data Mining is sometimes referred to as KDD and DM and KDD tend to be used as synonyms

Mining Association Rules: A Database Perspective

DATA-ENHANCED CUSTOMER EXPERIENCE

Introduction to Data Mining

SAP Solution Brief SAP HANA. Transform Your Future with Better Business Insight Using Predictive Analytics

CAPTURING THE VALUE OF UNSTRUCTURED DATA: INTRODUCTION TO TEXT MINING

Motorola Enterprise Mobility Retail Solutions: Driving customer loyalty and sales to new heights with mobility

DATA MINING TECHNIQUES AND APPLICATIONS

Turnkey Mobility Solutions. For Supply Chain

STUDY. Rethinking Retail. Insights from consumers and retailers into an omni-channel shopping experience

How To Transform Customer Service With Business Analytics

The Retail Customer Experience Which elements of the shopping experience matter most?

The Second Half of the Chessboard

Presented By: Web Analytics 101. Avoid Common Mistakes and Learn Best Practices. June Lubabah Bakht, CEO, Vitizo

Mobility in Retail. RapidValue Solutions

How To Learn To Use Big Data

Applying Customer Analytics to Promotion Decisions WHITE PAPER

Using Data Mining for Mobile Communication Clustering and Characterization

Thought Leadership White Paper. Omni-channel transforms retail transactions

International Journal of World Research, Vol: I Issue XIII, December 2008, Print ISSN: X DATA MINING TECHNIQUES AND STOCK MARKET

Foundations of Artificial Intelligence. Introduction to Data Mining

Future Models of Grocery Distribution. Dr. Alan Lewis Principal Consultant Transport & Travel Research Ltd

A STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH

Analytical Model for Automating Purchases using RFID-enabled Shelf and Cart

White Paper. Developing a Successful Onboarding Program to Drive Customer Loyalty and Profitability

Understanding Web personalization with Web Usage Mining and its Application: Recommender System

Big Data Mining Services and Knowledge Discovery Applications on Clouds

Transcription:

Improving the Customer Experience in Big Box Retail Stores Tyler Bruns Northwestern University TylerBruns2014@u.northwestern.edu Abstract-- With the growth and proliferation of big box retail stores, critics have voiced their strong concerns about their business practices. Mom-and-pop stores that provide a more personalized customer experience and once were more solidly rooted in the community are displaced by the big box retailers. The big box retail store experience offers a homogenized customer experience that is similar in whichever store in their chain that you visit. Furthermore, the big box store environment is that of a large warehouse. In a recent whitepaper by Motorola Solutions addressing What s driving tomorrow s retail business, they discusses the shift in attitude, where improving customer service is now the primary driver for investment. We agree, and this paper begins to address common issues faced by customers in a large warehouse-like big box retailer. Have you spent much more time traversing a big box retail store than you have intended? Have you exited a big box retail store without retrieving the items that you intended to get when you entered? Have you forgot to add a desired product on your shopping list? The objective of this project is to evaluate data mining techniques and algorithms that begin to address solutions to these questions from the customer s point of view. I. Introduction Competition is fierce between retailers to stock their shelves with products that their customers want. A retailer can gain competitive advantage by understanding what their customers want. Strategic product placement and targeted marketing can increase cross-selling and up-selling of additional products. Big box retailers such as Walmart, Target, and Meijer offer lower prices by capitalizing on high volume economies of scale. They data mine many aspects of their business to eliminate inefficient practices and uncover profitable opportunities. We note that the retailer s efforts are primarily from their point of view. But, what about improving the customer experience? (1) Have you spent much more time traversing a big box retail store than you have intended? See figure 1. (2) Have you exited a big box retail store without retrieving the items that you intended to get when you entered? (3) Have you forgot to add a desired product on your shopping list? The objective of this project is to evaluate data mining techniques and algorithms that begin to address solutions to these questions from the customer s point of view. We have outlined three questions that define what our business problem is from the customer s perspective. (1) Our initial plan to address the first question is to study shortest path problems, e.g. Dijkstra s algorithm [1]. With the rapid growth and ubiquitous use of smart phone technology, customers can communicate their shopping list to a central server that can suggest the shortest pick path, i.e. in a similar manner as Google Maps [2] to route car traffic, the customer might take to traverse the store and retrieve the desired products as depicted in figure 2. (2) Our initial plan to address the second question is illustrated in figure 1. With the knowledge of the shopping list, at checkout the picked products are scanned, and then the shopping list can be compared to the receipt to make sure that the products that the customer intended to purchase were purchased. (3) Our initial plan to address the third question is the main topic of this paper. Figure 3 illustrates the idea that association rules generated by mining other customer transactions can suggest products based on the products on the shopping list and/or products barcode scanned while at the store. Our objective is to evaluate if large datasets representative of big box stores can be effectively mined for these association rules. II. Prior Research Work In this section, we outline data mining applications and research efforts that are related to this paper. Shekar et al [3] discuss a smart phone application to assist grocery shoppers, and Forsblom et al [4] discuss a similar solution that includes the use of Dijkstra s algorithm to compute minimum paths to

association rules once a certain number of new transactions have been collected). Figue 1: (a) Figure 3 Figure 1: (b) Figure 1. (a) Have you entered a big box retail store with a list of products you intended to purchase but exited without retrieving some of these products that you intended to get when you entered? (b) Does your receipt (right) reflect your shopping list (left)? Figure 2: (a) Figure 2: (c) Figure 2: (b) Figure 2: (a) As customers enter the store, they can upload, e.g. text, their shopping list to a server. (b) Based on the shopping list, a traveling salesman-type algorithm can compute a shortest path through the big box retail store. Then, the path is suggested to the customer as illustrated in (a). At the end of the trip and once the customer has purchased their products, the transaction is added to the (end of the) transaction database (which can be mined again for new Figure 3. Based on the products of the customer s shopping list and/or products identified by using their smart phone, the product info is uploaded, e.g. texted, to the central server (which contains the master product list of all n products that they carry as well as the historical transaction table for all N transactions) which can return association rules that recommend products based on other customer transactions. More recently in What s driving tomorrow s retail business, Motorola Solutions [5] discusses the shift in attitude, espouced here, where improving customer service is now the primary driver for investment. Gertin [6] confirms in his thesis that [m]any retail environments try to maximize the amount of time customers spend in their store, and then studies traveling salesman problems to locate products to maximize the time that the customer stays in the store. This thought is beneficial from the retailer s point of view, but it does not serve the customer well. We advocate that the retailer will garner more benefit by serving the customer which will create a repeat and loyal customer. Hui et al [7] use the traveling salesman problem to define a baseline shortest route through a grocery store based on the products purchased. Of relevance to this paper, they note that trips with high order deviation [from the TSP-path] tend to be longer trips with a greater number of product categories purchased and in-store time. Ballester et al [8] study how to lay out a retail facility based on shortest travel paths while evaluating the traffic density and travel distances. They compare and contrast pick path strategies in warehouses vs. shopper paths in retail stores. Jung and Kwon [9] use k-mean clustering to study grocery customer behavior. Larson et al [10] use a k-medoid clustering algorithm is used to identify clusters of shoppers defined by their prototypical walking path through a store. While components of the project proposed here are discussed in the

aforementioned studies, we have not found a reference yet that proposes a complete solution for questions 1 through3. III. Data Mining by Association Rules We are interested in providing product recommendations based on the products that our customer has in their shopping list or those that they scan during their visit. There are two data mining algorithms that are suggested by this need. (1) If product transactional data is provided, data mining by association deals with discovering rules in the transactional data such that some product is associated with another product. We note that each transaction is identified by a transaction ID (TID). (2) If product preference data is provided, data mining by collaborative filtering deals with providing product recommendations based on the collected product preferences from customers with similar preferences among many customers. Over the course of time, our retailer may be able to collect product preference data. We can explicitly collect ratings of products from the customer, or we may, if allowed, implicitly surmise ratings of products by recurring same product purchases by the same customer. We note that each customer may be identified by a customer ID (CID). We recommend that a combination of both algorithms be implemented. However, at this time we do not have product preference data at our disposal nor transaction data where we can uniquely identify customer product purchased through their CID, so this aspect of our solution is left for future work. We do have transactional data, so we study if the association rules can be used to effectively suggest additional products to our customer from their shopping list based on the transactional history of many other customers. We elaborate on the association (or market basket) analysis. Data mining by association deals with discovering rules in transactional data such that some item/event/attribute is associated with some other item/event/attribute. For example, provided a dataset of N transactions with values for n attributes, i.e. A 1,, A n, we can uncover patterns amongst the attributes through unsupervised learning, e.g. such that when we encounter attribute A i in a transaction we will tend to see attribute A j too. This can more formally be written as an association rule, e.g. {A i } {A j }. If a customer purchases a handheld electronic device, they most probably will need batteries, so as an example, {electronic device} {batteries}. So, the goal of data mining by associations is to uncover attributes in the transactions that appear to be correlated through their co-occurrence in transactions. It is important to recognize that association algorithms find correlations between the attributes, but this does not imply causality, i.e. correlation does not imply causation! These algorithms can be computationally expensive, and the associations that are uncovered may be spurious, so it is important that we, the data scientist, scrutinize the uncovered rules for usefulness and be mindful of selection of transactional storage method and algorithm. The goal of association rule discovery is to find all the rules in the transactional dataset that have support and confidence measures greater than prescribed minimum threshold values, i.e. s s and c c. Unfortunately, a brute-force approach to finding all of these rules is computationally intractable for all but small dataset, so we utilize two techniques to reduce computational burden, i.e. the Apriori algorithm and FP-Growth algorithms. IV. Data Understanding The data required for association analysis is a Nxn transaction list for N transactions (associated with a unique TID) amongst the n products in the retailer product list. For this study, we want to determine if association rules can be determined from data that is representative of the product mix and larger number of transactions of a big box retailer. We have collected two datasets [11, 12] that, although not representative of the full product list of big box retailers, captures a larger mix of transactions and products than our previous study to begin our investigation. We restrict our discussion here to the second dataset for retailer B which amounts to a 9835x169 (Nxn) transaction list. For retailer B, the minimum and maximum support counts are 1 and 2513, respectively, and the minimum and maximum supports are 0.0001 and 0.26, respectively, which results in a skewed support distribution (with associated issues discussed starting on page 386 of [13]). Figure 4: (a)

Figure 5: (b) Figure 4: (b) Figure 5. Top 10 association rules constructed for retailer B using (a) Apriori and (b) FP-Growth algorithms with s = 0.001 and with confidence metric. Figure 4. (a) Partial snapshot of heat map illustrating density of product purchases from retailer B across 169 products (x-axis) for 9835 transactions (y-axis). (b) Associated support count distribution of products purchased. (Note: partial list of products shown.) V. Experimental Results and Analysis We select Weka [14] as our data mining tool to investigate the association rules analysis. Unfortunately, the raw data provided to us from both retailer A and B do not conform to the selected tool preferred format, i.e..arff format, and in addition, each raw dataset is in a different format. Although any scripting language could be used, we created a Fortran 90 code to translate these dataset to.arff format suitable for reading directly into Weka. We return to the goal of this report, i.e. to provide and evaluate a data-driven mechanism whereby product recommendations can be made to a customer based on their shopping list or scanned items from association rules mined for other customer transactions. We data mine retailer B s dataset for insight. As illustrated from figure 5(a) and (b), rules like {cream cheese, sugar, domestic eggs} {whole milk} and {domestic eggs, butter, soft cheese} {whole milk} are suggestive of common baking ingredients. We note that whole milk has the highest support count in figure 6, so we may want to recommend whole milk if it does not already appear on our customer s shopping list. Figure 5: (a) Figure 6. Products with highest support count for retailer B. VI. Conclusion We recall that the objective of this project is to evaluate data mining techniques and algorithms that begin to address solutions to three questions with an emphasis on the customer s point of view. We outlined initial plans and devoted this report to addressing the third question. We studied association rules generated from relatively large dataset of various forms, i.e. size and skewness, to assess our ability to recommend products to our customers based on their shopping list and/or product scans. In our opinion, the results are mixed. While some interesting association were generated, a large number of rules do not appear interesting. The appearance of combinations on the LHS of the rule in our customer s shopping list and the absence of the items on the RHS suggest possible recommendations based on previous customer purchases. A deeper study of the rules generated and feedback from customers on receiving such recommendations is necessary to complete the study, but the approach appears promising. The approach studied here enables a more personalized customer experience. While our recommendation here have been generic, i.e. whole milk, eggs, etc., we envision that smart phone technology will allow the retailer to provide timely brand product information and coupons to the customer based on the recommendations and the customer s location in the store. Although our emphasis here has been on improving the customer s experience, we should mention that the techniques developed here are useful to the retailer. That is, the same association rules generated can be viewed from

the retailer s point of view to improve product placement, cross-selling, etc. References 1. Dijkstra, E. W., "A note on two problems in connexion with graphs". Numerische Mathematik,1:269 271, 1959. 2. maps.google.com 3. Shekar, S., Nair, P., and Helal, A. (S.), igrocer a ubiquitous and pervasive smart grocery shopping system, Proceedings of the 2003 ACM symposium on Applied computing, 645-652, 2003. 4. Forsblom, A., Nurmi, P., Floréen, P., Peltonen, P., and Saarikko, P., Massive - an intelligent shopping assistant. Proceedings of the Workshop on Personalization in Mobile and Pervasive Computing, Trento, Italy, 2009. 5. White Paper: What s driving tomorrow s retail experience, Motorola Solutions, 2012. Available at www.zebra.com/content/dam/zebra/white-papers/enus/motorola-whitepaper-zc-en-us.pdf 6. Gertin, T., Maximizing the cost of shortest paths between facilities through optimal product category locations, M.S. thesis, George Mason University, 2012. 7. Hui, S. K., Fader, P. S., and Bradlow, E. T., The traveling salesman goes shopping: the systematic deviations of grocery paths from TSP optimality, Marketing Science, 28(3):566-572, 2009. 8. Ballester, N., Guthrie, B., Martens, S., Mowrey, C., Parikh, P. J., and Zhang, X., Effect of retail layout on traffic density and travel distance, Proceedings of the 2014 Industrial and Systems Engineering Research Conference, (Guan, Y. and Liao, H., eds), 2014. 9. Jung, I. C. and Kwon, Y. S., Grocery customer behavior analysis using RFID-based shopping paths data, World Academy of Science, Engineering and Technology, 5(11):834-838, 2011. 10. Larson, J. S., Bradlow, E. T., and Fader, P. S., An exploratory look at supermarket shopping paths, International Journal of Research in Marketing, 22:395-414, 2005. Available at papers.ssrn.com/sol3/papers.cfm?abstract_id=723821 11. Retailer A dataset, i.e. marketbasket.csv, downloaded from Mahmood, T., Data Mining, National University of Computer and Emerging Sciences (FAST-NU) available at https://sites.google.com/a/nu.edu.pk/tariqmahmood/teaching-1/fall-12---dm 12. Retailer B dataset, i.e. groceries.csv, downloaded from Marafi, S., Market Basket Analysis with R at http://www.salemmarafi.com/code/market-basketanalysis-with-r/ 13. Tan, P.-N., Steinbach, M., and Kumar, V. Introduction to Data Mining. Dorling Kindersley (India) Pvt. Ltd. licensee of Pearson Education, Inc., 2014. 14. www.cs.waikato.ac.nz/ml/weka/