Tamr on Google Cloud Platform: E-Commerce Tutorial



Similar documents
Microsoft Access Rollup Procedure for Microsoft Office Click on Blank Database and name it something appropriate.

DbSchema Tutorial with Introduction in SQL Databases

Creating Your PALS Online Account for New Teachers Navigate to the PALS Online homepage

Jumble for Microsoft Outlook

Online Check Stub Enrollment. from Dominion Payroll Services

Purpose: This tutorial demonstrates how to log on to WebAdvisor and register for classes.

MICROSOFT ACCESS 2003 TUTORIAL

Sample Table. Columns. Column 1 Column 2 Column 3 Row 1 Cell 1 Cell 2 Cell 3 Row 2 Cell 4 Cell 5 Cell 6 Row 3 Cell 7 Cell 8 Cell 9.

How to Concatenate Cells in Microsoft Access

Patient Portal. Accessing the Patient Portal. How to Begin: Enter first and last name, date of birth and create a user name and password.

BULK SMS APPLICATION USER MANUAL

RCS Liferay Google Analytics Portlet Installation Guide

Point2 Agent Syndication Dashboard Getting Started Guide For Brokers

Setting Up Your Online ecommerce Shopping Cart

Lab 9 Access PreLab Copy the prelab folder, Lab09 PreLab9_Access_intro

Analytics Canvas Tutorial: Cleaning Website Referral Traffic Data. N m o d a l S o l u t i o n s I n c. A l l R i g h t s R e s e r v e d

7. In the boxed unlabeled field, enter the last 4 digits of your Social Security number.

Banner Web Time Entry User Guide. Students. Delaware State University 1 Banner Web Entry

Database Forms and Reports Tutorial

Getting Started with StoreGrid Cloud

Excel Tutorial. Bio 150B Excel Tutorial 1

Instructions for applying data validation(s) to data fields in Microsoft Excel

IT462 Lab 5: Clustering with MS SQL Server

INTEGRATING MICROSOFT DYNAMICS CRM WITH SIMEGO DS3

Create a Database Driven Application

Crystal Reports Payroll Exercise

CITY OF NAPLES VENDOR REGISTRATION TUTORIAL VENDOR SELF SERVICE (VSS) VENDOR REGISTATION TUTORIAL

InstantSearch+ for Magento Extension

4. Click Next and then fill in your Name and address. Click Next again.

YOUR GUIDE TO THE iphone MOBILE APP WITH 1st SOURCE

Sage Accountants Business Cloud Advanced Features Guide

Catalog Creator by On-site Custom Software

Instructions for creating a data entry form in Microsoft Excel

Amazon S3 Cloud Backup Solution Contents

Cre-X-Mice Database. User guide

Compatible browsers: Privacy Settings. Internet Explorer Google Chrome (Mozilla Firefox and Safari are not formally supported at this time)

Configuring an IP (SIP) Polycom Soundstation on the Avaya IP Office

To successfully initialize Microsoft Outlook (Outlook) the first time, settings need to be verified.

E-commerce. Further Development 85

Creating a Participants Mailing and/or Contact List:

1. What practices does Raleigh Medical Group, P.A. include? 2. Is my health information secure? Who has access to this information?

High Impact & Alpha Five: A Mail Merge Guide.

Basic Pivot Tables. To begin your pivot table, choose Data, Pivot Table and Pivot Chart Report. 1 of 18

PERSONAL DEVELOPMENT SERIES

Legal Information Trademarks Licensing Disclaimer

START YOUR OWN BUSINESS GUIDE 1.8

Microsoft Office 2010

Website Creation Service: User s Guide

Generating Open For Business Reports with the BIRT RCP Designer

MealTime Online Frequently Asked Questions

Resources You can find more resources for Sync & Save at our support site:

1. Go to ArizOTA.org, and select Employer Registration from the OT Jobs drop down menu:

The Marketing Manager s Ultimate Cheat Sheet for Google Analytics

How to create database in GlycomcsPortal?

Online Payroll Remittance Manual for Employers

Getting Started with Google Analytics 7 Easy but comprehensive steps

Microsoft Access 2010 handout

Account Create for Outlook Express

Microsoft Access 2000

To Install EdiView IP camera utility on Android device, follow the following instructions:

MicroStrategy Desktop

How to Register for an Event Using Cheer America s New Online Registration System

Getting A Google Account

Scribe Online Integration Services (IS) Tutorial

WebSphere Business Monitor V6.2 Business space dashboards

Proofpoint provides the capability for external users to send secure/encrypted s to EBS-RMSCO employees.

Quick Start Guide to. ArcGISSM. Online

SuperOffice AS. CRM Online. Introduction to importing contacts

Still unable to log in? one of the following people for assistance and provide your name and site.

A quick guide to. Social Media

Ariba Sourcing Getting Started Guide for Suppliers

GCM for Android Setup Guide

UCBI Web Capture Remote Deposit User Instructions

You can get DrillToPIA.xla from

Introduction to Microsoft Access

Nonprofit Technology Collaboration. Web Analytics

Business Intelligence Overview. BW/BI Security. BW/BI Architecture. Business Explorer (BEx) BW/BI BEx Tools Overview. What is BEx?

Microsoft Access 2007

How To Manage A Project In Project Management Central

CRM Confluence Plugin User Guide

Jesubi Salesforce Integration Guide

Process: Self Service

The LMS/Moodle 2.7 GradeBook Workbook

Migration User Guides: The Console Application Setup Guide

Chapter 19: Shopping Carts

GETTING STARTED WITH THE ALMYTA CONTROL SYSTEM

ELECTRONIC DATA PROCESSOR (EDP) QUICKSTART FOR DATA PROVIDERS

Getting Started Guide: Transaction Download for QuickBooks Windows. Information You ll Need to Get Started

Creating Your Teacher Website using WEEBLY.COM

Faculty Access for the Web 7 - New Features and Enhancements

Transcription:

Tamr on Google Cloud Platform: E-Commerce Tutorial Overview In this tutorial, we ll be working with sources from an e-commerce company s customer account database and web analytics data warehouse. In particular, we ll be looking at data that was collected when the company was running a promotion on athletic shoes. Using Tamr on Google Cloud Platform, we ll join these sources together, do some data cleansing, and finally, push our new dataset via Google Dataflow to BigQuery, Google s fully managed, NoOps, data analytics service. There, we will ask questions of the data that might inform how the company could optimize its marketing strategies. For example, if the join reveals that 18-25 year old customers come most often to come to the site via a social media promotion, and buy running shoes more than any other shoe type, the company might embed running shoe advertisements into social media. Customer data from the warehouse includes the following fields, which come both from forms filled in when site visitors 1) create an account and 2) make a purchase, and from of their interactions on the site (ui): + first_name(1) + last_name (1) + birthdate (1) + age (1) + age_group (1) + billing_street (2) + billing_city (2) + billing_state (2) + billing_zip (2) + tracking_cookie (ui) + customer_id (ui) + date_last_visit (ui) + credit_card (2 -- value encrypted for security) + date_last_purchase (2) + is_premium_member -- has made a purchase (2) + is_senior_discount_member -- age group 65 and up (1) + num_visits (ui)

The company s web analytics data is derived directly from raw user interactions with their website, and contains the following fields: + session_id (unique identifier of each customer session) + tracking_cookie + IP_address + user_has_account + user_account_id + num_items_in_cart + items_in_cart + cart_id + product_department + product_id + product_category + Price + promotion_type + referral_site_id + return_prospect_rating + purchase_is_gift + shipping_category + billing_street + billing_zip Signing Into Tamr & Google Cloud Platform To get started, register with Tamr and sign into Google Cloud Platform at gcp-preview.tamr.com + If you don t have an account with Google Cloud Platform, you can go through the Tamr portion of the tutorial, but will not be able to push your dataset to BigQuery. + If you don t have a Google Cloud Platform account but would like to register for one, select the Free Trial option at the bottom of the Google Cloud Platform sign-in page. For more information on registration with Tamr and Google Cloud Platform, see [link to documentation section Getting Started ] Viewing, Joining, and Cleansing with Tamr Once signed in, a prompt to select a source will appear: + Select the Tamr Sample Data project and tamr-gcp-sample-data bucket + From this bucket, select the Customer.csv source 2

Once you have selected a source, you will land on a screen on which you will see all of the fields in the Customer.csv source. In order to view the data within each field, click Add All. This will generate a preview of the data contained by each field, and some information about the density of these fields, indicated by the green bar. The longer the green bar, the fewer empty records there are within that field. Next, we add a source from the web analytics warehouse, called Purchases.csv. + Select the Add a Source + Make sure the Tamr Sample Data project and tamr-gcp-sample-data bucket are still selected + From this bucket, select the Purchases.csv source In order to join these sources together, Tamr needs to use a join key, which is a field that contains the same data across the two sources. A prompt will appear in which we must enter the join key. In this case, we will use the company s unique identifier for customers, called User_ account_id in our Purchases source, and customer_id in our customer source. 3

To verify that the join worked, search for these two fields and drag and drop them into your new data set. You will see that their records are aligned and identical, indicating a successful join. From here, you may select any fields that you would like to include in your new dataset. Some interesting fields to use for queries in BigQuery include: + age + age group + referral site id + product_category + Price + return_prospect_rating + promotion type 4

In order to clean up this new data set, you can use Tamr s transformation functionality to: + Remove blank rows + Replace all blank cells in a attributes with a certain value An interesting transformation to try out involves removing rows that are blank for age and age_group. By doing this, we guarantee that there will be age information about all customers in our new dataset. To do this, select the two attributes and click Transform : Once you are finished selecting and transforming attributes in your new dataset, select Apply Formatting and you can view the results. Then, you can push the dataset to BigQuery by clicking the Move my Data to Google BigQuery button. Running a Google Cloud Dataflow Job After you click the Move my Data to Google BigQuery button, your job will take a few minutes to start. Once it is running, you can see the progress of your Cloud Dataflow job within the Google Cloud Platform console by clicking the Submitted link that appears when you click Publish to BigQuery, shown below: 5

Once the job has finished running, which you can verify by clicking View Logs on the right of the screen, and watching the state of the job diagram (shown below), click BigQuery in the left menu. Analysis in BigQuery Once you arrive in BigQuery, find your new project and dataset in the left menu. Some questions you might look into include: + Did age groups tend to associate with particular promotion types or referral sites? + Which shoe type did each age group buy most? + Which age group had the most prospects with a high return_prospect_rating? For help writing SQL queries in BigQuery, check out Google s BigQuery documentation, 6