Retail POS Data Analytics Using MS Bi Tools. Business Intelligence White Paper



Similar documents
SQL Server 2012 End-to-End Business Intelligence Workshop

Microsoft Services Exceed your business with Microsoft SharePoint Server 2010

Microsoft Data Warehouse in Depth

Microsoft Analytics Platform System. Solution Brief

SQL Server Administrator Introduction - 3 Days Objectives

W H I T E P A P E R B u s i n e s s I n t e l l i g e n c e S o lutions from the Microsoft and Teradata Partnership

East Asia Network Sdn Bhd

SQL Server 2012 Business Intelligence Boot Camp

Implementing a Data Warehouse with Microsoft SQL Server

Implementing a SQL Data Warehouse 2016

COURSE 20463C: IMPLEMENTING A DATA WAREHOUSE WITH MICROSOFT SQL SERVER

Building a BI Solution in the Cloud

LEARNING SOLUTIONS website milner.com/learning phone

Course Outline: Course: Implementing a Data Warehouse with Microsoft SQL Server 2012 Learning Method: Instructor-led Classroom Learning

Implement a Data Warehouse with Microsoft SQL Server 20463C; 5 days

Would-be system and database administrators. PREREQUISITES: At least 6 months experience with a Windows operating system.

Dell Microsoft Business Intelligence and Data Warehousing Reference Configuration Performance Results Phase III

Building a Data Warehouse

Implementing a Data Warehouse with Microsoft SQL Server

Well packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances

SQL Server 2005 Features Comparison

BI on Cloud using SQL Server on IaaS

Business Intelligence: Effective Decision Making

Implementing a Data Warehouse with Microsoft SQL Server 2012 (70-463)

Data Warehouse: Introduction

MS 20467: Designing Business Intelligence Solutions with Microsoft SQL Server 2012

Microsoft SQL Database Administrator Certification

Bussiness Intelligence and Data Warehouse. Tomas Bartos CIS 764, Kansas State University

Implementing a Data Warehouse with Microsoft SQL Server

Course 10977A: Updating Your SQL Server Skills to Microsoft SQL Server 2014

Course 10777A: Implementing a Data Warehouse with Microsoft SQL Server 2012

Implementing a Data Warehouse with Microsoft SQL Server 2012

Course Outline. Module 1: Introduction to Data Warehousing

BI, Analytics and Big Data A Modern-Day Perspective

News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren

MS 50511A The Microsoft Business Intelligence 2010 Stack

Practical Considerations for Real-Time Business Intelligence. Donovan Schneider Yahoo! September 11, 2006

Implementing a Data Warehouse with Microsoft SQL Server 2012

Foundations of Business Intelligence: Databases and Information Management

Implementing a Data Warehouse with Microsoft SQL Server MOC 20463

COURSE OUTLINE MOC 20463: IMPLEMENTING A DATA WAREHOUSE WITH MICROSOFT SQL SERVER

W I S E. SQL Server 2008/2008 R2 Advanced DBA Performance & WISE LTD.

Presented by: Jose Chinchilla, MCITP

Implementing a Data Warehouse with Microsoft SQL Server 2014

Implementing a Data Warehouse with Microsoft SQL Server

College of Engineering, Technology, and Computer Science

MOC 20467B: Designing Business Intelligence Solutions with Microsoft SQL Server 2012

Implementing a Data Warehouse with Microsoft SQL Server 2012 MOC 10777

Enterprise and Standard Feature Compare

Designing Business Intelligence Solutions with Microsoft SQL Server 2012 Course 20467A; 5 Days

Microsoft. Course 20463C: Implementing a Data Warehouse with Microsoft SQL Server

Inge Os Sales Consulting Manager Oracle Norway

SQL Server 2012 Performance White Paper

Implementing a Data Warehouse with Microsoft SQL Server 2012

Microsoft SQL Business Intelligence Boot Camp

Understanding Microsoft s BI Tools

Online Courses. Version 9 Comprehensive Series. What's New Series

SQL Server Business Intelligence on HP ProLiant DL785 Server

SQL SERVER BUSINESS INTELLIGENCE (BI) - INTRODUCTION

SQL Server 2012 Parallel Data Warehouse. Solution Brief

2009 Oracle Corporation 1

Sai Phanindra. Summary. Experience. SQL Server, SQL DBA and MSBI SQL School saiphanindrait@gmail.com

Emerging Technologies Shaping the Future of Data Warehouses & Business Intelligence

End to End Microsoft BI with SQL 2008 R2 and SharePoint 2010

Parallel Data Warehouse

50399AE Diseño de soluciones Business Intelligence con Microsoft SQL Server 2008

SSIS Training: Introduction to SQL Server Integration Services Duration: 3 days

Netezza and Business Analytics Synergy

Data Warehousing. Jens Teubner, TU Dortmund Winter 2015/16. Jens Teubner Data Warehousing Winter 2015/16 1

Your Technology Partner Offshore and Onsite. Services Portfolio

POLAR IT SERVICES. Business Intelligence Project Methodology

The Microsoft Business Intelligence 2010 Stack Course 50511A; 5 Days, Instructor-led

Beta: Implementing a Data Warehouse with Microsoft SQL Server 2012

SQL 2016 and SQL Azure

Developing Business Intelligence and Data Visualization Applications with Web Maps

Armanino McKenna LLP Welcomes You To Today s Webinar:

Moving Large Data at a Blinding Speed for Critical Business Intelligence. A competitive advantage

System Requirements Table of contents

Course 40009A: Updating your Business Intelligence Skills to Microsoft SQL Server 2012

Designing Business Intelligence Solutions with Microsoft SQL Server 2012

Upgrading to Microsoft SQL Server 2008 R2 from Microsoft SQL Server 2008, SQL Server 2005, and SQL Server 2000

Applied Business Intelligence. Iakovos Motakis, Ph.D. Director, DW & Decision Support Systems Intrasoft SA

QLIKVIEW INTEGRATION TION WITH AMAZON REDSHIFT John Park Partner Engineering

Updating Your SQL Server Skills from Microsoft SQL Server 2008 to Microsoft SQL Server 2014

LearnFromGuru Polish your knowledge

Optimize Oracle Business Intelligence Analytics with Oracle 12c In-Memory Database Option

How to make BIG DATA work for you. Faster results with Microsoft SQL Server PDW

Big Data Scenario mit Power BI vs. SAP HANA Gerhard Brückl

Please give me your feedback

Performance Verbesserung von SAP BW mit SQL Server Columnstore

Planning the Installation and Installing SQL Server

BI4Dynamics provides rich business intelligence capabilities to companies of all sizes and industries. From the first day on you can analyse your

Key Attributes for Analytics in an IBM i environment

Alejandro Vaisman Esteban Zimanyi. Data. Warehouse. Systems. Design and Implementation. ^ Springer

Transcription:

Retail POS Data Analytics Using MS Bi Tools Business Intelligence White Paper

Introduction Overview There is no doubt that businesses today are driven by data. Companies, big or small, take so much of effort to collect huge amount of data from wide range of sources and mediums such as contact details, financial and operational data, buyer behavior, and even social media data. With the help of this data, companies try to understand their strengths and weaknesses as well their competitors to make better business decisions. In the case of retail sector, retailers often have to rely on store-level POS data which is in huge quantity, gets created on daily / real-time basis and is not well-organized for analysis. Therefore, in order to understand their customers and provide them the best service and shopping experience, store / retail operators need to convert raw retail store data into intelligent information. This paper attempts to give insights about how through various business intelligence tools and technologies, organizations can derive meaningful information from huge chunks of data. Purpose The purpose of this paper is to highlight architectural and technical approach for the optimization of retail Point of Sale (POS) data analysis. Scope The paper s scope is limited to the basic concepts, tools and technologies of Microsoft Business Intelligence. Intended Audience The target audience of this paper are all decision makers, strategists, and top-level management professionals who are engaged in taking critical decisions at the strategic, tactical, and operational levels for their organizations. Contata Solutions 2015 Page 2

Problem Statement In today s competitive retail business landscape, some of the major challenges faced by the retail / store operators worldwide are: Aligning the speed of data capture: recording and converting data into information so as to take right decisions at the right time Breaking information silos Data and information consistency at every level of an organization Trend / Pattern analysis to make tactical and strategic decisions Since critical information directly affects sales and profitability, retail / store operators need to make quick strategic, marketing and operational decisions. Unavailability of such information often leads to disastrous business impact, such as: ineffective decision making due to unprocessed and incorrect data loss of time and money involved in extracting and compiling information from multiple locations / systems / subsystems time gap between the availability of information and the communication done to perform the action misalignment among strategic, tactical, and operational decisions Microsoft DWBI Tools and Technologies SQL Server SQL Server 2014 Standard vs. Enterprise Edition: SQL Server is used for relational database to store transactional database as well as define and store data warehouse. By opting the Enterprise version over the Standard version, one can optimize performance significantly. SQL Server Integration Services (SSIS) SSIS provides Extract, Transform and Load (ETL) capabilities for data import, data integration and data warehousing needs. Its GUI tools help to build workflows such as extracting data from various sources, querying data, transforming data and converting the processed data into required shapes. It can be used in day-to-day business operations as well as for data mining and data warehouse applications. SQL Server Analysis Services (SSAS) SSAS adds OLAP and data mining capabilities for SQL Server databases. SQL Server Reporting Services (SSRS) SSRS provides server-based platform designed to support wide variety of reporting needs. It delivers relevant information across the entire enterprise and helps in creating and managing both static and parameterized reports, while providing a sound platform for delivering information. Contata Solutions 2015 Page 3

Technical Solution Contata Solutions undertook a project that involved helping a retail store perform analysis on POS data. The data was in CSV format and the analysis had to be done on the basis of customer segmentation, geography, product consolidation, and seasonality / trend analysis. Given below are the requirements based on which the project was executed: Source Database Source data was available in multiple CSV format. Required Outcome The following analysis outcome was required: Customer Analysis Customer segmentation on the basis of: o Number of days since last visit o Number of orders in past 12 months o Dollar value of transactions Polarity between high-value and low-value customers Customer loyalty Store Analysis Total number of store visits on daily, weekly, and monthly basis Total sales on daily, weekly, and monthly basis High-selling products Product Analysis Products commonly bought together Sales by product category Product consolidation strategy on basis of high-value, less-cost products Seasonality / Trend Analysis Average order value by month Sales on festival season Increase in sales of a particular product during a baseball or football series Hardware Infrastructure The following hardware infrastructure was used for the project: Server 1 DB Server: 8 Core Processor, 64 GB RAM, Storage size: 1.5 TB Server 2 SSIS server: 8 Core Processor, 64 GB RAM, Storage size: 0.5 TB Decision on SQL Server Edition Case 1: SQL Server 2014 Standard Edition SQL Server 2014 Standard Edition was used initially, but the following issues were encountered: When data was transferred from CSV into the SQL Server staging database, its size was approximately 100 GB with 80% of the data distributed into 2 main tables related to daily transaction and transaction line item details. It was taking 2 minutes to count number of records. There are over 200 million records. To optimize the database, some steps had to be taken such as Table Partitioning, Columnstore Index, etc. However, Standard Edition did not have those features, hence it was decided to move to SQL Enterprise Edition. Contata Solutions 2015 Page 4

Case 2: SQL Server 2014 Enterprise Edition To improve performance, the following steps were taken: 1. SSIS Optimization To gain best results for data load, two separate servers were used for SSIS server and Database server. This was because the SQL Server uses a user mode cooperative multi-tasking and resource control that assumes 100% ownership of the system, and therefore consumes all the memory. In addition, even Lookups were cached to improve performance. 2. Table partition: Table is partitioned on the basis of months a. Hard drive with 250 GB storage was selected keeping in mind the scope for future scalability for both Transaction Database and Datawarehouse. b. Created filegroup for each month that maps each quarter filegroups with the hard drive. c. Data was partitioned on the basis of months, with the data of first month of any year going to the the partition range 1 (see below diagram). d. Define partition scheme to map partition range with filegroup. e. Associate table and partition scheme during table creation on month field. SSIS packages read the data for each partition from Staging Database and transferred the data to Datawarehouse. Both Staging Database and Datawarehouse main tables were partitioned. 3. Clustered column store index Since reports had to be defined from Datawarehouse, clustered columnstore index was used to gain query performance over traditional row-oriented storage. This was because the data was stored in columnar data format and was compressed over the uncompressed data size. As a result, query performance improved from 2 minute to 2 seconds for counting the total number of records. Contata Solutions 2015 Page 5

4. Query optimization Query are optimized like: Use Actual column in select statement instead of Select * Minimize the subquery usage Proper indexes are created in tables Avoid Full table scan wherever possible Avoid group by over multiple keys Avoid getting data from multiple left joins 5. Partial cube processing In order to do partial processing for cube for incremental data, the cube was partitioned on month s basis using views with each view corresponding to each month. Summary In summary, the following techniques were used to optimize overall architecture and query performance: SSIS optimization having SSIS run on separate server than DB server SSIS optimization using lookup cache Query optimization Table partitioning Clustered columnstore index Cube partitioning References http://technet.microsoft.com/ http://msdn.microsoft.com/ Contata Solutions 2015 For more information visit: www.contata.com