SQL Server 2014 BI Lab 04 Enhancing an E-Commerce Web Application with Analysis Services Data Mining in SQL Server 2014 Jump to the Lab Overview
Terms of Use 2014 Microsoft Corporation. All rights reserved. Information in this document, including URL and other Internet Web site references, is subject to change without notice. Unless otherwise noted, the companies, organizations, products, domain names, e-mail addresses, logos, people, places, and events depicted herein are fictitious, and no association with any real company, organization, product, domain name, e-mail address, logo, person, place, or event is intended or should be inferred. Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation. For more information, see Microsoft Copyright Permissions at http://www.microsoft.com/permission Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property. The Microsoft company name and Microsoft products mentioned herein may be either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries. The names of actual companies and products mentioned herein may be the trademarks of their respective owners. This document reflects current views and assumptions as of the date of development and is subject to change. Actual and future results and trends may differ materially from any forward-looking statements. Microsoft assumes no responsibility for errors or omissions in the materials. THIS DOCUMENT IS FOR INFORMATIONAL AND TRAINING PURPOSES ONLY AND IS PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND, WHETHER EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, AND NON-INFRINGEMENT. Page 2
Contents TERMS OF USE... 2 CONTENTS... 3 ABOUT THE AUTHOR... 4 DOCUMENT REVISIONS... 4 LAB OVERVIEW... 5 EXERCISE 1: EMBEDDING DATA MINING RESULTS INTO A CUSTOM APPLICATION... 6 Task 1 Browsing the Adventure Works Online Shopping Application... 6 Task 2 Creating an Analysis Services Multidimensional and Data Mining Project... 8 Task 3 Creating the AdventureWorksDW2014 Data Source... 9 Task 4 Creating the Basket Analysis Data Source View... 9 Task 5 Configuring the Basket Analysis Data Source View... 10 Task 6 Creating the Basket Analysis Mining Model... 12 Task 7 Configuring the Basket Analysis Mining Model Algorithm Parameters... 14 Task 8 Processing the Basket Analysis Mining Model... 15 Task 9 Viewing the Basket Analysis Mining Model Content... 16 Task 10 Querying the Basket Analysis Mining Model... 20 Task 11 Enhancing the Adventure Works Online Shopping Application... 23 Task 12 Browsing the Enhanced Adventure Works Online Shopping Application... 24 Task 13 Finishing Up... 25 SUMMARY... 25 Page 3
About the Author This lab was designed and written by Peter Myers. Peter Myers has worked with Microsoft database and development products since 1997. Today, he specializes in all Microsoft BI products and provides mentoring, technical training, and education content authoring for SQL Server, Office, and SharePoint. Peter has a broad business background supported by a bachelor s degree in applied economics and accounting, and he extends this with solid experience backed by current MCSE and MCT certifications. He has been a SQL Server MVP since 2007. Document Revisions # Date Author Comments 0 19-OCT-2014 Peter Myers Initial release Page 4
Lab Overview Introduction In this lab, you will develop a data mining model that uses the Microsoft Association Rules algorithm to identify patterns of product models which are commonly purchased together. You will then enhance an existing ASP.NET web application to provide relevant purchasing suggestions to online customers. Objectives The objectives of this exercise are to: Create a data source view Create a Microsoft Association Rules data mining model Explore the mining model content Query the mining model Embed a mining model query results into a web application Exercises This hands-on lab comprises the following exercise: 1. Embedding Data Mining Results Into a Custom Application Estimated time to complete this lab: 30 minutes Page 5
Exercise 1: Embedding Data Mining Results Into a Custom Application In this exercise, you will develop a data mining model that uses the Microsoft Association Rules algorithm to identify rules about models commonly purchased together. This type of data mining is called market basket analysis. The patterns discovered by the data mining model will be used by the Adventure Works Online Shopping web application to cross-promote models by suggesting relevant models during the shopping cart checkout. Task 1 Browsing the Adventure Works Online Shopping Application In this task, you will explore the Adventure Works Online Shopping web application to understand how it presently delivers suggestions during check out. 1. To open Visual Studio, on the taskbar, click the Visual Studio shortcut. Figure 1 Selecting the Visual Studio Shortcut 2. To open the existing application, on the File menu, select Open Project/Solution. 3. In the Open Project window, navigate to the D:\SQLServerBI\Lab04\Assets folder. 4. Select the AWOnlineShopping.sln file, and then click Open. 5. To run the web application, on the toolbar, click Internet Explorer. Figure 2 Locating the Internet Explorer Command Page 6
6. In Internet Explorer, on the menu (located at the left), select Catalog by Category. Figure 3 Selecting the Menu Item 7. On the Catalog by Category page, in the Product list, click the Mountain-200 Black, 38 link. 8. On the Product Details page, click Add To Shopping Cart. 9. On the Shopping Cart page, notice the three suggestions at the bottom of the page. 10. Click the Display Database Command label, and then review the database command. Figure 4 Reviewing the database command Page 7
Note: These suggestions were retrieved by a relational database stored procedure. They represent a static collection of suggestions, and as such they do not take into consideration items already added to the shopping cart. Clearly, the suggestion to purchase a Mountain-200 is no longer relevant. 11. To finish running the application, close the Internet Explorer window. 12. Leave Visual Studio and the web application project open. Task 2 Creating an Analysis Services Multidimensional and Data Mining Project In this task, you will create an Analysis Services Multidimensional and Data Mining project. 1. To open another instance of Visual Studio, on the taskbar, right-click the Visual Studio shortcut, and then select Visual Studio 2014. 2. On the File menu, select New Project. 3. In the New Project window, in the left pane, expand Business Intelligence Analysis Services. 4. Select the Analysis Services Multidimensional and Data Mining Project template. Figure 5 Selecting the Analysis Services Multidimensional and Data Mining Project Template 5. In the Name box, replace the text with Basket Analysis. 6. In the Location box, replace the text with D:\SQLServerBI\Lab04. 7. In the Solution Name box, replace the text with AdventureWorksBI, and then click OK. Page 8
Task 3 Creating the AdventureWorksDW2014 Data Source In this task, you will create the AdventureWorksDW2014 data source. 1. In Solution Explorer, in the Basket Analysis project, right-click the Data Sources folder, and then select New Data Source. 2. In Solution Explorer, in the Basket Analysis project, right-click the Data Sources folder, and then select New Data Source. 3. In the Data Source Wizard, click Next. 4. At the Select How to Define the Connection step, in the Data Connections list, click New. 5. In the Connection Manager window, in the Server Name box, enter localhost. 6. In the Select or Enter Database Name dropdown list, select the AdventureWorksDW2014 database. 7. Click OK. 8. Click Next. 9. At the Impersonation Information step, select the Use the Service Account option. Note: A preferred practice is to use a dedicated domain account. For simplicity, you will use the service account in this lab. 10. Click Next. 11. At the Completing the Wizard step, accept the default Data Source Name, and then click Finish. 12. To save the solution, on the File menu, select Save All. Task 4 Creating the Basket Analysis Data Source View In this task, you will you create the Basket Analysis data source view. The data source view will be the foundation upon which the data mining model in this exercise will be developed. 1. In Solution Explorer, right-click the Data Source Views folder, and then select New Data Source View. 2. In the Data Source View Wizard, click Next. 3. At the Select a Data Source step, notice that the Adventure Works DW2014 data source is selected, and then click Next. Page 9
4. At the Select Tables and Views step, in the Available Objects list, scroll to the bottom of the list. 5. While pressing the Control key, select the v2013order and v2013ordermodel views. 6. Click the arrow to add the selected tables to the Included Objects list. Figure 6 Adding the Views to the Included Objects List 7. Click Next. 8. At the Completing the Wizard step, in the Name box, replace the text with Basket Analysis, and then click Finish. 9. When the wizard completes, in Solution Explorer, notice the addition of the Basket Analysis data source view, and that the data source view designer opens automatically. 10. To save the solution, on the File menu, select Save All. Task 5 Configuring the Basket Analysis Data Source View In this task, you will refine the design of the data source view. This will involve providing friendly names for each of the data source view tables, defining a logical primary key and establishing a relationship between the tables. 1. To rename the tables, in the data source view designer, in the Tables pane (located at the bottom left corner), select the v2013order table, and then in the Properties window, modify the FriendyName property to Order. Note: If the Properties window is not visible, on the View menu, select Properties Window. Page 10
2. Repeat the last step for the v2013ordermodel table, and modify the FriendlyName property to Basket. Note: The purpose of this step is to create a user-friendly data model. It is important to configure friendly names at the data source view level so that they are consistently inherited throughout the objects (cubes, dimensions and mining models) created upon this view. 3. To define the primary key in the Order table, in the Order table, right-click the OrderNumber column, and then select Set Logical Primary Key. 4. To establish a relationship between the Basket table and the Order table, in the Basket table, drag the OrderNumber column on top of the OrderNumber column in the Order table. Figure 7 Establishing the Relationship Between the Tables 5. To arrange the tables, right-click in a blank area of the diagram, and then select Arrange Tables. 6. To explore the data in the Basket table, in the Tables pane (or the diagram), right-click the Basket table, and then select Explore Data. 7. In the explorer window, notice that many orders include many models. Note: The data mining model that you will develop in this exercise will produce a model to describe the relationships between models purchased together (i.e in the same order). 8. To close the explorer window, on the File menu, select Close. Page 11
9. On the File menu, click Save All. 10. To close the data source view designer, on the File menu, select Close. Task 6 Creating the Basket Analysis Mining Model In this task, you will use the Data Mining Wizard to create the BasketAnalysis_AR mining model. 1. In Solution Explorer, right-click the Mining Structures folder, and then select New Mining Structure. 2. In the Data Mining Wizard, click Next. 3. At the Select the Definition Method step, notice the default selection, and then click Next. 4. At the Create the Data Mining Structure step, in the dropdown list, select the Microsoft Association Rules data mining algorithm, and then click Next. 5. At the Select Data Source View step, in the Available Data Source Views list, select the Basket Analysis data source view, and then click Next. 6. At the Specify Table Types step, specify the table types as shown. Figure 8 Specifying the Table Types 7. Click Next. Page 12
8. At the Specify the Training Data step, specify the columns to use in the mining model as shown. Figure 9 Specifying the Training Data 9. Click Next. 10. At the Specify Columns' Content and Data Type step, click Next. 11. At the Create Testing Set step, reduce the Percentage of Data for Testing value to 0, and then click Next. 12. At the Completing the Wizard step, in the Mining Structure Name box, replace the text with BasketAnalysis, and in the Mining Model Name box, replace the text with BasketAnalysis_AR. Note: It is very important that you follow the lab instructions precisely, particularly when naming objects. This lab includes code that expects objects have been named correctly. Figure 10 Naming the Mining Structure and Mining Model Page 13
13. Click Finish. 14. When the wizard completes, in Solution Explorer, notice the addition of the Basket Analysis mining structure, and that the mining structure designer opens automatically. 15. On the File menu, click Save All. Task 7 Configuring the Basket Analysis Mining Model Algorithm Parameters In this task, you will configure the Basket Analysis mining model algorithm parameters. 1. In the mining structure designer, select the Mining Models tab. 2. Right-click the BasketAnalysis_AR model, and then select Set Algorithm Parameters. Figure 11 Opening the Algorithm Parameters Window Page 14
3. In the Algorithm Parameters window, configure the Value property for the MINIMUM_PROBABILITY and MINIMUM_SUPPORT parameters as shown. Figure 12 Configuring the Algorithm Parameters Note: The two parameters configured here define the sensitivity of the thresholds used to analyze the data when the mining model processes. 4. Click OK. Task 8 Processing the Basket Analysis Mining Model In this task, you will process the Basket Analysis mining structure. Once processed, the BasketAnalysis_AR mining model will contain the patterns and statistics that describe the relationships between frequently purchased models. 1. In Solution Explorer, right-click the BasketAnalysis.dmm mining structure, and then select Process. 2. If prompted to build and deploy the project, click Yes. 3. In the Process Mining Structure window, click Run. Note: The deployment process creates and processes the mining structure. At this time, the data is retrieved from the data source, and the Microsoft Association Rules algorithm correlates and identifies frequent relationships across attribute values, which in this case are product models. Page 15
4. When processing completes, in the Process Progress window, click Close. 5. In the Process Mining Structure window, click Close. Task 9 Viewing the Basket Analysis Mining Model Content In this task, you will use three mining model viewers to explore and understand the model content. 1. In the mining structure designer, select the Mining Model Viewer tab. 2. In the Show dropdown list, select Show Attribute Name Only. Figure 13 Configuring the Attribute Properties to Show 3. To sort the rules in descending order of importance, click the Importance header twice. Figure 14 Sorting the Importance in Descending Order 4. Review the most important rules (located at the top of the list). Note: The first rule, Touring Tire Tube -> Touring Tire, reads there is a 56.5% probability that the purchase of a Touring Tire Tube will result in the purchase of a Touring Tire. Page 16
5. Scroll to the bottom of the list to find rules with negative importance. The purchase of these combinations are highly unlikely; in fact, the purchase of one discourages the purchase of the other. 6. In the Mining Model Viewer tab, select the Itemsets tab. Figure 15 Selecting the Itemsets Tab 7. In the Show dropdown list, select Show Attribute Name Only. 8. Increase the Minimum Itemset Size value to 3. 9. Review the frequent itemsets that contain three models. Note: The Support column represents the number of orders that included these three models. 10. Notice that the most frequent itemset that contain three models includes the Mountain-200 model. You will explore this model visually in the following steps of this task. 11. To view the item dependencies, select the Dependency Network tab. 12. In the viewer, on its toolbar, in the Show dropdown list, select Show Attribute Name Only. 13. To locate the Mountain-200 model, in the viewer, click the Find Node toolbar button. Figure 16 Locating the Find Node toolbar button 14. In the Find Node window, select the Basket(Mountain-200) = Existing node, and then click OK. Page 17
15. To zoom in, in the viewer, click the Zoom In toolbar button until you can adequately see the selected node and its related nodes. Figure 17 Locating the Zoom In toolbar button Figure 18 Exploring the Mountain-200 and Related Nodes Note: Each line (in technical terms it is named an edge) represents a pairwise association. The slider (located on the left) is associated with the importance score. Page 18
16. Gradually drag the slider (located at the left of the dependency network diagram) down to highlight the stronger edges by filtering out the weaker edges. Stop when you see only the selected node, Mountain-200, and the one node that the selected node predicts. Figure 19 Exploring the Mountain-200 and Strongest Related Nodes Note: The legend at the bottom of the viewer describes the node colors. Initially, in this scenario, all nodes predict both ways (i.e. the sales of Mountain-200 often results in the sale of the Mountain Bottle Cage, and vice versa). As you filter out the weaker edges, notice that ultimately it is the Mountain-200 that more likely results in a purchase of the HL Mountain Tire. Page 19
Task 10 Querying the Basket Analysis Mining Model In this task, you will create two singleton queries to test the model predictions. 1. In the mining structure designer, select the Mining Model Prediction tab. 2. Right-click inside the Select Input Table(s) window, and then select Singleton Query. Figure 20 Configuring a singleton query Note: A singleton query enables the input of data expressed in the query rather than sourced from an external dataset. 3. Click inside the Value box to reveal an ellipsis, and then click the ellipsis. Figure 21 Locating the clicking the ellipsis 4. In the Nested Table Input window, in the Key Column list, select the Mountain-200 model, and then click Add. Page 20
5. Click OK. 6. In the query designer, in the query grid, in the Source column dropdown list, select Prediction Function. Figure 22 Selecting the Prediction Function source 7. In the corresponding Field column dropdown list, select PredictAssociation. 8. From the Mining Model window, drag Basket into the corresponding Criteria/Argument column. Figure 23 Dragging Basket to the Criteria/Argument Column Page 21
9. To query the three likely models associated with the Mountain-200 model, in the Criteria/Argument column, append a comma and the number 3 at the end to create the following argument. DMX [BasketAnalysis_AR].[Basket], 3 10. The query should look like the following. Figure 24 Reviewing the Query 11. On the mining model prediction toolbar, toggle to Query. Figure 25 Toggling to the Query view Note: The query designer displays the DMX statement. This statement is requesting the three most likely models based on a basket consisting of only Mountain-200. 12. To execute the query, on the mining model prediction toolbar, toggle to Result. Page 22
13. Expand Expression to reveal the three predicted models. Figure 26 Reviewing the Query Result 14. On the mining model prediction toolbar, toggle to Design. 15. To add another model to the basket, on the Singleton Query Input window, click inside the Value box, and then click the ellipsis. 16. In the Nested Table Input window, in the Key Column list, select the HL Mountain Tire model, and then click Add. 17. Click OK. 18. On the mining model prediction toolbar, toggle to Query. 19. Notice the addition of the HL Mountain Tire. 20. Modify the SELECT line to read SELECT FLATTENED. Note: The FLATTENED keyword will produce a flattened result that can be easily consumed by an application. Note, however, that once you modify the query created by the graphic designer that you lose the graphic support functionality. 21. On the mining model prediction toolbar, toggle to Result. 22. Review the query result that now requests the three most likely models based on a basket consisting of the Mountain-200 and the HL Mountain Tire models. Task 11 Enhancing the Adventure Works Online Shopping Application In this task, you will use modify the AWOnlineShopping web application to deliver relevant model suggestions by querying the BasketAnalysis_AR data mining model. 1. Switch to the Visual Studio instance for the AWOnlineShopping web application. 2. In Solution Explorer, right-click the ShoppingCart.aspx item, and then select View Code. 3. Press Control+G, then in the Go to Line window, enter 74, and then click OK. Page 23
4. In the code window, review the GetDataMiningSuggestions function and the associated comments. Note: It is not necessary to understand the details of this code. This code is responsible for dynamically building a DMX statement similar to the one you created in the previous task. 5. Press Control+G, then in the Go to Line window, enter 51, and then click OK. 6. Replace this line with the following line. Visual Basic dr = GetDataMiningSuggestions() Note: This modification will retrieve suggested models predicted by the BasketAnalysis_AR data mining model. 7. On the File menu, click Save All. Task 12 Browsing the Enhanced Adventure Works Online Shopping Application In this task, you will browse the enhanced AWOnlineShopping web application. 1. On the toolbar, click Internet Explorer. 2. On the menu (located at the left), select Shopping Cart. 3. Notice that the three suggestions have changed according to associations with the items added to the shopping cart. 4. Click the Display Database Command label, and then review the database command. 5. From the suggestions list, click the HL Mountain Tire link. 6. On the Catalog by Model page, click the HL Mountain Tire link. 7. On the Product Details page, click Add To Shopping Cart. 8. On the Shopping Cart page, notice that the three suggestions have been revised. 9. Click the Display Database Command label, and then review the database command. Page 24
Task 13 Finishing Up In this task, you will finish up by closing all applications. 1. Close the Internet Explorer window. 2. In the Visual Studio instance for the AWOnlineShopping web application, on the File menu, select Exit. 3. In the Visual Studio instance for the AdventureWorksBI solution, on the File menu, select Exit. Summary In this lab, you created a data mining model that uses the Microsoft Association Rules algorithm to identify patterns about models commonly purchased together. The patterns discovered by the data mining model were used to enhance the customer experience while shopping online. Page 25