Examples of Spotfire Recommendations in Action



Similar documents
Empowering the Masses with Analytics

SOLUTION BRIEF. Granular Data Retention Policies

Top 3 Ways to Use Data Science

TIBCO Spotfire Business Author Essentials Quick Reference Guide. Table of contents:

Keeping up with the KPIs 10 steps to help identify and monitor key performance indicators for your business

TIBCO Industry Analytics: Consumer Packaged Goods and Retail Solutions

Resource Sizing: Spotfire for AWS

Five Reasons Spotfire Is Better than Excel for Business Data Analytics

Combating Fraud, Waste, and Abuse in Healthcare

Partner Collaboration Blueprint for ICD-10 Transition

TIBCO Live Datamart: Push-Based Real-Time Analytics

Log Management Solution for IT Big Data

access convergence management performance security

Integration Maturity Model Capability #1: Connectivity How improving integration supplies greater agility, cost savings, and revenue opportunity

SOLUTION BRIEF. TIBCO LogLogic A Splunk Management Solution

whitepaper The Evolutionary Steps to Master Data Management

Mobile App Integration - Seven Principles for ZDNet

Streaming Analytics and the Internet of Things: Transportation and Logistics

P6 Analytics Reference Manual

How to Navigate Big Data with Ad Hoc Visual Data Discovery Data technologies are rapidly changing, but principles of 30 years ago still apply today

TIBCO Spotfire Web Player Release Notes

Compliance, Security & Control : How Business Drivers Killed FTP

TIBCO Cyber Security Platform. Atif Chaughtai

Five Tips for Presenting Data Analyses: Telling a Good Story with Data

WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat

Visualization Quick Guide

Summarizing and Displaying Categorical Data

TIBCO ActiveSpaces Use Cases How in-memory computing supercharges your infrastructure

Business Intelligence Tools Information Session and Survey. December, 2012

Intelligence Reporting Standard Reports

Operations Management for Virtual and Cloud Infrastructures: A Best Practices Guide

2015 AHAR Webinar Part 2-Steps to a Successful Data Submission Transcript

Data Analysis for Yield Improvement using TIBCO s Spotfire Data Analysis Software

Integration Maturity Model Capability #5: Infrastructure and Operations

TIBCO StreamBase High Availability Deploy Mission-Critical TIBCO StreamBase Applications in a Fault Tolerant Configuration

SalesLogix Advanced Analytics

SOLUTION BRIEF. An ArcSight Management Solution

Empower Your People with Modern BI

<no narration for this slide>

WHITEPAPER. Beyond Infrastructure Virtualization Platform Virtualization, PaaS and DevOps

Table of Contents: Visual Analytics in the Situation Room 4 Visualize and Collaborate 5

Employee Survey Analysis

Making SAP Information Steward a Key Part of Your Data Governance Strategy

Self-Service Business Intelligence

Data Mining, Predictive Analytics with Microsoft Analysis Services and Excel PowerPivot

Exercise 1: How to Record and Present Your Data Graphically Using Excel Dr. Chris Paradise, edited by Steven J. Price

Airline Disruption Management

SWAP (System-Wide Analytics and Projection) Tool: 2016 Emergency Shelter Program Performance Analysis

Tableau Your Data! Wiley. with Tableau Software. the InterWorks Bl Team. Fast and Easy Visual Analysis. Daniel G. Murray and

Step-by-Step Guide to Bi-Parental Linkage Mapping WHITE PAPER

Sage 50 Intelligence Reporting

Point-of-Sale Monitoring. Using Real-Time Retail Data to Reduce Out-of-Stocks and Improve Business Performance

A GENERAL TAXONOMY FOR VISUALIZATION OF PREDICTIVE SOCIAL MEDIA ANALYTICS

A Guide Through the BPM Maze

vrealize Operations Manager User Guide

TIBCO Spotfire Guided Analytics. Transferring Best Practice Analytics from Experts to Everyone

Introduction to Exploratory Data Analysis

Spotfire v6 New Features. TIBCO Spotfire Delta Training Jumpstart

WHAT S NEW IN OBIEE

Four Clues Your Organization Suffers from Inefficient Integration, ERP Integration Part 1

whitepaper Five Principles for Integrating Software as a Service Applications

Predictive Straight- Through Processing

The Power of Predictive Analytics

WebSphere Business Monitor V6.2 Business space dashboards

Oracle Utilities Mobile Workforce Management Business Intelligence

(KPIs) featuring a nancial analysis and Top 5

Making confident decisions with the full spectrum of analysis capabilities

Data Quality Standards

MicroStrategy Analytics Express User Guide

How To Improve Efficiency With Business Intelligence

MetroBoston DataCommon Training

WebSphere Business Monitor

White Paper. Redefine Your Analytics Journey With Self-Service Data Discovery and Interactive Predictive Analytics

Table of Contents. Page 2 of 41

Dynamic Claims Processing

Learn About Analysis, Interactive Reports, and Dashboards

Microsoft Consulting Services. PerformancePoint Services for Project Server 2010

How To Use A Polyanalyst

TIBCO Managed File Transfer Suite

BI Platforms User Survey, 2011: Customers Rate Their BI Platform Vendors

SAS Add-In 2.1 for Microsoft Office: Getting Started with Data Analysis

Predictive Analytics with TIBCO Spotfire and TIBCO Enterprise Runtime for R

Getting Started With Mortgage MarketSmart

Consuming Real Time Analytics and KPI powered by leveraging SAP Lumira and SAP Smart Business in Fiori SESSION CODE: 0611 Draft!!!

Data Visualization & Reporting for Case Management

Visualization methods for patent data

Sage PFW ERP Intelligence

ZOINED RETAIL ANALYTICS. User Guide

Accountable Care Organization Quality Explorer. Quick Start Guide

Oracle Utilities Meter Data Management Business Intelligence

BTIP BCO ipro M cess Suite

VisualCalc Dashboard: Google Analytics Comparison Whitepaper Rev 3.0 October 2007

5.7. Quick Guide to Fusion Pro Schedule

Excel 2010: Create your first spreadsheet

Maximizing Your Storage Investment with the EMC Storage Inventory Dashboard

DataPA OpenAnalytics End User Training

Data Exploration Data Visualization

Presentation Outline. Business Intelligence Foundational Pyramid 7/15/2013. From its Origins in Infographics. By Dan McHenry & Melissa Ness.

Creating a Patch Management Dashboard with IT Analytics Hands-On Lab

Transcription:

Examples of Spotfire Recommendations in Action Easy dashboard setup for business users, dramatically faster creation of full-featured data analysis applications for analysts TIME IS OF THE ESSENCE With a dashboard, every unnecessary piece of information results in time wasted trying to filter out what s important, which is intolerable when time is of the essence. 2 The agile business intelligence market is growing rapidly, and as Gartner points out, the transition is toward platforms that can be rapidly implemented and used by analysts and business users to find insights quickly as well as by IT staff to quickly build analytics content to meet business requirements and deliver more timely business benefits.1 This drive for speed is about business value: accuracy and speed of interpretation for decision-making, authoring, and development of data discovery applications, and task completion to enable developers to implement their ideas quickly and obtain accurate insights. 2 This paper describes a recommendation engine for the TIBCO Spotfire interactive graphical analysis system. Spotfire Recommendations makes data discovery fast and easy for both analysts and business users. The system uses metadata typing and built-in graphics taxonomy to produce a collection of inherently sensible graphics choices applied to the data at hand. The user chooses from the suggestions, and the software builds a dashboard of linked, brushable, configurable graphics with supporting data filters and graphics controls that can be rapidly applied to the canvas. 1 Rita L. Sallam, Bill Hostmann, Kurt Schlegel, Joao Tapadinhas, Josh Parenteau, Thomas W. Oestreich. Magic Quadrant for Business Intelligence and Analytics Platforms, Gartner. February 23, 2015. 2 Stephen Few. Information Dashboard Design, Analytics Press, CA. 2015.

WHITEPAPER 2 For business users, Recommendations reduces the burden for initial setup of the dashboard, and for analysts, it dramatically speeds the creation of full-featured data analysis applications. Following are two case studies showing Recommendations applied to datasets for consumer packaged goods manufacturing and for homeless populations in the United States. CASE STUDY: GEOLOCATION ANALYSIS OF US HOMELESS The Department of Housing and Urban Development (HUD) collects data on homelessness in the US and releases two annual reports to Congress: the Annual Homelessness Assessment Report (AHAR), Parts 1 3 and 2 4. Part 1 contains information from the annual point-in-time counts (PIT) conducted by communities nationwide on a single night in January. Part 2 includes information obtained from homeless shelters throughout the course of a calendar year, the Homeless Inventory Count (HIC). In March 2015, HUD released the 2013 AHAR Part 2; Part 1 was released in October 2014. Raw data is available online at data.hud.gov. We obtained PIT and HIC data for 2007 2013. Estimates of homeless veterans are included beginning in 2011. HUD partners with the Veterans Administration on the Veterans Homelessness Prevention Demonstration Program. The Housing Inventory Count and point-in-time data are yearly measures across ~400 spatial regions in the US using HUD s Continuums of Care (CoC) regions. A shape file describing these regions is available at https://www.hudexchange.info/ coc/gis-tools/. Key variables in the beds data (HIC [Housing Inventory Count]) are: Shelter Type (ES [Emergency Shelter], TH [Transitional Housing], RRH [Rapid Re- Housing], SH [Safe Haven], PSH [Permanent Supportive Housing] and Household Type (with children, without children, with only children). Key variables in homeless data (PIT) are: Shelter Access (Sheltered, Unsheltered) and Family Situation (Individuals, Persons in Families). We also included US Census data in the analysis for the period 2010 2013, including counts by state, county, and age group of total population, total male population, total female population, and male and female populations broken down by race. 3 https://www.hudexchange.info/resources/documents/2014-ahar-part1.pdf 4 https://www.hudexchange.info/onecpd/assets/file/2013-ahar-part-2.pdf

WHITEPAPER 3 DATA PANEL The analysis begins by loading the data, which results in data column names being organized in a data panel (Figure 1). To start the analysis, the user clicks the Recommendations icon and selects one or more columns of interest. Figure 1. Spotfire open with data panel on the left showing homeless data. The user clicks the Recommended visualizations icon in the center of the canvas to start the analysis. The numerical columns in this data panel are the homeless counts (PIT) and beds (HIC) by year. Other data includes a time variable (YearDate), location (by state and county) and categorical data relating to the Continuum of Care. We select homeless and state data initially. Spotfire Recommendations suggests some maps, a bar chart, and a tree map of homeless by state (Figure 2). We add a map and treemap by state to the canvas. Figure 2. Recommendations panel for US homeless data, state selected.

WHITEPAPER 4 HOMELESS, BEDS, AND YEAR We next add beds, and year. Recommendations responds by suggesting cross tables, a bar chart trellised by year, and a parallel coordinates plot (Figure 3). We choose the cross table (lower right) to add to the canvas. FIGURE 3. Recommendations panel for US homeless data, state, beds, and year selected. HOMELESS BY STATE We now have a map of homeless by state, a treemap by state, and a cross table of homeless and beds by state. Recommendations has linked and arranged these three graphs on the canvas. With a few more mouse clicks to configure the graphs, we have an accurate, interactive summary of homeless in the US (Figure 4). Creating this initial dashboard took approximately 30 seconds. Figure 4. Dashboard showing map of homeless and utilization of shelters by state.

WHITEPAPER 5 HOMELESS SHELTERS Using this dashboard as a starting point, we are now able to build a comprehensive analysis of homeless across the US. This enables us to assess if there are enough shelters for the homeless on an ongoing regional basis. Figure 5 shows such a dashboard including a map of homeless utilization by state, trends of homeless and available beds, beds by shelter type, top states for bed utilization, and tables of homeless and bed utilization by CoC. The dashboard addresses the question: Do we have enough shelters for the homeless? Relevant KPIs are shown across the top and visualizations are arranged for easy interpretation. Figure 5. Completed dashboard providing a detailed analysis of homeless in the US during 2007 2013. The dashboard in Figure 5 is rapidly assembed from that shown in Figure 4. We calculated bed utilization, configured colors on the map and bar chart, and added the trend charts and a slider for years at the top. This dashboard is setup for drill down into regions and times of interest. All the data is now in shape for continued analysis, and for combining with additional data. We focus on Massachusetts and incorporate some weather data into the analysis. We fit contours to the temperature data (zip code) and display precipitation by size of circle. Color of contour lines and circles indicates temperature (red is warmer and blue is colder). Figure 6 shows this updated analysis for Massachusetts. Note that the pockets of high homeless utilization to the southeast of Boston coincide with milder temperatures and lower precipitation.

WHITEPAPER 6 Figure 6. Completed dashboard with drill-down to homeless in Massachusetts. Weather data has been incorporated: contours are fit to the temperature data, and precipitation is shown in circles (larger circles indicate more precipitation). Color of contour lines and circles indicates temperature (red is warmer and blue is colder). Note that the pockets of high homeless utilization to the southeast of Boston coincide with milder temperatures and lower precipitation. CASE STUDY: PAPER TOWEL MANUFACTURING Paper towel manufacturing involves equipment including dryers, dyes, feeders, cutting and pattern machines, and a series of process steps. Product quality is assessed via measurements of quality characteristics like absorbancy, strength, and softness. Large quantities of data are collected on machines and on process times for each batch at every process step. The data under consideration in this simple example includes measurements of product quality and equipment operation and performance. One goal of the analysis is to assess effects of equipment on product quality. DATA PANEL AND COLUMN NAMES The analysis begins by loading the data, which results in data column names being organized in a data panel. To start the analysis, the user clicks the Recommendations icon and selects one or more columns of interest. The numerical columns in this dataset are the measured product quality characteristics. Selecting these columns produces histograms, density plots, and tables in the Recommendations panel. Basic versions of the actual visualizations are displayed (not canned representations of generic chart types), so the user can see the shape of the distribution directly in the panel, (Figure 7).

WHITEPAPER 7 FIGURE 7. Recommendations panel with numeric columns selected. The absorbency histogram is shown in the lower right. HISTOGRAMS Paper towel softness appears to be normally distributed (upper right). Selecting each of the other product quality characteristics indicates that most of them are normally distributed as well. However, when absorbency is selected, the histogram shows a bi-modal distribution (lower right). This is an interesting finding that needs an explanation. The absorbency histogram is selected and added to the analysis by clicking on it in the Recommendations display (Figure 7). PROCESS DATES To investigate whether the changes in absorbency correlate to any temporal effects, a column with process dates is now selected. In this case, a line plot, bar chart, tree map, table, and other graphics (not showing) are then displayed in the Recommendations panel (Figure 8). The line plot shows a dramatic change in absorbency over time (Figure 8, top left), so it is selected and added to the analysis. Figure 8. Recommendations panel with absorbency numeric column and a time column selected in the Data Panel. Why does absorbency increase in the later part of July? The line plot is selected (clicked) to add it to the analysis.

WHITEPAPER 8 Additional line plots are available from Recommendations when the numeric variables are chosen along with a time variable (Figure 9). Absorbency (lower left) stands out as being more affected by time (day of the month). Figure 9. Recommendations panel with numeric columns and time (day of the month) selected. Absorbency is the red trace in the lower left. MACHINE USE IN TWO PROCESS STEPS To investigate whether the machines may be affecting absorbency at one or more process steps, additional categorical columns, each containing the machine used at a process step, are selected. The most relevant recommended graph is an absorbency line plot, colored by machines at one step and trellised by machines at the second step (Figure 10, top right). This is chosen and added to the analysis. Figure 10. Recommendations panel with absorbency numeric column, a time column, and two categorical process machine columns selected.

WHITEPAPER 9 DAY OF MONTH VS. LOW ABSORBENCY The Recommendations panel is then closed to view the useful visualizations that have been added. Marking is set to correlate colors for corresponding points in the line plot and histogram (Figure 11). It is clear that low absorbency in the histogram correlates to batches processed before July 17th, and high absorbency correlates to batches processed on or after July 17th. Figure 11. Analysis with line plot and histogram added from the Recommendations panel. ANALYSIS OF VARIANCE To understand the paper towel absorbency variability, we use the Spotfire analysis of variance (ANOVA) function. This enables an assessment of the effects of the machine process steps on the measured product quality characteristics. In the ANOVA setup dialog (Figure 12), absorbency is selected as a response variable (Y), and the process step equipment (tool) columns are selected as explanatory variables (X). Figure 12. ANOVA analysis setup dialog.

WHITEPAPER 10 ANOVA results (Figure 13) are presented as a sorted summary table with one row per response predictor pair (product quality characteristic and process step equipment). Each row contains a p-value indicating statistical significance. The most significant relationship is the effect of Dryer Tool on absorbency. Marking this row produces a drill-down box plot showing that Dryer Tool DY1 produces paper towel batches with significantly higher absorbency than Dryer Tools DY2 and DY3. Figure 13. ANOVA results. DRYER TOOLS Since the ANOVA results indicate that Dryer Tool is responsible for the variation in absorbency, the linked line plot and histogram produced with Recommendations are then configured to show this clearly (Figure 14). The earlier dashboard graphics are colored to distinguish between the Dryer Tools, and the line plot X-axis is changed to the Dryer process date. This shows that only Dryer Tools DY2 and DY3 were in use prior to July 13th and only DY1 was used on or after July 13th. It is now clear that the increase in paper towel absorbency is related to the switch to Dryer DY1. Additional investigation is warranted to determine how DY2 and DY3 could be modified to match the superior absorbency results obtained with DY1. Figure 14. Reconfigured discovery analysis for presentation; line plots and histograms colored according to Dryer Tool.

WHITEPAPER 11 CONCLUSIONS For business users, Recommendations reduces the burden for initial setup of the dashboard, and for analysts, the engine dramatically speeds the creation of full-featured data analysis applications. Recommendations enables rapid insights, but does not eliminate the need for the analyst and business user to know the data structure and understand the principles of sound graphical data and information display. This whitepaper shows two examples: a KPI dashboard and analysis for the US homeless population and an analysis of a paper towel manufacturing process. In both cases, Recommendations enables rapid creation of initial dashboards, while also setting up the Spotfire workbook for continued advanced analysis to assess underlying correlations and root cause. The homeless analysis identifies homeless hotspots and relates these to weather patterns. The manufacturing analysis identifies issues with absorbency and relates these to introduction of a new dryer tool. The speed of analysis and presentation is a common theme in these examples. This productive Spotfire environment enables users to move through data quickly, to identify hotspots, and get to the underlying issues. This environment thereby drives extreme value as insights are generated rapidly and thoroughly throughout all the available data. These insights are then available for action on the TIBCO Fast Data Platform. ACKNOWLEDGEMENTS This whitepaper features the remarkable engineering work of the Spotfire Recommendations Engineering team: Jörgen Gustavsson, Gustav Karlberg, Erik Brandin, and Anders Fougstedt. The manufacturing case study was prepared by Mike Alperin and Michael O Connell from the Spotfire Analytics team. The homeless case study was based on analysis contributions from Andrew Berridge, Brad Hopper, Ujval Kamath, Jagrata Minardi, Eric Novik, Anna Nowakowska, Michael O Connell and Peter Shaw from the Spotfire Analytics team. Global Headquarters 3307 Hillview Avenue Palo Alto, CA 94304 +1 650-846-1000 TEL +1 800-420-8450 +1 650-846-1005 FAX www.tibco.com TIBCO Software Inc. is a global leader in infrastructure and business intelligence software. Whether it s optimizing inventory, cross-selling products, or averting crisis before it happens, TIBCO uniquely delivers the Two-Second Advantage the ability to capture the right information at the right time and act on it preemptively for a competitive advantage. With a broad mix of innovative products and services, customers around the world trust TIBCO as their strategic technology partner. Learn more about TIBCO at www.tibco.com. 2015, TIBCO Software Inc. All rights reserved. TIBCO, the TIBCO logo, TIBCO Software, and Spotfire are trademarks or registered trademarks of TIBCO Software Inc. or its subsidiaries in the United States and/or other countries. All other product and company names and marks in this document are the property of their respective owners and mentioned for identification purposes only. 05/27/15