Create interactive web graphics out of your SAS or R datasets



Similar documents
Interactive HTML Reporting Using D3 Naushad Pasha Puliyambalath Ph.D., Nationwide Insurance, Columbus, OH

Web Dashboard User Guide

Tutorial: Building a Dojo Application using IBM Rational Application Developer Loan Payment Calculator

Dashboard Skin Tutorial. For ETS2 HTML5 Mobile Dashboard v3.0.2

Developer Tutorial Version 1. 0 February 2015

Lab 2: Visualization with d3.js

MicroStrategy Desktop

Embedded BI made easy

Scatter Chart. Segmented Bar Chart. Overlay Chart

Visualizing an OrientDB Graph Database with KeyLines

We automatically generate the HTML for this as seen below. Provide the above components for the teaser.txt file.

Visualizing a Neo4j Graph Database with KeyLines

How To Draw A Pie Chart On Google Charts On A Computer Or Tablet Or Ipad Or Ipa Or Ipam Or Ipar Or Iporom Or Iperom Or Macodeo Or Iproom Or Gorgonchart On A

SAS BI Dashboard 4.3. User's Guide. SAS Documentation

4/25/2016 C. M. Boyd, Practical Data Visualization with JavaScript Talk Handout

CLASSROOM WEB DESIGNING COURSE

Team Members: Christopher Copper Philip Eittreim Jeremiah Jekich Andrew Reisdorph. Client: Brian Krzys

TDAQ Analytics Dashboard

SAS BI Dashboard 3.1. User s Guide

Up and Running with LabVIEW Web Services

Creating Basic Custom Monitoring Dashboards Antonio Mangiacotti, Stefania Oliverio & Randy Allen

Pay with Amazon Integration Guide

CREATING EXCEL PIVOT TABLES AND PIVOT CHARTS FOR LIBRARY QUESTIONNAIRE RESULTS

Front-End Performance Testing and Optimization

BusinessObjects Enterprise InfoView User's Guide

DESIGNING MOBILE FRIENDLY S

Web Development. Owen Sacco. ICS2205/ICS2230 Web Intelligence

JOOMLA 2.5 MANUAL WEBSITEDESIGN.CO.ZA

How to Deploy Custom Visualizations Using D3 on MSTR 10. Version 1.0. Presented by: Felipe Vilela

Creating and Configuring a Custom Visualization Component

About Google Analytics

Enterprise Data Visualization and BI Dashboard

MASTERTAG DEVELOPER GUIDE

Client Overview. Engagement Situation. Key Requirements

Internet/Intranet, the Web & SAS. II006 Building a Web Based EIS for Data Analysis Ed Confer, KGC Programming Solutions, Potomac Falls, VA

Virto Pivot View for Microsoft SharePoint Release User and Installation Guide

CaptainCasa. CaptainCasa Enterprise Client. CaptainCasa Enterprise Client. Feature Overview

Data Visualization. Scientific Principles, Design Choices and Implementation in LabKey. Cory Nathe Software Engineer, LabKey

Microsoft Excel 2010 Pivot Tables

OpenText Information Hub (ihub) 3.1 and 3.1.1

Oracle Utilities Meter Data Management Business Intelligence

EXERCISE: Introduction to the D3 JavaScript Library for Interactive Graphics and Maps

Participant Guide RP301: Ad Hoc Business Intelligence Reporting

End User Monitoring. AppDynamics Pro Documentation. Version Page 1

Web Portal User Guide. Version 6.0

Adam Rauch Partner, LabKey Software Extending LabKey Server Part 1: Retrieving and Presenting Data

Develop highly interactive web charts with SAS

Data representation and analysis in Excel

Ad Hoc Reporting. Usage and Customization

Portals and Hosted Files

How is it helping? PragmatiQa XOData : Overview with an Example. P a g e Doc Version : 1.3

Magento module Documentation

Using Adobe Dreamweaver CS4 (10.0)

How To Use Query Console

Ad-hoc Reporting Report Designer

603: Enhancing mobile device experience with NetScaler MobileStream Hands-on Lab Exercise Guide

Adding 3rd-Party Visualizations to OBIEE Kevin McGinley

SAS Add in to MS Office A Tutorial Angela Hall, Zencos Consulting, Cary, NC

Chapter 10 Encryption Service

DiskPulse DISK CHANGE MONITOR

Course Information Course Number: IWT 1229 Course Name: Web Development and Design Foundation

Dynamic Decision-Making Web Services Using SAS Stored Processes and SAS Business Rules Manager

TIBCO Spotfire Business Author Essentials Quick Reference Guide. Table of contents:

Assignment 5: Visualization

Web Performance. Lab. Bases de Dados e Aplicações Web MIEIC, FEUP 2014/15. Sérgio Nunes

A set-up guide and general information to help you get the most out of your new theme.

10CS73:Web Programming

Traitware Authentication Service Integration Document

Tableau Server Trusted Authentication

Drupal CMS for marketing sites

Designing portal site structure and page layout using IBM Rational Application Developer V7 Part of a series on portal and portlet development

StARScope: A Web-based SAS Prototype for Clinical Data Visualization

Practical Example: Building Reports for Bugzilla

Enterprise Web Developer : Using the Emprise Javascript Charting Widgets.

Chapter 1 Introduction to web development and PHP

NAIP Consortium Strengthening Statistical Computing for NARS SAS Enterprise Business Intelligence

Sisense. Product Highlights.

WHAT S NEW IN OBIEE

Portal Connector Fields and Widgets Technical Documentation

Support/ User guide HMA Content Management System

A Tool for Evaluation and Optimization of Web Application Performance

GeoGebra Statistics and Probability

Specify the location of an HTML control stored in the application repository. See Using the XPath search method, page 2.

Visualization of Semantic Windows with SciDB Integration

Citrix Receiver for Enterprise Applications The technical detail

Citrix StoreFront. Customizing the Receiver for Web User Interface Citrix. All rights reserved.

Apple Applications > Safari

Lost in Space? Methodology for a Guided Drill-Through Analysis Out of the Wormhole

Sizmek Formats. HTML5 Page Skin. Build Guide

Configuring the JEvents Component

jquery Sliding Image Gallery

Performance Testing for Ajax Applications

DataPA OpenAnalytics End User Training

SAS Task Manager 2.2. User s Guide. SAS Documentation

Transcription:

Paper CS07 Create interactive web graphics out of your SAS or R datasets Patrick René Warnat, HMS Analytical Software GmbH, Heidelberg, Germany ABSTRACT Several commercial software products allow the creation of interactive graphics. For some tasks, open source software solutions can be sufficient to present data in an interactive way. In this talk it is shown for one example how an ensemble of plots (bar chart, pie chart...) can be created for display in a web browser, visualizing data delivered by a SAS Stored Process or by a HTTP service interface to R. The ensemble of plots is interactive in the way that by clicking and thus selecting different parts of one graph, all other graphs are updated and filtered to show only the corresponding sub-part of the data. The solution enables users to interactively drill down into sub-parts of a data, providing a flexible graphical examination of a data set. INTRODUCTION Data visualization is a powerful tool for the analysis of quantifiable information, because it allows using a particular strength of human beings: the visual perception. Through the graphic representation of data, relationships can be identified faster and easier than without visualization. This effect is further intensified by using visualization software that supports interactive work with graphics. There are several commercial software products available that provide tools for generation of interactive graphics that can be used in a wide variety of use cases. For some tasks, open source software solutions can be sufficient to present data in an interactive way. In this paper, it is shown how to graphically present variables of a dataset in an interactive way using ubiquitously available web technologies and open source software libraries. An ensemble of plots (bar chart, pie chart...) is created for display in a web browser using HTML, CSS and JavaScript. In the simplest use case the visualized data is provided as a flat file, as an alternative it is shown how data delivery could be implemented with an R or SAS based server backend. The ensemble of plots is interactive in the way that by clicking and thus selecting different parts of one graph, all other graphs are updated and filtered to show only the corresponding sub-part of the data. The solution enables users to interactively drill down into sub-parts of a data, providing a flexible graphical examination of a data set. The solution shown here for one particular example dataset is easily applicable to other datasets. It could be used as a blueprint for a quick and easy solution to provide interactive diagrams with the optional possibility to connect to different data serving backend technologies. The remainder of this paper is structured as follows: First the specific example scenario is described in detail, second, the technical details for the web based interactive graphics are explained, then two possible variants for data providing backend solutions are outlined, and finally a conclusion is given. EXAMPLE SCENARIO As an example, data from the website ClinicalTrails.gov were used: Descriptive data for studies found on ClinicalTrials.gov by search term Influenza (search results retrieved at 13/August/2015). For this search, n = 2513 studies were found and descriptive data was downloaded as a tsv file (tab separated values) utilizing the download feature provided at the search results page of ClinicalTrails.gov. The downloaded file contains 2514 rows, first row with column headers and every following row describing one study. Out of the available variables (columns) describing the found studies, the following four categorical variables where used: - Study type: observational or interventional - Study results availability: results available at ClinicalTrails.gov or not - Study phase - Age group: investigated age group WEB FRONTEND WITH INTERACTIVE GRAPHICS In order to create interactive diagrams for the example data set, HTML, CSS and JavaScript were used. More specifically, a custom web page was created on an HTTP server that uses the open source Java Script library dc.js [1] to create an interactive panel of diagrams that can be used by opening an HTML file in a web browser (see figure 1). All plots are interactive in the way that by clicking and thus selecting different parts of one graph, all other graphs are updated and filtered to show only the corresponding sub-part of the data. Figure 2 shows the panel of charts after selection of a slice in the second pie chart, namely only studies for which results are available at ClinicalTrials.gov. A number of 415 studies are selected by this interaction and all other charts are automatically 1

updated to show only the corresponding sub-part of the data. A line of text informs about of number of currently selected records and allows for a reset of all selections. Figure 1: Demo panel of interactive graphics as rendered in a web browser. The diagrams display the number of category occurrences for four different attributes. Figure 2: Demo panel after interaction. In the second pie chart, the slice Has Results was clicked, thereby selecting only studies where results are available. All other charts are automatically updated to show only the corresponding sub-part of the data. The following files are used to create the panel of interactive diagrams: The file index.html contains the HTML for the web page structure and the JavaScript Code to read in the data and to create the diagrams utilizing the library dc.js. The dc.js library is located in the js subdirectory as a minified version, along with the two other libraries of which dc.js is dependent (crossfilter.js [2] and d3.js [3]). The data is contained as a tabulator separated file in subdirectory data. The css subdirectory contains two files: dc.css is provided together with dc.js and style.css was created in order to modify the styling of the diagrams, in particular the size and color of the text labels. Full source code of the files styles.css and index.html are printed in the appendix. The contents of the file index.html can be summarized with a list of code blocks as follows: 2

- HTML Head with title definition and links (imports) of CSS files - HTML Body: - Headlines definitions - several Div-Blocks for definition of the different panel elements - links (imports) of js libraries - custom Java Script code to read in data and define the diagrams: - function replacemissingwithmarkerna: simple replacement of empty strings with string NA, used during data import - function createcharts: used to define the interactive diagrams using dc.js, this function is designed to be used as a callback of an d3 data import function (see below) - a call to the function d3.tsv, a function that reads tabular separated files, allows definition of preprocessing (here a call to replacemissingwithmarkerna) and which calls function createcharts when data is read and available. The most interesting part of the JavaScript source code is the function createcharts. In this function the shown diagrams are defined using a declarative syntax. The following example shows the steps necessary to define one pie chart. // the variable data contains the tabular input data read by function d3.tsv as a // list of JSON objects // the crossfilter function takes a list of JSON objects, and creates an crossfilter // object var crf = crossfilter(data) // using the crossfilter object, we define the column types of the data as a // dimension, which can be used to group or filter data var typesdimension = crf.dimension(function(d) {return d.types) // the group function constructs a new grouping for the given dimension, according to a specified groupvalue function. The groupvalue function is optional if not specified, as it is the case here, the number of records per group will be counted. var typesgroup = typesdimension.group() //define a pie chart the referenced HTML DIV element defines where on the page the diagram will be located width, height and radius define the size of the diagram and dimension and group define the shown information var typespiechart = dc.piechart('#chart-pie-types') typespiechart.radius(110).dimension(typesdimension).group(typesgroup) // finally, a function call to render the diagram on the page dc.renderall() All other charts are defined in a similar manner. By using the same crossfilter object to define dimensions and groups for the different diagrams, they all are interconnected as described further above. Thus, no explicit programming is necessary to create a panel of interactive graphics, this functionality is completely provided by the dc.js library. In the example above, the data is provided as a tabular separated text file (tsv), directly located at the HTTP server that provides the files for the front end (HTML, CSS, JavaScript). As described, this file was manually downloaded from ClinicalTrials.gov for this demo. In other scenarios, the date file could be automatically generated or updated by scheduled backend processes like scheduled execution of SAS or R programs. As an alternative, the requested data could be provided on-the-fly by HTTP based services, as the data reading functions of the d3 library, like the d3.tsv functions are based on HTTP GET requests. The following two paragraphs give an overview on how such a data providing service could be implemented using SAS or R based technologies. SAS STORED PROCESS AS DATA PROVIDING BACKEND Utilizing the SAS Stored Process Web Application [4], SAS Stored Processes (STP) can be invoked directly using HTTP GET and STPs can return data as part of corresponding HTTP Response. For usage with the d3.tsv (or similarly d3.csv) data import function as used in the example above, it would be possible to implement a STP that 3

directly returns a data set in tsv format. In the front end JavaScript code call of the d3.tsv function, the relative path of a tsv file at the web server would be replaced with the URL of the STP that is providing the data. Please note that it is assumed for this example scenario that the web server providing the front end code and the Stored Process Web Application providing the STP HTTP interface run on the same host as otherwise most browsers will deny the call to the STP by default, due to a violation of the same-origin policy [5]. As an alternative, it is possible to provide the front end code itself as the result of an STP or make use of JSONP, a method to encapsulate the client-server communication. HTTP SERVICE INTERFACE TO R AS DATA PROVIDING BACKEND There are several software solutions available to implement a data providing HTTP service based on R. One of these solutions is OpenCPU [6]. OpenCPU is a system that provides a HTTP API to an R installation, providing ways to call R functions or R scripts and/or to retrieve data over HTTP. The OpenCPU system is available in two variants: The first variant is a R package that can be used in a local R installation for development, the second variant is a Linux server installation package for use in production. In addition, there is a publicly accessible server installation available on the domain opencpu.org. For example, using the public OpenCPU server that has the R package MASS installed, the URL to retrieve the data set Cars93 of the package MASS as a csv file is: http://public.opencpu.org/ocpu/library/mass/data/cars93/csv A URL like this can be directly used with the d3 import functions like d3.csv to import data sets as described in the example scenario. Be aware that the same-origin policy [5] as implemented in web browser needs to be taken into consideration here as well. A possible solution could be to configure the web server providing the front end code that it proxies the requests to a local OpenCPU server instance (see figure 3). 1 3 4 Web browser Web server OpenCPU server 2 6 5 Figure 3: Possible scenario of using an OpenCPU server together with a web server. (1) The web browser requests the index.html page. (2) The webserver provides the front end Code (HTML, CSS and JavaScript ) to the web browser. (3) The delivered Java Script code executed at the web browser fetches data (e.g. by a call to the d3.csv function) using an URL on the web server. (4) The web server acts as a reverse proxy to the OpenCPU server. (5) The OpenCPU server is providing data out of an R installation over HTTP. (6) The reverse proxy feature of the web server passes the data to the web browser. CONCLUSION Graphical presentation of data is very helpful, especially in order to find and interpret relationships. Interactivity of data visualizations enable users to a certain degree to select on which aspects of the presentation they want to concentrate on or in which order they explore different aspects. Interactive diagrams can be created with several commercial software packages. For some tasks, open source software solutions can be sufficient to present data in an interactive way, and in this paper one way was shown to accomplish this with minimal effort. The presented example can be enhanced in several ways and combined with different data providing backends and can be integrated in existing frontend code. REFERENCES web links as accessed on 2 nd September 2015: [1] http://dc-js.github.io/dc.js/ [2] http://square.github.io/crossfilter/ [3] http://d3js.org/ [4] http://support.sas.com/documentation/cdl/en/stpug/68399/html/default/viewer.htm#n1gt44n8wc0la0n18s9 kajwq0o2q.htm [5] https://en.wikipedia.org/wiki/same-origin_policy [6] https://www.opencpu.org/ CONTACT INFORMATION Your comments and questions are valued and encouraged. Contact the author at: Dr. Patrick R. Warnat HMS Analytical Software GmbH Rohrbacher Str. 26 69115 Heidelberg Germany http://www.analytical-software.de/en/ Brand and product names are trademarks of their respective companies. 4

APPENDIX Full source code of the files index.html and styles.css as described in paragraph Web Frontend with interactive graphics of this paper. File index.html <!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8"> <title>demo for interactive web graphics - data from ClinicalTrials.gov</title> <link rel="stylesheet" href="./css/dc.css"> <link rel="stylesheet" href="./css/style.css"> </head> <body> <h1>demo for interactive web graphics - data from ClinicalTrials.gov</h1> <h2>descriptive data for studies found by search term "Influenza" at 13/August/2015</h2> <div id="chart-pie-types"></div> <div id="chart-pie-results"></div> <div id="chart-row-phases"></div> <div id="chart-row-agegroups"></div> <div class="dc-data-count"> <span class="filter-count"></span> selected out of <span class="total-count"></span> records <a href="javascript:dc.filterall() dc.renderall()">reset All</a> </div> <script type="text/javascript" src="./js/d3.min.js"></script> <script type="text/javascript" src="./js/crossfilter.min.js"></script> <script type="text/javascript" src="./js/dc.min.js"></script> <script type="text/javascript"> //simple replacement of empty strings with string NA //used during data import replacemissingwithmarkerna = function(value) { var res = "NA" if (value){ res = value return(res) //function to define the interactive diagrams using dc.js, //this function is designed to be used as a callback of an //d3 data import function (see below) createcharts = function(data) { // the variable data contains the tabular input data read by function d3.tsv as a // list of JSON objects // the crossfilter function takes a list of JSON objects, and creates an crossfilter // object var crf = crossfilter(data) var all = crf.groupall() // using the crossfilter object, we define selected columns of the data as a // dimension, which can be used to group or filter data // the group function constructs a new grouping for the given dimension, // according to a specified groupvalue function. The groupvalue function // is optional if not specified, as it is the case here, the number of // records per group will be counted. var resultsdimension = crf.dimension(function(d) {return d.results) var resultsgroup = resultsdimension.group() var typesdimension = crf.dimension(function(d) {return d.types) var typesgroup = typesdimension.group() var phasesdimension = crf.dimension(function(d) {return d.phases) var phasesgroup = phasesdimension.group() 5

var agegroupsdimension = crf.dimension(function(d) {return d.agegroups) var agegroupsgroup = agegroupsdimension.group() //define a pie chart the referenced HTML DIV element defines where on the page //the diagram will be located width, height and radius define the size of the //diagram and dimension and group define the shown information var resultspiechart = dc.piechart('#chart-pie-results') resultspiechart.radius(110).dimension(resultsdimension).group(resultsgroup) //define a pie chart var typespiechart = dc.piechart('#chart-pie-types') typespiechart.radius(110).dimension(typesdimension).group(typesgroup) //define a row chart (horizontal bar chart) var phasesrowchart = dc.rowchart('#chart-row-phases') phasesrowchart.margins({top: 20, left: 10, right: 10, bottom: 20).dimension(phasesDimension).group(phasesGroup).elasticX(true).xAxis().ticks(4) //define a row chart (horizontal bar chart) var agegroupsrowchart = dc.rowchart('#chart-row-agegroups') agegroupsrowchart.margins({top: 20, left: 10, right: 10, bottom: 20).dimension(ageGroupsDimension).group(ageGroupsGroup).elasticX(true).xAxis().ticks(4) //define a data count for display of the numer of selected //and the total number of items var selecteddatacount = dc.datacount('.dc-data-count') selecteddatacount.dimension(crf).group(all) dc.renderall() // finally, a function call to render the diagram on the page //read in data d3.tsv( //data source url, can be local flat file or file from server "./data/study_fields.tsv", //accessor function for data row processing //it is defined which colums are read, and that they are preprocessed //with function replacemissingwithmarkerna function(d) { return { 6

types : replacemissingwithmarkerna(d["study Types"]), results : replacemissingwithmarkerna(d["study Results"]), phases : replacemissingwithmarkerna(d["phases"]), agegroups : replacemissingwithmarkerna(d["age Groups"]), //callback function which is called when the data is available function (data) { createcharts(data) ) </script> </body> </html> File style.css #chart-pie-results.pie-slice { fill: black font-size: 14px #chart-pie-types.pie-slice { fill: black font-size: 14px #chart-row-phases.row text { fill: black font-size: 14px #chart-row-agegroups.row text { fill: black font-size: 14px 7