GGobi : Interactive and dynamic



Similar documents
STAT 503X Case Study 2: Italian Olive Oils

The Value of Visualization 2

RnavGraph: A visualization tool for navigating through high-dimensional data

4/25/2016 C. M. Boyd, Practical Data Visualization with JavaScript Talk Handout

Exploratory Data Analysis with MATLAB

Introduction Predictive Analytics Tools: Weka

Cours de Visualisation d'information InfoVis Lecture. Multivariate Data Sets

Divvy: Fast and Intuitive Exploratory Data Analysis

An Introduction to WEKA. As presented by PACE

By LaBRI INRIA Information Visualization Team

Interactive Data Mining and Visualization

20 A Visualization Framework For Discovering Prepaid Mobile Subscriber Usage Patterns

Use R! Series Editors: Robert Gentleman Kurt Hornik Giovanni Parmigiani

Sisense. Product Highlights.

Exploratory Data Analysis for Ecological Modelling and Decision Support

Visualization Techniques in Data Mining

NaviCell Data Visualization Python API

Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data

Why are we teaching you VisIt?


Visualization and Visual Analytics

VisIVO, a VO-Enabled tool for Scientific Visualization and Data Analysis: Overview and Demo

COM CO P 5318 Da t Da a t Explora Explor t a ion and Analysis y Chapte Chapt r e 3

Descriptive Statistics and Exploratory Data Analysis

3D Interactive Information Visualization: Guidelines from experience and analysis of applications

TIES443. Lecture 9: Visualization. Lecture 9. Course webpage: November 17, 2006

The Forgotten JMP Visualizations (Plus Some New Views in JMP 9) Sam Gardner, SAS Institute, Lafayette, IN, USA

VisIt Visualization Tool

Criteria for Evaluating Visual EDA Tools

Norwegian Satellite Earth Observation Database for Marine and Polar Research USE CASES

Geovisual Analytics Exploring and analyzing large spatial and multivariate data. Prof Mikael Jern & Civ IngTobias Åström.

imc FAMOS 6.3 visualization signal analysis data processing test reporting Comprehensive data analysis and documentation imc productive testing

Graphics - an Ace up a Statistician's Sleeve

Processing Data with rsmap3d Software Services Group Advanced Photon Source Argonne National Laboratory

an introduction to VISUALIZING DATA by joel laumans

A Tutorial on dynamic networks. By Clement Levallois, Erasmus University Rotterdam

RAPIDMINER FREE SOFTWARE FOR DATA MINING, ANALYTICS AND BUSINESS INTELLIGENCE. Luigi Grimaudo Database And Data Mining Research Group

Data Mining: Exploring Data. Lecture Notes for Chapter 3. Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler

imc FAMOS 6.3 visualization signal analysis data processing test reporting Comprehensive data analysis and documentation imc productive testing

Scientific Visualization with ParaView

3D Client Software - Interactive, online and in real-time

Team Members: Christopher Copper Philip Eittreim Jeremiah Jekich Andrew Reisdorph. Client: Brian Krzys

What is Data mining?

MicroStrategy Analytics Express User Guide

Data Exploration Data Visualization

ParaView s Comparative Viewing, XY Plot, Spreadsheet View, Matrix View

Data Mining: Exploring Data. Lecture Notes for Chapter 3. Introduction to Data Mining

Margarines and Heart Disease. Do they protect?

Data/Result: Table 1: Group 10 s starting and ending volumes of NaOH, total volume of NaOH, and weight of olive oil

Using the Grid for the interactive workflow management in biomedicine. Andrea Schenone BIOLAB DIST University of Genova

A Multiple DNA Sequence Translation Tool Incorporating Web Robot and Intelligent Recommendation Techniques

How is EnSight Uniquely Suited to FLOW-3D Data?

Survey Public Visualization Services

Final Project Report

Data Mining: Exploring Data. Lecture Notes for Chapter 3. Introduction to Data Mining

Create Cool Lumira Visualization Extensions with SAP Web IDE Dong Pan SAP PM and RIG Analytics Henry Kam Senior Product Manager, Developer Ecosystem

Comparative Analysis Report:

Free 15-day trial. Signata Waveform Viewer Datasheet

VISUALIZATION OF GEOSPATIAL METADATA FOR SELECTING GEOGRAPHIC DATASETS

Exploratory Climate Data Visualization and Analysis

Oracle Advanced Analytics 12c & SQLDEV/Oracle Data Miner 4.0 New Features

Visualization Service Bus

TIBCO Spotfire Business Author Essentials Quick Reference Guide. Table of contents:

Astrophysics with Terabyte Datasets. Alex Szalay, JHU and Jim Gray, Microsoft Research

Visual Mining for Big Data

Trans fat-free production strategies for margarine

Visual Data Mining. Motivation. Why Visual Data Mining. Integration of visualization and data mining : Chidroop Madhavarapu CSE 591:Visual Analytics

Interactive Data Visualization using Mondrian

Take management Task automation Digital Asset Management. Tools and architecture for multimedia pipelines

Table Of Contents: I. MapifyPro: Installation. II. General Overview & License Activation. III. Map Settings. IV. Map Location Settings. V.

Multivariate Data Visualization

The Holman Omega 3 Test Report

Transcription:

GGobi : Interactive and dynamic data visualization system Bioinformatics and Biostatistics Lab., Seoul National Univ. Seoul, Korea Eun-Kyung Lee 1

Outline interactive and dynamic graphics Exploratory data analysis and Data mining What is GGobi? Main features of GGobi Demo with a couple of examples 2

Interactive vs. Dynamic Graphics Interactive graphics - a user can actively manipulate the visual graphics by input devices and make changes based on the visual result. Dynamic graphics - the visible graphics change on the computer screen without further user interaction --> Interactive and dynamic graphics 3

Interactive and dynamic graphical methods Focusing - zooming, slicing, rescaling, reformatting Arranging - rotation, grand tour, guided tour, manual tour Linking - Linked brushing and identification 4

EDA vs. DM Exploratory Data Analysis (EDA) - numerical or graphical detective work - a continuation of Tukey's idea to use graphics to find structure, general concepts, unexpected behavior, etc. in data sets by looking at the data. 5

EDA vs. DM Data Mining (DM) - Data mining is exploratory data analysis with little or no human interaction using computationally feasible techniques - Wegman Visual Data Mining (VDM) = DM + Statistical graphics 6

What is GGobi? A direct descendent of XGobi A data visualization system with interactive and dynamic methods for the manipulation of views of data. provide various plots with multiple plotting windows system use XML file format for data can be easily extended, either by being embedded in other SW or by the addition of plugins able to use in R (rggobi) 7

GGobi s main features 1. Appearance Use GTK+ single session can support multiple plots single process can support multiple independent session support several types of plots scatter plot, parallel coordinate plot, scatter plot matrix, time series plot, barchart include interactive tools to specify and tune color maps able to add variables on the fly panning and zooming 8

GGobi s Main features 2. Portability runs under various platforms, like Linux, Windows or Mac. 3. Data format XGobi : use several files (.dat,.col,.row,.glyphs,.colors, etc) use XML allow complex characteristics and relationships in data to be specified multiple dataset can be entered in a single XML file and specifications can be included for linking them 9

GGobi s Main features 4. Embedding in other SW GGobi can be treated as a C library and directly embedded in other SW, then controlled using an application program interface (API) This allows GGobi functionality to be integrated into one s own stand-alone application and provide as an add-on to existing language and scripting environments 5. Extending with plugins The plugin mechanism allows to provide add-on extensions to GGobi that are not part of the core design data viewer, ggvis (MDS), Variogram Cloud, Save Display Description 10

GGobi : File open XML from files XML from URL CSV New open new session Save as XML : keep all the information including color, glyph, etc. as CSV : keep only numeric data values 11

GGobi : Display open new plot window New Scatterplot Display New Scatterplot Matrix New Parallel coordinates Display New Time Series New Barchart 12

GGobi : View 1D plot XY plot : 2D plot 1D tour : project data into 1D space Rotation : use three variables 2D tour : project data into 2D space 2x1D tour : use 2 different 1D tour 13

GGobi : interaction Scale Brush Identify scale EditEdges : add edges or add points in Display MovePoints : move points in Display 14

GGobi : Tools Variable Manipulation ; Variable Transformation Sphering (PCA) Variable jittering : prevent point mass viewing Color Schemes; Automatic Brushing Color & Glyph groups; Case Subsetting & Sampling Missing Values plugins Data Viewer, ggvis(mds), Variogram Cloud, Save Display Description 15

RGGobi able to use GGobi in R Link, including programming customized GUIs containing linked GGobi plots, writing new linking rules in S, and responding to GGobi events, create GGobi plugins written in R. 16

Example 1 : Restaurant Tipping In early 1990, one waiter recorded information about each tip he received over a period of a few months working in one restaurant. He collected several variables; (n=244) TOTBILL : total bill in dollars TIP : tip in dollars SEX : gender of the bill payer; male(0), female(1) SMOKER : whether the party included smokers or not No(0), Yes(1) DAY : days of week ; Thu(3), Fri(4), Sat(5), Sun(6) TIME : lunch(0), dinner(1) SIZE : size of the party 17

Example 2 : Italian Olive Oils This data consists of the percentage composition of 8 fatty acids found in the lipid fraction of 572 Italian olive oils (n=572) Region : South(1),North(2) or Sardinia(3) Area : North-Apulia(1), Calabria(2), South Apulia(3), Sicily(4), Inland Sardinia(5), Costal Sardinia(6), East Liguria(7), West Liguria(8), and Umbria(9) Palmitic Palmitoleic Stearic Oleic Linoleic Linolenic Arachidic Eicosenoic 18

Italy North Italy 8 7 9 5 1 3 South Italy Sardinia 6 2 4 19

Example 3 : Leukemia data n = 72 : # of observation p = 3571 p=40 : # of genes acute myeloid leukemia(aml) : 25 cases acute lymphoblastic leukemia(all) : 47 cases B-cell ALL : 38 cases T-cell ALL : 9 cases 20

explorase Visual data mining tools for microarray data and metabolic networks Visual data analysis interface for microarray data and metabolic networks Based on R and GGobi Provide EDA tool using direct manipulation analyze the connections between microarray data and metabolic pathway visually and interactively combine statistical analysis results with interactive plots to improve the analysis. 21

explorase 22

Discussion Full marriage between GGobi s direct manipulation graphical environment and R s familiar extensible environment for statistical data analysis powerful tool for visual data mining * all references and documents are on the web http://www.ggobi.org 23