Cours de Visualisation d'information InfoVis Lecture. Multivariate Data Sets



Similar documents
Multi-Dimensional Data Visualization. Slides courtesy of Chris North

Information Visualization Multivariate Data Visualization Krešimir Matković

Visualization Techniques in Data Mining

Data Visualization. or Graphical Data Presentation. Jerzy Stefanowski Instytut Informatyki

Visualizing Data. Contents. 1 Visualizing Data. Anthony Tanbakuchi Department of Mathematics Pima Community College. Introductory Statistics Lectures

The Value of Visualization 2

Visual Data Mining with Pixel-oriented Visualization Techniques

Chapter 3 - Multidimensional Information Visualization II

Diagrams and Graphs of Statistical Data

TIBCO Spotfire Business Author Essentials Quick Reference Guide. Table of contents:

Visualization methods for patent data

Time Series Data Visualization

CS171 Visualization. The Visualization Alphabet: Marks and Channels. Alexander Lex [xkcd]

20 A Visualization Framework For Discovering Prepaid Mobile Subscriber Usage Patterns

Visualization Quick Guide

TIES443. Lecture 9: Visualization. Lecture 9. Course webpage: November 17, 2006

Analyzing The Role Of Dimension Arrangement For Data Visualization in Radviz

Step-by-Step Guide to Bi-Parental Linkage Mapping WHITE PAPER

Iris Sample Data Set. Basic Visualization Techniques: Charts, Graphs and Maps. Summary Statistics. Frequency and Mode

Big Data: Rethinking Text Visualization

Introduction to Multivariate Analysis

Visualization of Multivariate Data. Dr. Yan Liu Department of Biomedical, Industrial and Human Factors Engineering Wright State University

Business Intelligence and Process Modelling

Data Mining: Exploring Data. Lecture Notes for Chapter 3. Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler

Data exploration with Microsoft Excel: analysing more than one variable

Data Mining: Exploring Data. Lecture Notes for Chapter 3. Introduction to Data Mining

Exploratory Data Analysis

Exploratory data analysis (Chapter 2) Fall 2011

How To Use Statgraphics Centurion Xvii (Version 17) On A Computer Or A Computer (For Free)

Infographics in the Classroom: Using Data Visualization to Engage in Scientific Practices

CREATING EXCEL PIVOT TABLES AND PIVOT CHARTS FOR LIBRARY QUESTIONNAIRE RESULTS

The course: An Introduction to Information Visualization Techniques for Exploring Large Database

Visualization Software

TEXT-FILLED STACKED AREA GRAPHS Martin Kraus

There are six different windows that can be opened when using SPSS. The following will give a description of each of them.

RnavGraph: A visualization tool for navigating through high-dimensional data

TIPS FOR DOING STATISTICS IN EXCEL

Principles of Data Visualization for Exploratory Data Analysis. Renee M. P. Teate. SYS 6023 Cognitive Systems Engineering April 28, 2015

Data Visualization - A Very Rough Guide

Examples of Data Representation using Tables, Graphs and Charts

Basics of Dimensional Modeling

Exercise 1.12 (Pg )

9. Text & Documents. Visualizing and Searching Documents. Dr. Thorsten Büring, 20. Dezember 2007, Vorlesung Wintersemester 2007/08

Visual Mining of E-Customer Behavior Using Pixel Bar Charts

What is Visualization? Information Visualization An Overview. Information Visualization. Definitions

Visualization of missing values using the R-package VIM

SAS BI Dashboard 4.3. User's Guide. SAS Documentation

an introduction to VISUALIZING DATA by joel laumans

An example. Visualization? An example. Scientific Visualization. This talk. Information Visualization & Visual Analytics. 30 items, 30 x 3 values

Hierarchy and Tree Visualization

Lecture 2: Descriptive Statistics and Exploratory Data Analysis

A Comparative Study of Visualization Techniques for Data Mining

Enterprise Data Warehouse (EDW) UC Berkeley Peter Cava Manager Data Warehouse Services October 5, 2006

High Dimensional Data Visualization

Data Visualization Techniques

Visibility optimization for data visualization: A Survey of Issues and Techniques

Information visualization examples

SPSS Manual for Introductory Applied Statistics: A Variable Approach

Tableau Data Visualization Cookbook

Instructions for Creating Silly Survey Database

Chapter 2: Frequency Distributions and Graphs

Exploratory Data Analysis with MATLAB

The Forgotten JMP Visualizations (Plus Some New Views in JMP 9) Sam Gardner, SAS Institute, Lafayette, IN, USA

CSU, Fresno - Institutional Research, Assessment and Planning - Dmitri Rogulkin

GGobi : Interactive and dynamic

COM CO P 5318 Da t Da a t Explora Explor t a ion and Analysis y Chapte Chapt r e 3

GeoGebra. 10 lessons. Gerrit Stols

Data Warehousing and Decision Support. Introduction. Three Complementary Trends. Chapter 23, Part A

Extend Table Lens for High-Dimensional Data Visualization and Classification Mining

Data Mining: Exploring Data. Lecture Notes for Chapter 3. Introduction to Data Mining

Tutorial 3: Graphics and Exploratory Data Analysis in R Jason Pienaar and Tom Miller

Introduction Course in SPSS - Evening 1

3D Interactive Information Visualization: Guidelines from experience and analysis of applications

Data Visualization Principles: Interaction, Filtering, Aggregation

What Does the Normal Distribution Sound Like?

6 th Annual EclipseCon Introduction to BIRT Report Development. John Ward

Statistics Chapter 2

Reporting. Understanding Advanced Reporting Features for Managers

A Taxonomy of Visualization Techniques using the Data State Reference Model

Developing Web and Mobile Dashboards with Oracle ADF

Excel Companion. (Profit Embedded PHD) User's Guide

Criteria for Evaluating Visual EDA Tools

Scatter Plots with Error Bars

Big Data in Pictures: Data Visualization

Flexible Web Visualization for Alert-Based Network Security Analytics

Data Exploration and Preprocessing. Data Mining and Text Mining (UIC Politecnico di Milano)

Exploratory Spatial Data Analysis

Transcription:

Cours de Visualisation d'information InfoVis Lecture Multivariate Data Sets Frédéric Vernier Maître de conférence / Lecturer Univ. Paris Sud Inspired from CS 7450 - John Stasko CS 5764 - Chris North

Data Sets Ø Data comes in many different forms Ø Typically, not in the way you want it Ø How is stored (in the raw)? Ø Heterogeneous data often seen as multiple dimensions of elements extracted by patterns or needs.

Data set!

Schema Ø Cars Ø brand Ø model Ø year Ø cost Ø size Ø weights Ø miles per gallon Ø 1 M2R InfoVis Lecture. 2011. Univ. Paris Sud

Data Tables Ø Often, we take raw data and transform it into a form that is more workable Ø Main idea: Ø Individual items are called cases Ø Cases have variables (attributes)

Variable Types Ø N-Nominal (equal or not equal to other values) Ø Example: gender, hair color (blond, brown, black, red) Ø O-Ordinal (obeys < relation, ordered set) Ø Example: soccer leagues, rainbow colors Ø Q-Quantitative (can do math on them) Ø Example: age, photoshop colors

Variable Types Ø Three main types of variables Ø N-Nominal Ø By Class: data belong or not to classes (.org,.com,.fr) Ø Partially ordered: order on classes (engineer students) Ø O-Ordinal Ø Q-Quantitative Ø Quantitative + 0 (clear 0) Ø Sometimes the type depends on the context Ø O-Ordinal is always possible

Example Baseball statistics

Metadata Ø Descriptive information about the data Ø Might be something as simple as the type of a variable, or could be more complex (INT) Ø For times when the table itself just isn t enoughi Ø AtBats Hit HomeRuns Ø if YearInMasterLeague =1 then AtBats=CareerAtBat Ø if player is injured more than half of the season the avg do not take into account this season Ø 1rst season stats are not backed-up by the

How Many Variables? Ø Data sets of dimensions 1,2,3 are common Ø Number of variables per class Ø 1 - Univariate data (e.g timeline) Ø 2 - Bivariate data (e.g maps) Ø 3 - Trivariate data (volume) Ø >3 - Hypervariate data (???) Ø Example: www.nationmaster.com Ø Cases always the same

Univariate Ø Representations Ø Dot plot Ø Bar chart (item vs. attribute) Ø Tukey box plot Ø Histogram 7 Bill 5 3 1

Bivariate Ø Scatterplot Common BUT Powerful

Density problem

Trivariate Ø 3D scatterplot, 2D plot+size 2D plot+color, 3x barchart

Hypervariate Data Ø What about data sets with MANY variables? Ø Often the interesting ones Ø n-d What does 10-D space look like?

Multiple Projections Give each variable its own display 1 A B C D E 1 4 1 8 3 5 2 6 3 4 2 1 3 5 7 2 4 3 4 2 6 3 1 5 2 3 4 A B C D E What if more than 4 cases?

Help me Infovis! Ø smart layout Ø using graphical

Scatterplot Matrix All pair of variables in their own 2-D scatterplot Brushing (subset) & Linking (sync.) [Voigt, 2002]

label, dot plot, scale Histogram > dot plot for distribution Scale row & column

On steroids

Chernoff Faces Encode different variables values in characteristics of human face

Simple Example [Turner, 1977] [Spinelli and Zhou, 2004]

On steroids Look at faces, not colors 1 M2R InfoVis Lecture. 2011. Univ. Paris Sud

Star Plots / Glyphs Var 5 Var 1 Value Var 2 Space out the n variables at equal angles around a circle Var 4 Var 3 Each spoke encodes a variable s value

examples circular // coords Star plot or Glyph plot => freedom on layout!

On prednizone... just 2 dims [bertillon] population x percent foreigners area = number of foreigners

On steroids (count)

On steroids (dim)

Star Coordinates E. Kandogan, Star Coordinates: A Multi-dimensional Visualization Technique with Uniform Treatment of Dimensions, InfoVis 2000 Late-Breaking Hot Topics, Oct. 2000

Demo - Interaction Ø Activate/ deactivate axis Ø Color selection or axis Ø Glyph coordinates Ø Scale axis Ø Rotate axis Ø Dot size Ø Brushing on axis Ø Trail Ø Inspector Ø Panning

Parallel Coordinates By A. Inselberg Encode variables along a horizontal row Vertical line specifies values V1 V2 V3 V4 V5

Parallel Coords Example Basic Grayscale From: Dean F. Jerding and John T. Stasko http://www.cc.gatech.edu/gvu/softviz/infoviz/information_mural.html Color

And more cars

With brushing

and more brushing

On steroids

VisDB Ø Database of data items, each of n dimensions Ø Issue a query that specifies a target value of the dimensions Ø Often get back no exact matches Ø Want to find near matches Ø Relevance factor Ø metadata Taken from: D. Keim, H-P Kriegel, VisDB Database Exploration Using Multid Vis, IEEE CG&A, 1994.

Technique Ø Calculate relevance of all data points Ø Sort items based on relevance Ø Use spiral technique to order the values Ø Color items based on relevance High Empirically established Low

Display Methodology Highest relevance value in center, decreasing values grow outward Items ordered by total relevance Spiral in each window Total relevance Dim 1 Dim 2 Same item appears in same place in each window Dim 5 Dim 4 Dim 3

Figure from Paper

Example Display

Alternative Ø Grouping arrangement => single window Ø Create all relevance dimensional depictions for an item and group them Ø Spiral out the different data items

Example 8 dimensions 1000 items Multi-window Grouping

On Steroids?

Overview Scatterplot Matrix Chernoff Faces Star Plots / Glyphs Star Coordinates Parallel Coordinates Spiral plots

More techniques? Ø Combinations Ø More integrated software Ø legacy spreadsheet layout

Seelt

Highlighted Dynamic Table Viewer Nada Golmie & Bill Kules

InfoZoom

SpotFire

Spotfire

Advizor

IBM ILOG Discovery

Eureka / TableLens Rao & Card 94

Focus + context

EZChooser: K. Wittenburg

Comparisons Ø ParCood: <1000 items, <20 attrs Ø Relate between adjacent attr pairs Ø StarCoord: <1,000,000 items, <20 attrs Ø Interaction intensive Ø TableLens: similar to par-coords Ø more items with aggregation Ø Relate 1:m attrs (sorting), short learn time Ø Visdb: 100,000 items with 10 attrs Ø Items*attrs = screenspace, long learn time, must query Ø Spotfire: <1,000,000 items, <10 attrs (DQ many) Ø Filtering, short learn time

MultiVariate Visu Tools INTERACTION is the key!

Paper presentations Ø Hajar Falih Ø Multi-Dimensional Detective Ø Thibaut Jacob Ø Rolling the Dice: Multidimensional Visual Exploration using Scatterplot Matrix Navigation 06/12/2011 90 min Lecture: Multi-dimensional Data Visualization Δ 10 min Break 30 min Paper presentations (students) 40 min Lab work on Processing: interaction Δ (Dragicevic & Vernier)