Visualization Techniques in Data Mining



Similar documents
An example. Visualization? An example. Scientific Visualization. This talk. Information Visualization & Visual Analytics. 30 items, 30 x 3 values

High Dimensional Data Visualization

Hierarchy and Tree Visualization

HierarchyMap: A Novel Approach to Treemap Visualization of Hierarchical Data

TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM

Visual Data Mining with Pixel-oriented Visualization Techniques

Enhancing the Visual Clustering of Query-dependent Database Visualization Techniques using Screen-Filling Curves

Pixel-oriented Database Visualizations

Hierarchical Data Visualization

Visualization of Multivariate Data. Dr. Yan Liu Department of Biomedical, Industrial and Human Factors Engineering Wright State University

Interactive Exploration of Decision Tree Results

Space-filling Techniques in Visualizing Output from Computer Based Economic Models

The course: An Introduction to Information Visualization Techniques for Exploring Large Database

Multi-Dimensional Data Visualization. Slides courtesy of Chris North

Interactive Data Mining and Visualization

Clustering & Visualization

Information Visualization Multivariate Data Visualization Krešimir Matković

Evaluating the Effectiveness of Tree Visualization Systems for Knowledge Discovery

Pushing the limit in Visual Data Exploration: Techniques and Applications

Improvements of Space-Optimized Tree for Visualizing and Manipulating Very Large Hierarchies

Integration of Cluster Analysis and Visualization Techniques for Visual Data Analysis

Extend Table Lens for High-Dimensional Data Visualization and Classification Mining

TOP-DOWN DATA ANALYSIS WITH TREEMAPS

Information Visualization and Visual Data Mining

A Taxonomy of Visualization Techniques using the Data State Reference Model

A Survey on Multivariate Data Visualization

Regular TreeMap Layouts for Visual Analysis of Hierarchical Data

The Value of Visualization 2

Visual Mining of E-Customer Behavior Using Pixel Bar Charts

Specific Usage of Visual Data Analysis Techniques

BIG DATA VISUALIZATION. Team Impossible Peter Vilim, Sruthi Mayuram Krithivasan, Matt Burrough, and Ismini Lourentzou

an introduction to VISUALIZING DATA by joel laumans

Data Visualization. or Graphical Data Presentation. Jerzy Stefanowski Instytut Informatyki

Visualization methods for patent data

Cours de Visualisation d'information InfoVis Lecture. Multivariate Data Sets

HDDVis: An Interactive Tool for High Dimensional Data Visualization

Scatterplot Layout for High-dimensional Data Visualization

Squarified Treemaps. Mark Bruls, Kees Huizing, and Jarke J. van Wijk

Independence Diagrams: A Technique for Visual Data Mining

Rendering Hierarchical Data

Hierarchical Data Visualization. Ai Nakatani IAT 814 February 21, 2007

Visualizing Large Graphs with Compound-Fisheye Views and Treemaps

Agenda. TreeMaps. What is a Treemap? Basics

What is Visualization? Information Visualization An Overview. Information Visualization. Definitions

Cascaded Treemaps: Examining the Visibility and Stability of Structure in Treemaps

Big Data: Rethinking Text Visualization

TREE-MAP: A VISUALIZATION TOOL FOR LARGE DATA

Interaction and Visualization Techniques for Programming

Introduction of Information Visualization and Visual Analytics. Chapter 7. Trees and Graphs Visualization

7. Hierarchies & Trees Visualizing topological relations

Web Analysis Visualization Spreadsheet

Reviewing Data Visualization: an Analytical Taxonomical Study

NakeDB: Database Schema Visualization

Treemaps for Search-Tree Visualization

Common Core Unit Summary Grades 6 to 8

The Use of Information Visualization to Support Software Configuration Management *

Visualizing High-density Clusters in Multidimensional Data

Data Exploration and Preprocessing. Data Mining and Text Mining (UIC Politecnico di Milano)

Visualization Techniques for Data Mining in Business Context: A Comparative Analysis

Topic Maps Visualization

Graph/Network Visualization

TIES443. Lecture 9: Visualization. Lecture 9. Course webpage: November 17, 2006

Cabinet Tree: an orthogonal enclosure approach to visualizing and exploring big data

Visualization of Software

A Comparative Study of Visualization Techniques for Data Mining

Hyperbolic Tree for Effective Visualization of Large Extensible Data Standards

OLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP

Pixel bar charts: a visualization technique for very large multi-attributes data sets{

High-Dimensional Visualizations

VISUALIZING HIERARCHICAL DATA. Graham Wills SPSS Inc.,

Tudumi: Information Visualization System for Monitoring and Auditing Computer Logs

3 Information Visualization

COC131 Data Mining - Clustering

Information Visualization WS 2013/14 11 Visual Analytics

Transcription:

Tecniche di Apprendimento Automatico per Applicazioni di Data Mining Visualization Techniques in Data Mining Prof. Pier Luca Lanzi Laurea in Ingegneria Informatica Politecnico di Milano Polo di Milano Leonardo

Outline Goals of visualization Advantages Methodologies Techniques User interaction Problems

Goals of Data Visualization Today there is the need to manage a huge amount of data, and computer systems help us in this task Visual Data Mining help to deal with this flood of information, integrating the human in the data analysis process Visual Data Mining allows the user to gain insight into the data, drawing conclusions and directly interacting with the data

Advantages of visualization techniques The main advantages of the application of Visual data mining techniques are: Visual data exploration can easily deal with very large, highly non homogeneous and noisy amount of data Visual data exploration requires no understanding of complex mathematical or statistical algorithms Visualization techniques provide a qualitative overview useful for further quantitative analysis

Approach methodologies Presentation: starting point: facts to be presented are fixed a priori result: high-quality visualization of the data presenting the facts Confirmative Analysis: starting point: hypotheses about the data result: visualization of the data allowing confirmation or rejection of the hypotheses Explorative Analysis: starting point: data without hypotheses result: visualization of the data, which can provide hypotheses about data distribution

Visualization techniques Geometric techniques: scatterplots matrices, Hyperslice, parallel coordinates Pixel-oriented techniques: simple line-by-line, spiral and circle segments Hierarchical techniques: Treemap, cone trees Graph-based techniques: 2D and 3D graph Distortion techniques: hyperbolic tree, fisheye view, perspective wall User interaction: brushing, linking, dynamic projections and rotations, dynamic queries

Geometric techniques Basic idea: Visualization of geometric transformations and projections of the data Methods: Scatterplot matrices Hyperslice Parallel coordinates

Scatterplot matrices A scatterplot matrix is composed of scatter plots of all possible pairs of variables in a dataset Assuming a N-dimension dataset, there are (N 2 -N)/2 pairs of two dimension plots

Hyperslice HyperSlice is an extension of the scatterplot matrix They represent a multi-dimensional function as a matrix of orthogonal two-dimensional slices

Parallel Coordinates The axes are defined as parallel vertical lines separated A point in Cartesian coordinates correspond to a polyline in parallel coordinates Able to visualize data that may be occluded in Cartesian coordinates

Pixel-oriented techniques Basic idea: The basic idea of pixel-oriented techniques is to map each data value to a colored pixel Each attribute value is represented by a pixel with a color tone proportional to a relevance factor in a separate window Methods: Simple Arrangement Line-by-Line Spiral and Circle Segments Techniques

Pixel-oriented techniques Simple arrangement line-by-line

Pixel-oriented techniques Spiral Circle segments

Hierarchical techniques Basic idea: Visualization of the data using a hierarchical partitioning into two- or three-dimensional subspaces Methods: Treemap Cone trees

Treemap Visualization of hierarchical collections of quantitative data as files on a hard drive, financial analysis, bioinformatics, etc.. Divide a limited screen space display area into a sequence of rectangles whose areas correspond to an attribute of data set http://www.smartmoney.com/marketmap/

Cone trees 3-dimensional extension of the more familiar 2-D hierarchical tree structures, to a more intuitive navigation and display of information

Graph-based visualization Graphs (edges + nodes) with labels and attributes Used where emphasis is on data relationship (databases, telecom) Coordinates not always meaningful Useful for discovering patterns

Graph-based visualization Color and thickness code values Asymmetric relations:

Graph-based visualization E-mail (SeeNet)

Graph-based visualization 3D graphs: more room for objects different points of view Example (hypertexts Narcissus):

Focus vs. context Too much data in too small screens Solutions: dual views (detailed + global) distorted view (e.g. fisheye view)

Distortion Hyperbolic tree Fisheye view Perspective wall

User interaction Brushing: selecting points or regions Linking: more views work together

User interaction Dynamic projections and rotations Interactively and continuously moving through subspaces Dynamic queries Visual interface (button and sliders) Incremental behavior (undo)

Problems Missing attributes Ignore Fill blanks with: a predefined constant a value extracted according to the inferred distribution Assess the effect of interpolated values

Problems Large data sets Typical screens have one million pixels Subsampling Voxel/pixel bins Jittering Large number of attributes Principal component analysis Factor analysis Etc.

Conclusions Human and computer skills can be integrated with visual data mining Visualization may be useful for: understanding what is happening searching novel patterns User interaction is paramount in these

References (I) D. A. Keim. Visual Techniques for Exploring Databases. Int. Conference on Knowledge Discovery in Databases, 1997. D. A. Keim. Information visualization and visual data mining. IEEE Trans. on Visualization and Computer Graphics, jan 2002, vol. 8, no. 1, pp. 1-8 J. Van Wijk, R. Van Liere. HyperSlice - Visualization of scalar functions of many variables. IEEE Visualization, 1993, pp.119-125. P. C. Wong, A. H. Crabb, R. D. Bergeron. Dual multiresolution HyperSlice for multivariate data visualization. InfoVis 1996 D. A. Keim. Pixel-oriented Database Visualizations. SIGMOD RECORD, Special Issue on Information Visualization, 1996. M. Ankerst, D. A. Keim, H.-P. Kriegel. Circle Segments: A Technique for Visually Exploring Large Multidimensional Data Sets. Visualization '96, 1996. B. B. Bederson, B. Shneiderman, M. Wattenberg. Ordered and Quantum Treemaps: Making Effective Use of 2D Space to Display Hierarchies. ACM Transactions on Graphics, 2002, pp. 833-854.

References (II) R. A. Becker, S. G. Eick, A. R. Wilks. Visualizing Network Data. IEEE Trans. on Visualization and Computer Graphics, mar 1995, vol. 1, no. 1, pp. 16-28 R. J. Hendley, N. S. Drew, A. M. Wood, R. Beale. Narcissus: visualising information. InfoVis 1995, p. 90 T. A. Keahey, E. L. Robertson (1996). Techniques for non-linear magnification transformations. InfoVis 1996 J. Lamping, R. Rao, P. Pirolli. A focus+context technique based on hyperbolic geometry for visualizing large hierarchies. CHI '95, pp. 401-408 J. D. Mackinlay, G. G. Robertson, S. K. Card. The perspective wall: detail and context smoothly integrated. CHI '91, pp. 173-176