Data Visualization - A Very Rough Guide Ken Brodlie University of Leeds 1
What is This Thing Called Visualization? Visualization Use of computersupported, interactive, visual representations of data to amplify cognition (Card, McKinlay, Shneiderman) Born as a discipline in 1987 with publication of NSF Report Now widely used in computational science and engineering Vis5D 2
Visualization Twin Subjects Visualization Twin Subjects Scientific Visualization Visualization of physical data Information Visualization Visualization of abstract data Ozone layer around earth Automobile web site - visualizing links 3
Scientific Visualization Another Characterisation Focus is on visualizing an entity measured in a multi-dimensional space 1D 2D 3D Occasionally nd Underlying field is recreated from the sampled data Relationship between variables well understood some independent, some dependent http://pacific.commerce.ubc.ca/xr/plot.html Image from D. Bartz and M. Meissner 4
Scientific Visualization Model Scientific Visualization Model Visualization represented as pipeline: Read in data data model visualize render Build model of underlying entity Construct a visualization in terms of geometry Render geometry as image Realised as modular visualization environment IRIS Explorer IBM Open Visualization Data Explorer (DX) AVS 5
Extending the SciVis Model Extending the SciVis Model The dataflow model has proved extremely flexible Provides basis of collaborative visualization Implemented in IRIS Explorer as the COVISA toolkit Extensible User code introduced as module in pipeline allows computational steering data model visualize render collaborative server internet render control simulate visualize render 6
An e-science Demonstrator An e-science Demonstrator Emergency scenario: release of toxic chemical Simulation launched on Grid resource, steered from desktop using IRIS Explorer Collaborators linked in remotely using COVISA toolkit Dispersion of pollutant studied under varying wind directions A collaborator links in over the network 7
Other Metaphors Other Metaphors Other user interface metaphors have been suggested Spreadsheet interface becoming popular.. Allows audit trail of visualizations Jankun-Kelly and Ma 8
Information Visualization Information Visualization Focus is on visualizing set of observations that are multi-variate Example of iris data set 150 observations of 4 variables (length, width of petal and sepal) Techniques aim to display relationships between variables 9
Dataflow for Information Visualization Again we can express as a dataflow but emphasis now is on data itself rather than underlying entity First step is to form the data into a table of observations, each observation being a set of values of the variables Then we apply a visualization technique as before data data table observations 1 2 visualize A.... variables B.... render C.... 10
Multivariate Visualization Multivariate Visualization Techniques designed for any number of variables Glyph techniques Parallel co-ordinates Scatter plot matrices Pixel-based techniques Software: Xmdvtool Matthew Ward Acknowledgement: Many of images in following slides taken from Ward s work..and also IRIS Explorer! 11
Glyph Techniques Glyph Techniques Star plots Each observation represented as a star Each spike represents a variable Length of spike indicates the value Variety of possible glyphs Chernoff faces Crime in Detroit 12
Parallel Co-ordinates Parallel Co-ordinates Each variate represented as vertical axis Axes laid out uniformly Observation represented as a polyline traversing all M axes, crossing each axis at the observed value of the variate Detroit homicide data (7 variables,13 observations) 13
Scatter Plot Matrices Scatter Plot Matrices Matrix of 2D scatter plots Each plot shows projection of data onto a 2D subspace of the variates Order M 2 plots 14
The Screen Space Problem The Screen Space Problem All techniques, sooner or later, run out of screen space Parallel coordinates Usable for up to 150 variates Unworkable greater than 250 variates Remote sensing: 5 variates, 16,384 observations) 15
Brushing as a Solution Brushing as a Solution Brushing selects a restricted range of one or more variables Selection then highlighted 16
Clustering as a Solution Clustering as a Solution Success has been achieved through clustering of observations Hierarchical parallel co-ordinates Cluster by similarity Display using translucency and proximity-based colour 17
Hierarchical Parallel Coordinates 18
Reduction of Dimensionality of Variate Space Reduce number of variables, preserve information Principal Component Analysis Transform to new coordinate system Hard to interpret Hierarchical reduction of variate space Cluster variables where distance between observations is typically small Choose representative for each cluster 19
Using a Dataflow System for Information Visualization IRIS Explorer used to visualize data from BMW Five variables displayed using spatial arrangement for three, colour and object type for others Notice the clusters More later.. Kraus & Ertl 20
Scientific Visualization Information Visualization Scientific Visualization Focus is on visualizing an entity measured in a multi-dimensional space Underlying field is recreated from the sampled data Relationship between variables well understood Information Visualization Focus is on visualizing set of observations that are multi-variate There is no underlying field it is the data itself we want to visualize The relationship between variables is not well understood 21