Chapter 3 - Multidimensional Information Visualization II
|
|
|
- Aubrie Rose
- 10 years ago
- Views:
Transcription
1 Chapter 3 - Multidimensional Information Visualization II Concepts for visualizing univariate to hypervariate data Vorlesung Informationsvisualisierung Prof. Dr. Florian Alt, WS 2013/14 Konzept und Folien (4th revised edition): Thorsten Büring, Andreas Butz, Michael Burch 1
2 Outline Reference model and data terminology Visualizing data with < 4 variables Visualizing multivariable data Geometric transformation Glyphs Pixel-based Dimensional Stacking Downscaling of dimensions Case studies: support for exploring multidimensional data Rank-by-feature Value & relation display Dust & magnet Clutter reduction techniques 2
3 Trivariate Data 3
4 Trivariate Data Tempting: map each variable to each dimension of a 3D scatterplot Occlusion of points with different positions Problem with static representation? Occlusion of points with different positions and in general difficulty to visually map the (x,y,z)-coordinates of the single points 4
5 Trivariate Data Idea 1: Map each variable to a dimension (axis) of a 3D scatterplot Enhancement by using guiding lines Still problematic: Many lines lead to visual clutter and even more occlusions 5
6 Trivariate Data Idea 2: Map each variable to a dimension (axis) of a 3D scatterplot Rotation is used to explore the data but if the dataset is very dense we have to also look inside it More complex interactive features needed to analyze the trivariate data 6
7 Scatterplot Matrix Matrix of all pairwise scatterplot views of the data Easy to understand by using familiar and powerful scatterplot representation Can serve as a good starting point for data exploration Increased demand for display space Increased cognitive load caused by redundant data Cleveland
8 Trivariate Data Idea 3: 2D scatterplot with additional encoding In this case color and shape Shows relationship between three variables For color / shape coding: assumes categorical variable or classing of quantitative variable pot. loss of information 8
9 Multivariate Data 9
10 Multivariate Data Geometric Transformation Glyph-based Visualizations Pixel-based Visualizations Dimensional Stacking Downscaling of Dimensions Clutter Reduction 10
11 Geometric Transformations 11
12 Geometric Transformations Idea: present projections of the multidimensional data to find interesting correlations Most common techniques Scatterplot matrix Prosection matrix Parallel coordinates plot 12
13 Scatterplot Matrix Scatterplot matrix can be scaled to > 3 variables Number of scatterplots increases rapidly n variables means n x n plots Diagonal maps the same variable twice Each pair is plotted twice, once on each side of the diagonal Allows convenient sequential browsing of one variable compared to all other variables 13
14 Scatterplot Matrix 14
15 Prosection Matrix Scatterplot matrix with interactive linking and brushing (Tweedie & Spence 1996) Projection of a section of parameter space User selects multivariable ranges, which are colored differently 15
16 Prosection Matrix 16
17 Parallel Coordinate Plot Create M equidistant vertical axes, each corresponding to one of the M variables Each axis scaled to [min, max] range of the variable Each observation corresponds to a line drawn through point on each axis corresponding to value of the variable 17
18 Parallel Coordinate Plot One vertical axis for each variable Every case is represented by a line Line intersects each of the vertical axis at the point corresponding to the attribute value of the case Popular visualization technique Complexity (number of axes) is directly proportional to the number of attributes (comp. scatterplot matrix) All attributes receive uniform treatment Potential problems of this visualization? Inselberg
19 Parallel Coordinate Plot The Technique Technique lays out coordinate axes in parallel rather than orthogonal to each other (in contrast to scatterplots). But there's a much simpler way of looking at it: as the representation of a data table. Table describes car models released from 1970 to 1982, and contains their mileage (MPG), number of cylinders, horsepower, weight, and year they were introduced (among others). 19
20 Parallel Coordinate Plot Each of the table columns (left) is mapped onto a vertical axis in the plot (right). Each data value placed somewhere along the line, scaled to lie between the minimum at the bottom and the maximum at the top Pure collection of points would not be very useful, so the points belonging to the same case (row) are connected with lines (polylines) By this the characteristic effect occurs where parallel coordinates are famous for. 20
21 Parallel Coordinate Plot Insights in this multivariate dataset Cylinders axis stands out because it only has a few different values. The number of cylinders can only be a whole number, and there aren't more than eight here, so all the lines have to pass through a small number of points. Data like this, and also categorical data, are usually not well suited for parallel coordinates. As long as there is only one or two, it's not a problem, but when the data is largely or completely categorical, parallel coordinates do not show any useful information anymore. 21
22 Parallel Coordinate Plot Insights in this multivariate dataset In the space between MPG and cylinders, you can tell that eight-cylinder cars generally have lower mileage than six- and four-cylinder ones. Just follow the lines and look at how they cross: lots of crossing lines are an indication of an inverse relationship, and that is clearly the case here: the more cylinders, the lower the mileage. 22
23 Parallel Coordinate Plot Insights in this multivariate dataset Between horsepower and weight, the situation is similar: more horsepower means heavier cars in general, but there is some spread in the values of course. There is also one exception of a high-horsepower eight-cylinder car that is very light. Can you spot that outlier? 23
24 Parallel Coordinate Plot Insights in this multivariate dataset Finally, the lines between weight and year cross over a lot, indicating that cars got lighter over the years. You can also easily tell that the year axis only records a small number of different values, similar to the cylinders, from the triangular shapes converging on a few points. While this is a very simple example, it shows the typical structures you would find in most datasets. 24
25 Parallel Coordinate Plot Brushing 25
26 Parallel Coordinate Plot Brushing In addition to some experience in reading parallel coordinates, the best way to get to know a dataset using the technique is clearly interaction. The main one in parallel coordinates is called brushing, for reasons that should be obvious from looking at the image below. For this to make sense, we need to look at all axes. 26
27 Parallel Coordinate Plot Brushing Brushing the years 1980 to 1982 on the (right-most) year axis. Results in a brushed part of the lines in heavy black, with the rest still in the background in gray for context. Looking at the axes from right to left you can see that the car models in this selection were almost all in the lower half of the weight range, and all of them were in the lower half in terms of horsepower The distribution of cylinders is also interesting: there only seems to be a single eightcylinder car in this selection, all others are six cylinders or below. Mileage is also mostly above the mean value for all cars. 27
28 Parallel Coordinate Plot Brushing Brushing the years 1970 to 1972 yields a very different image: weights, power, etc. are all over the place, and mileage is mostly in the lower half. While higher values are to be expected, it is interesting to see that there was quite a spectrum of cars at the beginning of the decade, not just heavy eightcylinder ones. The trend over the years was towards lighter, more efficient cars, though. 28
29 Parallel Coordinate Plot More interactive Features There is more to be said about interaction, of course: you can usually reorder axes to compare different ones side by side, combine brushing on different axes, flip axes (the arrows at the top of the images show the direction of the axis), etc. 29
30 Parallel Coordinate Plot for sets Bendix et al. 2005: Parallel Sets Parallel coordinates for categorical data Substitute individual data points by a frequency-based representation 1st 2nd 3rd crew Female (s) Female (d) Male (s) Male (d)
31 3D Parallel Coordinates Parallel 2D planes instead of vertical axes 31
32 Parallel Coordinate Plot Limitations When the number of data items gets very high, there is a lot of overplotting that can make it impossible to see anything. The number of dimensions on screen also needs to be below about a dozen at the same time to really make sense, anything above that gets very difficult to read. There is a lot of work in visualization on automatic axes reordering, clustering of similar axes, etc. that makes this easier for highdimensional data. But in general, the technique simply works best for datasets with a moderate number of dimensions and no more than a few thousand records. Also, as mentioned above, the data needs to be numerical. The technique does not work well for categorical data or data with few values per axis. 32
33 Scatterplot vs. Parallel Coordinates Correlations also detectable in Parallel Coordinate 33
34 Parallel Coordinate Plot Try it out XmdvTool Macrofocus products/infoscope/ 34
35 Geometric Transformations: discussion Advantages Users familiarity with scatterplots (scatterplot matrix) 2D patterns can easily be identified Disadvantages Rather limited scalability limited number of cases (Parallel Coordinate Plot) limited number of dimensions (scatterplot matrix) Overplotting and overlap Labeling (Parallel Coordinates) 35
36 Glyph-based Visualizations 36
37 Glyph-Based Visualizations An object or symbol for representing data values. Glyphs are generally a way of representing many data values and are sometimes called icons. A common glyph is the arrow, often chosen to represent vector fields. The arrow depicts both speed and direction at a point. [Keller and Keller, 1993] A glyph is a graphical object designed to convey multiple data values. [Ware, 2004] 37
38 Glyph-Based Visualizations A primitive example of a glyph is an arrow whose visual attributes length, width, angle, and color might be used to encode four different data attributes in a single graphical object. The most prominent example for glyphs are Chernoff Faces [Chernoff, 1973], where the different parts of a conceptualized human face (mouth, nose, head, eyes, eyebrows, etc.) encode different dimensions of an n- dimensional data set. Advantages of glyph representations are foremost their easy learnability, long remembering periods, and the possibility to represent value restrictions visually, especially when multiple attributes are involved. 38
39 Glyph-Based Visualizations Glyph-based techniques Star glyph Chernoff faces Stick-figure Shape coding Color icons Glyph: small-sized visual symbol Variables are encoded as properties of glyph Each case is represented by a single glyph 39
40 Star glyphs Coekin 1996 Radial axes with equal angles (spokes of a wheel) Each axis represents a variable Each spoke length encodes a variable s value May also be overlaid for better comparison 40
41 Chernoff Faces Chernoff 1973 Humans are sensitive to a wide range of facial characteristics (e.g., eye size, length of a nose, etc.) 18 characteristics to encode data by stylized faces Positive evaluation results (Spence & Parr 1991) Some facial features seem to be able to carry more information than others (Morris et al. 1999; De Soete 1986) 41
42 Chernoff Faces Example
43 Stick-Figure Icons Pickett & Grinstein 1998 Each case is represented by a stick figure Two attributes are mapped to XY position of the glyph Remaining dimensions are mapped to the angle and / or length of the 4 limbs When icons are densely packed a texture appears Texture pattern reveals characteristics of the data space Different members of stick-figure family for conveying different types of data structures 43
44 Stick-Figure Icons Stick-figure example Census data showing age (y), income (x), education, salary, language, marital status etc. Gender is encoded by two stick-figure families Grinstein et al
45 Shape Coding Glyph encoding 6 attributes Medium Beddow 1990 Each case is drawn as a glyph containing a rectangular grid Each grid cell represents one attribute Low High Attribute value is encoded with gray scales Glyphs are positioned in a line, columns or encoded dimensions Highly compressed visualization without clutter and overlap (compare to stick figures) Identification of promising patterns 45
46 Shape Coding Attribute values encoded by white, grey, black 13 Variables gained from magnetoshere and solar wind data Includes one time variable (hour/day), which has been mapped to x/y 46
47 Color icons Levkowitz 1991, Keim & Kriegel 1994 Keim & Kriegel 1994 Shape coding with a focus on colors Arrangement is query-dependent (e.g., spiral: most relevant glyph is centered) What about compressing the visualization even more by using 1-pixel representations? Problem: users need at least 2x2 pixel per data value + pixels for borders to distinguish between the elements of the visualization This is different to pixel-based techniques, which will be discussed in the next lecture 47
48 8-dimensional result of a database query, cases, Keim&Kriegel
49 Glyph-Based Visualizations Advantages Provide holistic overview of the information space Exploit the human powerful ability of perceiving (texture) patterns and human face characteristics (Chernoff) Direct metaphor of Chernoff-face-like icons (e.g. houses) may prove to be intuitive for novice users Disadvantages Glyphs must be learned Only suitable for small to medium data sets Stick figures give a rather broad overview and may be difficult to interpret Mappings may introduce biases in interpretation (e.g. the head shape of a Chernoff-face may be easier to perceive and compare than length of nose) 49
50 Pixel-based Visualizations 50
51 Pixel-Based Techniques Idea: each data item is represented by one colored pixel Value ranges are mapped to a fixed color sequence of full color (hue) scale but monotonically decreasing brightness Data values belonging to one attribute are displayed in a separate view only one pixel per data value without need for a border But: users need to relate to different portions of the screen to perceive correlations 51
52 Pixel-Based Techniques Optimization Goal (OG) 1: arrangement of pixels in the subwindows should preserve the 1D ordering into 2D plane as best as possible Simple left-right or top-down arrangement do usually not provide useful results on a pixel-level Space-filling curves (Paneo-Hilbert and Morton) provide maximum of locality preservation, but are difficult to follow and thus to relate between the sub-windows Recursive pattern Circle segments Morton Paneo-Hilbert 52
53 Recursive Pattern Keim et al Naturally ordered data set Prices of IBM stock, Dow Jones index, Gold, exchange rate US-Dollar September 1987-February daily measurements for each stock Recursive pattern visualization Lower-level patterns used as building blocks for higher level patterns Keim 2000 LP1 one day, LP2 one week, LP3 one month, LP4 one year 53
54 Recursive Pattern 8 horizontal bars correspond to 8 years Subdivisions between the bars represent 12 months within each year Example analysis results Gold price was very low in the sixth year IBM price fell quickly after the first 1 ½ month US-Dollar exchange rate was highest in the third year Keim et al
55 Keim et al
56 Query Dependent Arrangement Ordering of data objects based on relevance to a given query Most relevant data object is placed in the center of the screen OG 2: for the pixel arrangement in each sub-window the distance to the center should correspond to the ordering of the data objects Simple spiral arrangement fulfills OG 2, but local clustering properties (OG 1) are weak, i.e. low probability that two pixels close on the screen are also close in the 1D ordered sequence of the query result set Generalized spiral technique: enhance the clustering qualities of the spiral technique by using screen-filling curves locally Keim
57 Spiral vs. Generalized Spiral Keim
58 Circle Segment Rethink the shape of sub-window Rectangular shape of sub-windows makes efficient use of the screen For data sets with many dimensions, the pixels of one data object are rather far apart Makes it difficult to find patterns OP 3: minimize the average distance between the pixels (data values) belonging to one case Circle segment Each dimension corresponds to a segment of a circle Values of one dimension are drawn in a back and forth manner from the center of the circle to the outside 58
59 Circle Segment (CS) vs. Line Graphs 10 years of stock data for 7 stocks Line graph granularity is limited by the width of the screen CS: oldest data items in the middle of the circle, most recent ones are at the outside Easier to perceive patterns no overlap of data Keim et al. 1996b 59
60 Pixel-Based Techniques Advantages: Large data sets can be visualized Improved pattern detection due to non-overlap strategy Disadvantages Labeling Intuitiveness Mapping color to quantitative data? Open questions Drill-down to detail information?? 60
61 Downscaling of Dimensions 61
62 Downscaling of Dimensions Projecting n dimensions down to a lower dimensionality while retaining as much of the original information as possible Principal components analysis, Factor analysis, Multidimensional scaling Statistical approaches to reduce the number of dimensions by finding the data s main characteristics / patterns Good tutorial: principal_components.pdf Self-organizing maps (SOM) aka Kohonen map Reduce the dimensions of data by using self-organizing neural networks Produces usually a 2D map which mirrors the similarity of cases (similar cases are grouped together) Good tutorial: Problems: pruning of information; hard to interpret since display coordinates have no semantic meaning; SOM and MDS are iterative approaches (computationally hard, no unique result) 62
63 Downscaling of Dimensions Skupin
64 Exploring Multivariate Data 64
65 Rank-By-Feature Seo & Shneiderman 2004 Part of the Hierarchical Clustering Explorer (HCE) ( Tabs: histogram and scatterplot ordering Implements systematic approach for data exploration (1) study 1D, study 2D, then find features (2) ranking guides insight, statistics confirm Tool provides low-dimensional projections as a histogram (1D) or scatterplot (2D) Users can select a feature detection criterion (e.g. test for normal distribution (1D), correlation coefficient (2D)) to rank projections The ranking facility is particularly helpful when the number of possible projections is too large to investigate: concentrate on the interesting ones 65
66 Rank-By-Feature Users start with 1D projections (histogram ordering) Four coordinated views A: selection of ranking criterion B: overview of scores for all dimensions (color coding: the brighter the color, the higher the score) C: numerical / statistical detail for each dimension (e.g. score, mean, standard deviation) D: display of histogram + boxplot (minimum, first quartile, median, third quartile, maximum) 66
67 Rank-By-Feature Some basic statistical terms Mean: Sum of all values divided by the number of values Median: Middle value of a distribution of values when ranked in order of magnitude Mode: Single most common value Variance: average squared deviation between the mean and the values Standard deviation: square root of the variance (translates the variance into the original units of measurement) Statistical tests supported by HCE for 1D ranking Normality of the distribution: distribution of items forms a symmetric, bell-shaped curve Uniformity of the distribution: all of the values of a random variable occur with equal probability (results in a flat histogram) Number of potential outliers Number of unique values 67
68 Rank-By-Feature Move on to 2D projections (scatterplot ordering) Identify pairwise relationships between dimensions B: prism provides overview of scores for dimension pairs; score is color coded D: scatterplot browser; multiple browsers are possible; Ranking criteria Correlation coefficient: direction and strength of linear relationship Least square error for simple linear / curvilinear regression: how well does the regression model fit Number of items in a user-defined region of interest & uniformity of scatterplot 68
Chapter 3 - Multidimensional Information Visualization II
Chapter 3 - Multidimensional Information Visualization II Concepts for visualizing univariate to hypervariate data Vorlesung Informationsvisualisierung Prof. Dr. Florian Alt, WS 2013/14 Konzept und Folien
Visualization of Multivariate Data. Dr. Yan Liu Department of Biomedical, Industrial and Human Factors Engineering Wright State University
Visualization of Multivariate Data Dr. Yan Liu Department of Biomedical, Industrial and Human Factors Engineering Wright State University Introduction Multivariate (Multidimensional) Visualization Visualization
Information Visualization Multivariate Data Visualization Krešimir Matković
Information Visualization Multivariate Data Visualization Krešimir Matković Vienna University of Technology, VRVis Research Center, Vienna Multivariable >3D Data Tables have so many variables that orthogonal
Multi-Dimensional Data Visualization. Slides courtesy of Chris North
Multi-Dimensional Data Visualization Slides courtesy of Chris North What is the Cleveland s ranking for quantitative data among the visual variables: Angle, area, length, position, color Where are we?!
Visual Data Mining with Pixel-oriented Visualization Techniques
Visual Data Mining with Pixel-oriented Visualization Techniques Mihael Ankerst The Boeing Company P.O. Box 3707 MC 7L-70, Seattle, WA 98124 [email protected] Abstract Pixel-oriented visualization
Diagrams and Graphs of Statistical Data
Diagrams and Graphs of Statistical Data One of the most effective and interesting alternative way in which a statistical data may be presented is through diagrams and graphs. There are several ways in
Data Exploration Data Visualization
Data Exploration Data Visualization What is data exploration? A preliminary exploration of the data to better understand its characteristics. Key motivations of data exploration include Helping to select
Iris Sample Data Set. Basic Visualization Techniques: Charts, Graphs and Maps. Summary Statistics. Frequency and Mode
Iris Sample Data Set Basic Visualization Techniques: Charts, Graphs and Maps CS598 Information Visualization Spring 2010 Many of the exploratory data techniques are illustrated with the Iris Plant data
Lecture 2: Descriptive Statistics and Exploratory Data Analysis
Lecture 2: Descriptive Statistics and Exploratory Data Analysis Further Thoughts on Experimental Design 16 Individuals (8 each from two populations) with replicates Pop 1 Pop 2 Randomly sample 4 individuals
The Value of Visualization 2
The Value of Visualization 2 G Janacek -0.69 1.11-3.1 4.0 GJJ () Visualization 1 / 21 Parallel coordinates Parallel coordinates is a common way of visualising high-dimensional geometry and analysing multivariate
Data Mining: Exploring Data. Lecture Notes for Chapter 3. Introduction to Data Mining
Data Mining: Exploring Data Lecture Notes for Chapter 3 Introduction to Data Mining by Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data Mining 8/05/2005 1 What is data exploration? A preliminary
Clustering & Visualization
Chapter 5 Clustering & Visualization Clustering in high-dimensional databases is an important problem and there are a number of different clustering paradigms which are applicable to high-dimensional data.
Cours de Visualisation d'information InfoVis Lecture. Multivariate Data Sets
Cours de Visualisation d'information InfoVis Lecture Multivariate Data Sets Frédéric Vernier Maître de conférence / Lecturer Univ. Paris Sud Inspired from CS 7450 - John Stasko CS 5764 - Chris North Data
Data Mining: Exploring Data. Lecture Notes for Chapter 3. Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler
Data Mining: Exploring Data Lecture Notes for Chapter 3 Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler Topics Exploratory Data Analysis Summary Statistics Visualization What is data exploration?
Data Mining: Exploring Data. Lecture Notes for Chapter 3. Introduction to Data Mining
Data Mining: Exploring Data Lecture Notes for Chapter 3 Introduction to Data Mining by Tan, Steinbach, Kumar What is data exploration? A preliminary exploration of the data to better understand its characteristics.
Tutorial 3: Graphics and Exploratory Data Analysis in R Jason Pienaar and Tom Miller
Tutorial 3: Graphics and Exploratory Data Analysis in R Jason Pienaar and Tom Miller Getting to know the data An important first step before performing any kind of statistical analysis is to familiarize
Data exploration with Microsoft Excel: analysing more than one variable
Data exploration with Microsoft Excel: analysing more than one variable Contents 1 Introduction... 1 2 Comparing different groups or different variables... 2 3 Exploring the association between categorical
Graphical Representation of Multivariate Data
Graphical Representation of Multivariate Data One difficulty with multivariate data is their visualization, in particular when p > 3. At the very least, we can construct pairwise scatter plots of variables.
Exploratory data analysis (Chapter 2) Fall 2011
Exploratory data analysis (Chapter 2) Fall 2011 Data Examples Example 1: Survey Data 1 Data collected from a Stat 371 class in Fall 2005 2 They answered questions about their: gender, major, year in school,
Exercise 1.12 (Pg. 22-23)
Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.
Principles of Data Visualization for Exploratory Data Analysis. Renee M. P. Teate. SYS 6023 Cognitive Systems Engineering April 28, 2015
Principles of Data Visualization for Exploratory Data Analysis Renee M. P. Teate SYS 6023 Cognitive Systems Engineering April 28, 2015 Introduction Exploratory Data Analysis (EDA) is the phase of analysis
COM CO P 5318 Da t Da a t Explora Explor t a ion and Analysis y Chapte Chapt r e 3
COMP 5318 Data Exploration and Analysis Chapter 3 What is data exploration? A preliminary exploration of the data to better understand its characteristics. Key motivations of data exploration include Helping
4.1 Exploratory Analysis: Once the data is collected and entered, the first question is: "What do the data look like?"
Data Analysis Plan The appropriate methods of data analysis are determined by your data types and variables of interest, the actual distribution of the variables, and the number of cases. Different analyses
Data Visualization - A Very Rough Guide
Data Visualization - A Very Rough Guide Ken Brodlie University of Leeds 1 What is This Thing Called Visualization? Visualization Use of computersupported, interactive, visual representations of data to
T O P I C 1 2 Techniques and tools for data analysis Preview Introduction In chapter 3 of Statistics In A Day different combinations of numbers and types of variables are presented. We go through these
Introduction to Data Visualization
Introduction to Data Visualization STAT 133 Gaston Sanchez Department of Statistics, UC Berkeley gastonsanchez.com github.com/gastonstat/stat133 Course web: gastonsanchez.com/teaching/stat133 Graphics
CS171 Visualization. The Visualization Alphabet: Marks and Channels. Alexander Lex [email protected]. [xkcd]
CS171 Visualization Alexander Lex [email protected] The Visualization Alphabet: Marks and Channels [xkcd] This Week Thursday: Task Abstraction, Validation Homework 1 due on Friday! Any more problems
Summarizing and Displaying Categorical Data
Summarizing and Displaying Categorical Data Categorical data can be summarized in a frequency distribution which counts the number of cases, or frequency, that fall into each category, or a relative frequency
Common Core Unit Summary Grades 6 to 8
Common Core Unit Summary Grades 6 to 8 Grade 8: Unit 1: Congruence and Similarity- 8G1-8G5 rotations reflections and translations,( RRT=congruence) understand congruence of 2 d figures after RRT Dilations
Visualization Quick Guide
Visualization Quick Guide A best practice guide to help you find the right visualization for your data WHAT IS DOMO? Domo is a new form of business intelligence (BI) unlike anything before an executive
GeoGebra. 10 lessons. Gerrit Stols
GeoGebra in 10 lessons Gerrit Stols Acknowledgements GeoGebra is dynamic mathematics open source (free) software for learning and teaching mathematics in schools. It was developed by Markus Hohenwarter
9. Text & Documents. Visualizing and Searching Documents. Dr. Thorsten Büring, 20. Dezember 2007, Vorlesung Wintersemester 2007/08
9. Text & Documents Visualizing and Searching Documents Dr. Thorsten Büring, 20. Dezember 2007, Vorlesung Wintersemester 2007/08 Slide 1 / 37 Outline Characteristics of text data Detecting patterns SeeSoft
What is Visualization? Information Visualization An Overview. Information Visualization. Definitions
What is Visualization? Information Visualization An Overview Jonathan I. Maletic, Ph.D. Computer Science Kent State University Visualize/Visualization: To form a mental image or vision of [some
Plotting: Customizing the Graph
Plotting: Customizing the Graph Data Plots: General Tips Making a Data Plot Active Within a graph layer, only one data plot can be active. A data plot must be set active before you can use the Data Selector
High Dimensional Data Visualization
High Dimensional Data Visualization Sándor Kromesch, Sándor Juhász Department of Automation and Applied Informatics, Budapest University of Technology and Economics, Budapest, Hungary Tel.: +36-1-463-3969;
Data analysis process
Data analysis process Data collection and preparation Collect data Prepare codebook Set up structure of data Enter data Screen data for errors Exploration of data Descriptive Statistics Graphs Analysis
There are six different windows that can be opened when using SPSS. The following will give a description of each of them.
SPSS Basics Tutorial 1: SPSS Windows There are six different windows that can be opened when using SPSS. The following will give a description of each of them. The Data Editor The Data Editor is a spreadsheet
GeoGebra Statistics and Probability
GeoGebra Statistics and Probability Project Maths Development Team 2013 www.projectmaths.ie Page 1 of 24 Index Activity Topic Page 1 Introduction GeoGebra Statistics 3 2 To calculate the Sum, Mean, Count,
January 26, 2009 The Faculty Center for Teaching and Learning
THE BASICS OF DATA MANAGEMENT AND ANALYSIS A USER GUIDE January 26, 2009 The Faculty Center for Teaching and Learning THE BASICS OF DATA MANAGEMENT AND ANALYSIS Table of Contents Table of Contents... i
Hierarchy and Tree Visualization
Hierarchy and Tree Visualization Definition Hierarchies An ordering of groups in which larger groups encompass sets of smaller groups. Data repository in which cases are related to subcases Hierarchical
Interpreting Data in Normal Distributions
Interpreting Data in Normal Distributions This curve is kind of a big deal. It shows the distribution of a set of test scores, the results of rolling a die a million times, the heights of people on Earth,
TIES443. Lecture 9: Visualization. Lecture 9. Course webpage: http://www.cs.jyu.fi/~mpechen/ties443. November 17, 2006
TIES443 Lecture 9 Visualization Mykola Pechenizkiy Course webpage: http://www.cs.jyu.fi/~mpechen/ties443 Department of Mathematical Information Technology University of Jyväskylä November 17, 2006 1 Topics
an introduction to VISUALIZING DATA by joel laumans
an introduction to VISUALIZING DATA by joel laumans an introduction to VISUALIZING DATA iii AN INTRODUCTION TO VISUALIZING DATA by Joel Laumans Table of Contents 1 Introduction 1 Definition Purpose 2 Data
Pushing the limit in Visual Data Exploration: Techniques and Applications
Pushing the limit in Visual Data Exploration: Techniques and Applications Daniel A. Keim 1, Christian Panse 1, Jörn Schneidewind 1, Mike Sips 1, Ming C. Hao 2, and Umeshwar Dayal 2 1 University of Konstanz,
CSU, Fresno - Institutional Research, Assessment and Planning - Dmitri Rogulkin
My presentation is about data visualization. How to use visual graphs and charts in order to explore data, discover meaning and report findings. The goal is to show that visual displays can be very effective
TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM
TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM Thanh-Nghi Do College of Information Technology, Cantho University 1 Ly Tu Trong Street, Ninh Kieu District Cantho City, Vietnam
Engineering Problem Solving and Excel. EGN 1006 Introduction to Engineering
Engineering Problem Solving and Excel EGN 1006 Introduction to Engineering Mathematical Solution Procedures Commonly Used in Engineering Analysis Data Analysis Techniques (Statistics) Curve Fitting techniques
Algebra 1 2008. Academic Content Standards Grade Eight and Grade Nine Ohio. Grade Eight. Number, Number Sense and Operations Standard
Academic Content Standards Grade Eight and Grade Nine Ohio Algebra 1 2008 Grade Eight STANDARDS Number, Number Sense and Operations Standard Number and Number Systems 1. Use scientific notation to express
NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
Understand the Sketcher workbench of CATIA V5.
Chapter 1 Drawing Sketches in Learning Objectives the Sketcher Workbench-I After completing this chapter you will be able to: Understand the Sketcher workbench of CATIA V5. Start a new file in the Part
Visualization Techniques in Data Mining
Tecniche di Apprendimento Automatico per Applicazioni di Data Mining Visualization Techniques in Data Mining Prof. Pier Luca Lanzi Laurea in Ingegneria Informatica Politecnico di Milano Polo di Milano
Scatter Plots with Error Bars
Chapter 165 Scatter Plots with Error Bars Introduction The procedure extends the capability of the basic scatter plot by allowing you to plot the variability in Y and X corresponding to each point. Each
Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012
Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts
Using Excel (Microsoft Office 2007 Version) for Graphical Analysis of Data
Using Excel (Microsoft Office 2007 Version) for Graphical Analysis of Data Introduction In several upcoming labs, a primary goal will be to determine the mathematical relationship between two variable
BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I
BNG 202 Biomechanics Lab Descriptive statistics and probability distributions I Overview The overall goal of this short course in statistics is to provide an introduction to descriptive and inferential
3D Interactive Information Visualization: Guidelines from experience and analysis of applications
3D Interactive Information Visualization: Guidelines from experience and analysis of applications Richard Brath Visible Decisions Inc., 200 Front St. W. #2203, Toronto, Canada, [email protected] 1. EXPERT
A Comparative Study of Visualization Techniques for Data Mining
A Comparative Study of Visualization Techniques for Data Mining A Thesis Submitted To The School of Computer Science and Software Engineering Monash University By Robert Redpath In fulfilment of the requirements
Multi-Attribute Glyphs on Venn and Euler Diagrams to Represent Data and Aid Visual Decoding
Multi-Attribute Glyphs on Venn and Euler Diagrams to Represent Data and Aid Visual Decoding Richard Brath Oculus Info Inc., Toronto, ON, Canada [email protected] Abstract. Representing quantities
A Survey on Multivariate Data Visualization
A Survey on Multivariate Data Visualization Winnie Wing-Yi Chan Department of Computer Science and Engineering Hong Kong University of Science and Technology Clear Water Bay, Kowloon, Hong Kong June 2006
SPSS Manual for Introductory Applied Statistics: A Variable Approach
SPSS Manual for Introductory Applied Statistics: A Variable Approach John Gabrosek Department of Statistics Grand Valley State University Allendale, MI USA August 2013 2 Copyright 2013 John Gabrosek. All
Introduction to Exploratory Data Analysis
Introduction to Exploratory Data Analysis A SpaceStat Software Tutorial Copyright 2013, BioMedware, Inc. (www.biomedware.com). All rights reserved. SpaceStat and BioMedware are trademarks of BioMedware,
Algebra 2 Chapter 1 Vocabulary. identity - A statement that equates two equivalent expressions.
Chapter 1 Vocabulary identity - A statement that equates two equivalent expressions. verbal model- A word equation that represents a real-life problem. algebraic expression - An expression with variables.
Data Visualization Techniques
Data Visualization Techniques From Basics to Big Data with SAS Visual Analytics WHITE PAPER SAS White Paper Table of Contents Introduction.... 1 Generating the Best Visualizations for Your Data... 2 The
The Forgotten JMP Visualizations (Plus Some New Views in JMP 9) Sam Gardner, SAS Institute, Lafayette, IN, USA
Paper 156-2010 The Forgotten JMP Visualizations (Plus Some New Views in JMP 9) Sam Gardner, SAS Institute, Lafayette, IN, USA Abstract JMP has a rich set of visual displays that can help you see the information
Algebra I Vocabulary Cards
Algebra I Vocabulary Cards Table of Contents Expressions and Operations Natural Numbers Whole Numbers Integers Rational Numbers Irrational Numbers Real Numbers Absolute Value Order of Operations Expression
Analyzing The Role Of Dimension Arrangement For Data Visualization in Radviz
Analyzing The Role Of Dimension Arrangement For Data Visualization in Radviz Luigi Di Caro 1, Vanessa Frias-Martinez 2, and Enrique Frias-Martinez 2 1 Department of Computer Science, Universita di Torino,
Big Data: Rethinking Text Visualization
Big Data: Rethinking Text Visualization Dr. Anton Heijs [email protected] Treparel April 8, 2013 Abstract In this white paper we discuss text visualization approaches and how these are important
Topic Maps Visualization
Topic Maps Visualization Bénédicte Le Grand, Laboratoire d'informatique de Paris 6 Introduction Topic maps provide a bridge between the domains of knowledge representation and information management. Topics
Northumberland Knowledge
Northumberland Knowledge Know Guide How to Analyse Data - November 2012 - This page has been left blank 2 About this guide The Know Guides are a suite of documents that provide useful information about
MEASURES OF VARIATION
NORMAL DISTRIBTIONS MEASURES OF VARIATION In statistics, it is important to measure the spread of data. A simple way to measure spread is to find the range. But statisticians want to know if the data are
DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.
DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,
Name: Date: Use the following to answer questions 2-3:
Name: Date: 1. A study is conducted on students taking a statistics class. Several variables are recorded in the survey. Identify each variable as categorical or quantitative. A) Type of car the student
Lecture 1: Review and Exploratory Data Analysis (EDA)
Lecture 1: Review and Exploratory Data Analysis (EDA) Sandy Eckel [email protected] Department of Biostatistics, The Johns Hopkins University, Baltimore USA 21 April 2008 1 / 40 Course Information I Course
Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm
Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm
Descriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics
Descriptive statistics is the discipline of quantitatively describing the main features of a collection of data. Descriptive statistics are distinguished from inferential statistics (or inductive statistics),
Data Visualization. or Graphical Data Presentation. Jerzy Stefanowski Instytut Informatyki
Data Visualization or Graphical Data Presentation Jerzy Stefanowski Instytut Informatyki Data mining for SE -- 2013 Ack. Inspirations are coming from: G.Piatetsky Schapiro lectures on KDD J.Han on Data
Module 3: Correlation and Covariance
Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis
Solving Simultaneous Equations and Matrices
Solving Simultaneous Equations and Matrices The following represents a systematic investigation for the steps used to solve two simultaneous linear equations in two unknowns. The motivation for considering
Gestation Period as a function of Lifespan
This document will show a number of tricks that can be done in Minitab to make attractive graphs. We work first with the file X:\SOR\24\M\ANIMALS.MTP. This first picture was obtained through Graph Plot.
How To Write A Data Analysis
Mathematics Probability and Statistics Curriculum Guide Revised 2010 This page is intentionally left blank. Introduction The Mathematics Curriculum Guide serves as a guide for teachers when planning instruction
3: Summary Statistics
3: Summary Statistics Notation Let s start by introducing some notation. Consider the following small data set: 4 5 30 50 8 7 4 5 The symbol n represents the sample size (n = 0). The capital letter X denotes
STATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI
STATS8: Introduction to Biostatistics Data Exploration Babak Shahbaba Department of Statistics, UCI Introduction After clearly defining the scientific problem, selecting a set of representative members
Current Standard: Mathematical Concepts and Applications Shape, Space, and Measurement- Primary
Shape, Space, and Measurement- Primary A student shall apply concepts of shape, space, and measurement to solve problems involving two- and three-dimensional shapes by demonstrating an understanding of:
Chapter 4 Creating Charts and Graphs
Calc Guide Chapter 4 OpenOffice.org Copyright This document is Copyright 2006 by its contributors as listed in the section titled Authors. You can distribute it and/or modify it under the terms of either
Enhancing the Visual Clustering of Query-dependent Database Visualization Techniques using Screen-Filling Curves
Enhancing the Visual Clustering of Query-dependent Database Visualization Techniques using Screen-Filling Curves Extended Abstract Daniel A. Keim Institute for Computer Science, University of Munich Leopoldstr.
Principles of Data Visualization
Principles of Data Visualization by James Bernhard Spring 2012 We begin with some basic ideas about data visualization from Edward Tufte (The Visual Display of Quantitative Information (2nd ed.)) He gives
Exploratory Data Analysis
Exploratory Data Analysis Johannes Schauer [email protected] Institute of Statistics Graz University of Technology Steyrergasse 17/IV, 8010 Graz www.statistics.tugraz.at February 12, 2008 Introduction
CALCULATIONS & STATISTICS
CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents
Georgia Standards of Excellence Curriculum Map. Mathematics. GSE 8 th Grade
Georgia Standards of Excellence Curriculum Map Mathematics GSE 8 th Grade These materials are for nonprofit educational purposes only. Any other use may constitute copyright infringement. GSE Eighth Grade
DESCRIPTIVE STATISTICS AND EXPLORATORY DATA ANALYSIS
DESCRIPTIVE STATISTICS AND EXPLORATORY DATA ANALYSIS SEEMA JAGGI Indian Agricultural Statistics Research Institute Library Avenue, New Delhi - 110 012 [email protected] 1. Descriptive Statistics Statistics
MARS STUDENT IMAGING PROJECT
MARS STUDENT IMAGING PROJECT Data Analysis Practice Guide Mars Education Program Arizona State University Data Analysis Practice Guide This set of activities is designed to help you organize data you collect
Figure 1. An embedded chart on a worksheet.
8. Excel Charts and Analysis ToolPak Charts, also known as graphs, have been an integral part of spreadsheets since the early days of Lotus 1-2-3. Charting features have improved significantly over the
Visualization Software
Visualization Software Maneesh Agrawala CS 294-10: Visualization Fall 2007 Assignment 1b: Deconstruction & Redesign Due before class on Sep 12, 2007 1 Assignment 2: Creating Visualizations Use existing
