VISUALIZATION OF WINTER WHEAT QUANTITATIVE TRAITS WITH PARALLEL COORDINATE PLOTS
|
|
|
- Hester Wade
- 10 years ago
- Views:
Transcription
1 ISSN UDK = : VISUALIZATION OF WINTER WHEAT QUANTITATIVE TRAITS WITH PARALLEL COORDINATE PLOTS Andrijana Eđed (1), Z. Lončarić (1), D. Horvat (1), K. Skala (2) SUMMARY Original scientific paper Izvorni znanstveni ~lanak Visualization of multivariate multidimensional data sets is a challenging task, especially without use of adequate tools and methods. In the last few years, parallel coordinate plots became quite popular and accepted as a very efficient multivariate visualization technique. The aim of this paper was to explore how parallel coordinates can be used in analysis of winter wheat quantitative traits. Data set is obtained from experiment set up by a completely randomized design with two treatments and four replicates. Ten variables (plant height, spike length, stem length, plant weight, spike weight, grain weight per spike, 1000 kernel weight, number of fertile and sterile spikelets per spike and total number of spikelets per spike) and fifty-five winter wheat genotypes were analysed in this paper. In parallel coordinate plots observations are shown as series of unbroken lines, passing through parallel axes, where each axes represents a different variable. Advantage of parallel coordinates, compared to other visualization techniques, is that they can represent multivariate data in two dimensions. From such representation, outliers and grouping among observations are easily detectable. Correlation among variables can also be easily detected from such representation. Although parallel coordinates cannot efficiently explore details, they are a good technique for visualization of multivariate data sets and they can be used for exploratory analysis of wheat quantitative traits. Key-words: parallel coordinate plots, winter wheat, quantitative traits, visualization INTRODUCTION It is quite challenging to visualize winter wheat quantitative traits in one plot, especially if hundred or more genotypes are tested at once. Such data sets are generally considered as multidimensional multivariate (mdmv) data sets, and they are not uncommon. Usually every data set including more than five dimensions is considered as mdmv data set. Wong and Bergeron (1997) explained that in the context of visualization, multidimensional data refers to the study of relationships among multiple parameters (variables). Mathematically, these parameters can be classified into two categories: dependent and independent. A variable is said to be dependent if it is a function of another variable, the independent variable. Term multidimensional refers to the dimensionality of the independent variables, while the term multivariate refers to the dimensionality of the dependent variables. Basic tool for data modelling and one of the most commonly used visualization techniques for visualization of data sets with lower number of variates is scatter plot diagram. Two-dimensional scatter plots can be enhanced by putting more plots into one display, creating a scatter plot matrix or by creating three-dimensional scatter plot, which allows the visualization of multivariate data of up to four dimensions. In the scatter plot matrix, plots are organized in a matrix of all pairwise combinations of the variables. Scatter plot matrix with ten or more correlated variables is not easily readable, and it s not convenient for mdmv data sets. On the other hand, PCP can represent mdmv data sets and correlations between variables in those data sets on very apparent way and PCP are often used as a graphical alternative to conventional scatter plots (Wegman and Luo, 1997). The goal of PCP s is to visualize mvmd data set without loss of information. According to Inselberg (1) Andrijana E ed, BSc ([email protected]), Prof. DSc Zdenko Lon~ari}, Prof. DSc Dra`en Horvat J.J. Strossmayer University of Osijek, Faculty of Agriculture in Osijek, Trg Sv. Trojstva 3 HR Osijek, Croatia, (2) Prof. DSc Karolj Skala Ru er Bo{kovi} Institute, Centre for Informatics and Computing, Bijeni~ka 54 HR Zagreb, Croatia
2 Andrijana E ed et al.: VISUALIZATION OF WINTER WHEAT QUANTITATIVE TRAITS WITH PARALLEL (1997), PCPs work for any number of variables (N) and all variables are treated uniformly. PCPs have properties to decrease representation complexity and the display easily conveys information on the properties of the N-dimensional object it represents (Inselberg, 1997). In PCP observations are represented as a series of unbroken lines, passing through parallel axes, each of which represents a different variable. Each line passes through an axis at a location that indicates the observation s value relative to all other values. The ends of the axis represent the maximum and minimum values of the axis variable for all observations under consideration. The result is a visual representation of relationships among many variables and possibly unique, multivariate signature for each observation (Edsall, 2003). The mathematical and geometric background of parallel coordinates is explained elsewhere (Wegman, 1990; Inselberg and Dimsdale, 1990). There is no fundamental limit on data dimensionality (Heinrich and Weiskopf, 2009) in PCPs and it is one of advantages of this technique. However, classical parallel coordinates suffer from over-plotting of lines, especially for very large data sets. Parallel coordinates were invented in 1885 by Maurice d Ocagne, but their systematic development and popularization started in 1985 when Al Inselberg re-discovered them. Paper, entitled Hyperdimensional Data Analysis Using Parallel Coordinates published by Wegman (1990), marked off starting point for general acceptance of parallel coordinates, as useful tool for visualization of mdmv data sets. Collision Avoidance Algorithms for Air Traffic Control, Computer Vision and Process Control are based on parallel coordinates and they are some of the most important examples how PCPs can be used. Lately, parallel coordinates have become accepted as a tool for visualization by researchers from other areas too (Eurenius and Heldring, 2008; Edsall, 2003; Few, 2006). Adrienko and Adrienko (2001) gave an extensive list of tasks for which PCPs, as general purpose data visualization technique, can be used. After detailed search of Internet and available literature, publications related to application of parallel coordinates in agriculture, biotechnology or to analysis of data gathered from experiments conducted on plants, were not found. Regarding constant increase of sample size and complexity of data sets in agricultural research, methods for visualization of such data sets are required. In this paper we presented how parallel coordinate plots can be used in visualization of wheat quantitative traits. Main intention was to show how mdmv data sets can be presented at one display and how patterns and possible grouping among observations and correlations among variables can be revealed from PCPs. MATERIAL AND METHODS The experiment was conducted in the growing season 2007/2008. Fifty-five winter wheat genotypes were grown in the field in plastic pots (pot diameter was 275 mm). Each pot contained eleven kilograms of soil, fertilized with 42.8 mg N/kg soil and 42.8 mg P 2 O 5 /kg soil. This corresponds with soil productivity and standard practice in wheat production (equally to 150 kg N/ ha and 150 kg P 2 O 5 /ha) since availability of phosphorus in soil was very low. Sowing rate of 850 grains m -2 was achieved by sowing forty-one seeds of each genotype per pot and sowing depth was two centimetres. The experiment was set up according to completely randomized design, with two treatments and four replicates. Applied treatments were: control treatment (0 mg Cd/kg soil) and treatment1 (20 mg Cd/kg soil). Harvest was performed at maturity stage. Ten plants from each pot were cut near the soil. Plant height and weight, stem length, spike length and spike weight were measured. Heads were removed from plants and sterile and fertile spikelets were counted. After threshing, number of grains and grain weight per spike were determined and 1000 kernel weight was calculated. Plant height, stem length, spike length, plant weight, spike weight, grain weight per spike, 1000 kernel weight, number of spikelets per spike, number of fertile spikelets per spike and number of sterile spikelets per spike were analysed in this papers. Statistical analyses were done in Microsoft Office Excel (2007) and SAS for Windows (SAS Institute Inc., Cary, NC, USA). Parallel coordinate plots were made in Excel according to instructions from com/excel/charts/parallelcoord.html. Drop-down menu in Figure 1 was created in a way that effects of different treatments can be compared to same genotype. Order of variables in plot was set by the data set, and all parallel axes were equally spaced. RESULTS AND DISCUSSION Representation of mvmd data sets in one display Figure 1 showing how ten variables measured on fifty-five genotypes, treated with two treatments (that is 1100 data points) can be represented in one plot. From the aspect of visualization, 1100 data points, can be considered as moderate size data set. Genotype Katarina was picked from drop-down menu, and in Figure 1 it is shown that there is no visible difference in plant height and stem length between treatments for genotype Katarina. Spike length, plant weight, spike weight, grain weight per spike, 1000 kernel weight and number of fertile spikelets per spike had higher values for control treatment than for treatment 1, while treatment 1 had higher number of spikelets per spike and number of sterile spikelets per spike than the control treatment. All other genotypes from data set can be compared at the same way; by choosing them from drop-down menu. It s not obvious, from PCP, if there is statistically significant differences between treatments for any variable, but general patterns of chosen observations are obvious. A display with this many data cannot be used to explore the details, but it can be used to search for predominant patterns and exceptions (Few, 2006). This representation can draw attention to variables and observations that differs from others and imply in which direction further statistical analysis should go on.
3 16 Andrijana E ed et al.: VISUALIZATION OF WINTER WHEAT QUANTITATIVE TRAITS WITH PARALLEL... Figure 1. Representation of 1100 data points in parallel coordinate plot Grafikon 1. Prikaz skupa koji broji 1100 vrijednosti u dijagramu paralelnih koordinata Determination of grouping patterns among variables In Figure 2 is shown how observations can form groups after exclusion of minimal and maximal values from data set. In this figure only control treatment values are shown. Genotypes with highest and lowest values for plant height were deleted from the data set prior to graph drawing. As a result of that, two groups of observations occurred (Figure 2). Group of genotypes with lower plant height is on the bottom (Group 1) and have a lower range than second group (Group 2). Due to lower range of values in Group 1, observations (lines) are more dense and there is no indication of further grouping within this group. Range of data values in Group 2 is higher and it is possible that in this group secondary grouping would occur after data manipulation. Considering high correlation (r = 0.99) between plant height and stem length, grouping pattern that occurred for plant height occurred for stem length too. If we look at the number of spikelets per spike, number of fertile spikelets per spike and number of sterile spikelets per spike (Figure 2), it is easy to perceive that almost all observations (gray lines) follows a similar pattern. However, it is also perceivable that eleven observations are considerably different in their pattern compared to all other variables, but they differ from each other too. Green lines represent genotypes Adriana, Superžitarka and Zlatna dolina having very similar number of spikelets per spike, number of fertile spikelets per spike and number of sterile spikelets per spike, creating a group on their own. Genotypes Golubica and Sana (orange lines), and genotypes Njivka and Primadur (blue lines) have similar paterns and slopes for these three variables, but similarity between Golubica and Sana (or between Njivka and Primadur) is higher than similarity between Njivka and Primadur (as one group), and Golubica and Sana (as the second group). Coloseo, Divana, Nefer and Osječka crvenka (purple lines) also have similar slopes and patterns which distinguished them from all other genotypes. After these eleven genotypes have been eliminated from PCP, no further grouping among remaining observations was noted (the figure is not shown). Detection of correlation between variables In Figure 3 horizontal lines between plant height and stem length represent strong correlation (r = 0.99) existing between these two variables. Strong correlation also exists between plant weight, spike weight and grain weight per spike (Table 1). Lines between these variables (Figure 3) have slight slopes, and they are not strictly horizontal, but it is obvious that horizontal pattern prevails. Observations that don t follow horizontal pattern are evident, and they can be
4 Andrijana E ed et al.: VISUALIZATION OF WINTER WHEAT QUANTITATIVE TRAITS WITH PARALLEL Figure 2. Representation of grouping among observations in parallel coordinate plot Grafikon 2. Prikaz grupiranja pojedinih opservacija u dijagramu paralelnih koordinata chosen for or excluded from further analysis, depending on the aim of analysis. Lines between variables that are not strongly correlated, like 1000 kernel weight and number of spikelets/spike (r=0.37), are crossed and there are no regular pattern between them. Direction of lines between variables cannot be predictor of direction of correlation between them. Figure 3. Representation of correlation between variables in parallel coordinate plot Slika 3. Prikaz korelacije izme u pojedinih varijabli u dijagramu paralelnih koordinata
5 18 Andrijana E ed et al.: VISUALIZATION OF WINTER WHEAT QUANTITATIVE TRAITS WITH PARALLEL... Table 1. Correlation coefficients (r) for ten variables (V1-Plant height, V2-Stem length, V3-Spike length, V4-Plant weight, V5-Spike weight, V6-Grain weight/spike, V kernel weight, V8-Number of spikelets/spike, V9-Number of fertile spikelets/spike, V10-Number of sterile spikelets/spike Tablica 1. Korelacijski koeficijenti (r) za 10 varijabli (V1-Visina biljke, V2-Duljina stabljike, V3- Duljina klasa, V4-Masa biljke, V5-Masa klasa, V6-Masa zrna/klasu, V7-Masa 1000 zrna, V8-Broj klasi}a/klasu, V9-Broj fertilnih klasi}a/klasu, V10-Broj sterilnih klasi}a/klasu) Plant height Stem length Spike length Plant height - Stem length Spike length Plant weight Spike weight Grain weight/ spike 1000 kernel weight Number of spikelets/ spike Number of fertile spikelets/ spike Number of sterile spikelets/ spike Plant weight Spike weight Grain weight/ spike 1000 kernel weight Number of spikelets/spike Num.of fertile spikelets/spike Num.of sterile spikelets/spike In addition, the order the axes are lined up clearly influences the amount and quality of the insight gained from the representation (Edsall, 2003). It is difficult to figure out the relationships between variables in nonadjacent positions (Huh and Park, 2007), and order of axis is critical for finding features. Thus, many reorderings will be needed in typical data analysis. In spite of that, importance of PCPs in visualization of mdmv data sets is considerable. Providing tools and methods to explore the multitude of possible relationships among variables visually may enable a researcher to discover important characteristics of the data set which would be difficult, if not impossible, to detect with only non visual statistical methods (Edsall, 2003). Many visualization softwares (Parallax: Multidimensional Graphs, VisDB, Ggobi) that implemented parallel coordinates methodology are available on market today. PCPs can be done in Microsoft Office Excel, as we did in this paper, but available specialized software for visualization would make it much easier. To create PCPs in Microsoft Office Excel and obtain full scale insight in data set, lots of time consuming manual editing is required. Beside that, PCPs done in Excel, are static, non-interactive forms and some authors noticed that dynamic PCPs are more advantageous for data visualization. Available visualization softwares are usually easy to handle; they offer enhanced PCPs compared to classical PCP and lots of editing options which can broaden analysis and offer more comprehensive insight in analysed data set. CONCLUSION In this paper, we presented a simple visualization method - parallel coordinate plots - that can be used to explore complex relationships. Parallel coordinate plots are a two-dimensional presentation method for multidimensional data. It is easy to spot patterns and correlations among variables and observations in parallel coordinate plots. They can be used for visualization and analysis of multidimensional data sets. This research showed that parallel coordinate plots can be used for visualization of wheat quantitative traits. Although PCPs have some limitations, like overplotting with large data sets, they require lots of reordering of axes aiming to asses some insight; they are easy to use data exploratory tool available in many data analysis softwares. ACKNOWLEDGEMENTS The paper is a result of the research project Soil conditioning impact on nutrients and heavy metals in soil-plant continuum ( ) financed by The Ministry of Science, Education and Sports, Croatia.
6 Andrijana E ed et al.: VISUALIZATION OF WINTER WHEAT QUANTITATIVE TRAITS WITH PARALLEL REFERENCES 1. Andrienko, G., Andrienko, N. (2001): Constructing Parallel Coordinates Plot for Problem Solving Constructing Parallel Coordinates Plot for Problem Solvin. Smart Graphics, Hawthorne, NY. 2. Edsall, M. R. (2003): The parallel coordinate plot in action: design and use for geographic visualization. Commputional Statistics & Data Analysis 43: Eurenius, O., Heldring, T. (2008): Coffe Flow Visualization, viewed 19 December, 2009, < 4. Few, S. (2006): Multivariate Analysis Using Parellel Coordinates, viewed 14 January, 2010, < Heinrich, J., Weiskopf, D. (2009): Continuous Parallel Coordinates. IEEE Transactions on Visualization and Computer Graphics, 15:6: Interactive Parallel Coordinates Chart 2009, Peltier Technical Services, viewed 28 December, 2009, < peltiertech.com/excel/charts/parallelcoord.html>. 7. Huh, M.H., Park D.Y. (2007): Enhancing parallel coordinate plots. Journal of Korean Statistical Society 37: Inselberg, A. (1997): Multidimensional Detective. Proceedings of IEEE Information Visualization 1997, IEEE Computer Society, Inselberg, A., Dimsdale, B. (1990): Parallel coordinates: A tool for visilizing multi-dimensional geometry. Proceedings of the IEEE Visualization 1990, IEEE Computer Society, Klemz, R. B., Dunne, M. P. (2000): Exploratory Analysis using Parallel Coordinate Systems: Data Visualization in N-Dimensions. Marketing Letters, 11:4: SAS/STAT User s Guide. ( ) Version Cary, NC. SAS Institute Inc. 12. Shannon, R., Holland, T., Quigley, A. (2008): Multivariate Graph Drawing using Parallel Coordinate Visualisations, viewed 16 January, 2010, < ucd-csi pdf>. 13. Siirtola, H., Räihä, K.J. (2006): Interacting with parallel coordinates. Interacting with Computers, 18: Wegman, E.J. (1990): Hyperdimensional Data Analysis Using Parallel Coordinates. Journal of the American Statistical Association, 85: 411: Wegman, E.J., Luo, Q. (1997): High Dimensional Clustering Using Parallel Coordinates and the Grand Tour. Computing Science and Statistics, 28: Wong, P., Bergeron, D.R. (1997): 30 years of multidimensional multivariate visualization. In Gregory Nielson, editor, Scientific Visualization Overviews, Methodologies, and Techniques, 1: IEEE Computer Society Press, Los Alamitos. VIZUALIZACIJA KVANTITATIVNIH SVOJSTAVA OZIME P[ENICE DIJAGRAMOM PARALELNIH KOORDINATA SA@ETAK Grafički prikazati multivarijantno-multidimenzionalni skup podataka bez primjene odgovarajućih alata nije jednostavan posao. U posljednjih nekoliko godina porasla je popularnost paralelnih koordinata, koje su se pokazale kao vrlo učinkovita metoda za vizualizaciju. Cilj ovoga rada bio je istražiti mogućnosti primjene paralelnih koordinata pri ispitivanju kvantitativnih svojstava ozime pšenice. U radu su korišteni podaci iz pokusa postavljenoga po potpuno slučajnome planu, s dva tretmana u četiri ponavljanja. U radu je analizirano deset varijabli (visina biljke, duljina stabljike, duljina klasa, masa biljke, masa klasa, masa zrna po klasu, masa 1000 zrna, broj sterilnih i fertilnih klasića, i ukupan broj klasića po klasu) i pedeset pet genotipova ozime pšenice. U dijagramu paralelnih koordinata opservacije su prikazane kao linije koje prolaze kroz paralelne osi, a svaka os predstavlja jednu varijablu. Prednost paralelnih koordinata, u odnosu na druge vizualizacijske metode, je u tome što multidimenzionalne probleme prikazuje u dvije dimenzije te se u takvome prikazu mogu lako uočiti odstupanja i grupiranja pojedinih opservacija te korelaciju među varijablama. Iako paralelne koordinate ne mogu otkriti i istražiti detalje promatranoga skupa, vrlo su korisne u vizualizaciji i eksploraciji multivarijantnih podataka, kao što su kvantitativna svojstva ozime pšenice. Ključne riječi: paralelne koordinate, ozima pšenica, kvantitativna svojstva, vizualizacija (Received on 13 October 2010; accepted on 05 November Primljeno 13. listopada 2010.; prihva}eno 05. studenoga 2010.)
The Value of Visualization 2
The Value of Visualization 2 G Janacek -0.69 1.11-3.1 4.0 GJJ () Visualization 1 / 21 Parallel coordinates Parallel coordinates is a common way of visualising high-dimensional geometry and analysing multivariate
What is Visualization? Information Visualization An Overview. Information Visualization. Definitions
What is Visualization? Information Visualization An Overview Jonathan I. Maletic, Ph.D. Computer Science Kent State University Visualize/Visualization: To form a mental image or vision of [some
Visualization methods for patent data
Visualization methods for patent data Treparel 2013 Dr. Anton Heijs (CTO & Founder) Delft, The Netherlands Introduction Treparel can provide advanced visualizations for patent data. This document describes
CSU, Fresno - Institutional Research, Assessment and Planning - Dmitri Rogulkin
My presentation is about data visualization. How to use visual graphs and charts in order to explore data, discover meaning and report findings. The goal is to show that visual displays can be very effective
MARS STUDENT IMAGING PROJECT
MARS STUDENT IMAGING PROJECT Data Analysis Practice Guide Mars Education Program Arizona State University Data Analysis Practice Guide This set of activities is designed to help you organize data you collect
Correlation key concepts:
CORRELATION Correlation key concepts: Types of correlation Methods of studying correlation a) Scatter diagram b) Karl pearson s coefficient of correlation c) Spearman s Rank correlation coefficient d)
Big Data: Rethinking Text Visualization
Big Data: Rethinking Text Visualization Dr. Anton Heijs [email protected] Treparel April 8, 2013 Abstract In this white paper we discuss text visualization approaches and how these are important
HDDVis: An Interactive Tool for High Dimensional Data Visualization
HDDVis: An Interactive Tool for High Dimensional Data Visualization Mingyue Tan Department of Computer Science University of British Columbia [email protected] ABSTRACT Current high dimensional data visualization
Visualization Techniques in Data Mining
Tecniche di Apprendimento Automatico per Applicazioni di Data Mining Visualization Techniques in Data Mining Prof. Pier Luca Lanzi Laurea in Ingegneria Informatica Politecnico di Milano Polo di Milano
Visualization of missing values using the R-package VIM
Institut f. Statistik u. Wahrscheinlichkeitstheorie 040 Wien, Wiedner Hauptstr. 8-0/07 AUSTRIA http://www.statistik.tuwien.ac.at Visualization of missing values using the R-package VIM M. Templ and P.
Iris Sample Data Set. Basic Visualization Techniques: Charts, Graphs and Maps. Summary Statistics. Frequency and Mode
Iris Sample Data Set Basic Visualization Techniques: Charts, Graphs and Maps CS598 Information Visualization Spring 2010 Many of the exploratory data techniques are illustrated with the Iris Plant data
Figure 1. An embedded chart on a worksheet.
8. Excel Charts and Analysis ToolPak Charts, also known as graphs, have been an integral part of spreadsheets since the early days of Lotus 1-2-3. Charting features have improved significantly over the
The Science and Art of Market Segmentation Using PROC FASTCLUS Mark E. Thompson, Forefront Economics Inc, Beaverton, Oregon
The Science and Art of Market Segmentation Using PROC FASTCLUS Mark E. Thompson, Forefront Economics Inc, Beaverton, Oregon ABSTRACT Effective business development strategies often begin with market segmentation,
The Forgotten JMP Visualizations (Plus Some New Views in JMP 9) Sam Gardner, SAS Institute, Lafayette, IN, USA
Paper 156-2010 The Forgotten JMP Visualizations (Plus Some New Views in JMP 9) Sam Gardner, SAS Institute, Lafayette, IN, USA Abstract JMP has a rich set of visual displays that can help you see the information
Principles of Data Visualization for Exploratory Data Analysis. Renee M. P. Teate. SYS 6023 Cognitive Systems Engineering April 28, 2015
Principles of Data Visualization for Exploratory Data Analysis Renee M. P. Teate SYS 6023 Cognitive Systems Engineering April 28, 2015 Introduction Exploratory Data Analysis (EDA) is the phase of analysis
MarkerView Software 1.2.1 for Metabolomic and Biomarker Profiling Analysis
MarkerView Software 1.2.1 for Metabolomic and Biomarker Profiling Analysis Overview MarkerView software is a novel program designed for metabolomics applications and biomarker profiling workflows 1. Using
9.2 User s Guide SAS/STAT. Introduction. (Book Excerpt) SAS Documentation
SAS/STAT Introduction (Book Excerpt) 9.2 User s Guide SAS Documentation This document is an individual chapter from SAS/STAT 9.2 User s Guide. The correct bibliographic citation for the complete manual
Visualizing non-hierarchical and hierarchical cluster analyses with clustergrams
Visualizing non-hierarchical and hierarchical cluster analyses with clustergrams Matthias Schonlau RAND 7 Main Street Santa Monica, CA 947 USA Summary In hierarchical cluster analysis dendrogram graphs
Data Mining: Exploring Data. Lecture Notes for Chapter 3. Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler
Data Mining: Exploring Data Lecture Notes for Chapter 3 Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler Topics Exploratory Data Analysis Summary Statistics Visualization What is data exploration?
DESCRIPTIVE STATISTICS AND EXPLORATORY DATA ANALYSIS
DESCRIPTIVE STATISTICS AND EXPLORATORY DATA ANALYSIS SEEMA JAGGI Indian Agricultural Statistics Research Institute Library Avenue, New Delhi - 110 012 [email protected] 1. Descriptive Statistics Statistics
Diagrams and Graphs of Statistical Data
Diagrams and Graphs of Statistical Data One of the most effective and interesting alternative way in which a statistical data may be presented is through diagrams and graphs. There are several ways in
Module 3: Correlation and Covariance
Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis
Graphical Representation of Multivariate Data
Graphical Representation of Multivariate Data One difficulty with multivariate data is their visualization, in particular when p > 3. At the very least, we can construct pairwise scatter plots of variables.
Plotting Data with Microsoft Excel
Plotting Data with Microsoft Excel Here is an example of an attempt to plot parametric data in a scientifically meaningful way, using Microsoft Excel. This example describes an experience using the Office
an introduction to VISUALIZING DATA by joel laumans
an introduction to VISUALIZING DATA by joel laumans an introduction to VISUALIZING DATA iii AN INTRODUCTION TO VISUALIZING DATA by Joel Laumans Table of Contents 1 Introduction 1 Definition Purpose 2 Data
Ministry of Education. The Ontario Curriculum. Mathematics. Mathematics Transfer Course, Grade 9, Applied to Academic
Ministry of Education The Ontario Curriculum Mathematics Mathematics Transfer Course, Grade 9, Applied to Academic 2 0 0 6 Contents Introduction....................................................... 2
Data Exploration Data Visualization
Data Exploration Data Visualization What is data exploration? A preliminary exploration of the data to better understand its characteristics. Key motivations of data exploration include Helping to select
A Complete Gradient Clustering Algorithm for Features Analysis of X-ray Images
A Complete Gradient Clustering Algorithm for Features Analysis of X-ray Images Małgorzata Charytanowicz, Jerzy Niewczas, Piotr A. Kowalski, Piotr Kulczycki, Szymon Łukasik, and Sławomir Żak Abstract Methods
Spatio-Temporal Mapping -A Technique for Overview Visualization of Time-Series Datasets-
Progress in NUCLEAR SCIENCE and TECHNOLOGY, Vol. 2, pp.603-608 (2011) ARTICLE Spatio-Temporal Mapping -A Technique for Overview Visualization of Time-Series Datasets- Hiroko Nakamura MIYAMURA 1,*, Sachiko
THE CONCEPT OF HIERARCHICAL LEVELS; AN OVERALL CONCEPT FOR A FULL AUTOMATIC CONCRETE DESIGN INCLUDING THE EDUCATION OF CONCRETE.
THE CONCEPT OF HIERARCHICAL LEVELS; AN OVERALL CONCEPT FOR A FULL AUTOMATIC CONCRETE DESIGN INCLUDING THE EDUCATION OF CONCRETE. THE CASE MatrixFrame VERSUS EuroCadCrete. ABSTRACT: Ir. Ron Weener 1. The
COM CO P 5318 Da t Da a t Explora Explor t a ion and Analysis y Chapte Chapt r e 3
COMP 5318 Data Exploration and Analysis Chapter 3 What is data exploration? A preliminary exploration of the data to better understand its characteristics. Key motivations of data exploration include Helping
Using Excel (Microsoft Office 2007 Version) for Graphical Analysis of Data
Using Excel (Microsoft Office 2007 Version) for Graphical Analysis of Data Introduction In several upcoming labs, a primary goal will be to determine the mathematical relationship between two variable
EXPLORING SPATIAL PATTERNS IN YOUR DATA
EXPLORING SPATIAL PATTERNS IN YOUR DATA OBJECTIVES Learn how to examine your data using the Geostatistical Analysis tools in ArcMap. Learn how to use descriptive statistics in ArcMap and Geoda to analyze
The VisuLab : an Instrument for Interactive, Comparative Visualization
Department of Computer Science Information Technology and Education The VisuLab : an Instrument for Interactive, Comparative Visualization Hans Hinterberger Technical Report Nr. 682 November, 2010 Contents
GRAPHING DATA FOR DECISION-MAKING
GRAPHING DATA FOR DECISION-MAKING Tibor Tóth, Ph.D. Center for Applied Demography and Survey Research (CADSR) University of Delaware Fall, 2006 TABLE OF CONTENTS Introduction... 3 Use High Information
On History of Information Visualization
On History of Information Visualization Mária Kmeťová Department of Mathematics, Constantine the Philosopher University in Nitra, Tr. A. Hlinku 1, Nitra, Slovakia [email protected] Keywords: Abstract: abstract
Data Mining: Exploring Data. Lecture Notes for Chapter 3. Introduction to Data Mining
Data Mining: Exploring Data Lecture Notes for Chapter 3 Introduction to Data Mining by Tan, Steinbach, Kumar What is data exploration? A preliminary exploration of the data to better understand its characteristics.
Data Mining: Exploring Data. Lecture Notes for Chapter 3. Introduction to Data Mining
Data Mining: Exploring Data Lecture Notes for Chapter 3 Introduction to Data Mining by Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data Mining 8/05/2005 1 What is data exploration? A preliminary
Visual quality metrics and human perception: an initial study on 2D projections of large multidimensional data.
Visual quality metrics and human perception: an initial study on 2D projections of large multidimensional data. Andrada Tatu Institute for Computer and Information Science University of Konstanz Peter
SAS BI Dashboard 3.1. User s Guide
SAS BI Dashboard 3.1 User s Guide The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2007. SAS BI Dashboard 3.1: User s Guide. Cary, NC: SAS Institute Inc. SAS BI Dashboard
Data Visualization Techniques
Data Visualization Techniques From Basics to Big Data with SAS Visual Analytics WHITE PAPER SAS White Paper Table of Contents Introduction.... 1 Generating the Best Visualizations for Your Data... 2 The
Multivariate Analysis of Ecological Data
Multivariate Analysis of Ecological Data MICHAEL GREENACRE Professor of Statistics at the Pompeu Fabra University in Barcelona, Spain RAUL PRIMICERIO Associate Professor of Ecology, Evolutionary Biology
An example. Visualization? An example. Scientific Visualization. This talk. Information Visualization & Visual Analytics. 30 items, 30 x 3 values
Information Visualization & Visual Analytics Jack van Wijk Technische Universiteit Eindhoven An example y 30 items, 30 x 3 values I-science for Astronomy, October 13-17, 2008 Lorentz center, Leiden x An
Visualizing Historical Agricultural Data: The Current State of the Art Irwin Anolik (USDA National Agricultural Statistics Service)
Visualizing Historical Agricultural Data: The Current State of the Art Irwin Anolik (USDA National Agricultural Statistics Service) Abstract This paper reports on methods implemented at the National Agricultural
Contents WEKA Microsoft SQL Database
WEKA User Manual Contents WEKA Introduction 3 Background information. 3 Installation. 3 Where to get WEKA... 3 Downloading Information... 3 Opening the program.. 4 Chooser Menu. 4-6 Preprocessing... 6-7
Data Visualization. or Graphical Data Presentation. Jerzy Stefanowski Instytut Informatyki
Data Visualization or Graphical Data Presentation Jerzy Stefanowski Instytut Informatyki Data mining for SE -- 2013 Ack. Inspirations are coming from: G.Piatetsky Schapiro lectures on KDD J.Han on Data
White Paper Combining Attitudinal Data and Behavioral Data for Meaningful Analysis
MAASSMEDIA, LLC WEB ANALYTICS SERVICES White Paper Combining Attitudinal Data and Behavioral Data for Meaningful Analysis By Abigail Lefkowitz, MaassMedia Executive Summary: In the fast-growing digital
An Easy Introduction to Biplots for Multi-Environment Trials
An Easy Introduction to Biplots for Multi-Environment Trials Brigid McDermott ([email protected]), Statistical Services Centre, University of Reading, UK Ric Coe ([email protected]), Statistical Services
GROWTH DYNAMICS AND YIELD OF WINTER WHEAT VARIETIES GROWN AT DIVERSE NITROGEN LEVELS E. SUGÁR and Z. BERZSENYI
GROWTH DYNAMICS AND YIELD OF WINTER WHEAT VARIETIES GROWN AT DIVERSE NITROGEN LEVELS E. SUGÁR and Z. BERZSENYI AGRICULTURAL RESEARCH INSTITUTE OF THE HUNGARIAN ACADEMY OF SCIENCES, MARTONVÁSÁR The growth
Data Visualization Techniques
Data Visualization Techniques From Basics to Big Data with SAS Visual Analytics WHITE PAPER SAS White Paper Table of Contents Introduction.... 1 Generating the Best Visualizations for Your Data... 2 The
Expert Color Choices for Presenting Data
Expert Color Choices for Presenting Data Maureen Stone, StoneSoup Consulting The problem of choosing colors for data visualization is expressed by this quote from information visualization guru Edward
TIBCO Spotfire Business Author Essentials Quick Reference Guide. Table of contents:
Table of contents: Access Data for Analysis Data file types Format assumptions Data from Excel Information links Add multiple data tables Create & Interpret Visualizations Table Pie Chart Cross Table Treemap
Analyzing The Role Of Dimension Arrangement For Data Visualization in Radviz
Analyzing The Role Of Dimension Arrangement For Data Visualization in Radviz Luigi Di Caro 1, Vanessa Frias-Martinez 2, and Enrique Frias-Martinez 2 1 Department of Computer Science, Universita di Torino,
with functions, expressions and equations which follow in units 3 and 4.
Grade 8 Overview View unit yearlong overview here The unit design was created in line with the areas of focus for grade 8 Mathematics as identified by the Common Core State Standards and the PARCC Model
Demographics of Atlanta, Georgia:
Demographics of Atlanta, Georgia: A Visual Analysis of the 2000 and 2010 Census Data 36-315 Final Project Rachel Cohen, Kathryn McKeough, Minnar Xie & David Zimmerman Ethnicities of Atlanta Figure 1: From
Business Process Discovery
Sandeep Jadhav Introduction Well defined, organized, implemented, and managed Business Processes are very critical to the success of any organization that wants to operate efficiently. Business Process
NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
Educator s Guide to Excel Graphing
Educator s Guide to Excel Graphing Overview: Students will use Excel to enter data into a spreadsheet and make a graph. Grades and Subject Areas: Grade 3-6 Subject Math Objectives: Students will: make
Optimizing an Electromechanical Device with Mulitdimensional Analysis Software
Optimizing an Electromechanical Device with Mulitdimensional Analysis Software White Paper June 2013 Tecplot, Inc. P.O. Box 52708 Bellevue, WA 98015 425.653.1200 direct 800.676.7568 toll free [email protected]
Spreadsheet software for linear regression analysis
Spreadsheet software for linear regression analysis Robert Nau Fuqua School of Business, Duke University Copies of these slides together with individual Excel files that demonstrate each program are available
NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS
NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS TEST DESIGN AND FRAMEWORK September 2014 Authorized for Distribution by the New York State Education Department This test design and framework document
There are six different windows that can be opened when using SPSS. The following will give a description of each of them.
SPSS Basics Tutorial 1: SPSS Windows There are six different windows that can be opened when using SPSS. The following will give a description of each of them. The Data Editor The Data Editor is a spreadsheet
Executive Dashboard Cookbook
Executive Dashboard Cookbook Rev: 2011-08-16 Sitecore CMS 6.5 Executive Dashboard Cookbook A Marketers Guide to the Executive Insight Dashboard Table of Contents Chapter 1 Introduction... 3 1.1 Overview...
Effective Use of Android Sensors Based on Visualization of Sensor Information
, pp.299-308 http://dx.doi.org/10.14257/ijmue.2015.10.9.31 Effective Use of Android Sensors Based on Visualization of Sensor Information Young Jae Lee Faculty of Smartmedia, Jeonju University, 303 Cheonjam-ro,
How To Use Statgraphics Centurion Xvii (Version 17) On A Computer Or A Computer (For Free)
Statgraphics Centurion XVII (currently in beta test) is a major upgrade to Statpoint's flagship data analysis and visualization product. It contains 32 new statistical procedures and significant upgrades
Barcode Based Automated Parking Management System
IJSRD - International Journal for Scientific Research & Development Vol. 2, Issue 03, 2014 ISSN (online): 2321-0613 Barcode Based Automated Parking Management System Parth Rajeshbhai Zalawadia 1 Jasmin
Efficient Information Visualization of Multivariate and Time-Varying Data
Linköping Studies in Science and Technology Dissertations, No. 1191 Efficient Information Visualization of Multivariate and Time-Varying Data Jimmy Johansson Department of Science and Technology Linköping
For example, estimate the population of the United States as 3 times 10⁸ and the
CCSS: Mathematics The Number System CCSS: Grade 8 8.NS.A. Know that there are numbers that are not rational, and approximate them by rational numbers. 8.NS.A.1. Understand informally that every number
BiCluster Viewer: A Visualization Tool for Analyzing Gene Expression Data
BiCluster Viewer: A Visualization Tool for Analyzing Gene Expression Data Julian Heinrich, Robert Seifert, Michael Burch, Daniel Weiskopf VISUS, University of Stuttgart Abstract. Exploring data sets by
Visualization Quick Guide
Visualization Quick Guide A best practice guide to help you find the right visualization for your data WHAT IS DOMO? Domo is a new form of business intelligence (BI) unlike anything before an executive
CAD and Creativity. Contents
CAD and Creativity K C Hui Department of Automation and Computer- Aided Engineering Contents Various aspects of CAD CAD training in the university and the industry Conveying fundamental concepts in CAD
Teaching Statistics with Fathom
Teaching Statistics with Fathom UCB Extension X369.6 (2 semester units in Education) COURSE DESCRIPTION This is a professional-level, moderated online course in the use of Fathom Dynamic Data software
Grade level: secondary Subject: mathematics Time required: 45 to 90 minutes
TI-Nspire Activity: Paint Can Dimensions By: Patsy Fagan and Angela Halsted Activity Overview Problem 1 explores the relationship between height and volume of a right cylinder, the height and surface area,
Statistics Chapter 2
Statistics Chapter 2 Frequency Tables A frequency table organizes quantitative data. partitions data into classes (intervals). shows how many data values are in each class. Test Score Number of Students
Microsoft Business Intelligence Visualization Comparisons by Tool
Microsoft Business Intelligence Visualization Comparisons by Tool Version 3: 10/29/2012 Purpose: Purpose of this document is to provide a quick reference of visualization options available in each tool.
CE 504 Computational Hydrology Computational Environments and Tools Fritz R. Fiedler
CE 504 Computational Hydrology Computational Environments and Tools Fritz R. Fiedler 1) Operating systems a) Windows b) Unix and Linux c) Macintosh 2) Data manipulation tools a) Text Editors b) Spreadsheets
Data mining as a tool of revealing the hidden connection of the plant
Data mining as a tool of revealing the hidden connection of the plant Honeywell AIDA Advanced Interactive Data Analysis Introduction What is AIDA? AIDA: Advanced Interactive Data Analysis Developped in
Common Core Unit Summary Grades 6 to 8
Common Core Unit Summary Grades 6 to 8 Grade 8: Unit 1: Congruence and Similarity- 8G1-8G5 rotations reflections and translations,( RRT=congruence) understand congruence of 2 d figures after RRT Dilations
Clutter Reduction in Multi-Dimensional Data Visualization Using Dimension. reordering.
Clutter Reduction in Multi-Dimensional Data Visualization Using Dimension Reordering Wei Peng, Matthew O. Ward and Elke A. Rundensteiner Computer Science Department Worcester Polytechnic Institute Worcester,
Data Visualization - A Very Rough Guide
Data Visualization - A Very Rough Guide Ken Brodlie University of Leeds 1 What is This Thing Called Visualization? Visualization Use of computersupported, interactive, visual representations of data to
business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar
business statistics using Excel Glyn Davis & Branko Pecar OXFORD UNIVERSITY PRESS Detailed contents Introduction to Microsoft Excel 2003 Overview Learning Objectives 1.1 Introduction to Microsoft Excel
Using kernel methods to visualise crime data
Submission for the 2013 IAOS Prize for Young Statisticians Using kernel methods to visualise crime data Dr. Kieran Martin and Dr. Martin Ralphs [email protected] [email protected] Office
Lesson 3.2.1 Using Lines to Make Predictions
STATWAY INSTRUCTOR NOTES i INSTRUCTOR SPECIFIC MATERIAL IS INDENTED AND APPEARS IN GREY ESTIMATED TIME 50 minutes MATERIALS REQUIRED Overhead or electronic display of scatterplots in lesson BRIEF DESCRIPTION
Rachel J. Goldberg, Guideline Research/Atlanta, Inc., Duluth, GA
PROC FACTOR: How to Interpret the Output of a Real-World Example Rachel J. Goldberg, Guideline Research/Atlanta, Inc., Duluth, GA ABSTRACT THE METHOD This paper summarizes a real-world example of a factor
Beating the NCAA Football Point Spread
Beating the NCAA Football Point Spread Brian Liu Mathematical & Computational Sciences Stanford University Patrick Lai Computer Science Department Stanford University December 10, 2010 1 Introduction Over
Bernice E. Rogowitz and Holly E. Rushmeier IBM TJ Watson Research Center, P.O. Box 704, Yorktown Heights, NY USA
Are Image Quality Metrics Adequate to Evaluate the Quality of Geometric Objects? Bernice E. Rogowitz and Holly E. Rushmeier IBM TJ Watson Research Center, P.O. Box 704, Yorktown Heights, NY USA ABSTRACT
PS1-7 - 6121 Influence of storage condition on seed quality of maize, soybean and sunflower
PS1-7 - 6121 Influence of storage condition on seed quality of maize, soybean and sunflower B. Šimic 1,*, A. Sudaric 1, I. Liovic 1, I. Kalinovic 2, V. Rozman 2, J. Cosic 2 Abstract The study was aimed
MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.
Final Exam Review MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 1) A researcher for an airline interviews all of the passengers on five randomly
Reporting. Understanding Advanced Reporting Features for Managers
Reporting Understanding Advanced Reporting Features for Managers Performance & Talent Management Performance & Talent Management combines tools and processes that allow employees to focus and integrate
The Comparisons. Grade Levels Comparisons. Focal PSSM K-8. Points PSSM CCSS 9-12 PSSM CCSS. Color Coding Legend. Not Identified in the Grade Band
Comparison of NCTM to Dr. Jim Bohan, Ed.D Intelligent Education, LLC [email protected] The Comparisons Grade Levels Comparisons Focal K-8 Points 9-12 pre-k through 12 Instructional programs from prekindergarten
VisCMD: Visualizing Cloud Modeling Data
CPSC-533C Information Visualization Project Report VisCMD: Visualizing Cloud Modeling Data Quanzhen Geng (#63546014) and Jing Li (#90814013) Email: [email protected] [email protected] (Master of Software
20 A Visualization Framework For Discovering Prepaid Mobile Subscriber Usage Patterns
20 A Visualization Framework For Discovering Prepaid Mobile Subscriber Usage Patterns John Aogon and Patrick J. Ogao Telecommunications operators in developing countries are faced with a problem of knowing
Overview of Factor Analysis
Overview of Factor Analysis Jamie DeCoster Department of Psychology University of Alabama 348 Gordon Palmer Hall Box 870348 Tuscaloosa, AL 35487-0348 Phone: (205) 348-4431 Fax: (205) 348-8648 August 1,
Exploratory Data Analysis for Ecological Modelling and Decision Support
Exploratory Data Analysis for Ecological Modelling and Decision Support Gennady Andrienko & Natalia Andrienko Fraunhofer Institute AIS Sankt Augustin Germany http://www.ais.fraunhofer.de/and 5th ECEM conference,
Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics
Course Text Business Statistics Lind, Douglas A., Marchal, William A. and Samuel A. Wathen. Basic Statistics for Business and Economics, 7th edition, McGraw-Hill/Irwin, 2010, ISBN: 9780077384470 [This
