CHAPTER TWELVE TABLES, CHARTS, AND GRAPHS



Similar documents
X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

CHAPTER THREE COMMON DESCRIPTIVE STATISTICS COMMON DESCRIPTIVE STATISTICS / 13

Introduction Hypothesis Methods Results Conclusions Figure 11-1: Format for scientific abstract preparation

Scatter Plots with Error Bars

Describing, Exploring, and Comparing Data

Exploratory data analysis (Chapter 2) Fall 2011

Describing and presenting data

Organizing Your Approach to a Data Analysis

Excel Tutorial. Bio 150B Excel Tutorial 1

CHARTS AND GRAPHS INTRODUCTION USING SPSS TO DRAW GRAPHS SPSS GRAPH OPTIONS CAG08

Session 7 Bivariate Data and Analysis

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

Exercise 1.12 (Pg )

Introduction Course in SPSS - Evening 1

Using Excel for descriptive statistics

Diagrams and Graphs of Statistical Data

Using SPSS, Chapter 2: Descriptive Statistics

Doing Multiple Regression with SPSS. In this case, we are interested in the Analyze options so we choose that menu. If gives us a number of choices:

Data exploration with Microsoft Excel: analysing more than one variable

Creating Charts in Microsoft Excel A supplement to Chapter 5 of Quantitative Approaches in Business Studies

Association Between Variables

CHAPTER THREE. Key Concepts

Examples of Data Representation using Tables, Graphs and Charts

Test Positive True Positive False Positive. Test Negative False Negative True Negative. Figure 5-1: 2 x 2 Contingency Table

Chapter 6: Constructing and Interpreting Graphic Displays of Behavioral Data

Descriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics

Pie Charts. proportion of ice-cream flavors sold annually by a given brand. AMS-5: Statistics. Cherry. Cherry. Blueberry. Blueberry. Apple.

Formulas, Functions and Charts

Chapter 2 Statistical Foundations: Descriptive Statistics

Modifying Colors and Symbols in ArcMap

Intermediate PowerPoint

TECHNIQUES OF DATA PRESENTATION, INTERPRETATION AND ANALYSIS

Gestation Period as a function of Lifespan

Quantitative vs. Categorical Data: A Difference Worth Knowing Stephen Few April 2005

Graphing in SAS Software

DATA INTERPRETATION AND STATISTICS

We are often interested in the relationship between two variables. Do people with more years of full-time education earn higher salaries?

CALCULATIONS & STATISTICS

Summarizing and Displaying Categorical Data

03 The full syllabus. 03 The full syllabus continued. For more information visit PAPER C03 FUNDAMENTALS OF BUSINESS MATHEMATICS

Biology 300 Homework assignment #1 Solutions. Assignment:

LOGISTIC REGRESSION ANALYSIS

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

Engineering Problem Solving and Excel. EGN 1006 Introduction to Engineering

Scientific Graphing in Excel 2010

There are six different windows that can be opened when using SPSS. The following will give a description of each of them.

R Graphics Cookbook. Chang O'REILLY. Winston. Tokyo. Beijing Cambridge. Farnham Koln Sebastopol

CSU, Fresno - Institutional Research, Assessment and Planning - Dmitri Rogulkin

Review of Fundamental Mathematics

Probability Distributions

IBM SPSS Direct Marketing 23

Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables 2

Descriptive Statistics

STATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI

Exponential Growth and Modeling

Exercise 1: How to Record and Present Your Data Graphically Using Excel Dr. Chris Paradise, edited by Steven J. Price

Chapter 2: Frequency Distributions and Graphs

Foundation of Quantitative Data Analysis

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

The Dummy s Guide to Data Analysis Using SPSS

SPSS for Simple Analysis

Microsoft Excel 2010 Part 3: Advanced Excel

The Forgotten JMP Visualizations (Plus Some New Views in JMP 9) Sam Gardner, SAS Institute, Lafayette, IN, USA

Visualization Quick Guide

Once saved, if the file was zipped you will need to unzip it. For the files that I will be posting you need to change the preferences.

Northumberland Knowledge

Appendix 2.1 Tabular and Graphical Methods Using Excel

Years after US Student to Teacher Ratio

Updates to Graphing with Excel

This unit will lay the groundwork for later units where the students will extend this knowledge to quadratic and exponential functions.

Simple Linear Regression, Scatterplots, and Bivariate Correlation

Descriptive Statistics and Measurement Scales

Additional sources Compilation of sources:

SPSS Explore procedure

Choosing a classification method

Excel -- Creating Charts

6.4 Normal Distribution

Visualizing Data. Contents. 1 Visualizing Data. Anthony Tanbakuchi Department of Mathematics Pima Community College. Introductory Statistics Lectures

5. Correlation. Open HeightWeight.sav. Take a moment to review the data file.

Fairfield Public Schools

In mathematics, there are four attainment targets: using and applying mathematics; number and algebra; shape, space and measures, and handling data.

IBM SPSS Direct Marketing 22

Creating an Excel XY (Scatter) Plot

Descriptive Statistics and Exploratory Data Analysis

Unit 1 Number Sense. In this unit, students will study repeating decimals, percents, fractions, decimals, and proportions.

Writing Reports BJECTIVES ONTENTS. By the end of this section you should be able to :

Means, standard deviations and. and standard errors

Directions for Frequency Tables, Histograms, and Frequency Bar Charts

USING EXCEL ON THE COMPUTER TO FIND THE MEAN AND STANDARD DEVIATION AND TO DO LINEAR REGRESSION ANALYSIS AND GRAPHING TABLE OF CONTENTS

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

2. Simple Linear Regression

Point and Interval Estimates

STATISTICAL QUALITY CONTROL (SQC)

A DATA ANALYSIS TOOL THAT ORGANIZES ANALYSIS BY VARIABLE TYPES. Rodney Carr Deakin University Australia

Characteristics of Binomial Distributions

MTH 140 Statistics Videos

Analysing Questionnaires using Minitab (for SPSS queries contact -)

Transcription:

TABLES, CHARTS, AND GRAPHS / 75 CHAPTER TWELVE TABLES, CHARTS, AND GRAPHS Tables, charts, and graphs are frequently used in statistics to visually communicate data. Such illustrations are also a frequent first step in evaluating raw data for trends, data entry errors, and outlying values which might impact on the statistical interpretation of the data. A properly drawn chart or graph can answer a number of questions in a minimal amount of space as well as suggest questions not previously considered. The goal in creating tables, charts or graphs is to present data in a clear and accurate format which is easily interpreted. The multitude of computer graphics programs currently available has greatly simplified the creation of such illustrations. Nevertheless, a poorly designed table or figure will only serve to confuse and even mislead the reader. The following are some guidelines for presenting data visually. DESIGNING TABLES Tables are useful for summarizing the raw data upon which a study s conclusions are based. They effectively use a minimum of space to communicate a large amount of information. There are, however, a few important points to keep in mind when creating a table: 1. Arrange table rows and columns logically. This will make the table more easily interpreted and aesthetically pleasing. We each have a preconceived impression of how a table should be arranged. We expect the data to be presented in a logical order. If this is not the case, we must first reorient ourselves to the table before we can begin to interpret the data. Rows and columns should therefore be ordered by size, time, or variable of interest as the reader would normally expect them to be. The order of presentation can also be used to emphasize important data. 2. Simplify data entries as much as possible. In most situations, presenting data with more than two significant digits only serves to make the table appear cluttered. In fact, it is rare that we are capable of measuring a physiologic parameter to more than two decimal places. As a general rule, a number should never have more significant digits than that of the raw data which were used to calculate it or more digits than we are physically able to measure. For example, since we can only measure arterial carbon dioxide tension (PaCO 2 ) in round numbers (i.e., PaCO 2 = 35 torr), we should not report mean PaCO 2 values of 38.42 torr; rather, we need to round off the mean to the number of significant digits we are capable of measuring and therefore present the mean PaCO 2 as 38 torr. 3. Important data should be emphasized. A table is primarily used to convey important information which is used to support the study s conclusions. Emphasizing the table with lines and/or bold face type styles will draw the reader s attention to this data. Important statistics such as mean, median, standard deviation, confidence intervals, and p values can be emphasized in this manner, or set apart in a summary row or column. This is especially important in scientific abstract preparation where space is limited. Statistical significance should be illustrated through the use of symbols which are described in the table legend. 4. Summarize the data to assist in comparison. Include row and column means or medians, where applicable, to summarize the information contained in the table. This allows the reader to quickly interpret the table and find important data.

76 / A PRACTICAL GUIDE TO BIOSTATISTICS To illustrate these points, consider the appearance of the following table: RVEF > 4% RVEF 3-39% RVEF < 3% RVEDVI.82.823.586 CVP.169.235.482 PAOP.221.5.34 Without any accompanying text, this table is uninterpretable as well as misleading. From the table, there is no way of knowing that these are the correlation coefficients (r values) from a study comparing cardiac index with right ventricular end-diastolic volume index (RVEDVI), pulmonary artery occlusion pressure (PAOP) and central venous pressure (CVP) at various levels of right ventricular ejection fraction (RVEF). A table should be self-explanatory and self-supporting. It should be easily understood apart from its accompanying text. It is unclear in the above example exactly what is being presented and why. Abbreviations are not defined, data are not arranged logically, and there is no statistical analysis given. Furthermore, the table appears to be comparing RVEF with RVEDVI, CVP, and PAOP and is therefore misleading. 1. No Title 2. Abbreviations not defined 3. Data not arranged logically RVEF > 4% RVEF 3-39% RVEF < 3% RVEDVI.82.823.586 CVP.169.235.482 PAOP.221.5.34 4. No statistical analysis 5. Too many significant digits 6. Comparison of data is misleading Consider the revised table after the above problems have been addressed: Correlation with Cardiac Index at Various Levels of RVEF RVEF < 3% RVEF 3-39% RVEF > 4% n 46 48 55 RVEDVI.59***.82***.8*** PAOP.48**.24.22 CVP.3*.5.17 *p <.5 **p <.1 ***p <.1 (Linear correlation compared to cardiac index) n - number of observations RVEF - right ventricular ejection fraction RVEDVI - right ventricular end-diastolic volume index PAOP - pulmonary artery occlusion pressure CVP - central venous pressure This table is easily interpreted apart from its accompanying text. The title clearly defines the information that is being presented. Abbreviations are explained in the accompanying legend. Data are logically arranged by increasing RVEF. The number of observations in each group is reported. The statistical significance for each group is presented and the statistical methods used described. The important values are set apart through the use of bold-face type. The data are also simplified to the appropriate number of significant digits. Careful design of tables in this manner makes the conclusions of an abstract or manuscript more easily recognized and applicable to clinical use. SUGGESTIONS FOR CHART AND GRAPH DESIGN

TABLES, CHARTS, AND GRAPHS / 77 1. Charts and graphs should be simple. In the past, charts and graphs were simple by necessity. They were often drawn by hand, requiring hours of work. With the widespread use of computers, creating elaborate and colorful charts is now quick and easy. Complicated, colorful, and intricate charts and graphs are not necessarily better, however, and may only serve to confuse or mislead the reader. Charts should therefore be kept as simple as possible. 2. Maximize the data in each chart. One of the real advantages of using visual representations of data is that a maximum of information can be conveyed in a minimal amount of space. It is possible, however, to include too much data and create a graph that overwhelms the reader by trying to answer several questions at once. Charts and graphs should be designed to be easily understood at first glance. At the same time, using multiple graphs to present data that could just as easily be presented in a single graph is an inefficient use of space. 3. Label rows and columns concisely and accurately. Numbers by themselves are meaningless. Accurate labeling of the data is essential if the reader is to interpret the data correctly. 4. Use legends to explain the data being illustrated. Legends should be used to define which study groups are being illustrated and in what manner. 5. Use an appropriate scale. The type of data variable and the statistical methods used to analyze the data will frequently determine the way is which data is displayed. Data which is binary or dichotomous will most likely require a nominal scale (i.e., disease vs no disease). When the data is ordered in categories, such as in tumor staging, an ordinal scale consisting of several categories is appropriate. The majority of data encountered in medical research is continuous and requires a numerical scale. Manipulating the scale can make any data look better, even if there is no significant difference to demonstrate. At the same time, using an inappropriate scale may minimize the appearance of a significant difference or may hide clinically useful variations in the data. Thus, the scale utilized can profoundly affect the way in which the data is interpreted. One common mistake in creating charts and graphs is changing the scale in mid-axis. This may distort the graphical representation of the data and can be misleading to the reader. Another common error is beginning the scale at a point other than zero. This is known as suppression of zero error and can also result in distortion of the data. 6. Arrange data logically. As with tables, the reader will expect to see data presented in a logical order. If the data has a natural order, rearranging that order may result in a confusing graph. Although it might seem natural and helpful, displaying data alphabetically or numerically from smallest to largest may thus upset the natural order of the data and make it less clear. 7. Use the same baseline for multiple graphs of the same type. If multiple graphs are being used to display data, use of a common baseline will allow the graphs to be compared much more easily and will make them less confusing. It can also allow multiple graphs to be condensed into one graph thus saving space. 8. Use significant digits appropriately. As described above for tables, only use as many significant digits as are appropriate. 9. Avoid using double Y-axis graphs. These graphs are commonly used to display two or more groups of data which are measured on different scales. Unfortunately, they can easily be misunderstood. It may not be immediately obvious, for example, which Y axis belongs to which data set, especially if multiple data sets are being presented. In such a situation, it may be better to display the data in two separate graphs, each with a different scale on the Y axis. 1. Avoid logarithmic scales (or other transformed scales) unless the data analysis required transformation (i.e., the data did not assume a normal distribution). Logarithmic scales can similarly distort the data and make it confusing.

78 / A PRACTICAL GUIDE TO BIOSTATISTICS 11. Error bars should be used to convey variability. Most charts and graphs provide information about the mean of the data. We are also interested, however, in the variability associated with the data mean. Error bars, as illustrated in the graph below, provide a way of displaying this variability in the data. CARDIAC INDEX VS RVEDVI DURING FLUID RESUSCITATION RVEDVI (ml/m2) 15 125 1 75 5 25 4 4.5 5 5.5 6 CARDIAC INDEX (ml/min/m2) COMMON USES OF VARIOUS CHARTS AND GRAPHS Various examples of tables, charts, and graphs have been illustrated in previous chapters as listed below: frequency histogram Chapter 2 data tables Chapters 3, 4, 5, 6, 9, 11 contingency tables Chapters 4, 5 ROC curves Chapter 4 X-Y scatter graphs Chapter 8 The following are examples of other types of charts and graphs which are commonly seen in the medical literature along with their uses. Bar charts - used to display nominal or ordinal data CHO mg/kg/min 6 5 4 3 2 1 Carbohydrate (CHO) Administration Rate 5.34 4.31 3.5 A7/D21 A4.25/D25 A5/D35 TPN formula

TABLES, CHARTS, AND GRAPHS / 79 Histograms used to present information about ordinal data the measurement of interest is graphed on the X axis and the number or percentage of observations on the Y axis MEXICO ITALY SWEDEN U.S. ENGLAND NEW ZEALAND 5 1 15 2 25 Death rate/1, population Dot plots -used to display binary or nominal data 1 8 Arterial Oxygen Tension by Sex PaO2 6 4 2 Male Female Scattergrams or Scatterplots - also known as bivariate plots - the risk factor is placed on the X axis and the characteristic or outcome to be explained is placed on the Y axis -different subgroups can be represented by different symbols maximizing information exchange without an increase in space 15 Cardiac Index vs RVEDVI RVEDVI 1 5 2 4 6 8 Cardiac index

8 / A PRACTICAL GUIDE TO BIOSTATISTICS Survival Curves used to illustrate survival rates at various periods of time. Kaplan-Meier survival curves plot the proportion of patients still surviving each time a patient dies. Survival Following Tumor Resection 1 Percent Survival 8 6 4 2 1 2 3 4 5 Time in years to death Actuarial survival curves are curvilinear as they plot the proportion of patients still surviving at specific time intervals since the study began, averaging the occurrence of death during the previous time period. Survival Following Tumor Resection Percent Survival 1 8 6 4 2 1 2 3 4 5 Year of Death SUGGESTED READING 1. Wainer H. Understanding graphs and tables. Educ Research 1992;21:14-23. 2. Wainer H. How to display data badly. Am Stat 1984;38:137-147. 3. Altman DG. Statistics and ethics in medical research: VI - Presentation of results. Brit Med J 198;281:1542-1544. 4. O Brien PC, Shampo MA. Statistics for clincians. 2. Graphic displays - histograms, frequency polygons, and cumulative distribution polygons. Mayo Clin Proc 1981;56:126-128. 5. O Brien PC, Shampo MA. Statistics for clincians. 3. Graphic displays - Scatter diagrams. Mayo Clin Proc 1981;56:196-197.