# Describing, Exploring, and Comparing Data

Size: px
Start display at page:

Transcription

1 24 Chapter 2. Describing, Exploring, and Comparing Data Chapter 2. Describing, Exploring, and Comparing Data There are many tools used in Statistics to visualize, summarize, and describe data. This chapter will focus on using SPSS to create the tables, graphs, and charts, and calculate the descriptive statistics that are discussed in Elementary Statistics. See Elementary Statistics for a thorough discussion of the uses and misuses of these techniques. You should be familiar with Chapter 2 of Elementary Statistics prior to beginning this chapter. Section 2-1. Frequency Distributions For a first look at data, it is practical to begin by describing the distribution of values in a data file. Many statistical tools have been developed for this purpose. SPSS can simplify the creation of the tables, graphs, and charts used for exploring a data file. A common tool used for describing the distribution of values is the frequency distribution. SPSS makes a frequencies report. A frequencies report is different from a frequency distribution. The frequencies report simply lists distinct data values with their frequencies; it does not list the frequency of data values in a list of categories. SPSS does not provide a way to customize the class limits to enable a frequency table to be created. It is not difficult to make a frequency distribution by hand from the frequencies report. Frequencies Report for a Variable The Health Exam Results data from Data Set 1 in Appendix B of Elementary Statistics (this data appears on disk as Mhealth.sav) has 13 different measurements on 40 men. The variables being measured are age, height, weight, waist circumference, pulse rate, systolic and diastolic blood pressure, cholesterol level, body mass index, upper leg length, elbow breadth, wrist breadth, and arm circumference. These data are from the U.S. Department of Health and Human Services, National Center for Health Statistics, Third National Health and Nutrition Survey. Open the Mhealth.sav data file (see Section 0-3). To obtain a frequencies report for age and ht, choose Analyze > Descriptive Statistics > Frequencies to open the Frequencies dialog box (Figure 2-1). Figure 2-1

2 Chapter 2. Describing, Exploring, and Comparing Data 25 Highlight the variable, age, by clicking on its label Ages (years) [age] and then clicking the Variable Paste button to copy the variable to the Variable(s) list. In a similar manner, copy the variable, ht, to the Variable(s) list. If the variable has many distinct data values the resulting frequencies report can be very long. The frequencies report(s) can be suppressed by unchecking the checkbox for Display frequency tables in the Frequencies dialog box. If you uncheck this box, SPSS may give a warning that says, You have turned off all output. Unless you request Display Frequency Tables, Statistics, or Charts, FREQUENCIS will generate no output. Frequency reports are often the first tool used to describe a data file. SPSS provides buttons on the Frequencies dialog box for calculating some common statistics and charts that are useful for describing data files. In the following sections we will discuss many of these charts and statistics. The checkbox for Display frequency tables should be checked now since the goal here is to produce the frequencies table. Click the OK button and SPSS will produce the frequencies report in a new window called the Output Viewer Window. The frequencies reports for both age and ht are rather long and so only the beginning and end of the frequencies report for age is shown below (Figure 2-2). The Output Viewer window displays the entire frequencies report for both variables. Figure 2-2 The frequencies report shows the frequencies associated with each distinct data value. Be careful when reading a frequencies report, for example, a common oversight is to not notice that there are no data values equal to 19, 21, 23-24, and so on in this data file. The frequencies report indicates that the minimum and maximum data values are 17 and 73, respectively. The report shows that the data value 17 occurs two times, and that the data value 20 occurs four times. The last line of the frequencies report indicates that there are 40 data values in the data file. A frequency distribution can easily be obtained from the frequencies report. The column labeled, Percent, indicates the relative frequency with which each data value occurs. This column can be used to make a relative frequency distribution of the ages. The last column labeled, Cumulative Percent, gives the cumulative relative frequencies. This column can be used to make a cumulative frequency distribution for ages. Output Viewer Window This section provides an overview of the Output Viewer window. The Output Viewer window shows the output that results from a command procedure being run in the dialog window. If there is no Output Viewer window open, SPSS will open a new Output Viewer window. The Output

3 26 Chapter 2. Describing, Exploring, and Comparing Data Viewer window can be used to hide, recenter, or delete the results of procedures. The frequencies procedure results are shown in the Output Viewer window (Figure 2-3). Figure 2-3 The SPSS Output Viewer window is divided into two independent windows. The left hand window (Outline window) shows the output in outline view. The right hand window (Output window) shows the output. The Outline window is useful for choosing output to be displayed in the Output window. Individual items in the Outline window can be displayed or hidden. For example, Notes, which has information about the execution of a procedure, is hidden by default. Click on Notes (or the icon in front of it) to select the item Notes and choose View > Show to cause the Notes to be displayed in the Output window. Choose View > Hide to hide a selected item in the Output window. Clicking the minus sign, in front of an item will hide all the output associated with the item. This is the same as selecting the item and choosing View > Collapse. Click on the minus sign in front of Frequencies; notice that the output window is now blank. Click on Frequencies to select it, and then choose View > Expand to make the output reappear. Alternately, clicking on the plus sign in front of a collapsed item will also expand the output. Clicking on (selecting) an item in the Outline window will cause the Output window to be recentered on the item. For example, click on Ages (years) in the Outline window to display the frequencies report for the variable age in the Output window. Click on Height (inches) in the Outline window and the display in the Output window changes to the frequencies report for the variable ht. If you want to delete an item from the Output window, then click on the item and choose Edit > Delete. For example, to delete the frequencies report for height, click on Height (inches) in the Outline window and choose Edit > Delete (or simply press the delete key). If you mistakenly delete some output you may choose Edit > Undo and SPSS will undo the previous action.

4 Chapter 2. Describing, Exploring, and Comparing Data 27 Frequencies Reports for subgroups of a Variable Often the same variable is measured for several groups. In situations like this the data file will contain two variables. One variable will contain the values being measured and a second variable that indicates into which group the particular case belongs. An example of such a situation is the Passive and Active Smoke data set. Consider the data from the Chapter Problem Are nonsmoking people really affected by others who are smoking cigarettes, or is the effect of secondhand smoke really a myth? in Chapter 2 of Elementary Statistics. Data Set 6 in Appendix B of Elementary Statistics includes some of the latest data from the National Institutes of Health (this data also appears on disk as Cotinine.sav). The data were obtained as part of the National Health and Nutrition Examination Survey. The data values consist of measured levels of serum cotinine (in ng/ml) in people selected as study subjects. (The data were rounded to the nearest whole number, so a value of zero does not necessarily indicate the total absence of cotinine. In fact, all of the original values were greater than zero.) Cotinine is a metabolite of nicotine, meaning that when the body absorbs nicotine, cotinine is produced. Because it is known that nicotine is absorbed through cigarette smoking, we have a way of measuring the effective presence of cigarette smoke indirectly through cotinine. Open the Cotinine data (see Section 0-3). The data file contains two variables, cotinine and group. The variable, cotinine, is the measured cotinine levels for the 120 people. There are 40 people in each of three groups. The variable, group, indicates whether the person was a smoker- SMOKER, a nonsmoker who was exposed to secondhand smoke- ETS, or a nonsmoker who was not exposed to secondhand smoke- NOETS). To obtain the frequencies reports of the cotinine levels for all three groups (Smoker, ETS, and NOETS), choose Analyze > Descriptive Statistics > Crosstabs to open the Crosstabs dialog box (Figure 2-4). The Crosstabs procedure forms two-way (and even three-way) tables and like the Frequencies procedure creates a frequencies report and calculates a variety of statistics and graphical displays. Figure 2-4 Copy the variable, cotinine, to the Row(s) list and the variable, group, to the Column(s) list by selecting the variable and clicking the Variable paste button. Click the OK button and the

5 28 Chapter 2. Describing, Exploring, and Comparing Data frequencies report will appear in the Output Viewer window. The frequencies reports are very long; only the beginning and end of the frequencies reports are shown below (Figure 2-5). Figure 2-5 Notice that the column labeled Smoker gives the same frequencies report obtained in the previous section. The next two columns give the frequencies reports for ETS and NOETS. Differences among the distributions of the three groups can be found by comparing the three frequency distributions. For example, it is easy to see that the cotinine levels tend to be largest for the smokers, less for the ETS group, and smallest for the NOETS group (not apparent from the shortened table above, but clear from looking at the whole table in the SPSS Output Viewer window). Strangely, the two largest cotinine values are in the ETS group. Section 2-2. Visualizing Data It is said that a picture is worth a thousand words. Describing the distribution of data can be accomplished much more succinctly with pictures than with a frequencies report or a frequency table. Histograms, pie charts, bar charts, stem-and-leaf plots, boxplots, and other graphs are pictures of the information in a frequency table. A histogram is a picture of the information in the frequency table that shows the shape, center, and spread of the distribution. When the data have nominal or ordinal levels of measurement pie charts and bar charts can be used to describe the data. Stem-and-leaf plots are useful when the number of data values is not too large (say less than 100). They provide a way to see the shape of the distribution and without losing the original data values. Boxplots show the shape of the distribution and also provide a way to check for outliers in the data file. The Frequencies procedure can be used to create graphical displays and calculate descriptive statistics for a single factor. To create bar charts, pie charts, or histograms open the Frequencies dialog box (Figure 2-1) and click the Charts button. We discuss using the Explore procedure to produce descriptive statistics and graphical displays.

6 Chapter 2. Describing, Exploring, and Comparing Data 29 Histograms To make histograms of the cotinine levels of the Smokers, ETS, and NOETS groups, choose Analyze > Descriptive Statistics > Explore to open the Explore dialog box (Figure 2-6). Figure 2-6 Copy the variable, cotinine, to the Dependent List and the variable, group, to the Factor List by selecting the variable and clicking the Variable Paste button. Next click the Plots button to open the Explore: Plots dialog box (Figure 2-7). Figure 2-7 Notice that Figure 2-7 has both bullets and checkboxes. Checkboxes and bullets serve two different purposes. Within a grouping, only a single bullet may be chosen while multiple checkboxes can be selected. For example, if the checkboxes for Histogram and Stem-and-leaf are both chosen then both a histogram and stem-and-leaf plot will be produced. Choose the checkbox for Histograms to make a Histogram for each combination of the factor levels (each group) of the variable in the Factor List and variable in the Dependent list. For example, in this problem histograms for Smoker, ETS, and NOETS of the cotinine variable will be produced. This same dialog box can be used to make Stem-and-leaf Plots and Boxplots, as well as some other plots that we will not discuss here. For now, choose the bullet for None under Boxplots. Click the Continue button to return to the Explore dialog box and then click the OK button to display the histograms in the Output Viewer window.

7 30 Chapter 2. Describing, Exploring, and Comparing Data Scroll down the Output Viewer window to see the histograms for Smoker, ETS, and NOETS. The histogram for the Smoker group (Figure 2-8) shows the distribution of the data values to be fairly uniform between 0 and 325, except for two extremely large values. The two large values are separated from the rest of the data and are likely to be outliers. The histograms for ETS and NOETS are extremely right skewed with their values equal to zero. Chart Editor Histograms, in fact any chart created in SPSS can be customized. To customize the histogram for smokers, first select the histogram by clicking on Group = Smoker in the Outline window and then choose Edit > SPSS Chart Object > Open to open the chart editor (Figure 2-8). Alternately, you can double-click on the Histogram to open the Chart Editor. The Chart Editor window has its own menu (notice the menu along the top of the Chart Editor window has changed). Figure 2-8 There are two axes, the Interval Axis (the axis from which the bars originate) and the Scale Axis (the axis which displays numerical values to scale). To modify the Interval Axis, choose Chart > Axis... and the Axis Selection dialog box will open. Choose the bullet for Interval, and then click the OK button. This will open the Interval Axis dialog box (Figure 2-9). Figure 2-9

8 Chapter 2. Describing, Exploring, and Comparing Data 31 Replacing Measured Cotinine Levels in the Axis Title box with a new title will change the axis title. Choosing Center from the drop down list of choices under Title Justification will center the axis title under the graph. This dialog box can also be used to show or hide Tick marks and Grid Lines. These modifications are cosmetic and will have no affect on the shape of the histogram. SPSS automatically determines the number of intervals and interval widths when making a histogram. Changing the number of intervals or the interval width will have an affect on the shape of the histogram. To change the number of class intervals and the interval widths in the histogram, choose the bullet for Custom and then click the Define button to open the Interval Axis: Define Custom Interval dialog box (Figure 2-10). Figure 2-10 Click the button for Interval width and type in 100 and then type 0 in the box for Minimum and 500 in the box for Maximum. This will create classes of 0-100, , , , and This notation might be confusing because it is not be clear into which class a data value of 100 would be included. SPSS understands the interval to be 0 x < 100, therefore 100 goes into the interval labeled After making these changes, click the Continue button and then click the OK button in the Interval Axis dialog box. The Histogram has now been updated; you can now make more changes or close the Chart Editor. Choose File > Close from the menu to close the Chart Editor. The shape of the histogram for smokers (Figure 2-11) indicates that the cotinine levels are fairly evenly distributed between 0 and 300. Further analysis will be necessary to determine if the three data values larger than 300 are outliers. Figure 2-11

9 32 Chapter 2. Describing, Exploring, and Comparing Data Stem-and-leaf plots To make stem-and-leaf plots for the measured cotinine levels for the three groups: Smoker, ETS, and NOETS, choose Analyze > Descriptive Statistics > Explore to open the Explore dialog box (Figure 2-6). Click the Plots button to open the Explore: Plots dialog box (Figure 2-7). Choose the checkbox for Stem-and-leaf and click the Continue button to close the dialog box. Click the OK button in the Explore dialog box and the stem-and-leaf plots will appear in the Output Viewer window (Figure 2-12). Figure 2-12 SPSS only shows the leading digit and the next digit in the number in the stem-and-leaf plot; that is, it truncates the data values to only two digits. To know the size of the data values you must read the Stem width. In the plot above, the Stem width is 100, which means that the first two stems in the plot display data values between and 50-99, respectively. From the stem-and-leaf plot it is impossible to know what the exact value of the 1 on the 3-stem represents. We only know it is a value between 310 and 319. Looking back at the frequencies table we see this value is 313. The stem-and-leaf plot for the Smoker group shows the similar information to that of the histogram. The data values are fairly evenly divided between 0 and 300 with 3 values larger than 300. There is more information here though since the actual data values are available. For example, it can be seen that one of the three data values that are larger than 300 is only about 310. Boxplots To make boxplots for the cotinine levels for the three groups: Smoker, ETS, and NOETS, choose Analyze > Descriptive Statistics > Explore to open the Explore dialog box (Figure 2-6). Click the Plots button to open the Explore: Plots dialog box (Figure 2-7). Choose the bullet for Factor levels together to make a separate Boxplot for each of the variables in the Dependent List in the Explore dialog box. If there are several variables in the Dependent List, choose the bullet for Dependents together to obtain side-by-side Boxplots. In this case, it does not matter since there is only one variable, cotinine, in the Dependent List. Click the Continue button, and then click the

10 Chapter 2. Describing, Exploring, and Comparing Data 33 OK button in the Explore dialog box, and the boxplots will appear side-by-side in the Output Viewer window (Figure 2-13). Figure 2-13 The boxplots clearly show that the cotinine levels for smokers are much larger than the cotinine levels of the ETS and NOETS groups. The two (or three) suspiciously large data values identified in the histogram and stem-and-leaf plots for Smokers apparently are not outliers. The ETS group has larger cotinine levels than the NOETS group. There are many outliers in the ETS and NOETS groups and the distributions are very right skewed. Section 2-3. Scatter Diagram A Scatter Diagram (also known as a Scatterplot) is a plot of paired (x, y) data with a horizontal x-axis and a vertical y-axis. The y-axis variable determines the vertical position of the point and the x-axis variable determines the horizontal position of the point. Scatter diagrams are useful for exploring the relationship between two variables. Exploring the relationship between two variables will be discussed in more detail in Chapter 9. The Health Exam Results data from Data Set 1 in Appendix B of Elementary Statistics (this data appears on disk as Mhealth.sav) includes the weight (in pounds) and the waist circumference (in cm) for 40 males. Make a scatter diagram to determine if there is a relationship between the weight and waist circumference measurements. Open the Mhealth (see Section 0-3) data file. To make a Scatter diagram of weight versus waist circumference choose Graphs > Scatter to open the Scatterplot dialog box (Figure 2-14). Figure 2-14

11 34 Chapter 2. Describing, Exploring, and Comparing Data Choose the icon for Simple and then click the Define button to open the Simple Scatterplot dialog box (Figure 2-15). Figure 2-15 Select the Weight (pounds) [wt] for the Y Axis and Waist circumference (cm) [waist] for the X Axis by clicking on the variable label and then clicking the Variable Paste button. Click the Titles button if you want to give your Scatter diagram a title. Click the OK button and the Scatterplot (Figure 2-16) will appear in the Output Viewer window. Figure 2-16 The scatter diagram indicates that as the waist circumference increases the weight of the person tends to increase as well. There appears to be an approximately linear relationship between the weight and the waist measurements.

12 Chapter 2. Describing, Exploring, and Comparing Data 35 Section 2-4. Descriptive Statistics Often descriptive statistics are used to describe characteristics of variables in a data file. The arithmetic mean (or average), median, and midrange are measures of the center of a distribution. The standard deviation, variance, and range are measures of the spread of a distribution. Quartiles and percentiles are measures of position within in a distribution. Descriptive Statistics for a Variable Open the Mhealth data file (see Section 0-3). The Frequencies procedure can be used since there are no subgroups in this data file. Choose Analyze > Descriptive Statistics > Frequencies to open the Frequencies: dialog box (see Figure 2-1). Copy the variables age, weight, and pulse to the Variable(s) list and click the Statistics button to open the Frequencies: Statistics dialog box (Figure 2-17). Figure 2-17 Check the checkboxes for any descriptive statistics to be included in the analysis. Click the Continue button to return to the Frequencies dialog box. The frequencies reports will be rather long and so it is probably best to uncheck the checkbox for Display frequency tables. Click the OK button and the descriptive statistics will appear in the Output Viewer Window (Figure 2-18). Figure 2-18

13 36 Chapter 2. Describing, Exploring, and Comparing Data The descriptive statistics for each variable are placed into one table. This makes it easy to compare the various statistics. SPSS appends a note to the table to indicate that the mode is not unique for waist circumference and pulse rate. Descriptive Statistics for subgroups of a Variable Open the Cotinine data file. Since there are three subgroups in this data file (indicated by the variable group) the Explore procedure will be used. Choose Analyze > Descriptive Statistics > Explore to open the Explore dialog box (Figure 2-6). Copy cotinine into the Dependent List box and group into the Factor List box. To display descriptive statistics only and suppress the creation of plots choose the bullet for Statistics. Click the button for Statistics to open the Explore: Statistics dialog box (Figure 2-19). Figure 2-19 Choose the checkbox for Descriptives and click the Continue button. Click the OK button and the descriptive statistics will appear in the Output Viewer window. Only a portion of the Descriptives table is shown (Figure 2-20) since the output is quite long. Figure 2-20 The Explore procedure calculates the several descriptive statistics (mean, median, variance, standard deviation, minimum, maximum, range and interquartile range) for each variable. There is other information in the Descriptives table, which we will put off describing it until we discuss inferential statistics.

14 Chapter 2. Describing, Exploring, and Comparing Data 37 Section 2-5. Exercises 1. Consider problem 20 in Chapter 2-2 of Elementary Statistics. Regular Coke and Diet Coke. Refer to Data Set 17 in Appendix B (this data appears on disk as Cola.sav). Construct a relative frequency distribution for the weights of regular coke by starting the first class at lb. and use a class width of lb. Then construct another relative frequency distribution for weights of diet Coke by starting the first class at lb and use a class width of lb. Then compare the results and determine whether there appears to be a significant difference. If so, provide a possible explanation for the difference. 2. Consider problem 9 in Chapter 2-3 of Elementary Statistics. Bears. Refer to Data Set 9 in Appendix B (this data appears on disk as Bears.sav). Construct a histogram for the measured weight of the bears with 11 classes with a lower class limit of 0 and a class width of 50 lbs. 3. Regular Coke and Diet Coke. Refer to Data Set 17 in Appendix B (this data appears on disk as Cola.sav). The data set contains the weight and volume measurements on 36 cans of Regular Coke, Diet Coke, Regular Pepsi, and Diet Pepsi. a. Determine the mean and standard deviation of the weights of Regular Coke, Diet Coke, Regular Pepsi, and Diet Pepsi. What can you conclude about the weights of Colas? b. Find the 5-number summary for the weights of Regular Coke, Diet Coke, Regular Pepsi, and Diet Pepsi. c. Make side-by-side boxplots of the weights of Regular Coke, Diet Coke, Regular Pepsi, and Diet Pepsi. What can you conclude about the distributions of the Colas? 4. Age of Presidents. A Senator is considering running for the U.S. presidency, but she is only 35 years of age, which is the minimum required age. While investigating this issue, she finds the ages of the past presidents when they were inaugurated, and those ages are listed in Table Using the listed ages, find the a. Mean, b. Median, c. Mode, d. Midrange, e. Range, f. Standard deviation, g. Variance, h. Q 1 and Q 3, i. and P 10. Table 2-1

### Using SPSS, Chapter 2: Descriptive Statistics

1 Using SPSS, Chapter 2: Descriptive Statistics Chapters 2.1 & 2.2 Descriptive Statistics 2 Mean, Standard Deviation, Variance, Range, Minimum, Maximum 2 Mean, Median, Mode, Standard Deviation, Variance,

### Exploratory data analysis (Chapter 2) Fall 2011

Exploratory data analysis (Chapter 2) Fall 2011 Data Examples Example 1: Survey Data 1 Data collected from a Stat 371 class in Fall 2005 2 They answered questions about their: gender, major, year in school,

### AP * Statistics Review. Descriptive Statistics

AP * Statistics Review Descriptive Statistics Teacher Packet Advanced Placement and AP are registered trademark of the College Entrance Examination Board. The College Board was not involved in the production

### Summarizing and Displaying Categorical Data

Summarizing and Displaying Categorical Data Categorical data can be summarized in a frequency distribution which counts the number of cases, or frequency, that fall into each category, or a relative frequency

### 2 Describing, Exploring, and

2 Describing, Exploring, and Comparing Data This chapter introduces the graphical plotting and summary statistics capabilities of the TI- 83 Plus. First row keys like \ R (67\$73/276 are used to obtain

### There are six different windows that can be opened when using SPSS. The following will give a description of each of them.

SPSS Basics Tutorial 1: SPSS Windows There are six different windows that can be opened when using SPSS. The following will give a description of each of them. The Data Editor The Data Editor is a spreadsheet

### SECTION 2-1: OVERVIEW SECTION 2-2: FREQUENCY DISTRIBUTIONS

SECTION 2-1: OVERVIEW Chapter 2 Describing, Exploring and Comparing Data 19 In this chapter, we will use the capabilities of Excel to help us look more carefully at sets of data. We can do this by re-organizing

### SPSS Manual for Introductory Applied Statistics: A Variable Approach

SPSS Manual for Introductory Applied Statistics: A Variable Approach John Gabrosek Department of Statistics Grand Valley State University Allendale, MI USA August 2013 2 Copyright 2013 John Gabrosek. All

### Descriptive Statistics

Y520 Robert S Michael Goal: Learn to calculate indicators and construct graphs that summarize and describe a large quantity of values. Using the textbook readings and other resources listed on the web

### Scatter Plots with Error Bars

Chapter 165 Scatter Plots with Error Bars Introduction The procedure extends the capability of the basic scatter plot by allowing you to plot the variability in Y and X corresponding to each point. Each

### Directions for Frequency Tables, Histograms, and Frequency Bar Charts

Directions for Frequency Tables, Histograms, and Frequency Bar Charts Frequency Distribution Quantitative Ungrouped Data Dataset: Frequency_Distributions_Graphs-Quantitative.sav 1. Open the dataset containing

### Variables. Exploratory Data Analysis

Exploratory Data Analysis Exploratory Data Analysis involves both graphical displays of data and numerical summaries of data. A common situation is for a data set to be represented as a matrix. There is

### Lab 1: The metric system measurement of length and weight

Lab 1: The metric system measurement of length and weight Introduction The scientific community and the majority of nations throughout the world use the metric system to record quantities such as length,

### IBM SPSS Statistics for Beginners for Windows

ISS, NEWCASTLE UNIVERSITY IBM SPSS Statistics for Beginners for Windows A Training Manual for Beginners Dr. S. T. Kometa A Training Manual for Beginners Contents 1 Aims and Objectives... 3 1.1 Learning

### SPSS Explore procedure

SPSS Explore procedure One useful function in SPSS is the Explore procedure, which will produce histograms, boxplots, stem-and-leaf plots and extensive descriptive statistics. To run the Explore procedure,

### Descriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics

Descriptive statistics is the discipline of quantitatively describing the main features of a collection of data. Descriptive statistics are distinguished from inferential statistics (or inductive statistics),

### Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs

Types of Variables Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs Quantitative (numerical)variables: take numerical values for which arithmetic operations make sense (addition/averaging)

### Exercise 1.12 (Pg. 22-23)

Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.

### Appendix 2.1 Tabular and Graphical Methods Using Excel

Appendix 2.1 Tabular and Graphical Methods Using Excel 1 Appendix 2.1 Tabular and Graphical Methods Using Excel The instructions in this section begin by describing the entry of data into an Excel spreadsheet.

### CHARTS AND GRAPHS INTRODUCTION USING SPSS TO DRAW GRAPHS SPSS GRAPH OPTIONS CAG08

CHARTS AND GRAPHS INTRODUCTION SPSS and Excel each contain a number of options for producing what are sometimes known as business graphics - i.e. statistical charts and diagrams. This handout explores

### MATH 103/GRACEY PRACTICE EXAM/CHAPTERS 2-3. MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

MATH 3/GRACEY PRACTICE EXAM/CHAPTERS 2-3 Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Provide an appropriate response. 1) The frequency distribution

### DESCRIPTIVE STATISTICS - CHAPTERS 1 & 2 1

DESCRIPTIVE STATISTICS - CHAPTERS 1 & 2 1 OVERVIEW STATISTICS PANIK...THE THEORY AND METHODS OF COLLECTING, ORGANIZING, PRESENTING, ANALYZING, AND INTERPRETING DATA SETS SO AS TO DETERMINE THEIR ESSENTIAL

### Mind on Statistics. Chapter 2

Mind on Statistics Chapter 2 Sections 2.1 2.3 1. Tallies and cross-tabulations are used to summarize which of these variable types? A. Quantitative B. Mathematical C. Continuous D. Categorical 2. The table

### The right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median

CONDENSED LESSON 2.1 Box Plots In this lesson you will create and interpret box plots for sets of data use the interquartile range (IQR) to identify potential outliers and graph them on a modified box

### Chapter 2: Frequency Distributions and Graphs

Chapter 2: Frequency Distributions and Graphs Learning Objectives Upon completion of Chapter 2, you will be able to: Organize the data into a table or chart (called a frequency distribution) Construct

### MBA 611 STATISTICS AND QUANTITATIVE METHODS

MBA 611 STATISTICS AND QUANTITATIVE METHODS Part I. Review of Basic Statistics (Chapters 1-11) A. Introduction (Chapter 1) Uncertainty: Decisions are often based on incomplete information from uncertain

### Chapter 1: Exploring Data

Chapter 1: Exploring Data Chapter 1 Review 1. As part of survey of college students a researcher is interested in the variable class standing. She records a 1 if the student is a freshman, a 2 if the student

### Using Excel for descriptive statistics

FACT SHEET Using Excel for descriptive statistics Introduction Biologists no longer routinely plot graphs by hand or rely on calculators to carry out difficult and tedious statistical calculations. These

### Data exploration with Microsoft Excel: univariate analysis

Data exploration with Microsoft Excel: univariate analysis Contents 1 Introduction... 1 2 Exploring a variable s frequency distribution... 2 3 Calculating measures of central tendency... 16 4 Calculating

### Lecture 1: Review and Exploratory Data Analysis (EDA)

Lecture 1: Review and Exploratory Data Analysis (EDA) Sandy Eckel seckel@jhsph.edu Department of Biostatistics, The Johns Hopkins University, Baltimore USA 21 April 2008 1 / 40 Course Information I Course

### STATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI

STATS8: Introduction to Biostatistics Data Exploration Babak Shahbaba Department of Statistics, UCI Introduction After clearly defining the scientific problem, selecting a set of representative members

### Descriptive Statistics

Descriptive Statistics Descriptive statistics consist of methods for organizing and summarizing data. It includes the construction of graphs, charts and tables, as well various descriptive measures such

### WHO STEPS Surveillance Support Materials. STEPS Epi Info Training Guide

STEPS Epi Info Training Guide Department of Chronic Diseases and Health Promotion World Health Organization 20 Avenue Appia, 1211 Geneva 27, Switzerland For further information: www.who.int/chp/steps WHO

### Diagrams and Graphs of Statistical Data

Diagrams and Graphs of Statistical Data One of the most effective and interesting alternative way in which a statistical data may be presented is through diagrams and graphs. There are several ways in

### 6 3 The Standard Normal Distribution

290 Chapter 6 The Normal Distribution Figure 6 5 Areas Under a Normal Distribution Curve 34.13% 34.13% 2.28% 13.59% 13.59% 2.28% 3 2 1 + 1 + 2 + 3 About 68% About 95% About 99.7% 6 3 The Distribution Since

### The Big Picture. Describing Data: Categorical and Quantitative Variables Population. Descriptive Statistics. Community Coalitions (n = 175)

Describing Data: Categorical and Quantitative Variables Population The Big Picture Sampling Statistical Inference Sample Exploratory Data Analysis Descriptive Statistics In order to make sense of data,

### SAS Analyst for Windows Tutorial

Updated: August 2012 Table of Contents Section 1: Introduction... 3 1.1 About this Document... 3 1.2 Introduction to Version 8 of SAS... 3 Section 2: An Overview of SAS V.8 for Windows... 3 2.1 Navigating

### Introduction Course in SPSS - Evening 1

ETH Zürich Seminar für Statistik Introduction Course in SPSS - Evening 1 Seminar für Statistik, ETH Zürich All data used during the course can be downloaded from the following ftp server: ftp://stat.ethz.ch/u/sfs/spsskurs/

### Intermediate PowerPoint

Intermediate PowerPoint Charts and Templates By: Jim Waddell Last modified: January 2002 Topics to be covered: Creating Charts 2 Creating the chart. 2 Line Charts and Scatter Plots 4 Making a Line Chart.

### Drawing a histogram using Excel

Drawing a histogram using Excel STEP 1: Examine the data to decide how many class intervals you need and what the class boundaries should be. (In an assignment you may be told what class boundaries to

### Bar Graphs and Dot Plots

CONDENSED L E S S O N 1.1 Bar Graphs and Dot Plots In this lesson you will interpret and create a variety of graphs find some summary values for a data set draw conclusions about a data set based on graphs

### Pie Charts. proportion of ice-cream flavors sold annually by a given brand. AMS-5: Statistics. Cherry. Cherry. Blueberry. Blueberry. Apple.

Graphical Representations of Data, Mean, Median and Standard Deviation In this class we will consider graphical representations of the distribution of a set of data. The goal is to identify the range of

### Data exploration with Microsoft Excel: analysing more than one variable

Data exploration with Microsoft Excel: analysing more than one variable Contents 1 Introduction... 1 2 Comparing different groups or different variables... 2 3 Exploring the association between categorical

### Probability Distributions

CHAPTER 5 Probability Distributions CHAPTER OUTLINE 5.1 Probability Distribution of a Discrete Random Variable 5.2 Mean and Standard Deviation of a Probability Distribution 5.3 The Binomial Distribution

### GETTING YOUR DATA INTO SPSS

GETTING YOUR DATA INTO SPSS UNIVERSITY OF GUELPH LUCIA COSTANZO lcostanz@uoguelph.ca REVISED SEPTEMBER 2011 CONTENTS Getting your Data into SPSS... 0 SPSS availability... 3 Data for SPSS Sessions... 4

### Statistics Revision Sheet Question 6 of Paper 2

Statistics Revision Sheet Question 6 of Paper The Statistics question is concerned mainly with the following terms. The Mean and the Median and are two ways of measuring the average. sumof values no. of

### Gestation Period as a function of Lifespan

This document will show a number of tricks that can be done in Minitab to make attractive graphs. We work first with the file X:\SOR\24\M\ANIMALS.MTP. This first picture was obtained through Graph Plot.

### Doing Multiple Regression with SPSS. In this case, we are interested in the Analyze options so we choose that menu. If gives us a number of choices:

Doing Multiple Regression with SPSS Multiple Regression for Data Already in Data Editor Next we want to specify a multiple regression analysis for these data. The menu bar for SPSS offers several options:

### An introduction to IBM SPSS Statistics

An introduction to IBM SPSS Statistics Contents 1 Introduction... 1 2 Entering your data... 2 3 Preparing your data for analysis... 10 4 Exploring your data: univariate analysis... 14 5 Generating descriptive

### The Dummy s Guide to Data Analysis Using SPSS

The Dummy s Guide to Data Analysis Using SPSS Mathematics 57 Scripps College Amy Gamble April, 2001 Amy Gamble 4/30/01 All Rights Rerserved TABLE OF CONTENTS PAGE Helpful Hints for All Tests...1 Tests

### INTRODUCTION TO SPSS FOR WINDOWS Version 19.0

INTRODUCTION TO SPSS FOR WINDOWS Version 19.0 Winter 2012 Contents Purpose of handout & Compatibility between different versions of SPSS.. 1 SPSS window & menus 1 Getting data into SPSS & Editing data..

### Excel -- Creating Charts

Excel -- Creating Charts The saying goes, A picture is worth a thousand words, and so true. Professional looking charts give visual enhancement to your statistics, fiscal reports or presentation. Excel

### Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

Glo bal Leadership M BA BUSINESS STATISTICS FINAL EXAM Name: INSTRUCTIONS 1. Do not open this exam until instructed to do so. 2. Be sure to fill in your name before starting the exam. 3. You have two hours

### Foundation of Quantitative Data Analysis

Foundation of Quantitative Data Analysis Part 1: Data manipulation and descriptive statistics with SPSS/Excel HSRS #10 - October 17, 2013 Reference : A. Aczel, Complete Business Statistics. Chapters 1

### 4 Other useful features on the course web page. 5 Accessing SAS

1 Using SAS outside of ITCs Statistical Methods and Computing, 22S:30/105 Instructor: Cowles Lab 1 Jan 31, 2014 You can access SAS from off campus by using the ITC Virtual Desktop Go to https://virtualdesktopuiowaedu

### Once saved, if the file was zipped you will need to unzip it. For the files that I will be posting you need to change the preferences.

1 Commands in JMP and Statcrunch Below are a set of commands in JMP and Statcrunch which facilitate a basic statistical analysis. The first part concerns commands in JMP, the second part is for analysis

### Formulas, Functions and Charts

Formulas, Functions and Charts :: 167 8 Formulas, Functions and Charts 8.1 INTRODUCTION In this leson you can enter formula and functions and perform mathematical calcualtions. You will also be able to

### How to Use a Data Spreadsheet: Excel

How to Use a Data Spreadsheet: Excel One does not necessarily have special statistical software to perform statistical analyses. Microsoft Office Excel can be used to run statistical procedures. Although

### GeoGebra Statistics and Probability

GeoGebra Statistics and Probability Project Maths Development Team 2013 www.projectmaths.ie Page 1 of 24 Index Activity Topic Page 1 Introduction GeoGebra Statistics 3 2 To calculate the Sum, Mean, Count,

### HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS

Mathematics Revision Guides Histograms, Cumulative Frequency and Box Plots Page 1 of 25 M.K. HOME TUITION Mathematics Revision Guides Level: GCSE Higher Tier HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS

### 2-7 Exploratory Data Analysis (EDA)

102 C HAPTER 2 Describing, Exploring, and Comparing Data 2-7 Exploratory Data Analysis (EDA) This chapter presents the basic tools for describing, exploring, and comparing data, and the focus of this section

### Appendix III: SPSS Preliminary

Appendix III: SPSS Preliminary SPSS is a statistical software package that provides a number of tools needed for the analytical process planning, data collection, data access and management, analysis,

### Chapter 3. The Normal Distribution

Chapter 3. The Normal Distribution Topics covered in this chapter: Z-scores Normal Probabilities Normal Percentiles Z-scores Example 3.6: The standard normal table The Problem: What proportion of observations

### DesCartes (Combined) Subject: Mathematics Goal: Statistics and Probability

DesCartes (Combined) Subject: Mathematics Goal: Statistics and Probability RIT Score Range: Below 171 Below 171 Data Analysis and Statistics Solves simple problems based on data from tables* Compares

### Instruction Manual for SPC for MS Excel V3.0

Frequency Business Process Improvement 281-304-9504 20314 Lakeland Falls www.spcforexcel.com Cypress, TX 77433 Instruction Manual for SPC for MS Excel V3.0 35 30 25 LSL=60 Nominal=70 Capability Analysis

### Microsoft Excel 2010 Part 3: Advanced Excel

CALIFORNIA STATE UNIVERSITY, LOS ANGELES INFORMATION TECHNOLOGY SERVICES Microsoft Excel 2010 Part 3: Advanced Excel Winter 2015, Version 1.0 Table of Contents Introduction...2 Sorting Data...2 Sorting

### Data Analysis. Using Excel. Jeffrey L. Rummel. BBA Seminar. Data in Excel. Excel Calculations of Descriptive Statistics. Single Variable Graphs

Using Excel Jeffrey L. Rummel Emory University Goizueta Business School BBA Seminar Jeffrey L. Rummel BBA Seminar 1 / 54 Excel Calculations of Descriptive Statistics Single Variable Graphs Relationships

### DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,

### Microsoft Excel 2010 Charts and Graphs

Microsoft Excel 2010 Charts and Graphs Email: training@health.ufl.edu Web Page: http://training.health.ufl.edu Microsoft Excel 2010: Charts and Graphs 2.0 hours Topics include data groupings; creating

### Chapter 2: Descriptive Statistics

Chapter 2: Descriptive Statistics **This chapter corresponds to chapters 2 ( Means to an End ) and 3 ( Vive la Difference ) of your book. What it is: Descriptive statistics are values that describe the

### AMS 7L LAB #2 Spring, 2009. Exploratory Data Analysis

AMS 7L LAB #2 Spring, 2009 Exploratory Data Analysis Name: Lab Section: Instructions: The TAs/lab assistants are available to help you if you have any questions about this lab exercise. If you have any

### A Picture Really Is Worth a Thousand Words

4 A Picture Really Is Worth a Thousand Words Difficulty Scale (pretty easy, but not a cinch) What you ll learn about in this chapter Why a picture is really worth a thousand words How to create a histogram

### DESCRIPTIVE STATISTICS AND EXPLORATORY DATA ANALYSIS

DESCRIPTIVE STATISTICS AND EXPLORATORY DATA ANALYSIS SEEMA JAGGI Indian Agricultural Statistics Research Institute Library Avenue, New Delhi - 110 012 seema@iasri.res.in 1. Descriptive Statistics Statistics

### Descriptive and Inferential Statistics

General Sir John Kotelawala Defence University Workshop on Descriptive and Inferential Statistics Faculty of Research and Development 14 th May 2013 1. Introduction to Statistics 1.1 What is Statistics?

### Visualization Quick Guide

Visualization Quick Guide A best practice guide to help you find the right visualization for your data WHAT IS DOMO? Domo is a new form of business intelligence (BI) unlike anything before an executive

### 5. Correlation. Open HeightWeight.sav. Take a moment to review the data file.

5. Correlation Objectives Calculate correlations Calculate correlations for subgroups using split file Create scatterplots with lines of best fit for subgroups and multiple correlations Correlation The

### 4. Descriptive Statistics: Measures of Variability and Central Tendency

4. Descriptive Statistics: Measures of Variability and Central Tendency Objectives Calculate descriptive for continuous and categorical data Edit output tables Although measures of central tendency and

### Learning SPSS: Data and EDA

Chapter 5 Learning SPSS: Data and EDA An introduction to SPSS with emphasis on EDA. SPSS (now called PASW Statistics, but still referred to in this document as SPSS) is a perfectly adequate tool for entering

### ITS Training Class Charts and PivotTables Using Excel 2007

When you have a large amount of data and you need to get summary information and graph it, the PivotTable and PivotChart tools in Microsoft Excel will be the answer. The data does not need to be in one

### Create Charts in Excel

Create Charts in Excel Table of Contents OVERVIEW OF CHARTING... 1 AVAILABLE CHART TYPES... 2 PIE CHARTS... 2 BAR CHARTS... 3 CREATING CHARTS IN EXCEL... 3 CREATE A CHART... 3 HOW TO CHANGE THE LOCATION

### Introduction to StatsDirect, 11/05/2012 1

INTRODUCTION TO STATSDIRECT PART 1... 2 INTRODUCTION... 2 Why Use StatsDirect... 2 ACCESSING STATSDIRECT FOR WINDOWS XP... 4 DATA ENTRY... 5 Missing Data... 6 Opening an Excel Workbook... 6 Moving around

### Years after 2000. US Student to Teacher Ratio 0 16.048 1 15.893 2 15.900 3 15.900 4 15.800 5 15.657 6 15.540

To complete this technology assignment, you should already have created a scatter plot for your data on your calculator and/or in Excel. You could do this with any two columns of data, but for demonstration

### Engineering Problem Solving and Excel. EGN 1006 Introduction to Engineering

Engineering Problem Solving and Excel EGN 1006 Introduction to Engineering Mathematical Solution Procedures Commonly Used in Engineering Analysis Data Analysis Techniques (Statistics) Curve Fitting techniques

### Classify the data as either discrete or continuous. 2) An athlete runs 100 meters in 10.5 seconds. 2) A) Discrete B) Continuous

Chapter 2 Overview Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Classify as categorical or qualitative data. 1) A survey of autos parked in

### Summary of important mathematical operations and formulas (from first tutorial):

EXCEL Intermediate Tutorial Summary of important mathematical operations and formulas (from first tutorial): Operation Key Addition + Subtraction - Multiplication * Division / Exponential ^ To enter a

### Session 7 Bivariate Data and Analysis

Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table co-variation least squares

### Statistics Chapter 2

Statistics Chapter 2 Frequency Tables A frequency table organizes quantitative data. partitions data into classes (intervals). shows how many data values are in each class. Test Score Number of Students

### Statistical Analysis Using SPSS for Windows Getting Started (Ver. 2014/11/6) The numbers of figures in the SPSS_screenshot.pptx are shown in red.

Statistical Analysis Using SPSS for Windows Getting Started (Ver. 2014/11/6) The numbers of figures in the SPSS_screenshot.pptx are shown in red. 1. How to display English messages from IBM SPSS Statistics

### INTRODUCTORY LAB: DOING STATISTICS WITH SPSS 21

INTRODUCTORY LAB: DOING STATISTICS WITH SPSS 21 This section covers the basic structure and commands of SPSS for Windows Release 21. It is not designed to be a comprehensive review of the most important

Table of Contents Preface Chapter 1: Introduction 1-1 Opening an SPSS Data File... 2 1-2 Viewing the SPSS Screens... 3 o Data View o Variable View o Output View 1-3 Reading Non-SPSS Files... 6 o Convert

### CHAPTER TWELVE TABLES, CHARTS, AND GRAPHS

TABLES, CHARTS, AND GRAPHS / 75 CHAPTER TWELVE TABLES, CHARTS, AND GRAPHS Tables, charts, and graphs are frequently used in statistics to visually communicate data. Such illustrations are also a frequent

### SPSS Tests for Versions 9 to 13

SPSS Tests for Versions 9 to 13 Chapter 2 Descriptive Statistic (including median) Choose Analyze Descriptive statistics Frequencies... Click on variable(s) then press to move to into Variable(s): list

### SPSS for Simple Analysis

STC: SPSS for Simple Analysis1 SPSS for Simple Analysis STC: SPSS for Simple Analysis2 Background Information IBM SPSS Statistics is a software package used for statistical analysis, data management, and

### PURPOSE OF GRAPHS YOU ARE ABOUT TO BUILD. To explore for a relationship between the categories of two discrete variables

3 Stacked Bar Graph PURPOSE OF GRAPHS YOU ARE ABOUT TO BUILD To explore for a relationship between the categories of two discrete variables 3.1 Introduction to the Stacked Bar Graph «As with the simple

### Sta 309 (Statistics And Probability for Engineers)

Instructor: Prof. Mike Nasab Sta 309 (Statistics And Probability for Engineers) Chapter 2 Organizing and Summarizing Data Raw Data: When data are collected in original form, they are called raw data. The