2-7 Exploratory Data Analysis (EDA)

Size: px
Start display at page:

Download "2-7 Exploratory Data Analysis (EDA)"

Transcription

1 102 C HAPTER 2 Describing, Exploring, and Comparing Data 2-7 Exploratory Data Analysis (EDA) This chapter presents the basic tools for describing, exploring, and comparing data, and the focus of this section is the exploration of data. We begin this section by first defining exploratory data analysis, then we introduce outliers, 5-number summaries, and boxplots. Definition Exploratory data analysis is the process of using statistical tools (such as graphs, measures of center, and measures of variation) to investigate data sets in order to understand their important characteristics. Recall that in Section 2-1 we listed five important characteristics of data, and we began with (1) center, (2) variation, and (3) the nature of the distribution. These characteristics can be investigated by calculating the values of the mean and standard deviation, and by constructing a histogram. It is generally important to further investigate the data set to identify any notable features, especially those that could strongly affect results and conclusions. One such feature is the presence of outliers. Outliers An outlier is a value that is located very far away from almost all of the other values. Relative to the other data, an outlier is an extreme value. When exploring a data set, outliers should be considered because they may reveal important information, and they may strongly affect the value of the mean and standard deviation, as well as seriously distorting a histogram. The following example uses an incorrect entry as an example of an outlier, but not all outliers are errors; some outliers are correct values. EXAMPLE Cotinine Levels of Smokers When using computer software or a calculator, it is often easy to make keying errors. Refer to the cotinine levels of smokers listed in Table 2-1 with the

2 2-7 Exploratory Data Analysis (EDA) 103 Chapter Problem and assume that the first entry of 1 is incorrectly entered as because you were distracted by a meteorite landing on your porch. The incorrect entry of is an outlier because it is located very far away from the other values. How does that outlier affect the mean, standard deviation, and histogram? SOLUTION When the entry of 1 is replaced by the outlier value of 11111, the mean changes from to 450.2, so the effect of the outlier is very substantial. The incorrect entry of causes the standard deviation to change from to , so the effect of the outlier here is also substantial. Figure 2-1 in Section 2-3 depicts the histogram for the correct values of cotinine levels of smokers in Table 2-1, but the STATDISK display presented here shows the histogram that results from using the same data with the value of 1 replaced by the incorrect value of Compare this STATDISK histogram to Figure 2-1 and you can easily see that the presence of the outlier dramatically affects the shape of the distribution. STATDISK The preceding example illustrates these important principles: 1. An outlier can have a dramatic effect on the mean. 2. An outlier can have a dramatic effect on the standard deviation. 3. An outlier can have a dramatic effect on the scale of the histogram so that the true nature of the distribution is totally obscured. An easy procedure for finding outliers is to examine a sorted list of the data. In particular, look at the minimum and maximum sample values and determine whether they are very far away from the other typical values. Some outliers are correct values and some are errors, as in the preceding example. If we are sure that an outlier is an error, we should correct it or delete it. If we include an outlier because we know that it is correct, we might study its effects by constructing graphs and calculating statistics with and without the outliers included. An Outlier Tip Outliers are important to consider because, in many cases, one extreme value can have a dramatic effect on statistics and conclusions derived from them. In some cases an outlier is a mistake that should be corrected or deleted. In other cases, an outlier is a valid data value that should be investigated for any important information. Students of the author collected data consisting of restaurant bills and tips, and no notable outliers were found among their sample data. However, one such outlier is the tip of $16,000 that was left for a restaurant bill of $8, The tip was left by an unidentified London executive to waiter Lenny Lorando at Nello s restaurant in New York City. Lorando said that he had waited on the customer before and He s always generous, but never anything like that before. I have to tell my sister about him.

3 104 C HAPTER 2 Describing, Exploring, and Comparing Data Boxplots Good Advice for Journalists Columnist Max Frankel wrote in the New York Times that most schools of journalism give statistics short shrift and some let students graduate without any numbers training at all. How can such reporters write sensibly about trade and welfare and crime, or air fares, health care and nutrition? The media s sloppy use of numbers about the incidence of accidents or disease frightens people and leaves them vulnerable to journalistic hype, political demagoguery, and commercial fraud. He cites several cases, including an example of a full-page article about New York City s deficit with a promise by the mayor of New York City to close a budget gap of $2.7 billion; the entire article never once mentioned the total size of the budget, so the $2.7 billion figure had no context. In addition to the graphs presented in Section 2-3, a boxplot is another graph that is used often. Boxplots are useful for revealing the center of the data, the spread of the data, the distribution of the data, and the presence of outliers. The construction of a boxplot requires that we first obtain the minimum value, the maximum value, and quartiles, as defined in the 5-number summary. Definitions For a set of data, the 5-number summary consists of the minimum value; the first quartile, Q 1 ; the median (or second quartile, Q 2 ); the third quartile, Q 3 ; and the maximum value. A boxplot (or box-and-whisker diagram) is a graph of a data set that consists of a line extending from the minimum value to the maximum value, and a box with lines drawn at the first quartile, Q 1 ; the median; and the third quartile, Q 3. (See Figure 2-16.) Procedure for Constructing a Boxplot 1. Find the 5-number summary consisting of the minimum value, Q 1, the median, Q 3, and the maximum value. 2. Construct a scale with values that include the minimum and maximum data values. 3. Construct a box (rectangle) extending from Q 1 to Q 3, and draw a line in the box at the median value. 4. Draw lines extending outward from the box to the minimum and maximum data values. Boxplots don t show as much detailed information as histograms or stem-and-leaf plots, so they might not be the best choice when dealing with a single data set. They are often great for comparing two or more data sets. When using two or more boxplots for comparing different data sets, it is important to use the same scale so that correct comparisons can be made. EXAMPLE Cotinine Levels of Smokers Refer to the 40 cotinine levels of smokers in Table 2-1 (without the error of used in place of 1, as in the preceding example). a. Find the values constituting the 5-number summary. b. Construct a boxplot. SOLUTION a. The 5-number summary consists of the minimum, Q 1, median, Q 3, and maximum. To find those values, first sort the data (by arranging them in order from lowest to highest). The minimum of 0 and the maximum of 491

4 2-7 Exploratory Data Analysis (EDA) 105 are easy to identify from the sorted list. Now proceed to find the quartiles. Using the flowchart of Figure 2-15, we get Q 1 5 P , which is located by calculating the locator L 5 (25> 100) and finding the value midway between the 10th value and the 11th value in the sorted list. The median is 170, which is the value midway between the 20th and 21st values. We also find that Q by using Figure 2-15 for the 75th percentile. The 5-number summary is therefore 0, 86.5, 170, 251.5, and 491. b. In Figure 2-16 we graph the boxplot for the data. We use the minimum (0) and the maximum (491) to determine a scale of values, then we plot the values from the 5-number summary as shown. Minimum Q 1 Median Q 3 Maximum FIGURE 2-16 Boxplot Continine Level of Smokers In Figure 2-17 we show some generic boxplots along with common distribution shapes. It appears that the cotinine levels of smokers have a skewed distribution. FIGURE 2-17 Boxplots Corresponding to Bell- Shaped, Uniform, and Skewed Distributions Bell shaped Uniform Skewed To illustrate the use of boxplots to compare data sets, see the accompanying Minitab display of cholesterol levels for a sample of males and a sample of females, based on the National Health Examination data included in Data Set 1 of Appendix B. Based on the sample data, it appears that males have cholesterol levels that are generally higher than females, and the cholesterol levels of males appear to vary more than those of females.

5 106 C HAPTER 2 Describing, Exploring, and Comparing Data Best Colleges Each year, U.S. News and World Report publishes an issue with a list of America s Best Colleges and Universities. Sales typically jump 40% for that issue. The list has critics who argue against the criteria and method of collecting data. Common complaints: Too much emphasis is placed on the criteria of a college s wealth, reputation, College Board scores, alumni donations, and the opinions of college presidents; too little emphasis is placed on the satisfaction of students and effective educational practices. The New York Times interviewed Kenneth Auchincloss, who is editor of How to Get Into College (by Kaplan> Newsweek), and he said that We have never been comfortable trying to quantify in numeric terms the various criteria that go into making a college good or less good, and we don t want to devote the resources to doing an elaborate statistical analysis that frankly we don t think is valid. EXAMPLE Does It Rain More on Weekends? Refer to Data Set 11 in Appendix B, which lists rainfall amounts (in inches) in Boston for every day of a recent year. The collection of this data set was inspired by media reports that it rains more on weekends (Saturday and Sunday) than on weekdays. Later in this book we will describe important statistical methods that can be used to formally test that claim, but for now, let s explore the data set to see what can be learned. (Even if we already know how to apply those formal statistical methods, we should first explore the data before proceeding with the formal analysis.) SOLUTION Let s begin with an investigation into the key elements of center, variation, distribution, outliers, and characteristics over time (the same CVDOT list introduced in Section 2-1). Listed below are measures of center (mean), measures of variation (standard deviation), and the 5-number summary for the rainfall amounts for each day of the week. The accompanying STATDISK display shows boxplots for each of the seven days of the week, starting with Monday at the top. Because the histograms for all seven days are pretty much the same, we show only the histogram for the Monday rainfall amounts. Standard Mean Deviation Minimum Q 1 Median Q 3 Maximum Monday Tuesday Wednesday Thursday Friday Saturday Sunday STATDISK STATDISK

6 2-7 Exploratory Data Analysis (EDA) 107 INTERPRETATION Examining and comparing the statistics and graphs, we make the following important observations. Means: The means vary from a low of in. to a high of in. The seven means vary by considerable amounts, and in later chapters of this book we will present methods for determining whether these differences are significant. (Later methods will show that the means do not differ by significant amounts.) If we list the means in order from low to high, we get this sequence of days: Wednesday, Tuesday, Sunday, Thursday, Friday, Monday, Saturday. There does not appear to be a pattern of higher rainfall on weekends (although the highest mean corresponds to Saturday). Also, see the Excel graph of the seven means, with the mean for Monday plotted first. The Excel graph does not support the claim of more rainfall on weekends (although it might be argued that there is more rainfall on Saturdays). Excel Variation: The seven standard deviations vary from in. to in., but those values are not dramatically different. There does not appear to be anything highly unusual about the amounts of variation. The minimums, first quartiles, and medians are all 0.00 for each of the seven days. This is explained by the fact that for each day of the week, there are many days with no rain. The abundance of zeros is also seen in the boxplots and histograms, which show that the data have distributions that are heavy toward the low end (skewed right). Outliers: There are no outliers or unusual values. At the low end, there are many rainfall amounts of zero. At the high end, the sorted list of all 365 rainfall amounts ends with the high values of 0.92, 0.96, 1.28, 1.41, and Distributions: The distributions of the rainfall amounts are skewed to the right. They are not bell-shaped, as we might have expected. If the use of a particular method of statistics requires normally distributed (bell-shaped) populations, that requirement is not satisfied for the rainfall amounts. We now have considerable insight into the nature of the Boston rainfall amounts for different days of the week. Based on our exploration, we can conclude that Boston does not experience more rain on weekends than on the other days of the week (although we might argue that there is more rainfall on Saturdays).

7 108 C HAPTER 2 Describing, Exploring, and Comparing Data Critical Thinking Armed with a list of tools for investigating center, variation, distribution, outliers, and characteristics of data over time, we might be tempted to develop a rote and mindless procedure, but critical thinking is critically important. In addition to using the tools presented in this chapter, we should consider any other relevant factors that might be crucial to the conclusions we form. We might pose questions such as these: Is the sample likely to be representative of the population, or is the sample somehow biased? What is the source of the data, and might the source be someone with an interest that could affect the quality of the data? Suppose, for example, that we want to estimate the mean income of college students. Also suppose that we mail questionnaires to 500 students and receive 20 responses. We could calculate the mean, standard deviation, construct graphs, identify outliers, and so on, but the results will be what statisticians refer to as hogwash. The sample is a voluntary response sample, and it is not likely to be representative of the population of all college students. In addition to the specific statistical tools presented in this chapter, we should also think! Using Technology This section introduced outliers, 5-number summaries, and boxplots. To find outliers, sort the data in order from lowest to highest, then examine the highest and lowest values to determine whether they are far away from the other sample values. STAT- DISK, Minitab, Excel, and the TI-83 Plus calculator can provide values of quartiles, so the 5-number summary is easy to find. STATDISK, Minitab, Excel, and the TI-83 Plus calculator can be used to create boxplots, and we now describe the different procedures. (Caution: Remember that quartile values calculated by Minitab and the TI-83 Plus calculator may differ slightly from those calculated by applying Figure 2-15, so the boxplots may differ slightly as well.) STATDISK Choose the main menu item of Data and use the Sample Editor to enter the data, then click on COPY. Now select Data, then Boxplot and click on PASTE, then Evaluate. Minitab Enter the data in column C1, then select Graph, then Boxplot. Enter C1 in the first cell under the Y column, then click OK. Excel Although Excel is not designed to generate boxplots, they can be generated using the Data Desk XL add-in that is a supplement to this book. First enter the data in column A. Click on DDXL and select Charts and Plots. Under Function Type, select the option of Boxplot. In the dialog box, click on the pencil icon and enter the range of data, such as A1:A40 if you have 40 values listed in column A. Click on OK. The result is a modified boxplot as described in Exercise 13. The values of the 5-number summary are also displayed. TI-83 Plus Enter the sample data in list L1. Now select STAT PLOT by pressing the 2nd key followed by the key labeled Y 5. Press the ENTER key, then select the option of ON, and select the boxplot type that is positioned in the middle of the second row. The Xlist should indicate L1 and the Freq value should be 1. Now press the ZOOM key and select option 9 for ZoomStat. Press the ENTER key and the boxplot should be displayed. You can use the arrow keys to move right or left so that values can be read from the horizontal scale. 2-7 Basic Skills and Concepts 1. Lottery Refer to Data Set 26 and use only the 40 digits in the first column of the Win 4 results from the New York State Lottery (9, 7, 0, and so on). Find the 5-number summary and construct a boxplot. What characteristic of the boxplot suggests that the digits are selected with a random and fair procedure?

8 2-7 Exploratory Data Analysis (EDA) Movie Budgets Refer to Data Set 21 in Appendix B for the budget amounts of the 15 movies that are R-rated. Find the 5-number summary and construct a boxplot. Determine whether the sample values are likely to be representative of movies made this year. 3. Cereal Calories Refer to Data Set 16 in Appendix B for the 16 values consisting of the calories per gram of cereal. Find the 5-number summary and construct a boxplot. Determine whether the sample values are likely to be representative of the cereals consumed by the general population. 4. Nicotine in Cigarettes Refer to Data Set 5 for the 29 amounts of nicotine (in mg per cigarette). Find the 5-number summary and construct a boxplot. Are the sample values likely to be representative of cigarettes smoked by an individual consumer? 5. Red M&Ms Refer to Data Set 19 for the 21 weights (in grams) of the red M&M candies. Find the 5-number summary and construct a boxplot. Are the red sample values likely to be representative of M&M candies of all colors? T 6. Bear Lengths Refer to Data Set 9 for the lengths (in inches) of the 54 bears that were anesthetized and measured. Find the 5-number summary and construct a boxplot. Does the distribution of the lengths appear to be symmetric or does it appear to be skewed? T 7. Alcohol in Children s Movies Refer to Data Set 7 for the 50 times (in seconds) of scenes showing alcohol use in animated children s movies. Find the 5-number summary and construct a boxplot. Based on the boxplot, does the distribution appear to be symmetric or is it skewed? T 8. Body Temperatures Refer to Data Set 4 in Appendix B for the 106 body temperatures for 12 A.M. on day 2. Find the 5-number summary and construct a boxplot, then determine whether the sample values support the common belief that the mean body temperature is 98.6 F. In Exercises 9 12, find 5-number summaries, construct boxplots, and compare the data sets. 9. Academy Awards In Ages of Oscar-Winning Best Actors and Actresses (Mathematics Teacher magazine) by Richard Brown and Gretchen Davis, the authors compare the ages of actors and actresses at the time they won Oscars. The results for winners from both categories are listed in the following table. Use boxplots to compare the two data sets. Actors: Actresses: T 10. Regular> Diet Coke Refer to Data Set 17 in Appendix B and use the weights of regular Coke and the weights of diet Coke. Does there appear to be a significant difference? If so, can you provide an explanation?

9 110 C HAPTER 2 Describing, Exploring, and Comparing Data T T 11. Cotinine Levels Refer to Table 2-1 located in the Chapter Problem. We have already found that the 5-number summary for the cotinine levels of smokers is 0, 86.5, 170, 251.5, and 491. Find the 5-number summaries for the other two groups, then construct the three boxplots using the same scale. Are there any apparent differences? 12. Clancy, Rowling, Tolstoy Refer to Data Set 14 in Appendix B and use the Flesch reading ease scores for the sample pages from Tom Clancy s The Bear and the Dragon, J. K. Rowling s Harry Potter and the Sorcerer s Stone, and Leo Tolstoy s War and Peace. (Higher scores indicate easier reading.) Does there appear to be a difference in ease of reading? Are the results consistent with your expectations? 2-7 Beyond the Basics 13. The boxplots discussed in this section are often called skeletal (or regular) boxplots. Modified boxplots are constructed as follows: a. Find the IQR, which denotes the interquartile range defined by IQR 5 Q 3 2 Q 1. b. Draw the box with the median and quartiles as usual, but when drawing the lines to the right and left of the box, draw the lines only as far as the points corresponding to the largest and smallest values that are within 1.5 IQR of the box. c. Mild outliers, plotted as solid dots, are values below Q 1 or above Q 3 by an amount that is greater than 1.5 IQR but not greater than 3 IQR. That is, mild outliers are values x such that Q IQR x Q IQR or Q IQR x Q IQR d. Extreme outliers, plotted as small hollow circles, are values that are either below Q 1 by more than 3 IQR or above Q 3 by more than 3 IQR. That is, extreme outliers are values x such that x Q IQR or x Q IQR The accompanying figure is an example of a modified boxplot. Refer to the cotinine levels of smokers in Table 2-1 included with the Chapter Problem. We have found that this data set has a 5-number summary of 0, 86.5, 170, 251.5, and 491. Identify the value of IQR, identify the ranges of values used to identify mild and extreme outliers, then identify any actual mild outliers or extreme outliers. Q1 Q2 Q3 Extreme Outliers Mild Outliers 1. 5 IQR IQR 1. 5 IQR Mild Outliers Extreme Outliers 3 IQR 3 IQR

10 Review Refer to the accompanying STATDISK display of three boxplots that represent the measure longevity (in months) of samples of three different car batteries. If you are the manager of a fleet of cars and you must select one of the three brands, which boxplot represents the brand you should choose? Why? STATDISK Review In this chapter we considered methods for describing, exploring, and comparing data sets. When investigating a data set, these characteristics are generally very important: 1. Center: A representative or average value. 2. Variation: A measure of the amount that the values vary. 3. Distribution: The nature or shape of the distribution of the data (such as bell-shaped, uniform, or skewed). 4. Outliers: Sample values that lie very far away from the vast majority of the other sample values. 5. Time: Changing characteristics of the data over time. After completing this chapter you should be able to do the following: Summarize data by constructing a frequency distribution or relative frequency distribution (Section 2-2) Visually display the nature of the distribution by constructing a histogram, dotplot, stem-and-leaf plot, pie chart, or Pareto chart (Section 2-3) Calculate measures of center by finding the mean, median, mode, and midrange (Section 2-4) Calculate measures of variation by finding the standard deviation, variance, and range (Section 2-5) Compare individual values by using z scores, quartiles, or percentiles (Section 2-6) Investigate and explore the spread of data, the center of the data, and the range of values by constructing a boxplot (Section 2-7)

Exploratory data analysis (Chapter 2) Fall 2011

Exploratory data analysis (Chapter 2) Fall 2011 Exploratory data analysis (Chapter 2) Fall 2011 Data Examples Example 1: Survey Data 1 Data collected from a Stat 371 class in Fall 2005 2 They answered questions about their: gender, major, year in school,

More information

Describing, Exploring, and Comparing Data

Describing, Exploring, and Comparing Data 24 Chapter 2. Describing, Exploring, and Comparing Data Chapter 2. Describing, Exploring, and Comparing Data There are many tools used in Statistics to visualize, summarize, and describe data. This chapter

More information

2 Describing, Exploring, and

2 Describing, Exploring, and 2 Describing, Exploring, and Comparing Data This chapter introduces the graphical plotting and summary statistics capabilities of the TI- 83 Plus. First row keys like \ R (67$73/276 are used to obtain

More information

Lecture 1: Review and Exploratory Data Analysis (EDA)

Lecture 1: Review and Exploratory Data Analysis (EDA) Lecture 1: Review and Exploratory Data Analysis (EDA) Sandy Eckel seckel@jhsph.edu Department of Biostatistics, The Johns Hopkins University, Baltimore USA 21 April 2008 1 / 40 Course Information I Course

More information

Variables. Exploratory Data Analysis

Variables. Exploratory Data Analysis Exploratory Data Analysis Exploratory Data Analysis involves both graphical displays of data and numerical summaries of data. A common situation is for a data set to be represented as a matrix. There is

More information

A Correlation of. to the. South Carolina Data Analysis and Probability Standards

A Correlation of. to the. South Carolina Data Analysis and Probability Standards A Correlation of to the South Carolina Data Analysis and Probability Standards INTRODUCTION This document demonstrates how Stats in Your World 2012 meets the indicators of the South Carolina Academic Standards

More information

Using SPSS, Chapter 2: Descriptive Statistics

Using SPSS, Chapter 2: Descriptive Statistics 1 Using SPSS, Chapter 2: Descriptive Statistics Chapters 2.1 & 2.2 Descriptive Statistics 2 Mean, Standard Deviation, Variance, Range, Minimum, Maximum 2 Mean, Median, Mode, Standard Deviation, Variance,

More information

STATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI

STATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI STATS8: Introduction to Biostatistics Data Exploration Babak Shahbaba Department of Statistics, UCI Introduction After clearly defining the scientific problem, selecting a set of representative members

More information

SECTION 2-1: OVERVIEW SECTION 2-2: FREQUENCY DISTRIBUTIONS

SECTION 2-1: OVERVIEW SECTION 2-2: FREQUENCY DISTRIBUTIONS SECTION 2-1: OVERVIEW Chapter 2 Describing, Exploring and Comparing Data 19 In this chapter, we will use the capabilities of Excel to help us look more carefully at sets of data. We can do this by re-organizing

More information

Exercise 1.12 (Pg. 22-23)

Exercise 1.12 (Pg. 22-23) Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.

More information

AP * Statistics Review. Descriptive Statistics

AP * Statistics Review. Descriptive Statistics AP * Statistics Review Descriptive Statistics Teacher Packet Advanced Placement and AP are registered trademark of the College Entrance Examination Board. The College Board was not involved in the production

More information

Foundation of Quantitative Data Analysis

Foundation of Quantitative Data Analysis Foundation of Quantitative Data Analysis Part 1: Data manipulation and descriptive statistics with SPSS/Excel HSRS #10 - October 17, 2013 Reference : A. Aczel, Complete Business Statistics. Chapters 1

More information

Lecture 2: Descriptive Statistics and Exploratory Data Analysis

Lecture 2: Descriptive Statistics and Exploratory Data Analysis Lecture 2: Descriptive Statistics and Exploratory Data Analysis Further Thoughts on Experimental Design 16 Individuals (8 each from two populations) with replicates Pop 1 Pop 2 Randomly sample 4 individuals

More information

Diagrams and Graphs of Statistical Data

Diagrams and Graphs of Statistical Data Diagrams and Graphs of Statistical Data One of the most effective and interesting alternative way in which a statistical data may be presented is through diagrams and graphs. There are several ways in

More information

6 3 The Standard Normal Distribution

6 3 The Standard Normal Distribution 290 Chapter 6 The Normal Distribution Figure 6 5 Areas Under a Normal Distribution Curve 34.13% 34.13% 2.28% 13.59% 13.59% 2.28% 3 2 1 + 1 + 2 + 3 About 68% About 95% About 99.7% 6 3 The Distribution Since

More information

Center: Finding the Median. Median. Spread: Home on the Range. Center: Finding the Median (cont.)

Center: Finding the Median. Median. Spread: Home on the Range. Center: Finding the Median (cont.) Center: Finding the Median When we think of a typical value, we usually look for the center of the distribution. For a unimodal, symmetric distribution, it s easy to find the center it s just the center

More information

Classify the data as either discrete or continuous. 2) An athlete runs 100 meters in 10.5 seconds. 2) A) Discrete B) Continuous

Classify the data as either discrete or continuous. 2) An athlete runs 100 meters in 10.5 seconds. 2) A) Discrete B) Continuous Chapter 2 Overview Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Classify as categorical or qualitative data. 1) A survey of autos parked in

More information

3: Summary Statistics

3: Summary Statistics 3: Summary Statistics Notation Let s start by introducing some notation. Consider the following small data set: 4 5 30 50 8 7 4 5 The symbol n represents the sample size (n = 0). The capital letter X denotes

More information

Chapter 1: Exploring Data

Chapter 1: Exploring Data Chapter 1: Exploring Data Chapter 1 Review 1. As part of survey of college students a researcher is interested in the variable class standing. She records a 1 if the student is a freshman, a 2 if the student

More information

1.3 Measuring Center & Spread, The Five Number Summary & Boxplots. Describing Quantitative Data with Numbers

1.3 Measuring Center & Spread, The Five Number Summary & Boxplots. Describing Quantitative Data with Numbers 1.3 Measuring Center & Spread, The Five Number Summary & Boxplots Describing Quantitative Data with Numbers 1.3 I can n Calculate and interpret measures of center (mean, median) in context. n Calculate

More information

Interpreting Data in Normal Distributions

Interpreting Data in Normal Distributions Interpreting Data in Normal Distributions This curve is kind of a big deal. It shows the distribution of a set of test scores, the results of rolling a die a million times, the heights of people on Earth,

More information

SPSS Manual for Introductory Applied Statistics: A Variable Approach

SPSS Manual for Introductory Applied Statistics: A Variable Approach SPSS Manual for Introductory Applied Statistics: A Variable Approach John Gabrosek Department of Statistics Grand Valley State University Allendale, MI USA August 2013 2 Copyright 2013 John Gabrosek. All

More information

Exploratory Data Analysis. Psychology 3256

Exploratory Data Analysis. Psychology 3256 Exploratory Data Analysis Psychology 3256 1 Introduction If you are going to find out anything about a data set you must first understand the data Basically getting a feel for you numbers Easier to find

More information

Students summarize a data set using box plots, the median, and the interquartile range. Students use box plots to compare two data distributions.

Students summarize a data set using box plots, the median, and the interquartile range. Students use box plots to compare two data distributions. Student Outcomes Students summarize a data set using box plots, the median, and the interquartile range. Students use box plots to compare two data distributions. Lesson Notes The activities in this lesson

More information

Exploratory Data Analysis

Exploratory Data Analysis Exploratory Data Analysis Johannes Schauer johannes.schauer@tugraz.at Institute of Statistics Graz University of Technology Steyrergasse 17/IV, 8010 Graz www.statistics.tugraz.at February 12, 2008 Introduction

More information

AMS 7L LAB #2 Spring, 2009. Exploratory Data Analysis

AMS 7L LAB #2 Spring, 2009. Exploratory Data Analysis AMS 7L LAB #2 Spring, 2009 Exploratory Data Analysis Name: Lab Section: Instructions: The TAs/lab assistants are available to help you if you have any questions about this lab exercise. If you have any

More information

Descriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics

Descriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics Descriptive statistics is the discipline of quantitatively describing the main features of a collection of data. Descriptive statistics are distinguished from inferential statistics (or inductive statistics),

More information

c. Construct a boxplot for the data. Write a one sentence interpretation of your graph.

c. Construct a boxplot for the data. Write a one sentence interpretation of your graph. MBA/MIB 5315 Sample Test Problems Page 1 of 1 1. An English survey of 3000 medical records showed that smokers are more inclined to get depressed than non-smokers. Does this imply that smoking causes depression?

More information

DESCRIPTIVE STATISTICS AND EXPLORATORY DATA ANALYSIS

DESCRIPTIVE STATISTICS AND EXPLORATORY DATA ANALYSIS DESCRIPTIVE STATISTICS AND EXPLORATORY DATA ANALYSIS SEEMA JAGGI Indian Agricultural Statistics Research Institute Library Avenue, New Delhi - 110 012 seema@iasri.res.in 1. Descriptive Statistics Statistics

More information

MATH 103/GRACEY PRACTICE EXAM/CHAPTERS 2-3. MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

MATH 103/GRACEY PRACTICE EXAM/CHAPTERS 2-3. MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. MATH 3/GRACEY PRACTICE EXAM/CHAPTERS 2-3 Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Provide an appropriate response. 1) The frequency distribution

More information

The right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median

The right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median CONDENSED LESSON 2.1 Box Plots In this lesson you will create and interpret box plots for sets of data use the interquartile range (IQR) to identify potential outliers and graph them on a modified box

More information

Statistics and Probability

Statistics and Probability Statistics and Probability TABLE OF CONTENTS 1 Posing Questions and Gathering Data. 2 2 Representing Data. 7 3 Interpreting and Evaluating Data 13 4 Exploring Probability..17 5 Games of Chance 20 6 Ideas

More information

Appendix 2.1 Tabular and Graphical Methods Using Excel

Appendix 2.1 Tabular and Graphical Methods Using Excel Appendix 2.1 Tabular and Graphical Methods Using Excel 1 Appendix 2.1 Tabular and Graphical Methods Using Excel The instructions in this section begin by describing the entry of data into an Excel spreadsheet.

More information

Mind on Statistics. Chapter 2

Mind on Statistics. Chapter 2 Mind on Statistics Chapter 2 Sections 2.1 2.3 1. Tallies and cross-tabulations are used to summarize which of these variable types? A. Quantitative B. Mathematical C. Continuous D. Categorical 2. The table

More information

4.1 Exploratory Analysis: Once the data is collected and entered, the first question is: "What do the data look like?"

4.1 Exploratory Analysis: Once the data is collected and entered, the first question is: What do the data look like? Data Analysis Plan The appropriate methods of data analysis are determined by your data types and variables of interest, the actual distribution of the variables, and the number of cases. Different analyses

More information

Data Exploration Data Visualization

Data Exploration Data Visualization Data Exploration Data Visualization What is data exploration? A preliminary exploration of the data to better understand its characteristics. Key motivations of data exploration include Helping to select

More information

Data Analysis, Statistics, and Probability

Data Analysis, Statistics, and Probability Chapter 6 Data Analysis, Statistics, and Probability Content Strand Description Questions in this content strand assessed students skills in collecting, organizing, reading, representing, and interpreting

More information

Module 4: Data Exploration

Module 4: Data Exploration Module 4: Data Exploration Now that you have your data downloaded from the Streams Project database, the detective work can begin! Before computing any advanced statistics, we will first use descriptive

More information

BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I

BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I BNG 202 Biomechanics Lab Descriptive statistics and probability distributions I Overview The overall goal of this short course in statistics is to provide an introduction to descriptive and inferential

More information

STAT355 - Probability & Statistics

STAT355 - Probability & Statistics STAT355 - Probability & Statistics Instructor: Kofi Placid Adragni Fall 2011 Chap 1 - Overview and Descriptive Statistics 1.1 Populations, Samples, and Processes 1.2 Pictorial and Tabular Methods in Descriptive

More information

Summarizing and Displaying Categorical Data

Summarizing and Displaying Categorical Data Summarizing and Displaying Categorical Data Categorical data can be summarized in a frequency distribution which counts the number of cases, or frequency, that fall into each category, or a relative frequency

More information

Introduction to Environmental Statistics. The Big Picture. Populations and Samples. Sample Data. Examples of sample data

Introduction to Environmental Statistics. The Big Picture. Populations and Samples. Sample Data. Examples of sample data A Few Sources for Data Examples Used Introduction to Environmental Statistics Professor Jessica Utts University of California, Irvine jutts@uci.edu 1. Statistical Methods in Water Resources by D.R. Helsel

More information

Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs

Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs Types of Variables Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs Quantitative (numerical)variables: take numerical values for which arithmetic operations make sense (addition/averaging)

More information

Common Tools for Displaying and Communicating Data for Process Improvement

Common Tools for Displaying and Communicating Data for Process Improvement Common Tools for Displaying and Communicating Data for Process Improvement Packet includes: Tool Use Page # Box and Whisker Plot Check Sheet Control Chart Histogram Pareto Diagram Run Chart Scatter Plot

More information

The Big Picture. Describing Data: Categorical and Quantitative Variables Population. Descriptive Statistics. Community Coalitions (n = 175)

The Big Picture. Describing Data: Categorical and Quantitative Variables Population. Descriptive Statistics. Community Coalitions (n = 175) Describing Data: Categorical and Quantitative Variables Population The Big Picture Sampling Statistical Inference Sample Exploratory Data Analysis Descriptive Statistics In order to make sense of data,

More information

4 Other useful features on the course web page. 5 Accessing SAS

4 Other useful features on the course web page. 5 Accessing SAS 1 Using SAS outside of ITCs Statistical Methods and Computing, 22S:30/105 Instructor: Cowles Lab 1 Jan 31, 2014 You can access SAS from off campus by using the ITC Virtual Desktop Go to https://virtualdesktopuiowaedu

More information

Introduction to Statistics for Psychology. Quantitative Methods for Human Sciences

Introduction to Statistics for Psychology. Quantitative Methods for Human Sciences Introduction to Statistics for Psychology and Quantitative Methods for Human Sciences Jonathan Marchini Course Information There is website devoted to the course at http://www.stats.ox.ac.uk/ marchini/phs.html

More information

Chapter 3. The Normal Distribution

Chapter 3. The Normal Distribution Chapter 3. The Normal Distribution Topics covered in this chapter: Z-scores Normal Probabilities Normal Percentiles Z-scores Example 3.6: The standard normal table The Problem: What proportion of observations

More information

How To Check For Differences In The One Way Anova

How To Check For Differences In The One Way Anova MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. One-Way

More information

Engineering Problem Solving and Excel. EGN 1006 Introduction to Engineering

Engineering Problem Solving and Excel. EGN 1006 Introduction to Engineering Engineering Problem Solving and Excel EGN 1006 Introduction to Engineering Mathematical Solution Procedures Commonly Used in Engineering Analysis Data Analysis Techniques (Statistics) Curve Fitting techniques

More information

consider the number of math classes taken by math 150 students. how can we represent the results in one number?

consider the number of math classes taken by math 150 students. how can we represent the results in one number? ch 3: numerically summarizing data - center, spread, shape 3.1 measure of central tendency or, give me one number that represents all the data consider the number of math classes taken by math 150 students.

More information

MBA 611 STATISTICS AND QUANTITATIVE METHODS

MBA 611 STATISTICS AND QUANTITATIVE METHODS MBA 611 STATISTICS AND QUANTITATIVE METHODS Part I. Review of Basic Statistics (Chapters 1-11) A. Introduction (Chapter 1) Uncertainty: Decisions are often based on incomplete information from uncertain

More information

Grade 8 Classroom Assessments Based on State Standards (CABS)

Grade 8 Classroom Assessments Based on State Standards (CABS) Grade 8 Classroom Assessments Based on State Standards (CABS) A. Mathematical Processes and E. Statistics and Probability (From the WKCE-CRT Mathematics Assessment Framework, Beginning of Grade 10) A.

More information

Introduction to Exploratory Data Analysis

Introduction to Exploratory Data Analysis Introduction to Exploratory Data Analysis A SpaceStat Software Tutorial Copyright 2013, BioMedware, Inc. (www.biomedware.com). All rights reserved. SpaceStat and BioMedware are trademarks of BioMedware,

More information

Cell Phone Impairment?

Cell Phone Impairment? Cell Phone Impairment? Overview of Lesson This lesson is based upon data collected by researchers at the University of Utah (Strayer and Johnston, 2001). The researchers asked student volunteers (subjects)

More information

HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS

HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS Mathematics Revision Guides Histograms, Cumulative Frequency and Box Plots Page 1 of 25 M.K. HOME TUITION Mathematics Revision Guides Level: GCSE Higher Tier HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS

More information

How To Write A Data Analysis

How To Write A Data Analysis Mathematics Probability and Statistics Curriculum Guide Revised 2010 This page is intentionally left blank. Introduction The Mathematics Curriculum Guide serves as a guide for teachers when planning instruction

More information

Using Excel for descriptive statistics

Using Excel for descriptive statistics FACT SHEET Using Excel for descriptive statistics Introduction Biologists no longer routinely plot graphs by hand or rely on calculators to carry out difficult and tedious statistical calculations. These

More information

Sta 309 (Statistics And Probability for Engineers)

Sta 309 (Statistics And Probability for Engineers) Instructor: Prof. Mike Nasab Sta 309 (Statistics And Probability for Engineers) Chapter 2 Organizing and Summarizing Data Raw Data: When data are collected in original form, they are called raw data. The

More information

Thursday, November 13: 6.1 Discrete Random Variables

Thursday, November 13: 6.1 Discrete Random Variables Thursday, November 13: 6.1 Discrete Random Variables Read 347 350 What is a random variable? Give some examples. What is a probability distribution? What is a discrete random variable? Give some examples.

More information

GeoGebra Statistics and Probability

GeoGebra Statistics and Probability GeoGebra Statistics and Probability Project Maths Development Team 2013 www.projectmaths.ie Page 1 of 24 Index Activity Topic Page 1 Introduction GeoGebra Statistics 3 2 To calculate the Sum, Mean, Count,

More information

Statistics Chapter 2

Statistics Chapter 2 Statistics Chapter 2 Frequency Tables A frequency table organizes quantitative data. partitions data into classes (intervals). shows how many data values are in each class. Test Score Number of Students

More information

2. Here is a small part of a data set that describes the fuel economy (in miles per gallon) of 2006 model motor vehicles.

2. Here is a small part of a data set that describes the fuel economy (in miles per gallon) of 2006 model motor vehicles. Math 1530-017 Exam 1 February 19, 2009 Name Student Number E There are five possible responses to each of the following multiple choice questions. There is only on BEST answer. Be sure to read all possible

More information

New Zealand Crash Statistics Mathematics and Statistics 91582 (3.10) version 1: Use statistical methods to make a formal inference Credits: 4

New Zealand Crash Statistics Mathematics and Statistics 91582 (3.10) version 1: Use statistical methods to make a formal inference Credits: 4 New Zealand Crash Statistics Mathematics and Statistics 91582 (3.10) version 1: Use statistical methods to make a formal inference Credits: 4 Teacher guidelines Context/setting This activity requires students

More information

DESCRIPTIVE STATISTICS - CHAPTERS 1 & 2 1

DESCRIPTIVE STATISTICS - CHAPTERS 1 & 2 1 DESCRIPTIVE STATISTICS - CHAPTERS 1 & 2 1 OVERVIEW STATISTICS PANIK...THE THEORY AND METHODS OF COLLECTING, ORGANIZING, PRESENTING, ANALYZING, AND INTERPRETING DATA SETS SO AS TO DETERMINE THEIR ESSENTIAL

More information

Gestation Period as a function of Lifespan

Gestation Period as a function of Lifespan This document will show a number of tricks that can be done in Minitab to make attractive graphs. We work first with the file X:\SOR\24\M\ANIMALS.MTP. This first picture was obtained through Graph Plot.

More information

GETTING YOUR DATA INTO SPSS

GETTING YOUR DATA INTO SPSS GETTING YOUR DATA INTO SPSS UNIVERSITY OF GUELPH LUCIA COSTANZO lcostanz@uoguelph.ca REVISED SEPTEMBER 2011 CONTENTS Getting your Data into SPSS... 0 SPSS availability... 3 Data for SPSS Sessions... 4

More information

Final Exam Practice Problem Answers

Final Exam Practice Problem Answers Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal

More information

SPSS Workbook 1 Data Entry : Questionnaire Data

SPSS Workbook 1 Data Entry : Questionnaire Data TEESSIDE UNIVERSITY SCHOOL OF HEALTH & SOCIAL CARE SPSS Workbook 1 Data Entry : Questionnaire Data Prepared by: Sylvia Storey s.storey@tees.ac.uk SPSS data entry 1 This workbook is designed to introduce

More information

IBM SPSS Statistics for Beginners for Windows

IBM SPSS Statistics for Beginners for Windows ISS, NEWCASTLE UNIVERSITY IBM SPSS Statistics for Beginners for Windows A Training Manual for Beginners Dr. S. T. Kometa A Training Manual for Beginners Contents 1 Aims and Objectives... 3 1.1 Learning

More information

Getting started in Excel

Getting started in Excel Getting started in Excel Disclaimer: This guide is not complete. It is rather a chronicle of my attempts to start using Excel for data analysis. As I use a Mac with OS X, these directions may need to be

More information

MEASURES OF VARIATION

MEASURES OF VARIATION NORMAL DISTRIBTIONS MEASURES OF VARIATION In statistics, it is important to measure the spread of data. A simple way to measure spread is to find the range. But statisticians want to know if the data are

More information

Ch. 3.1 # 3, 4, 7, 30, 31, 32

Ch. 3.1 # 3, 4, 7, 30, 31, 32 Math Elementary Statistics: A Brief Version, 5/e Bluman Ch. 3. # 3, 4,, 30, 3, 3 Find (a) the mean, (b) the median, (c) the mode, and (d) the midrange. 3) High Temperatures The reported high temperatures

More information

2. Filling Data Gaps, Data validation & Descriptive Statistics

2. Filling Data Gaps, Data validation & Descriptive Statistics 2. Filling Data Gaps, Data validation & Descriptive Statistics Dr. Prasad Modak Background Data collected from field may suffer from these problems Data may contain gaps ( = no readings during this period)

More information

First Midterm Exam (MATH1070 Spring 2012)

First Midterm Exam (MATH1070 Spring 2012) First Midterm Exam (MATH1070 Spring 2012) Instructions: This is a one hour exam. You can use a notecard. Calculators are allowed, but other electronics are prohibited. 1. [40pts] Multiple Choice Problems

More information

Practice#1(chapter1,2) Name

Practice#1(chapter1,2) Name Practice#1(chapter1,2) Name Solve the problem. 1) The average age of the students in a statistics class is 22 years. Does this statement describe descriptive or inferential statistics? A) inferential statistics

More information

Box-and-Whisker Plots

Box-and-Whisker Plots Learning Standards HSS-ID.A. HSS-ID.A.3 3 9 23 62 3 COMMON CORE.2 Numbers of First Cousins 0 3 9 3 45 24 8 0 3 3 6 8 32 8 0 5 4 Box-and-Whisker Plots Essential Question How can you use a box-and-whisker

More information

Key Concept. Density Curve

Key Concept. Density Curve MAT 155 Statistical Analysis Dr. Claude Moore Cape Fear Community College Chapter 6 Normal Probability Distributions 6 1 Review and Preview 6 2 The Standard Normal Distribution 6 3 Applications of Normal

More information

UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test March 2014

UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test March 2014 UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test March 2014 STAB22H3 Statistics I Duration: 1 hour and 45 minutes Last Name: First Name: Student number: Aids

More information

Paper 232-2012. Getting to the Good Part of Data Analysis: Data Access, Manipulation, and Customization Using JMP

Paper 232-2012. Getting to the Good Part of Data Analysis: Data Access, Manipulation, and Customization Using JMP Paper 232-2012 Getting to the Good Part of Data Analysis: Data Access, Manipulation, and Customization Using JMP Audrey Ventura, SAS Institute Inc., Cary, NC ABSTRACT Effective data analysis requires easy

More information

Descriptive Statistics and Exploratory Data Analysis

Descriptive Statistics and Exploratory Data Analysis Descriptive Statistics and Exploratory Data Analysis Dean s s Faculty and Resident Development Series UT College of Medicine Chattanooga Probasco Auditorium at Erlanger January 14, 2008 Marc Loizeaux,

More information

CONTENTS. Chapter 1...1. Chapter 2...9. Chapter 3... 29. Chapter 4... 45. Chapter 5... 59. Chapter 6... 73. Chapter 7... 101. Chapter 8...

CONTENTS. Chapter 1...1. Chapter 2...9. Chapter 3... 29. Chapter 4... 45. Chapter 5... 59. Chapter 6... 73. Chapter 7... 101. Chapter 8... CONTENTS Chapter 1...1 Chapter...9 Chapter 3... 9 Chapter 4... 45 Chapter 5... 59 Chapter 6... 73 Chapter 7... 101 Chapter 8... 117 Chapter 9... 139 Chapter 10... 159 Chapter 11... 199 Chapter 1... 11

More information

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012 Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts

More information

AP Statistics Solutions to Packet 2

AP Statistics Solutions to Packet 2 AP Statistics Solutions to Packet 2 The Normal Distributions Density Curves and the Normal Distribution Standard Normal Calculations HW #9 1, 2, 4, 6-8 2.1 DENSITY CURVES (a) Sketch a density curve that

More information

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Exam Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 1) The government of a town needs to determine if the city's residents will support the

More information

Solutions to Homework 3 Statistics 302 Professor Larget

Solutions to Homework 3 Statistics 302 Professor Larget s to Homework 3 Statistics 302 Professor Larget Textbook Exercises 3.20 Customized Home Pages A random sample of n = 1675 Internet users in the US in January 2010 found that 469 of them have customized

More information

Data exploration with Microsoft Excel: univariate analysis

Data exploration with Microsoft Excel: univariate analysis Data exploration with Microsoft Excel: univariate analysis Contents 1 Introduction... 1 2 Exploring a variable s frequency distribution... 2 3 Calculating measures of central tendency... 16 4 Calculating

More information

Box-and-Whisker Plots

Box-and-Whisker Plots Mathematics Box-and-Whisker Plots About this Lesson This is a foundational lesson for box-and-whisker plots (boxplots), a graphical tool used throughout statistics for displaying data. During the lesson,

More information

+ Chapter 1 Exploring Data

+ Chapter 1 Exploring Data Chapter 1 Exploring Data Introduction: Data Analysis: Making Sense of Data 1.1 Analyzing Categorical Data 1.2 Displaying Quantitative Data with Graphs 1.3 Describing Quantitative Data with Numbers Introduction

More information

3. There are three senior citizens in a room, ages 68, 70, and 72. If a seventy-year-old person enters the room, the

3. There are three senior citizens in a room, ages 68, 70, and 72. If a seventy-year-old person enters the room, the TMTA Statistics Exam 2011 1. Last month, the mean and standard deviation of the paychecks of 10 employees of a small company were $1250 and $150, respectively. This month, each one of the 10 employees

More information

Bar Graphs and Dot Plots

Bar Graphs and Dot Plots CONDENSED L E S S O N 1.1 Bar Graphs and Dot Plots In this lesson you will interpret and create a variety of graphs find some summary values for a data set draw conclusions about a data set based on graphs

More information

BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS

BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS SEEMA JAGGI Indian Agricultural Statistics Research Institute Library Avenue, New Delhi-110 012 seema@iasri.res.in Genomics A genome is an organism s

More information

ADD-INS: ENHANCING EXCEL

ADD-INS: ENHANCING EXCEL CHAPTER 9 ADD-INS: ENHANCING EXCEL This chapter discusses the following topics: WHAT CAN AN ADD-IN DO? WHY USE AN ADD-IN (AND NOT JUST EXCEL MACROS/PROGRAMS)? ADD INS INSTALLED WITH EXCEL OTHER ADD-INS

More information

Walk the Line Written by: Maryann Huey Drake University Maryann.Huey@drake.edu

Walk the Line Written by: Maryann Huey Drake University Maryann.Huey@drake.edu Walk the Line Written by: Maryann Huey Drake University Maryann.Huey@drake.edu Overview of Lesson In this activity, students will conduct an investigation to collect data to determine how far students

More information

When to use Excel. When NOT to use Excel 9/24/2014

When to use Excel. When NOT to use Excel 9/24/2014 Analyzing Quantitative Assessment Data with Excel October 2, 2014 Jeremy Penn, Ph.D. Director When to use Excel You want to quickly summarize or analyze your assessment data You want to create basic visual

More information

The Normal Distribution

The Normal Distribution Chapter 6 The Normal Distribution 6.1 The Normal Distribution 1 6.1.1 Student Learning Objectives By the end of this chapter, the student should be able to: Recognize the normal probability distribution

More information

EXAM #1 (Example) Instructor: Ela Jackiewicz. Relax and good luck!

EXAM #1 (Example) Instructor: Ela Jackiewicz. Relax and good luck! STP 231 EXAM #1 (Example) Instructor: Ela Jackiewicz Honor Statement: I have neither given nor received information regarding this exam, and I will not do so until all exams have been graded and returned.

More information

Demographics of Atlanta, Georgia:

Demographics of Atlanta, Georgia: Demographics of Atlanta, Georgia: A Visual Analysis of the 2000 and 2010 Census Data 36-315 Final Project Rachel Cohen, Kathryn McKeough, Minnar Xie & David Zimmerman Ethnicities of Atlanta Figure 1: From

More information

IBM SPSS Statistics 20 Part 4: Chi-Square and ANOVA

IBM SPSS Statistics 20 Part 4: Chi-Square and ANOVA CALIFORNIA STATE UNIVERSITY, LOS ANGELES INFORMATION TECHNOLOGY SERVICES IBM SPSS Statistics 20 Part 4: Chi-Square and ANOVA Summer 2013, Version 2.0 Table of Contents Introduction...2 Downloading the

More information