Learning goals for this chapter:

Size: px
Start display at page:

Download "Learning goals for this chapter:"

Transcription

1 Chapter 1: Looking at Data--Distributions Section 1.1: Introduction, Displaying Distributions with Graphs Section 1.2: Describing Distributions with Numbers Learning goals for this chapter: Identify categorical and quantitative variables. Interpret, create (by hand and with SPSS), and know when to use: bar graphs, pie charts, stemplots (standard, back-to-back, split), histograms, and boxplots (regular, modified, side-by-side). Describe the shape, center, and spread of data distributions. Define, calculate (by hand and with SPSS), and know when to use measures of center (mean vs. median) and spread (range, 5-number summary, IQR, variance, standard deviation). Understand what a resistant measure of center and spread is and when this is important. Use the 1.5IQR rule to look for outliers. Draw a Normal curve in correct proportions and identify the mean/median, standard deviation, middle 68%, middle 95%, and middle 99.7%. Perform calculations with the empirical rule, both backwards and forwards. Understand the need for standardization. Big picture: what do we learn in this chapter? Individuals vs. Variables Categorical vs. Quantitative Variables Graphs: Bar graphs and pie charts (categorical variables) Histograms and stemplots (quantitative variables good for checking for symmetry and skewness) Boxplots (quantitative variables graphical display of the 5 # summary, modified boxplots show outliers) Describing distributions Shape (symmetric/skewed, unimodal/bimodal/multimodal) Center (mean or median) Spread (usually standard deviation/variance or IQR from the 5 # summary) Outliers If you have a symmetric distribution with no outliers, use the mean and standard deviation. If you have a skewed distribution and/or you have outliers, use the 5 # summary instead. 1

2 2 components in describing data or information: Individuals: objects being described by a set of data (people, households, cars, animals, corn, etc.) Variables: characteristics of individuals (height, yield, length, age, eye color, etc.) Categorical: places an individual into one of several groups (gender, eye color, college major, hometown, etc.) Quantitative: Attaches a numerical value to a variable so that adding or averaging the values makes sense (height, weight, age, income, yield, etc.) Distribution of a variable: describes what values a variables takes and how often it takes those values If you have more than one variable in your problem, you should look at each variable by itself before you look at relationships between the variables. Example: Identify whether the following questions would give you categorical or quantitative data. a) What letter grade did you get in your Calculus class last semester? b) What was your score on the last exam? c) Who will you vote for in the next election? d) How many votes did George W. Bush get? e) How many red M&Ms are in this bag? f) Which type of M&Ms has more red ones: peanut or plain? It s always a good idea to start by displaying variables graphically before you do any other statistical analysis. What kind of graph should you use? That depends on whether you have a categorical or quantitative variable. Categorical Variables: Bar graphs or pie charts Messy room example: In a poll of 200 parents of children ages 6 to 12, respondents were asked to name the most disgusting things ever found in their children s rooms. The results are below (J&C 2005) Most disgusting thing # of parents % of parents Food-related % 2

3 Count Animal and insect-related nuisances 22 11% Clothing (dirty socks and underwear especially) 22 11% Other 50 25% Bar graph (can use either # of parents like below or % of parents): animal clothing food other type of disgusting mess Cases weighted by # of parents Pie chart (needs % of parents): 11.0% animal type of disgusting mess animal clothing food other 25.0% other 11.0% clothing 53.0% food Cases weighted by # of parents 3

4 Quantitative Variables: Stemplots, histograms, and boxplots (discussed a little later) Example: You investigate the amount of time students spend online (in minutes). You study 28 students, and their times are listed below. Show the distribution of times with a stemplot To create a stemplot by hand, 1. Put the data in order from smallest to largest. 2. The stem will be all digits for a data point except for the last one. Write the stems in a vertical line. (Think of 7 as being 07 so that all the numbers have a digit in the tens place.) 3. The leaf will be the next digit (in this case, the ones place) from each data point. Write the leaves after the appropriate stem, in increasing order. 4. It is possible to trim any digits that you feel may be unnecessary. For example, if our second data point had been 20.3, we would probably choose to ignore the.3 for the purposes of the stemplot so that we could create a more reasonable stemplot. If we did not ignore this.3, then our stems would have been 07, 08, 09, 10, 11, 12, 13,, 88 with decimal numbers as our leaves. This would show a very uniform stemplot with only one leaf for each stem (all leaves would be 0 except for the 3). This would not be helpful to us at all. It makes much more sense to use the tens place for the stem and the ones place as the leaves in this example. Stemplot A split stemplot just has more stems. There are several ways to split the stems. Here they are split by fives

5 Why do we need split stemplots? Sometimes it is easier to see the shape of the data with more stems. Sometimes a regular stemplot is better. If you re not sure, try it both ways and see if a pattern appears. Try a stemplot and a split stemplot with this data (use the hundreds place for stems): 3, 4, 17, 18, 39, 93, 102, 110, 143, 178, 250, 278, 299, Histograms Sorting the quantitative data into bins. How many bins? Not too many bins with either 0 or 1 counts. Not overly summarized so that you lose all the information Not so detailed that it is no longer a summary Too few bins OK Too many bins 5

6 Histograms The bars for each interval touch each other. Histograms have a continuous, quantitative x-axis, with the x-values in order. Quantitative variables Bar graphs The bars for each category do not touch each other. There are spaces between the bars. Bar graphs can have the categories on the x-axis listed in any order (alphabetical, biggest-tosmallest, etc.) Categorical variables Histograms Quantitative variables Good for big data sets, especially if technology is available. Uses a box to represent each data point. Quantitative variables Stemplots Good for small data sets, convenient for back-ofthe-envelope calculations. Rarely found in scientific or laymen publications. Uses a digit to represent each data point. 6

7 You ve drawn your graph (histogram or stemplot). Now what? Look for overall pattern and any outliers. The pattern is described by shape, center, and spread. 1. Shape: o # of peaks (unimodal = 1, bimodal = 2, multimodal > 2) o Where the long tail is: Symmetric Right skewed (long tail on the right) Left skewed (long tail on the left) Median Mean Median < Mean Median > Mean To describe the shape, use a histogram with a smoothed curve highlighting the overall pattern of the distribution (don t get overly detailed). 2. Center: (If the distribution is symmetric, the mean will equal the median, but otherwise these numbers are not the same.) a) Mean: arithmetic average, x 1 n xi n i 1 Where n = the total # of observations And x i = an individual observation b) Mode: the most common number, biggest peak 7

8 c) Median: M, midpoint of the distribution such that ½ the observations are smaller and ½ the observations are larger. The median is not as affected by outliers as the mean is; the median is resistant to outliers. To find the median: i. Order the data form smallest to largest ii. Count the # of observations (n) iii. n 1 Calculate to find the center of the data set. 2 iv. If n is odd, M is the data point at the center of the data set. v. n 1 If n is even, falls between 2 data points, called the 2 middle pair. M = the average of the middle pair Examples of center: Find the mean and median of the following 7 numbers in Dataset A: Find the mean and median of the following 8 numbers in Dataset B: Spread: a) Range = max min (simplest, not always the most helpful) b) Variance: s 2, average of the square of deviations of observations from the mean n s ( xi x) n 1 i 1 c) Standard Deviation: s, square root of the variance, common way for measuring how far observations are from the mean Example of finding the standard deviation by hand: 0, 2, 4 1. Calculate the mean. 2. Calculate the variance. 3. Take the square root of the variance. 8

9 d) P th percentile: value such that p% of the observations fall at or below it Median = M = 50 th percentile First Quartile = Q 1 = 25 th percentile Third Quartile = Q 3 = 75 th percentile How do you find quartiles? Think of them as mini-medians. Leave the median out, and then find the median of what is left over on the left side (Q 1 ) and what is left over on the right side (Q 3 ). Find the 1 st and 3 rd quartiles of the following 7 numbers in Dataset A: Min M Max Find the 1 st and 3 rd quartiles of the following 8 numbers in Dataset B: Min Max M = 7 e) 5-Number Summary: Min Q 1 M Q 3 Max f) Interquartile Range (IQR) = Q 3 Q 1 Call an observation a suspected outlier if it is: > Q IQR OR < Q IQR g) Boxplots: Graph of the 5-number summary Modified boxplots have lines extend from the box out to the smallest and largest observations which are NOT outliers. Dots mark any outliers. (We will always ask for the modified boxplot, but if there are no outliers, the modified and regular boxplots look exactly the same.) 9

10 Boxplot for Dataset A with 5- number summary: -20, 1, 25, 33, 67 Since there were no outliers in this dataset, a regular boxplot and a modified boxplot look exactly the same for this data. For the online time example (with 2 additional data points added in), list the 5-number summary, find any outliers present, and show a boxplot and modified boxplot

11 How do you know which method is best for determining center and spread? 5-Number Summary: better for skewed distributions or distribution with outliers Mean and Standard Deviation: good for reasonably symmetric distributions free of outliers. Always start with a graph! In the internet time example, here are how the mean/standard deviation and 5-number summary are affected by the outlier: With outlier (151) With outlier removed from dataset Mean Standard Deviation number summary 7, 30, 46.5, 77, 151 7, 29, 46, 76, 135 The Median vs. the Mean in the Age of Average by Mike Pesca on NPR s Day-to-Day 7/19/06: Do you always have to do all of this by hand? NO! Statistical software packages like SPSS can make life much easier for you, but it s a good idea to know how to do these by hand so you can make sense of your output. Also, on the exam, you won t have access to a computer. Read over your SPSS manual and get comfortable with using SPSS. You will have a chance to practice on the HW for this week, and you will work on it in lab on Friday. Enter your data, then Analyze--> Descriptive Statistics--> Explore. Follow the instructions on p. 48 of the SPSS manual. The output from SPSS for the internet time problem looks like: Descriptive s Time spent on the web Mean 95% Confidence Interval for Mean Lower Bound Upper Bound Statistic Std. Error % Trimmed Mean Median Variance Std. Deviation Minimum Maximum Range Interquartile Range Skewness Kurtosis

12 Frequency Stem-and-Leaf Plot Histogram Frequency Stem & Leaf Time spent on the web Mean = Std. Dev. = N = Extremes (>=151) Stem width: 100 Each leaf: 1 case(s) Notice on the boxplot, it is easy to identify the potential outlier. This would be your indication that the 5-number summary would be the best way to describe your data. (You could also try calculating the mean and standard deviation without the outlier for comparison.) SPSS can also give you the Quartiles (listed under Percentiles ), but these are not necessarily the same answers as what you would get by hand. The weighted average and Tukey s Hinges are not the same method we use. For this class, whenever we ask you to calculate the Quartiles, we want you to do them by hand. 12

13 What if you want to compare the results from two or more different groups? Use side-by-side boxplots or back-to-back stemplots for your graphs. Female Male

14 Preview of Section 1.3 (from Section 1.3) A z-score tells us how many standard deviations away from the mean an observation is. z x This is also called getting a standardized value. Why is standardization useful? For comparing apples to oranges. Example: (p. 88, Problem 1.99) Jacob scores 16 on the ACT. Emily scores 670 on the SAT. Assuming that both tests measure scholastic aptitude, who has the higher score? The SAT scores for 1.4 million students in a recent graduating class were roughly normal with a mean of 1026 and standard deviation of 209. The ACT scores for more than 1 million students in the same class were roughly normal with mean of 20.8 and standard deviation of

15 How else can we use standardization? If the distribution of observations has a bellshape, then these standardized values have some special properties. One of these is the % Empirical Rule. Approximately 68% of the observations fall within 1 of the mean (between 1 and 1 ). Approximately 95% of the observations fall within 2 of the mean (between 2 and 2 ). Approximately 99.7% of the observations fall within 3 of the mean (between 3 and 3 ). P( -1 <X< +1 ) = 0.68 P( -2 <X< +2 ) = 0.95 P( -3 <X< +3 ) = Standard deviations away from the mean (z-score), so a z-score of -2 could also be written as 2, for example. The mean and the median of a bell-shaped curve are in the middle. This is shown with a 0 because the mean is 0 standard deviations away from itself. The most famous bell-shaped distribution is the Normal distribution. We will spend several lectures talking about it for Section 1.3, and it will be important to everything we do for the rest of the semester. 15

16 Example: Checking account balances are approximately Normally distributed with a mean of $1325 and a standard deviation of $25. a) Between what numbers do 68% of the balances fall? b) Above what number do 2.5% of the balances lie? c) Approximately what percent of balances are between 1250 and 1400? 16

Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs

Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs Types of Variables Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs Quantitative (numerical)variables: take numerical values for which arithmetic operations make sense (addition/averaging)

More information

Exploratory data analysis (Chapter 2) Fall 2011

Exploratory data analysis (Chapter 2) Fall 2011 Exploratory data analysis (Chapter 2) Fall 2011 Data Examples Example 1: Survey Data 1 Data collected from a Stat 371 class in Fall 2005 2 They answered questions about their: gender, major, year in school,

More information

Lecture 1: Review and Exploratory Data Analysis (EDA)

Lecture 1: Review and Exploratory Data Analysis (EDA) Lecture 1: Review and Exploratory Data Analysis (EDA) Sandy Eckel seckel@jhsph.edu Department of Biostatistics, The Johns Hopkins University, Baltimore USA 21 April 2008 1 / 40 Course Information I Course

More information

STATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI

STATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI STATS8: Introduction to Biostatistics Data Exploration Babak Shahbaba Department of Statistics, UCI Introduction After clearly defining the scientific problem, selecting a set of representative members

More information

Exercise 1.12 (Pg. 22-23)

Exercise 1.12 (Pg. 22-23) Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.

More information

Center: Finding the Median. Median. Spread: Home on the Range. Center: Finding the Median (cont.)

Center: Finding the Median. Median. Spread: Home on the Range. Center: Finding the Median (cont.) Center: Finding the Median When we think of a typical value, we usually look for the center of the distribution. For a unimodal, symmetric distribution, it s easy to find the center it s just the center

More information

Descriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics

Descriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics Descriptive statistics is the discipline of quantitatively describing the main features of a collection of data. Descriptive statistics are distinguished from inferential statistics (or inductive statistics),

More information

AP * Statistics Review. Descriptive Statistics

AP * Statistics Review. Descriptive Statistics AP * Statistics Review Descriptive Statistics Teacher Packet Advanced Placement and AP are registered trademark of the College Entrance Examination Board. The College Board was not involved in the production

More information

Descriptive Statistics

Descriptive Statistics Y520 Robert S Michael Goal: Learn to calculate indicators and construct graphs that summarize and describe a large quantity of values. Using the textbook readings and other resources listed on the web

More information

1.3 Measuring Center & Spread, The Five Number Summary & Boxplots. Describing Quantitative Data with Numbers

1.3 Measuring Center & Spread, The Five Number Summary & Boxplots. Describing Quantitative Data with Numbers 1.3 Measuring Center & Spread, The Five Number Summary & Boxplots Describing Quantitative Data with Numbers 1.3 I can n Calculate and interpret measures of center (mean, median) in context. n Calculate

More information

Descriptive Statistics. Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion

Descriptive Statistics. Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion Descriptive Statistics Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion Statistics as a Tool for LIS Research Importance of statistics in research

More information

3: Summary Statistics

3: Summary Statistics 3: Summary Statistics Notation Let s start by introducing some notation. Consider the following small data set: 4 5 30 50 8 7 4 5 The symbol n represents the sample size (n = 0). The capital letter X denotes

More information

Chapter 1: Exploring Data

Chapter 1: Exploring Data Chapter 1: Exploring Data Chapter 1 Review 1. As part of survey of college students a researcher is interested in the variable class standing. She records a 1 if the student is a freshman, a 2 if the student

More information

Exploratory Data Analysis

Exploratory Data Analysis Exploratory Data Analysis Johannes Schauer johannes.schauer@tugraz.at Institute of Statistics Graz University of Technology Steyrergasse 17/IV, 8010 Graz www.statistics.tugraz.at February 12, 2008 Introduction

More information

Variables. Exploratory Data Analysis

Variables. Exploratory Data Analysis Exploratory Data Analysis Exploratory Data Analysis involves both graphical displays of data and numerical summaries of data. A common situation is for a data set to be represented as a matrix. There is

More information

Data Exploration Data Visualization

Data Exploration Data Visualization Data Exploration Data Visualization What is data exploration? A preliminary exploration of the data to better understand its characteristics. Key motivations of data exploration include Helping to select

More information

Introduction to Statistics for Psychology. Quantitative Methods for Human Sciences

Introduction to Statistics for Psychology. Quantitative Methods for Human Sciences Introduction to Statistics for Psychology and Quantitative Methods for Human Sciences Jonathan Marchini Course Information There is website devoted to the course at http://www.stats.ox.ac.uk/ marchini/phs.html

More information

Diagrams and Graphs of Statistical Data

Diagrams and Graphs of Statistical Data Diagrams and Graphs of Statistical Data One of the most effective and interesting alternative way in which a statistical data may be presented is through diagrams and graphs. There are several ways in

More information

Summarizing and Displaying Categorical Data

Summarizing and Displaying Categorical Data Summarizing and Displaying Categorical Data Categorical data can be summarized in a frequency distribution which counts the number of cases, or frequency, that fall into each category, or a relative frequency

More information

Mind on Statistics. Chapter 2

Mind on Statistics. Chapter 2 Mind on Statistics Chapter 2 Sections 2.1 2.3 1. Tallies and cross-tabulations are used to summarize which of these variable types? A. Quantitative B. Mathematical C. Continuous D. Categorical 2. The table

More information

AP Statistics Solutions to Packet 2

AP Statistics Solutions to Packet 2 AP Statistics Solutions to Packet 2 The Normal Distributions Density Curves and the Normal Distribution Standard Normal Calculations HW #9 1, 2, 4, 6-8 2.1 DENSITY CURVES (a) Sketch a density curve that

More information

First Midterm Exam (MATH1070 Spring 2012)

First Midterm Exam (MATH1070 Spring 2012) First Midterm Exam (MATH1070 Spring 2012) Instructions: This is a one hour exam. You can use a notecard. Calculators are allowed, but other electronics are prohibited. 1. [40pts] Multiple Choice Problems

More information

Pie Charts. proportion of ice-cream flavors sold annually by a given brand. AMS-5: Statistics. Cherry. Cherry. Blueberry. Blueberry. Apple.

Pie Charts. proportion of ice-cream flavors sold annually by a given brand. AMS-5: Statistics. Cherry. Cherry. Blueberry. Blueberry. Apple. Graphical Representations of Data, Mean, Median and Standard Deviation In this class we will consider graphical representations of the distribution of a set of data. The goal is to identify the range of

More information

How To Write A Data Analysis

How To Write A Data Analysis Mathematics Probability and Statistics Curriculum Guide Revised 2010 This page is intentionally left blank. Introduction The Mathematics Curriculum Guide serves as a guide for teachers when planning instruction

More information

Introduction to Environmental Statistics. The Big Picture. Populations and Samples. Sample Data. Examples of sample data

Introduction to Environmental Statistics. The Big Picture. Populations and Samples. Sample Data. Examples of sample data A Few Sources for Data Examples Used Introduction to Environmental Statistics Professor Jessica Utts University of California, Irvine jutts@uci.edu 1. Statistical Methods in Water Resources by D.R. Helsel

More information

Exploratory Data Analysis. Psychology 3256

Exploratory Data Analysis. Psychology 3256 Exploratory Data Analysis Psychology 3256 1 Introduction If you are going to find out anything about a data set you must first understand the data Basically getting a feel for you numbers Easier to find

More information

Northumberland Knowledge

Northumberland Knowledge Northumberland Knowledge Know Guide How to Analyse Data - November 2012 - This page has been left blank 2 About this guide The Know Guides are a suite of documents that provide useful information about

More information

The Big Picture. Describing Data: Categorical and Quantitative Variables Population. Descriptive Statistics. Community Coalitions (n = 175)

The Big Picture. Describing Data: Categorical and Quantitative Variables Population. Descriptive Statistics. Community Coalitions (n = 175) Describing Data: Categorical and Quantitative Variables Population The Big Picture Sampling Statistical Inference Sample Exploratory Data Analysis Descriptive Statistics In order to make sense of data,

More information

Lecture 2: Descriptive Statistics and Exploratory Data Analysis

Lecture 2: Descriptive Statistics and Exploratory Data Analysis Lecture 2: Descriptive Statistics and Exploratory Data Analysis Further Thoughts on Experimental Design 16 Individuals (8 each from two populations) with replicates Pop 1 Pop 2 Randomly sample 4 individuals

More information

Using SPSS, Chapter 2: Descriptive Statistics

Using SPSS, Chapter 2: Descriptive Statistics 1 Using SPSS, Chapter 2: Descriptive Statistics Chapters 2.1 & 2.2 Descriptive Statistics 2 Mean, Standard Deviation, Variance, Range, Minimum, Maximum 2 Mean, Median, Mode, Standard Deviation, Variance,

More information

The right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median

The right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median CONDENSED LESSON 2.1 Box Plots In this lesson you will create and interpret box plots for sets of data use the interquartile range (IQR) to identify potential outliers and graph them on a modified box

More information

MBA 611 STATISTICS AND QUANTITATIVE METHODS

MBA 611 STATISTICS AND QUANTITATIVE METHODS MBA 611 STATISTICS AND QUANTITATIVE METHODS Part I. Review of Basic Statistics (Chapters 1-11) A. Introduction (Chapter 1) Uncertainty: Decisions are often based on incomplete information from uncertain

More information

Classify the data as either discrete or continuous. 2) An athlete runs 100 meters in 10.5 seconds. 2) A) Discrete B) Continuous

Classify the data as either discrete or continuous. 2) An athlete runs 100 meters in 10.5 seconds. 2) A) Discrete B) Continuous Chapter 2 Overview Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Classify as categorical or qualitative data. 1) A survey of autos parked in

More information

Descriptive Statistics and Measurement Scales

Descriptive Statistics and Measurement Scales Descriptive Statistics 1 Descriptive Statistics and Measurement Scales Descriptive statistics are used to describe the basic features of the data in a study. They provide simple summaries about the sample

More information

Foundation of Quantitative Data Analysis

Foundation of Quantitative Data Analysis Foundation of Quantitative Data Analysis Part 1: Data manipulation and descriptive statistics with SPSS/Excel HSRS #10 - October 17, 2013 Reference : A. Aczel, Complete Business Statistics. Chapters 1

More information

Lesson 4 Measures of Central Tendency

Lesson 4 Measures of Central Tendency Outline Measures of a distribution s shape -modality and skewness -the normal distribution Measures of central tendency -mean, median, and mode Skewness and Central Tendency Lesson 4 Measures of Central

More information

a. mean b. interquartile range c. range d. median

a. mean b. interquartile range c. range d. median 3. Since 4. The HOMEWORK 3 Due: Feb.3 1. A set of data are put in numerical order, and a statistic is calculated that divides the data set into two equal parts with one part below it and the other part

More information

Chapter 3. The Normal Distribution

Chapter 3. The Normal Distribution Chapter 3. The Normal Distribution Topics covered in this chapter: Z-scores Normal Probabilities Normal Percentiles Z-scores Example 3.6: The standard normal table The Problem: What proportion of observations

More information

BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I

BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I BNG 202 Biomechanics Lab Descriptive statistics and probability distributions I Overview The overall goal of this short course in statistics is to provide an introduction to descriptive and inferential

More information

+ Chapter 1 Exploring Data

+ Chapter 1 Exploring Data Chapter 1 Exploring Data Introduction: Data Analysis: Making Sense of Data 1.1 Analyzing Categorical Data 1.2 Displaying Quantitative Data with Graphs 1.3 Describing Quantitative Data with Numbers Introduction

More information

T O P I C 1 2 Techniques and tools for data analysis Preview Introduction In chapter 3 of Statistics In A Day different combinations of numbers and types of variables are presented. We go through these

More information

Visualizing Data. Contents. 1 Visualizing Data. Anthony Tanbakuchi Department of Mathematics Pima Community College. Introductory Statistics Lectures

Visualizing Data. Contents. 1 Visualizing Data. Anthony Tanbakuchi Department of Mathematics Pima Community College. Introductory Statistics Lectures Introductory Statistics Lectures Visualizing Data Descriptive Statistics I Department of Mathematics Pima Community College Redistribution of this material is prohibited without written permission of the

More information

EXAM #1 (Example) Instructor: Ela Jackiewicz. Relax and good luck!

EXAM #1 (Example) Instructor: Ela Jackiewicz. Relax and good luck! STP 231 EXAM #1 (Example) Instructor: Ela Jackiewicz Honor Statement: I have neither given nor received information regarding this exam, and I will not do so until all exams have been graded and returned.

More information

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,

More information

2. Here is a small part of a data set that describes the fuel economy (in miles per gallon) of 2006 model motor vehicles.

2. Here is a small part of a data set that describes the fuel economy (in miles per gallon) of 2006 model motor vehicles. Math 1530-017 Exam 1 February 19, 2009 Name Student Number E There are five possible responses to each of the following multiple choice questions. There is only on BEST answer. Be sure to read all possible

More information

Week 1. Exploratory Data Analysis

Week 1. Exploratory Data Analysis Week 1 Exploratory Data Analysis Practicalities This course ST903 has students from both the MSc in Financial Mathematics and the MSc in Statistics. Two lectures and one seminar/tutorial per week. Exam

More information

MEASURES OF VARIATION

MEASURES OF VARIATION NORMAL DISTRIBTIONS MEASURES OF VARIATION In statistics, it is important to measure the spread of data. A simple way to measure spread is to find the range. But statisticians want to know if the data are

More information

Descriptive statistics parameters: Measures of centrality

Descriptive statistics parameters: Measures of centrality Descriptive statistics parameters: Measures of centrality Contents Definitions... 3 Classification of descriptive statistics parameters... 4 More about central tendency estimators... 5 Relationship between

More information

Lecture 2. Summarizing the Sample

Lecture 2. Summarizing the Sample Lecture 2 Summarizing the Sample WARNING: Today s lecture may bore some of you It s (sort of) not my fault I m required to teach you about what we re going to cover today. I ll try to make it as exciting

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Suppose following data have been collected (heights of 99 five-year-old boys) 117.9 11.2 112.9 115.9 18. 14.6 17.1 117.9 111.8 16.3 111. 1.4 112.1 19.2 11. 15.4 99.4 11.1 13.3 16.9

More information

Mathematical goals. Starting points. Materials required. Time needed

Mathematical goals. Starting points. Materials required. Time needed Level S6 of challenge: B/C S6 Interpreting frequency graphs, cumulative cumulative frequency frequency graphs, graphs, box and box whisker and plots whisker plots Mathematical goals Starting points Materials

More information

Section 1.3 Exercises (Solutions)

Section 1.3 Exercises (Solutions) Section 1.3 Exercises (s) 1.109, 1.110, 1.111, 1.114*, 1.115, 1.119*, 1.122, 1.125, 1.127*, 1.128*, 1.131*, 1.133*, 1.135*, 1.137*, 1.139*, 1.145*, 1.146-148. 1.109 Sketch some normal curves. (a) Sketch

More information

Common Tools for Displaying and Communicating Data for Process Improvement

Common Tools for Displaying and Communicating Data for Process Improvement Common Tools for Displaying and Communicating Data for Process Improvement Packet includes: Tool Use Page # Box and Whisker Plot Check Sheet Control Chart Histogram Pareto Diagram Run Chart Scatter Plot

More information

SPSS Manual for Introductory Applied Statistics: A Variable Approach

SPSS Manual for Introductory Applied Statistics: A Variable Approach SPSS Manual for Introductory Applied Statistics: A Variable Approach John Gabrosek Department of Statistics Grand Valley State University Allendale, MI USA August 2013 2 Copyright 2013 John Gabrosek. All

More information

CHAPTER 7 INTRODUCTION TO SAMPLING DISTRIBUTIONS

CHAPTER 7 INTRODUCTION TO SAMPLING DISTRIBUTIONS CHAPTER 7 INTRODUCTION TO SAMPLING DISTRIBUTIONS CENTRAL LIMIT THEOREM (SECTION 7.2 OF UNDERSTANDABLE STATISTICS) The Central Limit Theorem says that if x is a random variable with any distribution having

More information

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4) Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume

More information

Scatter Plots with Error Bars

Scatter Plots with Error Bars Chapter 165 Scatter Plots with Error Bars Introduction The procedure extends the capability of the basic scatter plot by allowing you to plot the variability in Y and X corresponding to each point. Each

More information

STAB22 section 1.1. total = 88(200/100) + 85(200/100) + 77(300/100) + 90(200/100) + 80(100/100) = 176 + 170 + 231 + 180 + 80 = 837,

STAB22 section 1.1. total = 88(200/100) + 85(200/100) + 77(300/100) + 90(200/100) + 80(100/100) = 176 + 170 + 231 + 180 + 80 = 837, STAB22 section 1.1 1.1 Find the student with ID 104, who is in row 5. For this student, Exam1 is 95, Exam2 is 98, and Final is 96, reading along the row. 1.2 This one involves a careful reading of the

More information

Describing, Exploring, and Comparing Data

Describing, Exploring, and Comparing Data 24 Chapter 2. Describing, Exploring, and Comparing Data Chapter 2. Describing, Exploring, and Comparing Data There are many tools used in Statistics to visualize, summarize, and describe data. This chapter

More information

THE BINOMIAL DISTRIBUTION & PROBABILITY

THE BINOMIAL DISTRIBUTION & PROBABILITY REVISION SHEET STATISTICS 1 (MEI) THE BINOMIAL DISTRIBUTION & PROBABILITY The main ideas in this chapter are Probabilities based on selecting or arranging objects Probabilities based on the binomial distribution

More information

Section 1.1 Exercises (Solutions)

Section 1.1 Exercises (Solutions) Section 1.1 Exercises (Solutions) HW: 1.14, 1.16, 1.19, 1.21, 1.24, 1.25*, 1.31*, 1.33, 1.34, 1.35, 1.38*, 1.39, 1.41* 1.14 Employee application data. The personnel department keeps records on all employees

More information

Def: The standard normal distribution is a normal probability distribution that has a mean of 0 and a standard deviation of 1.

Def: The standard normal distribution is a normal probability distribution that has a mean of 0 and a standard deviation of 1. Lecture 6: Chapter 6: Normal Probability Distributions A normal distribution is a continuous probability distribution for a random variable x. The graph of a normal distribution is called the normal curve.

More information

Sta 309 (Statistics And Probability for Engineers)

Sta 309 (Statistics And Probability for Engineers) Instructor: Prof. Mike Nasab Sta 309 (Statistics And Probability for Engineers) Chapter 2 Organizing and Summarizing Data Raw Data: When data are collected in original form, they are called raw data. The

More information

The Normal Distribution

The Normal Distribution Chapter 6 The Normal Distribution 6.1 The Normal Distribution 1 6.1.1 Student Learning Objectives By the end of this chapter, the student should be able to: Recognize the normal probability distribution

More information

Frequency Distributions

Frequency Distributions Descriptive Statistics Dr. Tom Pierce Department of Psychology Radford University Descriptive statistics comprise a collection of techniques for better understanding what the people in a group look like

More information

MATH 103/GRACEY PRACTICE EXAM/CHAPTERS 2-3. MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

MATH 103/GRACEY PRACTICE EXAM/CHAPTERS 2-3. MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. MATH 3/GRACEY PRACTICE EXAM/CHAPTERS 2-3 Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Provide an appropriate response. 1) The frequency distribution

More information

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012 Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts

More information

Name: Date: Use the following to answer questions 2-3:

Name: Date: Use the following to answer questions 2-3: Name: Date: 1. A study is conducted on students taking a statistics class. Several variables are recorded in the survey. Identify each variable as categorical or quantitative. A) Type of car the student

More information

Practice#1(chapter1,2) Name

Practice#1(chapter1,2) Name Practice#1(chapter1,2) Name Solve the problem. 1) The average age of the students in a statistics class is 22 years. Does this statement describe descriptive or inferential statistics? A) inferential statistics

More information

Interpreting Data in Normal Distributions

Interpreting Data in Normal Distributions Interpreting Data in Normal Distributions This curve is kind of a big deal. It shows the distribution of a set of test scores, the results of rolling a die a million times, the heights of people on Earth,

More information

Means, standard deviations and. and standard errors

Means, standard deviations and. and standard errors CHAPTER 4 Means, standard deviations and standard errors 4.1 Introduction Change of units 4.2 Mean, median and mode Coefficient of variation 4.3 Measures of variation 4.4 Calculating the mean and standard

More information

Midterm Review Problems

Midterm Review Problems Midterm Review Problems October 19, 2013 1. Consider the following research title: Cooperation among nursery school children under two types of instruction. In this study, what is the independent variable?

More information

Tutorial 3: Graphics and Exploratory Data Analysis in R Jason Pienaar and Tom Miller

Tutorial 3: Graphics and Exploratory Data Analysis in R Jason Pienaar and Tom Miller Tutorial 3: Graphics and Exploratory Data Analysis in R Jason Pienaar and Tom Miller Getting to know the data An important first step before performing any kind of statistical analysis is to familiarize

More information

Module 4: Data Exploration

Module 4: Data Exploration Module 4: Data Exploration Now that you have your data downloaded from the Streams Project database, the detective work can begin! Before computing any advanced statistics, we will first use descriptive

More information

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name: Glo bal Leadership M BA BUSINESS STATISTICS FINAL EXAM Name: INSTRUCTIONS 1. Do not open this exam until instructed to do so. 2. Be sure to fill in your name before starting the exam. 3. You have two hours

More information

HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS

HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS Mathematics Revision Guides Histograms, Cumulative Frequency and Box Plots Page 1 of 25 M.K. HOME TUITION Mathematics Revision Guides Level: GCSE Higher Tier HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS

More information

Geostatistics Exploratory Analysis

Geostatistics Exploratory Analysis Instituto Superior de Estatística e Gestão de Informação Universidade Nova de Lisboa Master of Science in Geospatial Technologies Geostatistics Exploratory Analysis Carlos Alberto Felgueiras cfelgueiras@isegi.unl.pt

More information

MTH 140 Statistics Videos

MTH 140 Statistics Videos MTH 140 Statistics Videos Chapter 1 Picturing Distributions with Graphs Individuals and Variables Categorical Variables: Pie Charts and Bar Graphs Categorical Variables: Pie Charts and Bar Graphs Quantitative

More information

Shape of Data Distributions

Shape of Data Distributions Lesson 13 Main Idea Describe a data distribution by its center, spread, and overall shape. Relate the choice of center and spread to the shape of the distribution. New Vocabulary distribution symmetric

More information

Projects Involving Statistics (& SPSS)

Projects Involving Statistics (& SPSS) Projects Involving Statistics (& SPSS) Academic Skills Advice Starting a project which involves using statistics can feel confusing as there seems to be many different things you can do (charts, graphs,

More information

Chapter 7 Section 7.1: Inference for the Mean of a Population

Chapter 7 Section 7.1: Inference for the Mean of a Population Chapter 7 Section 7.1: Inference for the Mean of a Population Now let s look at a similar situation Take an SRS of size n Normal Population : N(, ). Both and are unknown parameters. Unlike what we used

More information

DESCRIPTIVE STATISTICS AND EXPLORATORY DATA ANALYSIS

DESCRIPTIVE STATISTICS AND EXPLORATORY DATA ANALYSIS DESCRIPTIVE STATISTICS AND EXPLORATORY DATA ANALYSIS SEEMA JAGGI Indian Agricultural Statistics Research Institute Library Avenue, New Delhi - 110 012 seema@iasri.res.in 1. Descriptive Statistics Statistics

More information

Chapter 2: Frequency Distributions and Graphs

Chapter 2: Frequency Distributions and Graphs Chapter 2: Frequency Distributions and Graphs Learning Objectives Upon completion of Chapter 2, you will be able to: Organize the data into a table or chart (called a frequency distribution) Construct

More information

Unit 7: Normal Curves

Unit 7: Normal Curves Unit 7: Normal Curves Summary of Video Histograms of completely unrelated data often exhibit similar shapes. To focus on the overall shape of a distribution and to avoid being distracted by the irregularities

More information

determining relationships among the explanatory variables, and

determining relationships among the explanatory variables, and Chapter 4 Exploratory Data Analysis A first look at the data. As mentioned in Chapter 1, exploratory data analysis or EDA is a critical first step in analyzing the data from an experiment. Here are the

More information

AP STATISTICS REVIEW (YMS Chapters 1-8)

AP STATISTICS REVIEW (YMS Chapters 1-8) AP STATISTICS REVIEW (YMS Chapters 1-8) Exploring Data (Chapter 1) Categorical Data nominal scale, names e.g. male/female or eye color or breeds of dogs Quantitative Data rational scale (can +,,, with

More information

4.1 Exploratory Analysis: Once the data is collected and entered, the first question is: "What do the data look like?"

4.1 Exploratory Analysis: Once the data is collected and entered, the first question is: What do the data look like? Data Analysis Plan The appropriate methods of data analysis are determined by your data types and variables of interest, the actual distribution of the variables, and the number of cases. Different analyses

More information

Dongfeng Li. Autumn 2010

Dongfeng Li. Autumn 2010 Autumn 2010 Chapter Contents Some statistics background; ; Comparing means and proportions; variance. Students should master the basic concepts, descriptive statistics measures and graphs, basic hypothesis

More information

Basics of Statistics

Basics of Statistics Basics of Statistics Jarkko Isotalo 30 20 10 Std. Dev = 486.32 Mean = 3553.8 0 N = 120.00 2400.0 2800.0 3200.0 3600.0 4000.0 4400.0 4800.0 2600.0 3000.0 3400.0 3800.0 4200.0 4600.0 5000.0 Birthweights

More information

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY 1. Introduction Besides arriving at an appropriate expression of an average or consensus value for observations of a population, it is important to

More information

STAT355 - Probability & Statistics

STAT355 - Probability & Statistics STAT355 - Probability & Statistics Instructor: Kofi Placid Adragni Fall 2011 Chap 1 - Overview and Descriptive Statistics 1.1 Populations, Samples, and Processes 1.2 Pictorial and Tabular Methods in Descriptive

More information

Data exploration with Microsoft Excel: univariate analysis

Data exploration with Microsoft Excel: univariate analysis Data exploration with Microsoft Excel: univariate analysis Contents 1 Introduction... 1 2 Exploring a variable s frequency distribution... 2 3 Calculating measures of central tendency... 16 4 Calculating

More information

CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

More information

List of Examples. Examples 319

List of Examples. Examples 319 Examples 319 List of Examples DiMaggio and Mantle. 6 Weed seeds. 6, 23, 37, 38 Vole reproduction. 7, 24, 37 Wooly bear caterpillar cocoons. 7 Homophone confusion and Alzheimer s disease. 8 Gear tooth strength.

More information

Unit Support Notes Statistics (SCQF level 6)

Unit Support Notes Statistics (SCQF level 6) Unit Support Notes Statistics (SCQF level 6) This document may be reproduced in whole or in part for educational purposes provided that no profit is derived from reproduction and that, if reproduced in

More information

3.2 Measures of Spread

3.2 Measures of Spread 3.2 Measures of Spread In some data sets the observations are close together, while in others they are more spread out. In addition to measures of the center, it's often important to measure the spread

More information

Introduction; Descriptive & Univariate Statistics

Introduction; Descriptive & Univariate Statistics Introduction; Descriptive & Univariate Statistics I. KEY COCEPTS A. Population. Definitions:. The entire set of members in a group. EXAMPLES: All U.S. citizens; all otre Dame Students. 2. All values of

More information

Sampling and Descriptive Statistics

Sampling and Descriptive Statistics Sampling and Descriptive Statistics Berlin Chen Department of Computer Science & Information Engineering National Taiwan Normal University Reference: 1. W. Navidi. Statistics for Engineering and Scientists.

More information

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Exam Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 1) The government of a town needs to determine if the city's residents will support the

More information