# Box plots & t-tests. Example

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Box plots & t-tests Box Plots Box plots are a graphical representation of your sample (easy to visualize descriptive statistics); they are also known as box-and-whisker diagrams. Any data that you can present using a bar graph can, in most cases, also be presented using box plots. A box plot provides more information about the data than does a bar graph. Things to know about box plots Your sample is presented as a box. The spacings between the different parts of the box help indicate the degree of dispersion (spread) and skewness in the data, and identify outliers. A box plot shows a 5-number data summary: minimum, first (lower) quartile, median, third (upper) quartile, maximum. The box is divided at the median. The length of the box is the interquartile range (IQR). The 1st quartile is the bottom line. The 3rd quartile is the top line. Example Quartiles divide frequency distributions Q 1 :1 st or lower quartile: cuts off lowest 25% of the data Q 2 :2 nd quartile or median: 50% point, cuts data set in half Q 3 :3 rd quartile or upper quartile: cuts off lowest 75% of the data (or highest 25%) Q 1 is the median of the first half of your data set. Q 3 is the median of the second half of your data set. The difference between the upper and lower quartiles is called the interquartile range. The interquartile range spans 50% of a data set, and eliminates the influence of outliers because the highest and lowest quarters are removed. Example: A biologist samples 12 red oak trees in a forest plot and counts the number of caterpillars on each tree. The following is a list of the number of caterpillars on each tree: 34, 47, 1, 15, 57, 24, 20, 11, 19, 50, 28, 37 1

2 Calculate the median, 1 st and 3 rd quartile. Step 1: Arrange the values in ascending order 1, 11, 15, 19, 20, 24, 28, 34, 37, 47, 50, 57 Step 2: Calculate the median Median = ( )/ 2 = 26 Step 3: Determine Q1 Lower quartile = value of middle of first half of data Q 1 = the median of 1, 11, 15, 19, 20, 24 = (3 rd + 4 th observations) 2 = ( ) 2 = 17 Step 4: Determine Q3 Upper quartile = value of middle of second half of data Q 3 = the median of 28, 34, 37, 47, 50, 57 = (3 rd + 4 th observations) 2 = ( ) 2 = 42 Outliers Observations that are 1.5 x IQR greater than Q3 or less than Q1 are called outliers and are distinguished by a different mark, e.g., an asterisk. The actual symbols used don t matter as long as you are consistent and you explain your symbols in the figure legend. In the figure below the arrows are pointing to the outliers. Do not remove outliers from the dataset unless there is good reason to do so. A good reason could be that the outlier is a typo, for example a student records a caterpillar mass of 10 grams instead of 0.10 grams. Or if the equipment used to measure the observation failed, for example if the balance measures a caterpillar as 10 grams. Don t remove outliers just because you want to make your dataset look prettier. Outliers can point us to interesting patterns or let us know that we may need to increase are sample size. How do you determine if there are any outliers in your sample? 1. Calculate IQR x Add this value to Q3. Are there any values greater than Q3 + (IQR x 1.5)? If so, then these values are outliers. 3. Subtract this value from Q1. Are there any values smaller than Q1 (IQR x 1.5)? If so, then these values are outliers. Whiskers The two vertical lines (called whiskers) outside the box extend to the smallest and largest observations within 1.5 x IQR (interquartile range) of the quartiles. If there are no outliers, then the whiskers extend to the min and max values. 2

3 Comparing populations using t-tests The t-test is a parametric test for comparing two sets of continuous data. It is used for comparing two sample means. For example, we want to know if the average mass of the caterpillars fed leaves from plants receiving the fertilizer treatment is larger than the average mass of caterpillars fed leaves from plants receiving the control treatment. A t-test allows you to determine if there is a statistically significance difference between the two treatments. When you are comparing two samples, then you use a t-test. A t-test doesn t work if you are comparing more than two samples. For example, if you were comparing caterpillars fed leaves from a high, low, and control fertilizer treatments, then you would not be able to use a t-test. The two-sample t-test is a hypothesis test for answering questions about the mean when the data are collected from two random samples of independent observations, each from an underlying normal distribution. H 1 : The means are the two samples (or treatments) are different. H 0 : (null hypothesis): The samples are from the same populations = the means are equal. A t-test uses 3 pieces of information 1. Sample size = the number of replicates in each sample. 2. The difference between the two means The figures above show frequency histograms of hypothetical caterpillar mass data. The difference between the two means is greater in figure A). These data will produce a more significant t-test. 3. The variance of each sample The figures above show frequency histograms of hypothetical caterpillar mass data. The difference between the two means is equal; however, the two samples in A) are more clearly distinguishable because of their smaller variances. The data in sample A) are more likely to produce a significant t-test. Variance is represented by s 2. Variance is a measure of data scattering around the mean. Variance is a way to measure how variable or dispersed the data are. Variance is equal to the standard deviation (standard deviation = s) squared. 3

5 How to conduct an independent t-test There are two parts to a t-test: the t-statistic and the p-value. First, you calculate the t statistic and then you determine the p-value that goes along with your t-statistic. t statistic: In English: The mean of sample 1 minus the mean of sample 2 divided by the square root of the variance of sample 1 divided by the sample size of sample 1 plus the variance of sample 2 divided by the sample size of sample 2. You also need to calculate the degree of freedom (df) for the t-test: Degree of freedom = n 1 + n 2-2 Example You are researching the impact of urbanization on robin populations. You conduct point-counts in urban and rural settings. From these point-counts you are able to estimate robin density (# of robins per hectare). You conduct 15 point-count surveys in each setting. The first step is to establish the specific hypotheses we wish to examine. H 0 : Null hypothesis is that the difference between the two groups is 0. There is no difference in robin density between the urban and rural settings. H 1 : Alternative or research hypothesis - the difference between the two groups is not 0. There is a difference in robin density between the urban and rural settings. This hypothesis would be tested using a 2-tailed t-test. Or you could make a directional prediction I predict that robin density will be higher in the rural environment. This hypothesis would be tested using a 1-tailed t-test. For this example, we are going to use the 2-tailed hypothesis. The data 5

6 Information you need for a t-test (calculated from the data): Sample size of a = n a = 15 Sample size of b = n b = 15 Mean a = Mean b = Var a = s a 2 = Var b = s b 2 = 7.79 Degree of freedom: = 28 If you plug these numbers into the above t-stat formula, then you will calculate t = Is this a high or low value? Now you need to consult a statistical table to determine the significance of the t-statistic. There is a copy of part of a statistical table included at the end of these notes additionally, if you have taken a statistics class previously and still have your textbook, then there is probably a copy of the table at the end of the book. 1. First, look down the left hand column to find the degrees of freedom that matches the one you calculated. Then, reading across that row, compare your t-value with the number in the second column. The number in the chart is the critical value of t for p = If your calculated t-statistic is smaller than the p = 0.05 t-critical value, then your two samples are NOT significantly different. You must accept the null hypothesis that the two samples came from the same population. There is no difference between the means. 3. If your calculated t-statistic is larger than p = 0.05 t-critical value, then the difference between the means is statistically significant. You can reject the null hypothesis with no more than a 5% error rate and accept the alternative hypothesis. 4. The next column in the table presents the t-critical values for p = If your value is larger than this number, then you can reject your null with a 2.5% error rate. 5. Then compare the computed p-value to alpha Step 5: Comparing the computed p-value to alpha All hypothesis tests are based on the same basic principles and are setup to minimize the probability of drawing an incorrect conclusion. Two possibilities (reality): Null hypothesis is true. Null hypothesis is false. Two outcomes of a hypothesis test: We reject the null hypothesis. We fail to reject the null hypothesis. 6

8 Continuing with the robin example: If you used a table to compare your t-statistics and t-critical: Robin density did not differ between the urban and rural settings (t 28 = 0.988; p>0.05). If you used a computer program to analyze your statistics: Robin density did not differ between the urban and rural settings (t 28 = 0.988; p= 0.331). Report your statistics, don t discuss p-values and t-statistics. This is wrong: My p-value was 0.046, this is less than the accepted value of 0.05, so I can reject my null hypothesis and accept my alternative hypothesis. In the results section you just state the results (including the statistics). In the discussion section, you interpret and explain your results. Don t ignore your statistical results. In this example, average robin density was higher in the urban setting (21.19 robins per count) than in the rural setting (19.49 robins per count). However, these means are NOT significantly different. So you cannot make any statements about differences in robin density between the two environments. It is incorrect to say that robin density was higher in the urban setting. 8

9 Practice problems with answers Problem 1 Brown trout box plots Figure 1: Brown trout eat a variety of freshwater invertebrates and the size of the food that they eat can vary with trout age. This figures shows box plots of the age-related variation in prey size of Salmo trutta, brown trout, in the Furelos River (NW Spain) during summer. From: Ontogenetic Dietary Shifts in a Predatory Freshwater Fish Species: The Brown Trout as an Example of a Dynamic Fish Species By Javier Sánchez-Hernández, María J. Servia, Rufino Vieira-Lanero and Fernando Cobo a) What is the independent variable? What is the dependent variable? b) Describe how prey size changes as trout age class increases. c) Visually estimate the minimum, maximum, median, Q1, Q3 and the IQR for teach treatment. Age class Min. Max. Median Q1 Q3 IQR d) Which of the box plots would most likely have normal? Explain your answer. 2. Creating and interpreting box plots a) Use the data below to construct box plots of the two samples. Round to whole numbers. Sample A Sample B b) Do you think that samples A and B were collected from the same population? Why or why not? c) Which sample would you predict would have a greater standard deviation? Explain your answer. 9

10 3. A researcher hypothesized that phosphorus-fertilization of host plants will impact caterpillar growth. 8 caterpillars were reared on plants fertilized with phosphorus and 8 caterpillars are reared on plants that receive a water only (no phosphorus) treatment. The data are presented below. Pupal Mass - Phosphorus Pupal Mass - No phosphorus Sample Mean Sample Size Variance a) What would be the null hypothesis in this study? b) What would be the alternate (research) hypothesis? c) What type of t-test would you use? d) How many degrees of freedom for your t-test? e) What is your t-critical? f) What is your computed t-statistic? g) Is there a significant difference between the two groups? h) Write these results up as you would in the results section of a lab report (include a statement of the results and the statistics). i) Interpret these results as you would in the discussion section of your lab report (i.e. what do these results mean). 10

11 Answers 1. Brown trout a) What is the independent variable? What is the dependent variable? Independent variable: Brown trout age class Dependent variable: Prey size b) Describe how prey size changes as trout age class increases. Older fish eat larger prey. c) Visually estimate the minimum, maximum, median, Q1, Q3 and the IQR for teach treatment. (Since these are estimates, it s OK if your values are slightly different). Age Min. Max. Median Q1 Q3 IQR class d) Which of the box plots would most likely have normal distributions (you may want to review the pine needle handout)? Explain your answer. The box plots for age classes 0+ and 1+ most likely were produced from data that are normally distributed. When data are normally distributed (bell curve) then the mean = median and there are approximately an equal number of replicates above and below the median (the median looks like it is in the middle of the distribution of data). 2. Use the data below to construct box plots of the two samples. Round to whole numbers. Sample A Sample B a) Do you think that samples A and B were collected from the same population? Why or why not? Answers may vary a correct answer depends on a correct explanation. The two samples have the same median but the range of values is greater in sample A. There is much more variation within the sample A population and the mean is greater. b) Which sample would you predict would have a greater standard deviation? Sample A (greater range of data). If you actually calculated the values: 11

12 Problem 2: A researcher hypothesized that phosphorus-fertilization of host plants will impact caterpillar growth. 8 caterpillars were reared on plants fertilized with phosphorus and 8 caterpillars are reared on plants that receive a water only (no phosphorus) treatment. The data are presented below. Pupal Mass - Phosphorus Pupal Mass - No phosphorus Sample Mean Sample Size 8 8 Variance a) What would be the null hypothesis in this study? There will not be a difference in mass between the caterpillars fed phosphorus-treated leaves and the caterpillars fed control leaves. b) What would be the alternate (aka research) hypothesis? There will be a difference in mass between the caterpillars fed phosphorus-treated leaves and the caterpillars fed control leaves. c) What type of t-test would you use? A 2-tailed t-test. In the description above, the researcher hypothesizes that phosphorus will impact caterpillar mass, so there could be a positive or negative effect of phosphorus on caterpillar mass. d) How many degrees of freedom for your t-test?14 e) What is your t-critical? t critical = f) What is your computed t-statistic? t stat = g) Is there a significant difference between the two groups? No h) Write these results up as you would in the results section of a lab report (include a statement of the results and the statistics). Pupal mass of caterpillars fed phosphorus treated leaves was not significantly different than pupal mass of caterpillars fed control leaves (t 14 = 0.279, p > 0.05). i) Interpret these results as you would in the discussion section of your lab report (i.e. what does these results mean). Phosphorus had no effect on caterpillar growth. Perhaps phosphorus is not a limiting nutrient for caterpillars, so adding more phosphorus to the plants did not increase caterpillar mass. Other explanations are possible. However, it is important that you don t discuss you t-stat or p-value in the discussion. 12

13 Critical values for the t-distribution 13

### LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.

### Describing Data. We find the position of the central observation using the formula: position number =

HOSP 1207 (Business Stats) Learning Centre Describing Data This worksheet focuses on describing data through measuring its central tendency and variability. These measurements will give us an idea of what

### Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY 1. Introduction Besides arriving at an appropriate expression of an average or consensus value for observations of a population, it is important to

### AP Statistics 2001 Solutions and Scoring Guidelines

AP Statistics 2001 Solutions and Scoring Guidelines The materials included in these files are intended for non-commercial use by AP teachers for course and exam preparation; permission for any other use

### Descriptive Statistics

Y520 Robert S Michael Goal: Learn to calculate indicators and construct graphs that summarize and describe a large quantity of values. Using the textbook readings and other resources listed on the web

### Describing Populations Statistically: The Mean, Variance, and Standard Deviation

Describing Populations Statistically: The Mean, Variance, and Standard Deviation BIOLOGICAL VARIATION One aspect of biology that holds true for almost all species is that not every individual is exactly

### STATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI

STATS8: Introduction to Biostatistics Data Exploration Babak Shahbaba Department of Statistics, UCI Introduction After clearly defining the scientific problem, selecting a set of representative members

### Descriptive Statistics

Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

### Statistical Inference and t-tests

1 Statistical Inference and t-tests Objectives Evaluate the difference between a sample mean and a target value using a one-sample t-test. Evaluate the difference between a sample mean and a target value

### Simple Regression Theory II 2010 Samuel L. Baker

SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

### Chapter 7. Section Introduction to Hypothesis Testing

Section 7.1 - Introduction to Hypothesis Testing Chapter 7 Objectives: State a null hypothesis and an alternative hypothesis Identify type I and type II errors and interpret the level of significance Determine

### Brief Review of Median

Session 36 Five-Number Summary and Box Plots Interpret the information given in the following box-and-whisker plot. The results from a pre-test for students for the year 2000 and the year 2010 are illustrated

### Data Analysis. Lecture Empirical Model Building and Methods (Empirische Modellbildung und Methoden) SS Analysis of Experiments - Introduction

Data Analysis Lecture Empirical Model Building and Methods (Empirische Modellbildung und Methoden) Prof. Dr. Dr. h.c. Dieter Rombach Dr. Andreas Jedlitschka SS 2014 Analysis of Experiments - Introduction

### THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7.

THERE ARE TWO WAYS TO DO HYPOTHESIS TESTING WITH STATCRUNCH: WITH SUMMARY DATA (AS IN EXAMPLE 7.17, PAGE 236, IN ROSNER); WITH THE ORIGINAL DATA (AS IN EXAMPLE 8.5, PAGE 301 IN ROSNER THAT USES DATA FROM

### Unit 29 Chi-Square Goodness-of-Fit Test

Unit 29 Chi-Square Goodness-of-Fit Test Objectives: To perform the chi-square hypothesis test concerning proportions corresponding to more than two categories of a qualitative variable To perform the Bonferroni

### 10-3 Measures of Central Tendency and Variation

10-3 Measures of Central Tendency and Variation So far, we have discussed some graphical methods of data description. Now, we will investigate how statements of central tendency and variation can be used.

### Inferential Statistics

Inferential Statistics Sampling and the normal distribution Z-scores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are

### Chapter 6: t test for dependent samples

Chapter 6: t test for dependent samples ****This chapter corresponds to chapter 11 of your book ( t(ea) for Two (Again) ). What it is: The t test for dependent samples is used to determine whether the

### Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Chapter 4 Two-Sample T-Tests Assuming Equal Variance (Enter Means) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when the variances of

### Exploratory data analysis (Chapter 2) Fall 2011

Exploratory data analysis (Chapter 2) Fall 2011 Data Examples Example 1: Survey Data 1 Data collected from a Stat 371 class in Fall 2005 2 They answered questions about their: gender, major, year in school,

### Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

Chapter 45 Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when no assumption

### Descriptive Statistics. Understanding Data: Categorical Variables. Descriptive Statistics. Dataset: Shellfish Contamination

Descriptive Statistics Understanding Data: Dataset: Shellfish Contamination Location Year Species Species2 Method Metals Cadmium (mg kg - ) Chromium (mg kg - ) Copper (mg kg - ) Lead (mg kg - ) Mercury

### Hypothesis Testing. Bluman Chapter 8

CHAPTER 8 Learning Objectives C H A P T E R E I G H T Hypothesis Testing 1 Outline 8-1 Steps in Traditional Method 8-2 z Test for a Mean 8-3 t Test for a Mean 8-4 z Test for a Proportion 8-5 2 Test for

### Descriptive Statistics. Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion

Descriptive Statistics Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion Statistics as a Tool for LIS Research Importance of statistics in research

### Chapter 7 Section 7.1: Inference for the Mean of a Population

Chapter 7 Section 7.1: Inference for the Mean of a Population Now let s look at a similar situation Take an SRS of size n Normal Population : N(, ). Both and are unknown parameters. Unlike what we used

### 13.2 Measures of Central Tendency

13.2 Measures of Central Tendency Measures of Central Tendency For a given set of numbers, it may be desirable to have a single number to serve as a kind of representative value around which all the numbers

### Hypothesis Testing Level I Quantitative Methods. IFT Notes for the CFA exam

Hypothesis Testing 2014 Level I Quantitative Methods IFT Notes for the CFA exam Contents 1. Introduction... 3 2. Hypothesis Testing... 3 3. Hypothesis Tests Concerning the Mean... 10 4. Hypothesis Tests

### Final Exam Practice Problem Answers

Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal

### Biodiversity Data Analysis: Testing Statistical Hypotheses By Joanna Weremijewicz, Simeon Yurek, Steven Green, Ph. D. and Dana Krempels, Ph. D.

Biodiversity Data Analysis: Testing Statistical Hypotheses By Joanna Weremijewicz, Simeon Yurek, Steven Green, Ph. D. and Dana Krempels, Ph. D. In biological science, investigators often collect biological

### Data analysis. Data analysis in Excel using Windows 7/Office 2010

Data analysis Data analysis in Excel using Windows 7/Office 2010 Open the Data tab in Excel If Data Analysis is not visible along the top toolbar then do the following: o Right click anywhere on the toolbar

### Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume

### F. Farrokhyar, MPhil, PhD, PDoc

Learning objectives Descriptive Statistics F. Farrokhyar, MPhil, PhD, PDoc To recognize different types of variables To learn how to appropriately explore your data How to display data using graphs How

### EXCEL Analysis TookPak [Statistical Analysis] 1. First of all, check to make sure that the Analysis ToolPak is installed. Here is how you do it:

EXCEL Analysis TookPak [Statistical Analysis] 1 First of all, check to make sure that the Analysis ToolPak is installed. Here is how you do it: a. From the Tools menu, choose Add-Ins b. Make sure Analysis

### 4. Introduction to Statistics

Statistics for Engineers 4-1 4. Introduction to Statistics Descriptive Statistics Types of data A variate or random variable is a quantity or attribute whose value may vary from one unit of investigation

### Chapter 3: Data Description Numerical Methods

Chapter 3: Data Description Numerical Methods Learning Objectives Upon successful completion of Chapter 3, you will be able to: Summarize data using measures of central tendency, such as the mean, median,

### Statistics Review PSY379

Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses

### Independent samples t-test. Dr. Tom Pierce Radford University

Independent samples t-test Dr. Tom Pierce Radford University The logic behind drawing causal conclusions from experiments The sampling distribution of the difference between means The standard error of

### The Dummy s Guide to Data Analysis Using SPSS

The Dummy s Guide to Data Analysis Using SPSS Mathematics 57 Scripps College Amy Gamble April, 2001 Amy Gamble 4/30/01 All Rights Rerserved TABLE OF CONTENTS PAGE Helpful Hints for All Tests...1 Tests

### Lecture 1: Review and Exploratory Data Analysis (EDA)

Lecture 1: Review and Exploratory Data Analysis (EDA) Sandy Eckel seckel@jhsph.edu Department of Biostatistics, The Johns Hopkins University, Baltimore USA 21 April 2008 1 / 40 Course Information I Course

### Hypothesis Testing - Relationships

- Relationships Session 3 AHX43 (28) 1 Lecture Outline Correlational Research. The Correlation Coefficient. An example. Considerations. One and Two-tailed Tests. Errors. Power. for Relationships AHX43

### 1. 2. 3. 4. Find the mean and median. 5. 1, 2, 87 6. 3, 2, 1, 10. Bellwork 3-23-15 Simplify each expression.

Bellwork 3-23-15 Simplify each expression. 1. 2. 3. 4. Find the mean and median. 5. 1, 2, 87 6. 3, 2, 1, 10 1 Objectives Find measures of central tendency and measures of variation for statistical data.

### HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS

Mathematics Revision Guides Histograms, Cumulative Frequency and Box Plots Page 1 of 25 M.K. HOME TUITION Mathematics Revision Guides Level: GCSE Higher Tier HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS

### t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon

t-tests in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com www.excelmasterseries.com

### Introduction to Statistics for Psychology. Quantitative Methods for Human Sciences

Introduction to Statistics for Psychology and Quantitative Methods for Human Sciences Jonathan Marchini Course Information There is website devoted to the course at http://www.stats.ox.ac.uk/ marchini/phs.html

### Numerical Summarization of Data OPRE 6301

Numerical Summarization of Data OPRE 6301 Motivation... In the previous session, we used graphical techniques to describe data. For example: While this histogram provides useful insight, other interesting

### Fairfield Public Schools

Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity

### KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

KSTAT MINI-MANUAL Decision Sciences 434 Kellogg Graduate School of Management Kstat is a set of macros added to Excel and it will enable you to do the statistics required for this course very easily. To

### Comparing Means in Two Populations

Comparing Means in Two Populations Overview The previous section discussed hypothesis testing when sampling from a single population (either a single mean or two means from the same population). Now we

### Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing

### How Does My TI-84 Do That

How Does My TI-84 Do That A guide to using the TI-84 for statistics Austin Peay State University Clarksville, Tennessee How Does My TI-84 Do That A guide to using the TI-84 for statistics Table of Contents

### Center: Finding the Median. Median. Spread: Home on the Range. Center: Finding the Median (cont.)

Center: Finding the Median When we think of a typical value, we usually look for the center of the distribution. For a unimodal, symmetric distribution, it s easy to find the center it s just the center

### Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

Glo bal Leadership M BA BUSINESS STATISTICS FINAL EXAM Name: INSTRUCTIONS 1. Do not open this exam until instructed to do so. 2. Be sure to fill in your name before starting the exam. 3. You have two hours

### Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis

### Module 4: Data Exploration

Module 4: Data Exploration Now that you have your data downloaded from the Streams Project database, the detective work can begin! Before computing any advanced statistics, we will first use descriptive

### Part 3. Comparing Groups. Chapter 7 Comparing Paired Groups 189. Chapter 8 Comparing Two Independent Groups 217

Part 3 Comparing Groups Chapter 7 Comparing Paired Groups 189 Chapter 8 Comparing Two Independent Groups 217 Chapter 9 Comparing More Than Two Groups 257 188 Elementary Statistics Using SAS Chapter 7 Comparing

### Using Excel for inferential statistics

FACT SHEET Using Excel for inferential statistics Introduction When you collect data, you expect a certain amount of variation, just caused by chance. A wide variety of statistical tests can be applied

### Outline of Topics. Statistical Methods I. Types of Data. Descriptive Statistics

Statistical Methods I Tamekia L. Jones, Ph.D. (tjones@cog.ufl.edu) Research Assistant Professor Children s Oncology Group Statistics & Data Center Department of Biostatistics Colleges of Medicine and Public

### Descriptive Analysis

Research Methods William G. Zikmund Basic Data Analysis: Descriptive Statistics Descriptive Analysis The transformation of raw data into a form that will make them easy to understand and interpret; rearranging,

### HYPOTHESIS TESTING: POWER OF THE TEST

HYPOTHESIS TESTING: POWER OF THE TEST The first 6 steps of the 9-step test of hypothesis are called "the test". These steps are not dependent on the observed data values. When planning a research project,

### Chapter 23 Inferences About Means

Chapter 23 Inferences About Means Chapter 23 - Inferences About Means 391 Chapter 23 Solutions to Class Examples 1. See Class Example 1. 2. We want to know if the mean battery lifespan exceeds the 300-minute

### Null Hypothesis H 0. The null hypothesis (denoted by H 0

Hypothesis test In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test (or test of significance) is a standard procedure for testing a claim about a property

### 1.5 NUMERICAL REPRESENTATION OF DATA (Sample Statistics)

1.5 NUMERICAL REPRESENTATION OF DATA (Sample Statistics) As well as displaying data graphically we will often wish to summarise it numerically particularly if we wish to compare two or more data sets.

### Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013

Statistics I for QBIC Text Book: Biostatistics, 10 th edition, by Daniel & Cross Contents and Objectives Chapters 1 7 Revised: August 2013 Chapter 1: Nature of Statistics (sections 1.1-1.6) Objectives

### Nonparametric Two-Sample Tests. Nonparametric Tests. Sign Test

Nonparametric Two-Sample Tests Sign test Mann-Whitney U-test (a.k.a. Wilcoxon two-sample test) Kolmogorov-Smirnov Test Wilcoxon Signed-Rank Test Tukey-Duckworth Test 1 Nonparametric Tests Recall, nonparametric

### BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I

BNG 202 Biomechanics Lab Descriptive statistics and probability distributions I Overview The overall goal of this short course in statistics is to provide an introduction to descriptive and inferential

### Chapter 9. Two-Sample Tests. Effect Sizes and Power Paired t Test Calculation

Chapter 9 Two-Sample Tests Paired t Test (Correlated Groups t Test) Effect Sizes and Power Paired t Test Calculation Summary Independent t Test Chapter 9 Homework Power and Two-Sample Tests: Paired Versus

### Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs

Types of Variables Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs Quantitative (numerical)variables: take numerical values for which arithmetic operations make sense (addition/averaging)

### The right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median

CONDENSED LESSON 2.1 Box Plots In this lesson you will create and interpret box plots for sets of data use the interquartile range (IQR) to identify potential outliers and graph them on a modified box

### NCSS Statistical Software

Chapter 06 Introduction This procedure provides several reports for the comparison of two distributions, including confidence intervals for the difference in means, two-sample t-tests, the z-test, the

### Nonparametric Statistics

1 14.1 Using the Binomial Table Nonparametric Statistics In this chapter, we will survey several methods of inference from Nonparametric Statistics. These methods will introduce us to several new tables

### CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

### Projects Involving Statistics (& SPSS)

Projects Involving Statistics (& SPSS) Academic Skills Advice Starting a project which involves using statistics can feel confusing as there seems to be many different things you can do (charts, graphs,

### Study Guide for the Final Exam

Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make

### SPSS on two independent samples. Two sample test with proportions. Paired t-test (with more SPSS)

SPSS on two independent samples. Two sample test with proportions. Paired t-test (with more SPSS) State of the course address: The Final exam is Aug 9, 3:30pm 6:30pm in B9201 in the Burnaby Campus. (One

### Statistics Summary (prepared by Xuan (Tappy) He)

Statistics Summary (prepared by Xuan (Tappy) He) Statistics is the practice of collecting and analyzing data. The analysis of statistics is important for decision making in events where there are uncertainties.

### AP Statistics 2012 Scoring Guidelines

AP Statistics 2012 Scoring Guidelines The College Board The College Board is a mission-driven not-for-profit organization that connects students to college success and opportunity. Founded in 1900, the

### 1 SAMPLE SIGN TEST. Non-Parametric Univariate Tests: 1 Sample Sign Test 1. A non-parametric equivalent of the 1 SAMPLE T-TEST.

Non-Parametric Univariate Tests: 1 Sample Sign Test 1 1 SAMPLE SIGN TEST A non-parametric equivalent of the 1 SAMPLE T-TEST. ASSUMPTIONS: Data is non-normally distributed, even after log transforming.

### Hints for Success on the AP Statistics Exam. (Compiled by Zack Bigner)

Hints for Success on the AP Statistics Exam. (Compiled by Zack Bigner) The Exam The AP Stat exam has 2 sections that take 90 minutes each. The first section is 40 multiple choice questions, and the second

### 1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

### 2.0 Lesson Plan. Answer Questions. Summary Statistics. Histograms. The Normal Distribution. Using the Standard Normal Table

2.0 Lesson Plan Answer Questions 1 Summary Statistics Histograms The Normal Distribution Using the Standard Normal Table 2. Summary Statistics Given a collection of data, one needs to find representations

### Foundation of Quantitative Data Analysis

Foundation of Quantitative Data Analysis Part 1: Data manipulation and descriptive statistics with SPSS/Excel HSRS #10 - October 17, 2013 Reference : A. Aczel, Complete Business Statistics. Chapters 1

### 1 Measures for location and dispersion of a sample

Statistical Geophysics WS 2008/09 7..2008 Christian Heumann und Helmut Küchenhoff Measures for location and dispersion of a sample Measures for location and dispersion of a sample In the following: Variable

### How to Conduct a Hypothesis Test

How to Conduct a Hypothesis Test The idea of hypothesis testing is relatively straightforward. In various studies we observe certain events. We must ask, is the event due to chance alone, or is there some

### Permutation Tests for Comparing Two Populations

Permutation Tests for Comparing Two Populations Ferry Butar Butar, Ph.D. Jae-Wan Park Abstract Permutation tests for comparing two populations could be widely used in practice because of flexibility of

### Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

Course Text Business Statistics Lind, Douglas A., Marchal, William A. and Samuel A. Wathen. Basic Statistics for Business and Economics, 7th edition, McGraw-Hill/Irwin, 2010, ISBN: 9780077384470 [This

### Descriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics

Descriptive statistics is the discipline of quantitatively describing the main features of a collection of data. Descriptive statistics are distinguished from inferential statistics (or inductive statistics),

### Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct 16 2015

Stat 411/511 THE RANDOMIZATION TEST Oct 16 2015 Charlotte Wickham stat511.cwick.co.nz Today Review randomization model Conduct randomization test What about CIs? Using a t-distribution as an approximation

### Linear Models in STATA and ANOVA

Session 4 Linear Models in STATA and ANOVA Page Strengths of Linear Relationships 4-2 A Note on Non-Linear Relationships 4-4 Multiple Linear Regression 4-5 Removal of Variables 4-8 Independent Samples

### Tutorial 5: Hypothesis Testing

Tutorial 5: Hypothesis Testing Rob Nicholls nicholls@mrc-lmb.cam.ac.uk MRC LMB Statistics Course 2014 Contents 1 Introduction................................ 1 2 Testing distributional assumptions....................

### TIPS FOR DOING STATISTICS IN EXCEL

TIPS FOR DOING STATISTICS IN EXCEL Before you begin, make sure that you have the DATA ANALYSIS pack running on your machine. It comes with Excel. Here s how to check if you have it, and what to do if you

### Chapter 7 Part 2. Hypothesis testing Power

Chapter 7 Part 2 Hypothesis testing Power November 6, 2008 All of the normal curves in this handout are sampling distributions Goal: To understand the process of hypothesis testing and the relationship

### Point Biserial Correlation Tests

Chapter 807 Point Biserial Correlation Tests Introduction The point biserial correlation coefficient (ρ in this chapter) is the product-moment correlation calculated between a continuous random variable

### ! x sum of the entries

3.1 Measures of Central Tendency (Page 1 of 16) 3.1 Measures of Central Tendency Mean, Median and Mode! x sum of the entries a. mean, x = = n number of entries Example 1 Find the mean of 26, 18, 12, 31,

### Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools

Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools Occam s razor.......................................................... 2 A look at data I.........................................................

### Investigating the Investigative Task: Testing for Skewness An Investigation of Different Test Statistics and their Power to Detect Skewness

Investigating the Investigative Task: Testing for Skewness An Investigation of Different Test Statistics and their Power to Detect Skewness Josh Tabor Canyon del Oro High School Journal of Statistics Education

### THE BINOMIAL DISTRIBUTION & PROBABILITY

REVISION SHEET STATISTICS 1 (MEI) THE BINOMIAL DISTRIBUTION & PROBABILITY The main ideas in this chapter are Probabilities based on selecting or arranging objects Probabilities based on the binomial distribution

### Means, standard deviations and. and standard errors

CHAPTER 4 Means, standard deviations and standard errors 4.1 Introduction Change of units 4.2 Mean, median and mode Coefficient of variation 4.3 Measures of variation 4.4 Calculating the mean and standard

### Section 13, Part 1 ANOVA. Analysis Of Variance

Section 13, Part 1 ANOVA Analysis Of Variance Course Overview So far in this course we ve covered: Descriptive statistics Summary statistics Tables and Graphs Probability Probability Rules Probability