fifty Fathoms Statistics Demonstrations for Deeper Understanding Tim Erickson

Size: px
Start display at page:

Download "fifty Fathoms Statistics Demonstrations for Deeper Understanding Tim Erickson"

Transcription

1 fifty Fathoms Statistics Demonstrations for Deeper Understanding Tim Erickson

2 Contents What Are These Demos About? How to Use These Demos If This Is Your First Time Using Fathom Tutorial: An Extended Example A Few Good Skills Fathom Overview xi xii xv xv xix xxi Measures of Center and Spread 1 Demo 1: The Meaning of Mean 2 The mean How individual values affect the mean Demo 2: Mean and Median 4 Measures of center: mean, median, and midrange Resistance: what happens to the measures when you move one point Demo 3: What Do Normal Data Look Like? 6 Normally distributed data The effect of changing the mean and standard deviation Demo 4: Transforming the Mean and Standard Deviation 8 What happens to the mean and standard deviation when you add a constant to every value or multiply every value by a constant Demo 5: The Mean Is Least Squares, Too 10 Defining the mean as the place where the sum of squares of deviations is a minimum (just like the least-squares line) The median, and what it minimizes Regression and Correlation 13 Demo 6: Least-Squares Linear Regression 14 Exploring the squares in least squares Minimizing the areas of the squares built on residuals Demo 7: Standard Scores 16 Using standard scores to compare unlike scales Making a scale in terms of standard deviations Demo 8: Devising the Correlation Coefficient 18 How the correlation coefficient measures what it does 2014 William Finzer v

3 Fifty Fathoms: Statistics Demonstrations for Deeper Understanding Demo 9: Correlation Coefficients of Samples 21 How samples from a correlated population yield different values for the correlation How sample size affects that sampling distribution Demo 10: Regression Toward the Mean 24 Regression toward the mean The meaning and asymmetry of the least-squares line Random Walks and the Binomial Distribution 27 Demo 11: Flipping Coins the Law of Large Numbers 28 How the proportion of heads approaches 0.5 as sample size increases How the number of heads does not approach half the sample size Demo 12: How Random Walks Go as Root N 31 How the distance from the origin increases with the number of steps Demo 13: Building the Binomial Distribution 36 Constructing the binomial distribution by resampling How the distribution depends on the population proportion Demo 14: More Binomial 39 How the binomial distribution depends on sample size for small N The relationship between the distribution of sample proportions and the distribution of sample counts Demo 15: Two-Dimensional Random Walks 41 Unexpected behavior in 2D random walks How the 2D walk eventually looks like a 2D normal distribution Standard Deviation, Standard Error, and Student s t 45 Demo 16: Standard Error and Standard Deviation 46 Getting a feel for the difference between standard deviation and standard error Demo 17: What Is Standard Error, Really? 48 The connection between standard error and the sampling distribution of the mean How the sample size connects standard deviation and standard error Demo 18: The Road to Student s t 51 Using standard error as the scale for measuring how far a sample mean is from the true mean How these quantities are not normally distributed; in fact they follow a t-distribution Demo 19: A Close Look at the t-statistic 55 How sample mean, standard deviation, t, and P interrelate How they depend on the values of individual points in a sample Sampling Distributions 59 Demo 20: The Distribution of Sample Proportions 60 How sample size and population proportion affect the distribution vi 2014 William Finzer

4 Contents Demo 21: Adding Uniform Random Variables 61 What happens when you add two uniform random variables How that corresponds to adding two dice Demo 22: How Errors Add 63 Basic error analysis How to find the error in the sum of two quantities that each have some measurement error Demo 23: Sampling Distributions and Sample Size 65 How sampling distributions (of the mean) get narrower as you increase sample size Demo 24: How the Width of the Sampling Distribution Depends on N 68 How the width (as measured by IQR) of a sampling distribution of the mean is inversely proportional to the square root of the sample size Demo 25: Does n 1 Really Work in the SD? 71 Unbiased estimators How the familiar formula for sample standard deviation is not unbiased Why we should care about variance Demo 26: German Tanks 73 Unbiased estimators Evaluating estimators from their sampling distributions Even among unbiased estimators, some are better than others Demo 27: The Central Limit Theorem 76 A demo of the CLT How sampling distributions usually look normal Cases where they do not look normal Confidence Intervals 79 Demo 28: The Confidence Interval of a Proportion 80 Defining the confidence interval Looking at sample results in terms of plausibility Demo 29: Capturing with Confidence Intervals 83 How confidence intervals of a proportion do not always capture the population value Demo 30: Where Does That Root (p(1 p)) Come From? 85 The standard deviation of a variable that s only 0 or 1 Connecting the proportion situation to the mean situation Demo 31: Why np > 10 Is a Good Rule of Thumb 87 Explaining the np > 10 rule for using the normal approximation in the CI of a proportion Demo 32: How the Width of the CI Depends on N 90 How the width of a confidence interval is inversely proportional to the square root of the sample size Demo 33: Using the Bootstrap to Estimate a Parameter 93 The bootstrap Using resampling (with replacement) to create an interval for a parameter 2014 William Finzer vii

5 Fifty Fathoms: Statistics Demonstrations for Deeper Understanding Demo 34: Exploring the Confidence Interval of the Mean 96 How the CI depends on individual values Demo 35: Capturing the Mean with Confidence Intervals 98 How confidence intervals of a mean do not always capture the population value What repeated CIs look like Hypothesis Tests 101 Demo 36: Fair and Unfair Dice 102 Creating a measure of fairness Sampling distributions Testing hypotheses empirically The chi-square statistic Demo 37: Scrambling to Compare Means 106 Randomization test Using scrambling to simulate the null hypothesis Generating a sampling distribution Demo 38: Using a t-test to Compare Means 109 Comparing means with Student s t Demo 39: Another Look at a t-test 111 Repeated t-tests on samples from the same distribution How t, P, mean, and standard deviation interrelate Demo 40: On the Equivalence of Tests and Estimates 113 How a hypothesis test and a confidence interval are really the same Demo 41: Paired Versus Unpaired 115 How a paired test gives a significant result more easily than its unpaired counterpart Demo 42: Analysis of Variance 117 Assessing whether means are different in different groups Introduction to ANOVA Power in Tests 121 Demo 43: The Distribution of P-Values 122 How the distribution of P is flat if the null hypothesis is true How it changes if the null hypothesis is false Demo 44: Power 124 How power the chance that you reject the null hypothesis changes with the population parameters Demo 45: Power and Sample Size 127 How power the chance that you reject the null hypothesis changes with sample size viii 2014 William Finzer

6 Contents Demo 46: Heteroscedasticity and Its Consequences 129 Homoscedasticity an assumption behind many statistical calculations What happens when that assumption is not met Distributions 133 Demo 47: Wait Time and the Geometric Distribution 134 The distribution of times until something happens How this is the geometric distribution Demo 48: The Exponential and Poisson Distributions 137 The continuous analog to the geometric distribution How many events happen in a given period: a Poisson distribution Demo 49: Sampling Without Replacement and the Hypergeometric Distribution 140 How distributions change when the sample is large compared to the population Demo 50: The Bizarre Cauchy Distribution 143 The Cauchy distribution The meaning of mean and standard deviation; how it s possible for a distribution to have neither Appendix A: How Did They Do That? 145 Appendix B: A Little Mathematical Statistics 151 Some Basics 151 The Distribution of the Sample Mean 154 A Random Walk: Two Proofs That the Mean Square Distance Is N 155 How to Generate Correlated Data 157 Sample Variance: Why the Denominator Is n The Geometric Distribution: Proof That the Mean Is (1/p) 160 Generating Normally Distributed Random Numbers 162 Selected Solutions William Finzer ix

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in

More information

Teaching Statistics with Fathom

Teaching Statistics with Fathom Teaching Statistics with Fathom UCB Extension X369.6 (2 semester units in Education) COURSE DESCRIPTION This is a professional-level, moderated online course in the use of Fathom Dynamic Data software

More information

Inferential Statistics

Inferential Statistics Inferential Statistics Sampling and the normal distribution Z-scores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are

More information

Summary of Probability

Summary of Probability Summary of Probability Mathematical Physics I Rules of Probability The probability of an event is called P(A), which is a positive number less than or equal to 1. The total probability for all possible

More information

Chapter 16 Multiple Choice Questions (The answers are provided after the last question.)

Chapter 16 Multiple Choice Questions (The answers are provided after the last question.) Chapter 16 Multiple Choice Questions (The answers are provided after the last question.) 1. Which of the following symbols represents a population parameter? a. SD b. σ c. r d. 0 2. If you drew all possible

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics. Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing

More information

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics Course Text Business Statistics Lind, Douglas A., Marchal, William A. and Samuel A. Wathen. Basic Statistics for Business and Economics, 7th edition, McGraw-Hill/Irwin, 2010, ISBN: 9780077384470 [This

More information

INTRODUCTORY STATISTICS

INTRODUCTORY STATISTICS INTRODUCTORY STATISTICS FIFTH EDITION Thomas H. Wonnacott University of Western Ontario Ronald J. Wonnacott University of Western Ontario WILEY JOHN WILEY & SONS New York Chichester Brisbane Toronto Singapore

More information

Chapter 11: Two Variable Regression Analysis

Chapter 11: Two Variable Regression Analysis Department of Mathematics Izmir University of Economics Week 14-15 2014-2015 In this chapter, we will focus on linear models and extend our analysis to relationships between variables, the definitions

More information

Nonparametric tests, Bootstrapping

Nonparametric tests, Bootstrapping Nonparametric tests, Bootstrapping http://www.isrec.isb-sib.ch/~darlene/embnet/ Hypothesis testing review 2 competing theories regarding a population parameter: NULL hypothesis H ( straw man ) ALTERNATIVEhypothesis

More information

Business Analytics. Methods, Models, and Decisions. James R. Evans : University of Cincinnati PEARSON

Business Analytics. Methods, Models, and Decisions. James R. Evans : University of Cincinnati PEARSON Business Analytics Methods, Models, and Decisions James R. Evans : University of Cincinnati PEARSON Boston Columbus Indianapolis New York San Francisco Upper Saddle River Amsterdam Cape Town Dubai London

More information

Suggested solution for exam in MSA830: Statistical Analysis and Experimental Design October 2009

Suggested solution for exam in MSA830: Statistical Analysis and Experimental Design October 2009 Petter Mostad Matematisk Statistik Chalmers Suggested solution for exam in MSA830: Statistical Analysis and Experimental Design October 2009 1. (a) To use a t-test, one must assume that both groups of

More information

Simple Linear Regression Chapter 11

Simple Linear Regression Chapter 11 Simple Linear Regression Chapter 11 Rationale Frequently decision-making situations require modeling of relationships among business variables. For instance, the amount of sale of a product may be related

More information

AP Statistics 2002 Scoring Guidelines

AP Statistics 2002 Scoring Guidelines AP Statistics 2002 Scoring Guidelines The materials included in these files are intended for use by AP teachers for course and exam preparation in the classroom; permission for any other use must be sought

More information

Statistics for Management II-STAT 362-Final Review

Statistics for Management II-STAT 362-Final Review Statistics for Management II-STAT 362-Final Review Multiple Choice Identify the letter of the choice that best completes the statement or answers the question. 1. The ability of an interval estimate to

More information

Homework #3 is due Friday by 5pm. Homework #4 will be posted to the class website later this week. It will be due Friday, March 7 th, at 5pm.

Homework #3 is due Friday by 5pm. Homework #4 will be posted to the class website later this week. It will be due Friday, March 7 th, at 5pm. Homework #3 is due Friday by 5pm. Homework #4 will be posted to the class website later this week. It will be due Friday, March 7 th, at 5pm. Political Science 15 Lecture 12: Hypothesis Testing Sampling

More information

E205 Final: Version B

E205 Final: Version B Name: Class: Date: E205 Final: Version B Multiple Choice Identify the choice that best completes the statement or answers the question. 1. The owner of a local nightclub has recently surveyed a random

More information

Testing: is my coin fair?

Testing: is my coin fair? Testing: is my coin fair? Formally: we want to make some inference about P(head) Try it: toss coin several times (say 7 times) Assume that it is fair ( P(head)= ), and see if this assumption is compatible

More information

Statistics II Final Exam - January Use the University stationery to give your answers to the following questions.

Statistics II Final Exam - January Use the University stationery to give your answers to the following questions. Statistics II Final Exam - January 2012 Use the University stationery to give your answers to the following questions. Do not forget to write down your name and class group in each page. Indicate clearly

More information

ACTM State Exam-Statistics

ACTM State Exam-Statistics ACTM State Exam-Statistics For the 25 multiple-choice questions, make your answer choice and record it on the answer sheet provided. Once you have completed that section of the test, proceed to the tie-breaker

More information

Machine Learning. Topic: Evaluating Hypotheses

Machine Learning. Topic: Evaluating Hypotheses Machine Learning Topic: Evaluating Hypotheses Bryan Pardo, Machine Learning: EECS 349 Fall 2011 How do you tell something is better? Assume we have an error measure. How do we tell if it measures something

More information

AP Statistics: Syllabus 3

AP Statistics: Syllabus 3 AP Statistics: Syllabus 3 Scoring Components SC1 The course provides instruction in exploring data. 4 SC2 The course provides instruction in sampling. 5 SC3 The course provides instruction in experimentation.

More information

OLS is not only unbiased it is also the most precise (efficient) unbiased estimation technique - ie the estimator has the smallest variance

OLS is not only unbiased it is also the most precise (efficient) unbiased estimation technique - ie the estimator has the smallest variance Lecture 5: Hypothesis Testing What we know now: OLS is not only unbiased it is also the most precise (efficient) unbiased estimation technique - ie the estimator has the smallest variance (if the Gauss-Markov

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

More information

Fairfield Public Schools

Fairfield Public Schools Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity

More information

BIOSTATISTICS QUIZ ANSWERS

BIOSTATISTICS QUIZ ANSWERS BIOSTATISTICS QUIZ ANSWERS 1. When you read scientific literature, do you know whether the statistical tests that were used were appropriate and why they were used? a. Always b. Mostly c. Rarely d. Never

More information

MCQ TESTING OF HYPOTHESIS

MCQ TESTING OF HYPOTHESIS MCQ TESTING OF HYPOTHESIS MCQ 13.1 A statement about a population developed for the purpose of testing is called: (a) Hypothesis (b) Hypothesis testing (c) Level of significance (d) Test-statistic MCQ

More information

Math 62 Statistics Sample Exam Questions

Math 62 Statistics Sample Exam Questions Math 62 Statistics Sample Exam Questions 1. (10) Explain the difference between the distribution of a population and the sampling distribution of a statistic, such as the mean, of a sample randomly selected

More information

Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010

Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010 Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010 Week 1 Week 2 14.0 Students organize and describe distributions of data by using a number of different

More information

Understanding Confidence Intervals and Hypothesis Testing Using Excel Data Table Simulation

Understanding Confidence Intervals and Hypothesis Testing Using Excel Data Table Simulation Understanding Confidence Intervals and Hypothesis Testing Using Excel Data Table Simulation Leslie Chandrakantha lchandra@jjay.cuny.edu Department of Mathematics & Computer Science John Jay College of

More information

Simple Linear Regression

Simple Linear Regression Inference for Regression Simple Linear Regression IPS Chapter 10.1 2009 W.H. Freeman and Company Objectives (IPS Chapter 10.1) Simple linear regression Statistical model for linear regression Estimating

More information

CHAPTER 3 COMMONLY USED STATISTICAL TERMS

CHAPTER 3 COMMONLY USED STATISTICAL TERMS CHAPTER 3 COMMONLY USED STATISTICAL TERMS There are many statistics used in social science research and evaluation. The two main areas of statistics are descriptive and inferential. The third class of

More information

Statistical Functions in Excel

Statistical Functions in Excel Statistical Functions in Excel There are many statistical functions in Excel. Moreover, there are other functions that are not specified as statistical functions that are helpful in some statistical analyses.

More information

Minitab Guide. This packet contains: A Friendly Guide to Minitab. Minitab Step-By-Step

Minitab Guide. This packet contains: A Friendly Guide to Minitab. Minitab Step-By-Step Minitab Guide This packet contains: A Friendly Guide to Minitab An introduction to Minitab; including basic Minitab functions, how to create sets of data, and how to create and edit graphs of different

More information

Inferential Statistics. Probability. From Samples to Populations. Katie Rommel-Esham Education 504

Inferential Statistics. Probability. From Samples to Populations. Katie Rommel-Esham Education 504 Inferential Statistics Katie Rommel-Esham Education 504 Probability Probability is the scientific way of stating the degree of confidence we have in predicting something Tossing coins and rolling dice

More information

Statistics 112 Regression Cheatsheet Section 1B - Ryan Rosario

Statistics 112 Regression Cheatsheet Section 1B - Ryan Rosario Statistics 112 Regression Cheatsheet Section 1B - Ryan Rosario I have found that the best way to practice regression is by brute force That is, given nothing but a dataset and your mind, compute everything

More information

Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013

Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013 Statistics I for QBIC Text Book: Biostatistics, 10 th edition, by Daniel & Cross Contents and Objectives Chapters 1 7 Revised: August 2013 Chapter 1: Nature of Statistics (sections 1.1-1.6) Objectives

More information

MONT 107N Understanding Randomness Solutions For Final Examination May 11, 2010

MONT 107N Understanding Randomness Solutions For Final Examination May 11, 2010 MONT 07N Understanding Randomness Solutions For Final Examination May, 00 Short Answer (a) (0) How are the EV and SE for the sum of n draws with replacement from a box computed? Solution: The EV is n times

More information

AP Statistics 2001 Solutions and Scoring Guidelines

AP Statistics 2001 Solutions and Scoring Guidelines AP Statistics 2001 Solutions and Scoring Guidelines The materials included in these files are intended for non-commercial use by AP teachers for course and exam preparation; permission for any other use

More information

Technology Step-by-Step Using StatCrunch

Technology Step-by-Step Using StatCrunch Technology Step-by-Step Using StatCrunch Section 1.3 Simple Random Sampling 1. Select Data, highlight Simulate Data, then highlight Discrete Uniform. 2. Fill in the following window with the appropriate

More information

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Name: Section: I pledge my honor that I have not violated the Honor Code Signature: This exam has 34 pages. You have 3 hours to complete this

More information

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Open book and note Calculator OK Multiple Choice 1 point each MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Find the mean for the given sample data.

More information

Study Guide for the Final Exam

Study Guide for the Final Exam Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make

More information

0.1 Solutions to CAS Exam ST, Fall 2015

0.1 Solutions to CAS Exam ST, Fall 2015 0.1 Solutions to CAS Exam ST, Fall 2015 The questions can be found at http://www.casact.org/admissions/studytools/examst/f15-st.pdf. 1. The variance equals the parameter λ of the Poisson distribution for

More information

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE 1 2 CONTENTS OF DAY 2 I. More Precise Definition of Simple Random Sample 3 Connection with independent random variables 3 Problems with small populations 8 II. Why Random Sampling is Important 9 A myth,

More information

Simple Linear Regression

Simple Linear Regression STAT 101 Dr. Kari Lock Morgan Simple Linear Regression SECTIONS 9.3 Confidence and prediction intervals (9.3) Conditions for inference (9.1) Want More Stats??? If you have enjoyed learning how to analyze

More information

Hypothesis Testing Level I Quantitative Methods. IFT Notes for the CFA exam

Hypothesis Testing Level I Quantitative Methods. IFT Notes for the CFA exam Hypothesis Testing 2014 Level I Quantitative Methods IFT Notes for the CFA exam Contents 1. Introduction... 3 2. Hypothesis Testing... 3 3. Hypothesis Tests Concerning the Mean... 10 4. Hypothesis Tests

More information

Resampling: Bootstrapping and Randomization. Presented by: Jenn Fortune, Alayna Gillespie, Ivana Pejakovic, and Anne Bergen

Resampling: Bootstrapping and Randomization. Presented by: Jenn Fortune, Alayna Gillespie, Ivana Pejakovic, and Anne Bergen Resampling: Bootstrapping and Randomization Presented by: Jenn Fortune, Alayna Gillespie, Ivana Pejakovic, and Anne Bergen Outline 1. The logic of resampling and bootstrapping 2. Bootstrapping: confidence

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

Statistics 2014 Scoring Guidelines

Statistics 2014 Scoring Guidelines AP Statistics 2014 Scoring Guidelines College Board, Advanced Placement Program, AP, AP Central, and the acorn logo are registered trademarks of the College Board. AP Central is the official online home

More information

STAT 360 Probability and Statistics. Fall 2012

STAT 360 Probability and Statistics. Fall 2012 STAT 360 Probability and Statistics Fall 2012 1) General information: Crosslisted course offered as STAT 360, MATH 360 Semester: Fall 2012, Aug 20--Dec 07 Course name: Probability and Statistics Number

More information

University of Chicago Graduate School of Business. Business 41000: Business Statistics

University of Chicago Graduate School of Business. Business 41000: Business Statistics Name: University of Chicago Graduate School of Business Business 41000: Business Statistics Special Notes: 1. This is a closed-book exam. You may use an 8 11 piece of paper for the formulas. 2. Throughout

More information

Fundamental Probability and Statistics

Fundamental Probability and Statistics Fundamental Probability and Statistics "There are known knowns. These are things we know that we know. There are known unknowns. That is to say, there are things that we know we don't know. But there are

More information

Statistical Rules of Thumb

Statistical Rules of Thumb Statistical Rules of Thumb Second Edition Gerald van Belle University of Washington Department of Biostatistics and Department of Environmental and Occupational Health Sciences Seattle, WA WILEY AJOHN

More information

A Logic of Prediction and Evaluation

A Logic of Prediction and Evaluation 5 - Hypothesis Testing in the Linear Model Page 1 A Logic of Prediction and Evaluation 5:12 PM One goal of science: determine whether current ways of thinking about the world are adequate for predicting

More information

Mean = (sum of the values / the number of the value) if probabilities are equal

Mean = (sum of the values / the number of the value) if probabilities are equal Population Mean Mean = (sum of the values / the number of the value) if probabilities are equal Compute the population mean Population/Sample mean: 1. Collect the data 2. sum all the values in the population/sample.

More information

Hypothesis Testing & Data Analysis. Statistics. Descriptive Statistics. What is the difference between descriptive and inferential statistics?

Hypothesis Testing & Data Analysis. Statistics. Descriptive Statistics. What is the difference between descriptive and inferential statistics? 2 Hypothesis Testing & Data Analysis 5 What is the difference between descriptive and inferential statistics? Statistics 8 Tools to help us understand our data. Makes a complicated mess simple to understand.

More information

StatsMate for ipad. User Guide

StatsMate for ipad. User Guide StatsMate for ipad User Guide Overview StatsMate is an easy-to-use powerful statistical calculator for ipad. It has been featured by Apple on Apps For Learning Math in the App Stores around the world.

More information

0.1 Estimating and Testing Differences in the Treatment means

0.1 Estimating and Testing Differences in the Treatment means 0. Estimating and Testing Differences in the Treatment means Is the F-test significant, we only learn that not all population means are the same, but through this test we can not determine where the differences

More information

COURSE OUTLINE. Course Number Course Title Credits MAT201 Probability and Statistics for Science and Engineering 4. Co- or Pre-requisite

COURSE OUTLINE. Course Number Course Title Credits MAT201 Probability and Statistics for Science and Engineering 4. Co- or Pre-requisite COURSE OUTLINE Course Number Course Title Credits MAT201 Probability and Statistics for Science and Engineering 4 Hours: Lecture/Lab/Other 4 Lecture Co- or Pre-requisite MAT151 or MAT149 with a minimum

More information

Elements of statistics (MATH0487-1)

Elements of statistics (MATH0487-1) Elements of statistics (MATH0487-1) Prof. Dr. Dr. K. Van Steen University of Liège, Belgium December 10, 2012 Introduction to Statistics Basic Probability Revisited Sampling Exploratory Data Analysis -

More information

UCLA STAT 13 Statistical Methods - Final Exam Review Solutions Chapter 7 Sampling Distributions of Estimates

UCLA STAT 13 Statistical Methods - Final Exam Review Solutions Chapter 7 Sampling Distributions of Estimates UCLA STAT 13 Statistical Methods - Final Exam Review Solutions Chapter 7 Sampling Distributions of Estimates 1. (a) (i) µ µ (ii) σ σ n is exactly Normally distributed. (c) (i) is approximately Normally

More information

Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

More information

Biostatistics Lab Notes

Biostatistics Lab Notes Biostatistics Lab Notes Page 1 Lab 1: Measurement and Sampling Biostatistics Lab Notes Because we used a chance mechanism to select our sample, each sample will differ. My data set (GerstmanB.sav), looks

More information

Chapter 7. Estimates and Sample Size

Chapter 7. Estimates and Sample Size Chapter 7. Estimates and Sample Size Chapter Problem: How do we interpret a poll about global warming? Pew Research Center Poll: From what you ve read and heard, is there a solid evidence that the average

More information

Factors affecting online sales

Factors affecting online sales Factors affecting online sales Table of contents Summary... 1 Research questions... 1 The dataset... 2 Descriptive statistics: The exploratory stage... 3 Confidence intervals... 4 Hypothesis tests... 4

More information

Econometric Analysis of Cross Section and Panel Data Second Edition. Jeffrey M. Wooldridge. The MIT Press Cambridge, Massachusetts London, England

Econometric Analysis of Cross Section and Panel Data Second Edition. Jeffrey M. Wooldridge. The MIT Press Cambridge, Massachusetts London, England Econometric Analysis of Cross Section and Panel Data Second Edition Jeffrey M. Wooldridge The MIT Press Cambridge, Massachusetts London, England Preface Acknowledgments xxi xxix I INTRODUCTION AND BACKGROUND

More information

Chapter 11: Linear Regression - Inference in Regression Analysis - Part 2

Chapter 11: Linear Regression - Inference in Regression Analysis - Part 2 Chapter 11: Linear Regression - Inference in Regression Analysis - Part 2 Note: Whether we calculate confidence intervals or perform hypothesis tests we need the distribution of the statistic we will use.

More information

PROBLEM SET 1. For the first three answer true or false and explain your answer. A picture is often helpful.

PROBLEM SET 1. For the first three answer true or false and explain your answer. A picture is often helpful. PROBLEM SET 1 For the first three answer true or false and explain your answer. A picture is often helpful. 1. Suppose the significance level of a hypothesis test is α=0.05. If the p-value of the test

More information

Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools

Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools Occam s razor.......................................................... 2 A look at data I.........................................................

More information

Chapter 7. One-way ANOVA

Chapter 7. One-way ANOVA Chapter 7 One-way ANOVA One-way ANOVA examines equality of population means for a quantitative outcome and a single categorical explanatory variable with any number of levels. The t-test of Chapter 6 looks

More information

Chris Slaughter, DrPH. GI Research Conference June 19, 2008

Chris Slaughter, DrPH. GI Research Conference June 19, 2008 Chris Slaughter, DrPH Assistant Professor, Department of Biostatistics Vanderbilt University School of Medicine GI Research Conference June 19, 2008 Outline 1 2 3 Factors that Impact Power 4 5 6 Conclusions

More information

THE CERTIFIED SIX SIGMA BLACK BELT HANDBOOK

THE CERTIFIED SIX SIGMA BLACK BELT HANDBOOK THE CERTIFIED SIX SIGMA BLACK BELT HANDBOOK SECOND EDITION T. M. Kubiak Donald W. Benbow ASQ Quality Press Milwaukee, Wisconsin Table of Contents list of Figures and Tables Preface to the Second Edition

More information

5. Linear Regression

5. Linear Regression 5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4

More information

The alternative hypothesis,, is the statement that the parameter value somehow differs from that claimed by the null hypothesis. : 0.5 :>0.5 :<0.

The alternative hypothesis,, is the statement that the parameter value somehow differs from that claimed by the null hypothesis. : 0.5 :>0.5 :<0. Section 8.2-8.5 Null and Alternative Hypotheses... The null hypothesis,, is a statement that the value of a population parameter is equal to some claimed value. :=0.5 The alternative hypothesis,, is the

More information

AP Statistics 2010 Scoring Guidelines

AP Statistics 2010 Scoring Guidelines AP Statistics 2010 Scoring Guidelines The College Board The College Board is a not-for-profit membership association whose mission is to connect students to college success and opportunity. Founded in

More information

Crash Course on Basic Statistics

Crash Course on Basic Statistics Crash Course on Basic Statistics Marina Wahl, marina.w4hl@gmail.com University of New York at Stony Brook November 6, 2013 2 Contents 1 Basic Probability 5 1.1 Basic Definitions...........................................

More information

Sociology 6Z03 Topic 15: Statistical Inference for Means

Sociology 6Z03 Topic 15: Statistical Inference for Means Sociology 6Z03 Topic 15: Statistical Inference for Means John Fox McMaster University Fall 2016 John Fox (McMaster University) Soc 6Z03: Statistical Inference for Means Fall 2016 1 / 41 Outline: Statistical

More information

Chapter 8: Interval Estimates and Hypothesis Testing

Chapter 8: Interval Estimates and Hypothesis Testing Chapter 8: Interval Estimates and Hypothesis Testing Chapter 8 Outline Clint s Assignment: Taking Stock Estimate Reliability: Interval Estimate Question o Normal Distribution versus the Student t-distribution:

More information

Advanced Techniques for Mobile Robotics Statistical Testing

Advanced Techniques for Mobile Robotics Statistical Testing Advanced Techniques for Mobile Robotics Statistical Testing Wolfram Burgard, Cyrill Stachniss, Kai Arras, Maren Bennewitz Statistical Testing for Evaluating Experiments Deals with the relationship between

More information

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

More information

Chapter 8 Hypothesis Testing

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing Chapter problem: Does the MicroSort method of gender selection increase the likelihood that a baby will be girl? MicroSort: a gender-selection method developed by Genetics

More information

THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7.

THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7. THERE ARE TWO WAYS TO DO HYPOTHESIS TESTING WITH STATCRUNCH: WITH SUMMARY DATA (AS IN EXAMPLE 7.17, PAGE 236, IN ROSNER); WITH THE ORIGINAL DATA (AS IN EXAMPLE 8.5, PAGE 301 IN ROSNER THAT USES DATA FROM

More information

Multiple Hypothesis Testing: The F-test

Multiple Hypothesis Testing: The F-test Multiple Hypothesis Testing: The F-test Matt Blackwell December 3, 2008 1 A bit of review When moving into the matrix version of linear regression, it is easy to lose sight of the big picture and get lost

More information

ELEMENTARY STATISTICS

ELEMENTARY STATISTICS ELEMENTARY STATISTICS Study Guide Dr. Shinemin Lin Table of Contents 1. Introduction to Statistics. Descriptive Statistics 3. Probabilities and Standard Normal Distribution 4. Estimates and Sample Sizes

More information

Data Analysis. Lecture Empirical Model Building and Methods (Empirische Modellbildung und Methoden) SS Analysis of Experiments - Introduction

Data Analysis. Lecture Empirical Model Building and Methods (Empirische Modellbildung und Methoden) SS Analysis of Experiments - Introduction Data Analysis Lecture Empirical Model Building and Methods (Empirische Modellbildung und Methoden) Prof. Dr. Dr. h.c. Dieter Rombach Dr. Andreas Jedlitschka SS 2014 Analysis of Experiments - Introduction

More information

RESIDUAL = ACTUAL PREDICTED The following questions refer to this data set:

RESIDUAL = ACTUAL PREDICTED The following questions refer to this data set: REGRESSION DIAGNOSTICS After fitting a regression line it is important to do some diagnostic checks to verify that regression fit was OK One aspect of diagnostic checking is to find the rms error This

More information

e = random error, assumed to be normally distributed with mean 0 and standard deviation σ

e = random error, assumed to be normally distributed with mean 0 and standard deviation σ 1 Linear Regression 1.1 Simple Linear Regression Model The linear regression model is applied if we want to model a numeric response variable and its dependency on at least one numeric factor variable.

More information

Section 13, Part 1 ANOVA. Analysis Of Variance

Section 13, Part 1 ANOVA. Analysis Of Variance Section 13, Part 1 ANOVA Analysis Of Variance Course Overview So far in this course we ve covered: Descriptive statistics Summary statistics Tables and Graphs Probability Probability Rules Probability

More information

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL STATIsTICs 4 IV. RANDOm VECTORs 1. JOINTLY DIsTRIBUTED RANDOm VARIABLEs If are two rom variables defined on the same sample space we define the joint

More information

Linear Regression with One Regressor

Linear Regression with One Regressor Linear Regression with One Regressor Michael Ash Lecture 10 Analogy to the Mean True parameter µ Y β 0 and β 1 Meaning Central tendency Intercept and slope E(Y ) E(Y X ) = β 0 + β 1 X Data Y i (X i, Y

More information

Hence, multiplying by 12, the 95% interval for the hourly rate is (965, 1435)

Hence, multiplying by 12, the 95% interval for the hourly rate is (965, 1435) Confidence Intervals for Poisson data For an observation from a Poisson distribution, we have σ 2 = λ. If we observe r events, then our estimate ˆλ = r : N(λ, λ) If r is bigger than 20, we can use this

More information

Regression analysis in the Assistant fits a model with one continuous predictor and one continuous response and can fit two types of models:

Regression analysis in the Assistant fits a model with one continuous predictor and one continuous response and can fit two types of models: This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. The simple regression procedure in the

More information

MTH 140 Statistics Videos

MTH 140 Statistics Videos MTH 140 Statistics Videos Chapter 1 Picturing Distributions with Graphs Individuals and Variables Categorical Variables: Pie Charts and Bar Graphs Categorical Variables: Pie Charts and Bar Graphs Quantitative

More information

CHAPTER 11 CHI-SQUARE: NON-PARAMETRIC COMPARISONS OF FREQUENCY

CHAPTER 11 CHI-SQUARE: NON-PARAMETRIC COMPARISONS OF FREQUENCY CHAPTER 11 CHI-SQUARE: NON-PARAMETRIC COMPARISONS OF FREQUENCY The hypothesis testing statistics detailed thus far in this text have all been designed to allow comparison of the means of two or more samples

More information

9-3.4 Likelihood ratio test. Neyman-Pearson lemma

9-3.4 Likelihood ratio test. Neyman-Pearson lemma 9-3.4 Likelihood ratio test Neyman-Pearson lemma 9-1 Hypothesis Testing 9-1.1 Statistical Hypotheses Statistical hypothesis testing and confidence interval estimation of parameters are the fundamental

More information

Case Study in Data Analysis Does a drug prevent cardiomegaly in heart failure?

Case Study in Data Analysis Does a drug prevent cardiomegaly in heart failure? Case Study in Data Analysis Does a drug prevent cardiomegaly in heart failure? Harvey Motulsky hmotulsky@graphpad.com This is the first case in what I expect will be a series of case studies. While I mention

More information

tests whether there is an association between the outcome variable and a predictor variable. In the Assistant, you can perform a Chi-Square Test for

tests whether there is an association between the outcome variable and a predictor variable. In the Assistant, you can perform a Chi-Square Test for This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. In practice, quality professionals sometimes

More information

Bivariate Analysis. Comparisons of proportions: Chi Square Test (X 2 test) Variable 1. Variable 2 2 LEVELS >2 LEVELS CONTINUOUS

Bivariate Analysis. Comparisons of proportions: Chi Square Test (X 2 test) Variable 1. Variable 2 2 LEVELS >2 LEVELS CONTINUOUS Bivariate Analysis Variable 1 2 LEVELS >2 LEVELS CONTINUOUS Variable 2 2 LEVELS X 2 chi square test >2 LEVELS X 2 chi square test CONTINUOUS t-test X 2 chi square test X 2 chi square test ANOVA (F-test)

More information