Inferential statistics

Similar documents
Study Guide for the Final Exam

Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.

Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

Descriptive Statistics

II. DISTRIBUTIONS distribution normal distribution. standard scores

WISE Power Tutorial All Exercises

Confidence intervals

Fairfield Public Schools

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

Chapter 9. Two-Sample Tests. Effect Sizes and Power Paired t Test Calculation

Comparison of frequentist and Bayesian inference. Class 20, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Sample Size and Power in Clinical Trials

Two-sample inference: Continuous data

Simple Regression Theory II 2010 Samuel L. Baker

DATA ANALYSIS. QEM Network HBCU-UP Fundamentals of Education Research Workshop Gerunda B. Hughes, Ph.D. Howard University

CALCULATIONS & STATISTICS

MA 1125 Lecture 14 - Expected Values. Friday, February 28, Objectives: Introduce expected values.

Introduction to Hypothesis Testing. Hypothesis Testing. Step 1: State the Hypotheses

Independent samples t-test. Dr. Tom Pierce Radford University

The Procedures of Monte Carlo Simulation (and Resampling)

DDBA 8438: Introduction to Hypothesis Testing Video Podcast Transcript

Sampling. COUN 695 Experimental Design

This chapter discusses some of the basic concepts in inferential statistics.

WHERE DOES THE 10% CONDITION COME FROM?

Statistical tests for SPSS

HYPOTHESIS TESTING: POWER OF THE TEST

STA-201-TE. 5. Measures of relationship: correlation (5%) Correlation coefficient; Pearson r; correlation and causation; proportion of common variance

General Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1.

Selecting Research Participants

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Section 13, Part 1 ANOVA. Analysis Of Variance

Two-sample hypothesis testing, II /16/2004

Week 4: Standard Error and Confidence Intervals

DDBA 8438: The t Test for Independent Samples Video Podcast Transcript

SAMPLING & INFERENTIAL STATISTICS. Sampling is necessary to make inferences about a population.

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

Understand the role that hypothesis testing plays in an improvement project. Know how to perform a two sample hypothesis test.

Descriptive Statistics and Exploratory Data Analysis

SPSS Guide: Regression Analysis

Chapter 8: Quantitative Sampling

UNDERSTANDING THE TWO-WAY ANOVA

Chapter Eight: Quantitative Methods

6.4 Normal Distribution

Lecture Notes Module 1

Introduction to Hypothesis Testing

1.6 The Order of Operations

Statistics 2014 Scoring Guidelines

p ˆ (sample mean and sample

Chapter 2. Hypothesis testing in one population

Constructing and Interpreting Confidence Intervals

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

Descriptive Statistics

Using Excel for inferential statistics

Section 7.1. Introduction to Hypothesis Testing. Schrodinger s cat quantum mechanics thought experiment (1935)

University of Chicago Graduate School of Business. Business 41000: Business Statistics Solution Key

HOW TO WRITE A LABORATORY REPORT

Inclusion and Exclusion Criteria

Math 58. Rumbos Fall Solutions to Review Problems for Exam 2

Elements of statistics (MATH0487-1)

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

Multivariate Analysis of Variance. The general purpose of multivariate analysis of variance (MANOVA) is to determine

Pearson's Correlation Tests

Odds ratio, Odds ratio test for independence, chi-squared statistic.

Updates to Graphing with Excel

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Hypothesis Testing: Two Means, Paired Data, Two Proportions

Activities/ Resources for Unit V: Proportions, Ratios, Probability, Mean and Median

Outline. Definitions Descriptive vs. Inferential Statistics The t-test - One-sample t-test

Non-random/non-probability sampling designs in quantitative research

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

Introduction to Hypothesis Testing OPRE 6301

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

research/scientific includes the following: statistical hypotheses: you have a null and alternative you accept one and reject the other

Violent crime total. Problem Set 1

First-year Statistics for Psychology Students Through Worked Examples. 2. Probability and Bayes Theorem

UNDERSTANDING THE DEPENDENT-SAMPLES t TEST

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

What is Statistic? OPRE 6301

MATH 140 Lab 4: Probability and the Standard Normal Distribution

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

Projects Involving Statistics (& SPSS)

Association Between Variables

Point Biserial Correlation Tests

Having a coin come up heads or tails is a variable on a nominal scale. Heads is a different category from tails.

An Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 10- TWO-SAMPLE TESTS

Linear functions Increasing Linear Functions. Decreasing Linear Functions

Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures

Survey Research: Choice of Instrument, Sample. Lynda Burton, ScD Johns Hopkins University

5.1 Identifying the Target Parameter

Chapter 7 Notes - Inference for Single Samples. You know already for a large sample, you can invoke the CLT so:

Non-Parametric Tests (I)

99.37, 99.38, 99.38, 99.39, 99.39, 99.39, 99.39, 99.40, 99.41, cm

Testing Research and Statistical Hypotheses

Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

Planning sample size for randomized evaluations

Transcription:

Inferential statistics We ve seen how operational definition specifies the measurement operations that define a variable. Last week we considered how carrying out such a measurement operation assigns a number a score; a value to a variable. Typically one carries out not a single such operation of measurement but several and this gives us many scores: a distribution of scores. We ve seen how descriptive statistics can be used to describe such a distribution, in particular its central tendency and dispersion. (Notice that we re leaving to one side for a while notions of experimental design and hypothesis-testing. We ll return to these soon.) In addition to descriptive statistics, we need to understand inferential statistics : Inferential statistics provide a way of: going from a sample to a population inferring the parameters of a population from data on the statistics of a sample. i.e., parameters such as m and s, from statistics such as m and s. But before we can see what is involved in the move from sample to population we need to understand how to move from population to sample. The study of obtaining a sample from a population is probability. Probability Population ææ probability ææ Sample flææinferential statisticsflææ For example: The probability of picking a black ball from jar A is one half; the probability of picking a black ball from jar B is one tenth. [This is reasoning about probability.] 50 black, jar B: 90 black, 10 white [We would use Inferential statistics to answer questions such as: from which jar is one more likely to get a sample of four black balls?] Definition:

the probability of an outcome B = number of outcomes classified as B total number of possible outcomes This is written P(B) this is a proportion, a fraction 1 And a value of P(B) = 0 means the outcome is impossible; the value P(B) = 1 means the outcome is certain. BUT, for this definition of probability to be accurate, the outcomes must be selected randomly. Definition: Random sample: 1. every selection has an equal chance. (i.e. there is no bias to the selection) 2. when there is more than one selection, there must be a constant probability each time. (i.e., we must be sampling with replacement )...or else we ll have a situation like this after we ve made our first selection: 50 black, 49 black; Notice: random does not mean chaotic. Rather, random means there is a pattern that becomes apparent only when we examine a large number of events. It means there is a pattern that doesn t show itself in a single case, or a few cases. Probability is the study of the patterns of random processes. The Normal Distribution The kind of pattern that emerges from many types of random process is the normal distribution See pattern emerging from a random process in a quincunx at: http://www.rand.org/methodology/stat/applets/clt.html There s another at:

http://www.users.on.net/zhcchz/java/quincunx/quincunx.html Read the history of how Francis Galton invented the quincunx at: http://www.tld.jcu.edu.au/hist/stats/galton/galton16.html A normal distribution is easily described... [We ll add this in class] Now we re ready for: Inferential Statistics It is usually necessary for a researcher to work with samples rather than a whole population. but one difficulty is that a sample is generally not identical to the population from which it comes. specifically, the sample mean will differ from the population mean: X m and another difficulty is that no two samples are the same. How can we know which best describes the population? We need rules that relate samples to populations The Distribution of Sample Means Definition: the distribution of sample means is the collection of sample means for all the possible random samples of a particular size (n) that can be obtained from a population. It is not a distribution of scores, but a distribution of statistics. This distribution tends to be normal. It will be almost perfectly normal if either: 1. the population from which the sample is drawn is normal, or 2. the n of the sample is relatively large (30 or more). This distribution has a mean that is equal to the population mean; i.e. m X = m and a standard deviation, s X, which is called the standard error of X. The standard error is a very valuable measure. It specifies how well a sample mean estimates the population mean;

i.e. how accurate an estimate a sample provides; i.e., the error between X and m, on average. IMPORTANT NOTE: It can be shown that: s X This formula shows how it is that the accuracy of the estimate provided by a sample increases as the sample size increases. (This formula and a related one we ll introduce shortly is something APA says one needs to know for the licensing exam.) Now we can turn to... Hypothesis Testing Definition: Remember? hypothesis testing is an inferential procedure that uses sample data to evaluate the credibility of a hypothesis about a population. null hypothesis (H 0 ): the treatment has no effect. i.e., the experimental group and the control group are drawn from the same population. This will seem confusing, since a good experiment assigns subjects randomly to the two groups. The point here is that the null hypothesis asserts that it is still the case AFTER THE TREATMENT that the two groups belong to the same population. If, on the other hand, the treatment did have an effect which of course means that the null hypothesis is false then the two groups would now come from different populations. Comprendas? i.e., X expt and X control will not differ by more than random error. i.e., they will not differ by more than the standard error. How do we know the standard error? Remember, it s the standard deviation of the distribution of sampling means. Actually, we don t know it directly. But we can estimate it: our estimate replaces the population parameter s with the sample statistic, s: Remember we said that it can be shown that s X In the same way: s X

We use the symbol s X indicate that the value is calculated from sample data, rather than from the population parameter. Now we can say, if the null hypothesis is true, the experimental group mean minus the control group mean will equal the estimated standard error, or less: i.e., If H 0 is true, X expt minus X control s X i.e., X expt minus X control / s X 1 (simply rearranging the formula) The left hand side of this equation is called t, the t-statistic. So, if the null hypothesis is true, t 1 Conversely, if the null hypothesis is false, t > 1 How much greater than 1 will t be? We just calculate that from our data. Then we use this value to look in a table of t. This table will tell us how likely it is that we could have obtained our data as a result of chance alone. That s to say, it tells us the probability of the null hypothesis being true, given our data. That means, it tells us how confident we can feel in eliminating the null hypothesis. In other words, it gives us a p value! === n.b., the above stuff assumes: 1. random sampling 2. that the experimental treatment doesn t change s. (I.e., that it only changes m. This is the assumption of homogeneity of variance. ) 3. that the distribution of sampling means is normal. (We considered this earlier.)