Introduction. Statistics Toolbox


 Marcus Fisher
 4 years ago
 Views:
Transcription
1 Introduction A hypothesis test is a procedure for determining if an assertion about a characteristic of a population is reasonable. For example, suppose that someone says that the average price of a gallon of regular unleaded gas in Massachusetts is $1.15. How would you decide whether this statement is true? You could try to find out what every gas station in the state was charging and how many gallons they were selling at that price. That approach might be definitive, but it could end up costing more than the information is worth. A simpler approach is to find out the price of gas at a small number of randomly chosen stations around the state and compare the average price to $1.15. Of course, the average price you get will probably not be exactly $1.15 due to variability in price from one station to the next. Suppose your average price was $1.18. Is this three cent difference a result of chance variability, or is the original assertion incorrect? A hypothesis test can provide an answer. The following sections provide an overview of hypothesis testing with the Statistics Toolbox: Hypothesis Test Terminology Hypothesis Test Assumptions Example: Hypothesis Testing Available Hypothesis Tests Hypothesis Tests Hypothesis Test Terminology
2 Hypothesis Test Terminology To get started, there are some terms to define and assumptions to make: The null hypothesis is the original assertion. In this case the null hypothesis is that the average price of a gallon of gas is $1.15. The notation is H 0 : µ = There are three possibilities for the alternative hypothesis. You might only be interested in the result if gas prices were actually higher. In this case, the alternative hypothesis is H 1 : µ > The other possibilities are H 1 : µ < 1.15 and H 1 : µ The significance level is related to the degree of certainty you require in order to reject the null hypothesis in favor of the alternative. By taking a small sample you cannot be certain about your conclusion. So you decide in advance to reject the null hypothesis if the probability of observing your sampled result is less than the significance level. For a typical significance level of 5%, the notation is = For this significance level, the probability of incorrectly rejecting the null hypothesis when it is actually true is 5%. If you need more protection from this error, then choose a lower value of ph. The pvalue is the probability of observing the given sample result under the assumption that the null hypothesis is true. If the pvalue is less than, then you reject the null hypothesis. For example, if alpha = 0.05 and the pvalue is 0.03, then you reject the null hypothesis. The converse is not true. If the pvalue is greater than, you have insufficient evidence to reject the null hypothesis. The outputs for many hypothesis test functions also include confidence intervals. Loosely speaking, a confidence interval is a range of values that have a chosen probability of containing the true hypothesized quantity. Suppose, in the example, 1.15 is inside a 95% confidence interval for the mean, µ. That is equivalent to being unable to reject the null hypothesis at a significance level of Conversely if the 100(1 ) confidence interval does not contain 1.15, then you reject the null hypothesis at the alpha level of significance. Introduction Hypothesis Test Assumptions
3 Hypothesis Test Assumptions The difference between hypothesis test procedures often arises from differences in the assumptions that the researcher is willing to make about the data sample. For example, the Ztest assumes that the data represents independent samples from the same normal distribution and that you know the standard deviation,. The ttest has the same assumptions except that you estimate the standard deviation using the data instead of specifying it as a known quantity. Both tests have an associated signaltonoise ratio The signal is the difference between the average and the hypothesized mean. The noise is the standard deviation posited or estimated. If the null hypothesis is true, then Z has a standard normal distribution, N(0,1). T has a Student's t distribution with the degrees of freedom,, equal to one less than the number of data values. Given the observed result for Z or T, and knowing the distribution of Z and T assuming the null hypothesis is true, it is possible to compute the probability (pvalue) of observing this result. A very small pvalue casts doubt on the truth of the null hypothesis. For example, suppose that the pvalue was 0.001, meaning that the probability of observing the given Z or T was one in a thousand. That should make you skeptical enough about the null hypothesis that you reject it rather than believe that your result was just a lucky 999 to 1 shot. There are also nonparametric tests that do not even require the assumption that the data come from a normal distribution. In addition, there are functions for testing whether the normal assumption is reasonable. Hypothesis Test Terminology Example: Hypothesis Testing
4 Example: Hypothesis Testing This example uses the gasoline price data in gas.mat. There are two samples of 20 observed gas prices for the months of January and February, load gas As a first step, you may want to test whether the samples from each month follow a normal distribution. As each sample is relatively small, you might choose to perform a Lilliefors test (rather than a JarqueBera test). lillietest(price1) ans = 0 lillietest(price2) ans = 0 The result of the hypothesis test is a Boolean value that is 0 when you do not reject the null hypothesis, and 1 when you do reject that hypothesis. In each case, there is no need to reject the null hypothesis that the samples have a normal distribution. Suppose it is historically true that the standard deviation of gas prices at gas stations around Massachusetts is four cents a gallon. The Ztest is a procedure for testing the null hypothesis that the average price of a gallon of gas in January (price1) is $1.15. [h,pvalue,ci] = ztest(price1/100,1.15,0.04) 0 pvalue = ci = The Boolean output is 0, so you do not reject the null hypothesis. The result suggests that $1.15 is reasonable. The 95% confidence interval [ ] neatly brackets $1.15. What about February? Try a ttest with price2. Now you are not assuming that you know the standard deviation in price.
5 [h,pvalue,ci] = ttest(price2/100,1.15) 1 pvalue = e04 ci = With the Boolean result 1, you can reject the null hypothesis at the default significance level, It looks like $1.15 is not a reasonable estimate of the gasoline price in February. The low end of the 95% confidence interval is greater than The function ttest2 allows you to compare the means of the two data samples. [h,sig,ci] = ttest2(price1,price2) sig = ci = The confidence interval (ci above) indicates that gasoline prices were between one and six cents lower in January than February. If the two samples were not normally distributed but had similar shape, it would have been more appropriate to use the nonparametric rank sum test in place of the ttest. You can still use the rank sum test with normally distributed data, but it is less powerful than the ttest. [p,h,stats] = ranksum(price1, price2) p = stats = zval: ranksum: 314
6 As might be expected, the rank sum test leads to the same conclusion but is less sensitive to the difference between samples (higher pvalue). The box plot below gives less conclusive results. On a notched box plot, two groups have overlapping notches if their medians are not significantly different. Here the notches just barely overlap, indicating that the difference in medians is of borderline significance. (The results for a box plot are not always the same as for a ttest, which is based on means rather than medians.) Refer to Statistical Plots for more information about box plots. boxplot(prices,1) set(gca,'xticklabel',str2mat('january','february')) xlabel('month') ylabel('prices ($0.01)') Hypothesis Test Assumptions Available Hypothesis Tests
7 Available Hypothesis Tests The Statistics Toolbox has functions for performing the following tests. Function chi2gof dwtest jbtest kstest kstest2 lillietest ranksum runstest signrank signtest ttest ttest2 vartest vartest2 vartestn ztest What it Tests Chisquare test of distribution of one normal sample DurbinWatson test Normal distribution for one sample Any specified distribution for one sample Equal distributions for two samples Normal distribution for one sample Median of two unpaired samples Randomness of the sequence of observations Median of two paired samples Median of two paired samples Mean of one normal sample Mean of two normal samples Variance of one normal sample Variance of two normal samples Variance of N normal samples Mean of normal sample with known standard deviation Example: Hypothesis Testing Statistical Plots
8 ztest Hypothesis testing for mean of one sample with known variance Syntax ztest(x,m,sigma) ttest(x,m) ztest(x,m,sigma,alpha) [h,sig,ci] = ztest(x,m,sigma,alpha,tail) ztest(...,alpha,tail,dim) Description ztest(x,m,sigma) performs a Z test at significance level 0.05 to determine whether a sample x from a normal distribution with standard deviation sigma could have mean m. x can also be a matrix or an ndimensional array. For matrices, ztest performs separate Z tests along each column of x and returns a vector of results. For ndimensional arrays, ztest works along the first nonsingleton dimension of x. ttest(x,m) performs a Z test of the hypothesis that the data in the vector ztest(x,m,sigma,alpha) gives control of the significance level alpha. For example, if alpha = 0.01 and the result is 1, you can reject the null hypothesis at the significance level If 0, you cannot reject the null hypothesis at the alpha level of significance. [h,sig,ci] = ztest(x,m,sigma,alpha,tail) allows specification of one or twotailed tests, where tail is a flag that specifies one of three alternative hypotheses: tail = 'both' specifies the alternative tail = 'right' specifies the alternative. tail = 'left' specifies the alternative. zval is the value of the Z statistic (default). where is the number of observations in the sample. sig is the probability that the observed value of Z could be as large or larger by chance under the null hypothesis that the mean of x is equal to m. ci is a 1alpha confidence interval for the true mean. ztest(...,alpha,tail,dim) performs the test along dimension dim of the input x array. For a matrix x, dim=1 computes the Z test for each column (along the first dimension), and dim=2 computes the Z test for each row. By default, ztest works along the first nonsingleton dimension, so it treats a singlerow input as a row vector. Example This example generates 100 normal random numbers with theoretical mean 0 and
9 standard deviation 1. The observed mean and standard deviation are different from their theoretical values, of course. You test the hypothesis that there is no true difference. m = mean(x) m = [h,sig,ci] = ztest(x,0,1) 0 sig = ci = The result, 0, means that you cannot reject the null hypothesis. The significance level is , which means that by chance you would have observed values of Z more extreme than the one in this example in 47 of 100 similar experiments. A 95% confidence interval on the mean is [ ], which includes the theoretical (and hypothesized) mean of zero. zscore Bibliography
10 ttest Hypothesis testing for single sample mean Syntax ttest(x) ttest(x,m) ttest(x,y) ttest(...,alpha) ttest(...,alpha,tail) [h,p,ci,stats] = ttest(...) ttest(...,alpha,tail,dim) Description ttest(x) performs a ttest of the hypothesis that the data in the vector x comes from a distribution with mean zero, and returns the result of the test in h. h=0 indicates that the null hypothesis (mean is zero) cannot be rejected at the 5% significance level. h=1 indicates that the null hypothesis can be rejected at the 5% level. The data are assumed to come from a normal distribution with unknown variance. x can also be a matrix or an ndimensional array. For matrices, ttest performs separate ttests along each column of x and returns a vector of results. For ndimensional arrays, ttest works along the first nonsingleton dimension of x. ttest(x,m) performs a ttest of the hypothesis that the data in the vector x comes from a distribution with mean m. ttest(x,y) performs a paired ttest of the hypothesis that two matched (or paired) samples in the vectors x and y come from distributions with equal means. The difference xy is assumed to come from a normal distribution with unknown variance. x and y must be vectors of the same length, or arrays of the same size. ttest(...,alpha) performs the test at the significance level (100*alpha)%. For example, if alpha = 0.01, and the result h is 1, you can reject the null hypothesis at the significance level If h is 0, you cannot reject the null hypothesis at the alpha level of significance. ttest(...,alpha,tail) performs the test against the alternative hypothesis specified by tail. There are three options for tail: 'both' 'right' 'left' m) (twotailed test). This is the default. m) (righttailed test). m) (lefttailed test). [h,p,ci,stats] = ttest(...) returns a structure with the following fields: 'tstat' 'df' 'sd' standard deviation of xy. Output p is the pvalue associated with the tstatistic
11 where is the sample standard deviation and is the number of observations in the sample. p is the probability that the value of the tstatistic is equal to or more extreme than the observed value by chance, under the null hypothesis that the mean of x is equal to m. ci is a 1alpha confidence interval for the true mean. ttest(...,alpha,tail,dim) performs the test along dimension dim of the input x array. For a matrix x, dim=1 computes the ttest for each column (along the first dimension), and dim=2 computes the ttest for each row. By default, ttest works along the first nonsingleton dimension, so it treats a singlerow input as a row vector. Example This example generates 100 normal random numbers with theoretical mean 0 and standard deviation 1. The observed mean and standard deviation are different from their theoretical values, of course, so you test the hypothesis that there is no true difference. Here is a normal random number generator test: [h,p,ci] = ttest(x,0) 0 p = ci = The result 0 means that you cannot reject the null hypothesis. The significance level is , which means that by chance you would have observed values of T more extreme than the one in this example in 45 of 100 similar experiments. A 95% confidence interval on the mean is [ ], which includes the theoretical (and hypothesized) mean of zero. tstat ttest2
12 ttest2 Hypothesis testing for difference in means of two samples Syntax ttest2(x,y) [h,significance,ci] = ttest2(x,y,alpha) [h,significance,ci,stats] = ttest2(x,y,alpha) [...] = ttest2(x,y,alpha,tail) ttest2(x,y,alpha,tail,'unequal') ttest2(...,dim) Description ttest2(x,y) performs a ttest to determine whether two samples from a normal distribution (x and y) could have the same mean when the standard deviations are unknown but assumed equal. The vectors x and y can have different lengths. x and y can also be matrices or ndimensional arrays. For matrices, ttest2 performs separate ttests along each column and returns a vector of results. x and y must have the same number of columns. For ndimensional arrays, ttest2 works along the first nonsingleton dimension. x and y must have the same size along all the remaining dimensions. The result, h, is 1 if you can reject the null hypothesis that the means are equal at the 0.05 significance level and 0 otherwise. significance is the pvalue associated with the tstatistic where s is the pooled sample standard deviation and n and m are the numbers of observations in the x and y samples. significance is the probability that the observed value of T could be as large or larger by chance under the null hypothesis that the mean of x is equal to the mean of y. ci is a 95% confidence interval for the true difference in means. [h,significance,ci] = ttest2(x,y,alpha) gives control of the significance level alpha. For example, if alpha = 0.01, and the result, h, is 1, you can reject the null hypothesis at the significance level ci in this case is a 100(1  alpha) % confidence interval for the true difference in means. [h,significance,ci,stats] = ttest2(x,y,alpha) returns a structure stats with the following three fields: tstat df 'sd' variance case, or a vector containing the unpooled estimates of the population standard deviations in the unequal variance case
13 [...] = ttest2(x,y,alpha,tail) allows specification of one or twotailed tests, where tail is a flag that specifies one of three alternative hypotheses: tail = 'both' specifies the alternative tail = 'right' specifies the alternative. tail = 'left' specifies the alternative. (default). [...] = ttest2(x,y,alpha,tail,'unequal') performs the test assuming that the two samples come from normal distributions with unknown and possibly unequal variances. This is known as the BehrensFisher problem. ttest2 uses Satterthwaite's approximation for the effective degrees of freedom. [...] = ttest2(...,dim) performs the test along dimension dim of the input x and y arrays. For matrix inputs, dim=1 computes the ttest for each column (along the first dimension), and dim=2 computes the ttest for each row. By default, ttest2 works along the first nonsingleton dimension, so it treats singlerow inputs as row vectors. Examples This example generates 100 normal random numbers with theoretical mean 0 and standard deviation 1. You then generate 100 more normal random numbers with theoretical mean 1/2 and standard deviation 1. The observed means and standard deviations are different from their theoretical values, of course. You test the hypothesis that there is no true difference between the two means. Notice that the true difference is only onehalf of the standard deviation of the individual observations, so you are trying to detect a signal that is only onehalf the size of the inherent noise in the process. [h,significance,ci] = ttest2(x,y) 1 significance = ci = The result 1 means that you can reject the null hypothesis. The significance is , which means that by chance you would have observed values of t more extreme than the one in this example in only 17 of 10,000 similar experiments! A 95% confidence interval on the mean is [ ], which includes the theoretical (and hypothesized) difference of ttest unidcdf
LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING
LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.
More informationTHE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7.
THERE ARE TWO WAYS TO DO HYPOTHESIS TESTING WITH STATCRUNCH: WITH SUMMARY DATA (AS IN EXAMPLE 7.17, PAGE 236, IN ROSNER); WITH THE ORIGINAL DATA (AS IN EXAMPLE 8.5, PAGE 301 IN ROSNER THAT USES DATA FROM
More informationHow To Test For Significance On A Data Set
NonParametric Univariate Tests: 1 Sample Sign Test 1 1 SAMPLE SIGN TEST A nonparametric equivalent of the 1 SAMPLE TTEST. ASSUMPTIONS: Data is nonnormally distributed, even after log transforming.
More informationSection 7.1. Introduction to Hypothesis Testing. Schrodinger s cat quantum mechanics thought experiment (1935)
Section 7.1 Introduction to Hypothesis Testing Schrodinger s cat quantum mechanics thought experiment (1935) Statistical Hypotheses A statistical hypothesis is a claim about a population. Null hypothesis
More informationHypothesis Testing  One Mean
Hypothesis Testing  One Mean A hypothesis is simply a statement that something is true. Typically, there are two hypotheses in a hypothesis test: the null, and the alternative. Null Hypothesis The hypothesis
More informationComparing Means in Two Populations
Comparing Means in Two Populations Overview The previous section discussed hypothesis testing when sampling from a single population (either a single mean or two means from the same population). Now we
More informationTwoSample TTests Assuming Equal Variance (Enter Means)
Chapter 4 TwoSample TTests Assuming Equal Variance (Enter Means) Introduction This procedure provides sample size and power calculations for one or twosided twosample ttests when the variances of
More informationTwoSample TTests Allowing Unequal Variance (Enter Difference)
Chapter 45 TwoSample TTests Allowing Unequal Variance (Enter Difference) Introduction This procedure provides sample size and power calculations for one or twosided twosample ttests when no assumption
More informationt Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon
ttests in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com www.excelmasterseries.com
More informationDifference of Means and ANOVA Problems
Difference of Means and Problems Dr. Tom Ilvento FREC 408 Accounting Firm Study An accounting firm specializes in auditing the financial records of large firm It is interested in evaluating its fee structure,particularly
More informationChapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 81 Overview 82 Basics of Hypothesis Testing
Chapter 8 Hypothesis Testing 1 Chapter 8 Hypothesis Testing 81 Overview 82 Basics of Hypothesis Testing 83 Testing a Claim About a Proportion 85 Testing a Claim About a Mean: s Not Known 86 Testing
More informationCalculating PValues. Parkland College. Isela Guerra Parkland College. Recommended Citation
Parkland College A with Honors Projects Honors Program 2014 Calculating PValues Isela Guerra Parkland College Recommended Citation Guerra, Isela, "Calculating PValues" (2014). A with Honors Projects.
More informationIndependent t Test (Comparing Two Means)
Independent t Test (Comparing Two Means) The objectives of this lesson are to learn: the definition/purpose of independent ttest when to use the independent ttest the use of SPSS to complete an independent
More informationNCSS Statistical Software
Chapter 06 Introduction This procedure provides several reports for the comparison of two distributions, including confidence intervals for the difference in means, twosample ttests, the ztest, the
More informationStudy Guide for the Final Exam
Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make
More informationLesson 1: Comparison of Population Means Part c: Comparison of Two Means
Lesson : Comparison of Population Means Part c: Comparison of Two Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis
More informationNCSS Statistical Software
Chapter 06 Introduction This procedure provides several reports for the comparison of two distributions, including confidence intervals for the difference in means, twosample ttests, the ztest, the
More information1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96
1 Final Review 2 Review 2.1 CI 1propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
More informationTwo Related Samples t Test
Two Related Samples t Test In this example 1 students saw five pictures of attractive people and five pictures of unattractive people. For each picture, the students rated the friendliness of the person
More informationHYPOTHESIS TESTING: POWER OF THE TEST
HYPOTHESIS TESTING: POWER OF THE TEST The first 6 steps of the 9step test of hypothesis are called "the test". These steps are not dependent on the observed data values. When planning a research project,
More informationUNDERSTANDING THE INDEPENDENTSAMPLES t TEST
UNDERSTANDING The independentsamples t test evaluates the difference between the means of two independent or unrelated groups. That is, we evaluate whether the means for two independent groups are significantly
More informationChapter 7 Notes  Inference for Single Samples. You know already for a large sample, you can invoke the CLT so:
Chapter 7 Notes  Inference for Single Samples You know already for a large sample, you can invoke the CLT so: X N(µ, ). Also for a large sample, you can replace an unknown σ by s. You know how to do a
More informationHypothesis testing  Steps
Hypothesis testing  Steps Steps to do a twotailed test of the hypothesis that β 1 0: 1. Set up the hypotheses: H 0 : β 1 = 0 H a : β 1 0. 2. Compute the test statistic: t = b 1 0 Std. error of b 1 =
More informationC. The null hypothesis is not rejected when the alternative hypothesis is true. A. population parameters.
Sample Multiple Choice Questions for the material since Midterm 2. Sample questions from Midterms and 2 are also representative of questions that may appear on the final exam.. A randomly selected sample
More informationSCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES
SCHOOL OF HEALTH AND HUMAN SCIENCES Using SPSS Topics addressed today: 1. Differences between groups 2. Graphing Use the s4data.sav file for the first part of this session. DON T FORGET TO RECODE YOUR
More informationAdditional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jintselink/tselink.htm
Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jintselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm
More information12: Analysis of Variance. Introduction
1: Analysis of Variance Introduction EDA Hypothesis Test Introduction In Chapter 8 and again in Chapter 11 we compared means from two independent groups. In this chapter we extend the procedure to consider
More information3.4 Statistical inference for 2 populations based on two samples
3.4 Statistical inference for 2 populations based on two samples Tests for a difference between two population means The first sample will be denoted as X 1, X 2,..., X m. The second sample will be denoted
More informationUNDERSTANDING THE DEPENDENTSAMPLES t TEST
UNDERSTANDING THE DEPENDENTSAMPLES t TEST A dependentsamples t test (a.k.a. matched or pairedsamples, matchedpairs, samples, or subjects, simple repeatedmeasures or withingroups, or correlated groups)
More informationDescriptive Statistics
Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize
More informationNONPARAMETRIC STATISTICS 1. depend on assumptions about the underlying distribution of the data (or on the Central Limit Theorem)
NONPARAMETRIC STATISTICS 1 PREVIOUSLY parametric statistics in estimation and hypothesis testing... construction of confidence intervals computing of pvalues classical significance testing depend on assumptions
More informationSkewed Data and Nonparametric Methods
0 2 4 6 8 10 12 14 Skewed Data and Nonparametric Methods Comparing two groups: ttest assumes data are: 1. Normally distributed, and 2. both samples have the same SD (i.e. one sample is simply shifted
More information2 Sample ttest (unequal sample sizes and unequal variances)
Variations of the ttest: Sample tail Sample ttest (unequal sample sizes and unequal variances) Like the last example, below we have ceramic sherd thickness measurements (in cm) of two samples representing
More informationPermutation & NonParametric Tests
Permutation & NonParametric Tests Statistical tests Gather data to assess some hypothesis (e.g., does this treatment have an effect on this outcome?) Form a test statistic for which large values indicate
More informationTwosample hypothesis testing, II 9.07 3/16/2004
Twosample hypothesis testing, II 9.07 3/16/004 Small sample tests for the difference between two independent means For twosample tests of the difference in mean, things get a little confusing, here,
More informationStat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct 16 2015
Stat 411/511 THE RANDOMIZATION TEST Oct 16 2015 Charlotte Wickham stat511.cwick.co.nz Today Review randomization model Conduct randomization test What about CIs? Using a tdistribution as an approximation
More information1.5 Oneway Analysis of Variance
Statistics: Rosie Cornish. 200. 1.5 Oneway Analysis of Variance 1 Introduction Oneway analysis of variance (ANOVA) is used to compare several means. This method is often used in scientific or medical experiments
More informationSection 13, Part 1 ANOVA. Analysis Of Variance
Section 13, Part 1 ANOVA Analysis Of Variance Course Overview So far in this course we ve covered: Descriptive statistics Summary statistics Tables and Graphs Probability Probability Rules Probability
More informationAnalysis of Variance ANOVA
Analysis of Variance ANOVA Overview We ve used the t test to compare the means from two independent groups. Now we ve come to the final topic of the course: how to compare means from more than two populations.
More informationUsing Excel for inferential statistics
FACT SHEET Using Excel for inferential statistics Introduction When you collect data, you expect a certain amount of variation, just caused by chance. A wide variety of statistical tests can be applied
More informationHypothesis Testing: Two Means, Paired Data, Two Proportions
Chapter 10 Hypothesis Testing: Two Means, Paired Data, Two Proportions 10.1 Hypothesis Testing: Two Population Means and Two Population Proportions 1 10.1.1 Student Learning Objectives By the end of this
More informationAn Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 10 TWOSAMPLE TESTS
The Islamic University of Gaza Faculty of Commerce Department of Economics and Political Sciences An Introduction to Statistics Course (ECOE 130) Spring Semester 011 Chapter 10 TWOSAMPLE TESTS Practice
More informationIntroduction to Hypothesis Testing. Hypothesis Testing. Step 1: State the Hypotheses
Introduction to Hypothesis Testing 1 Hypothesis Testing A hypothesis test is a statistical procedure that uses sample data to evaluate a hypothesis about a population Hypothesis is stated in terms of the
More informationChapter Study Guide. Chapter 11 Confidence Intervals and Hypothesis Testing for Means
OPRE504 Chapter Study Guide Chapter 11 Confidence Intervals and Hypothesis Testing for Means I. Calculate Probability for A Sample Mean When Population σ Is Known 1. First of all, we need to find out the
More informationProjects Involving Statistics (& SPSS)
Projects Involving Statistics (& SPSS) Academic Skills Advice Starting a project which involves using statistics can feel confusing as there seems to be many different things you can do (charts, graphs,
More informationRecall this chart that showed how most of our course would be organized:
Chapter 4 OneWay ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical
More informationConfidence Intervals for the Difference Between Two Means
Chapter 47 Confidence Intervals for the Difference Between Two Means Introduction This procedure calculates the sample size necessary to achieve a specified distance from the difference in sample means
More informationPsychology 60 Fall 2013 Practice Exam Actual Exam: Next Monday. Good luck!
Psychology 60 Fall 2013 Practice Exam Actual Exam: Next Monday. Good luck! Name: 1. The basic idea behind hypothesis testing: A. is important only if you want to compare two populations. B. depends on
More informationOdds ratio, Odds ratio test for independence, chisquared statistic.
Odds ratio, Odds ratio test for independence, chisquared statistic. Announcements: Assignment 5 is live on webpage. Due Wed Aug 1 at 4:30pm. (9 days, 1 hour, 58.5 minutes ) Final exam is Aug 9. Review
More informationSimple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
More informationNCSS Statistical Software. OneSample TTest
Chapter 205 Introduction This procedure provides several reports for making inference about a population mean based on a single sample. These reports include confidence intervals of the mean or median,
More informationRegression Analysis: A Complete Example
Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty
More informationPart 3. Comparing Groups. Chapter 7 Comparing Paired Groups 189. Chapter 8 Comparing Two Independent Groups 217
Part 3 Comparing Groups Chapter 7 Comparing Paired Groups 189 Chapter 8 Comparing Two Independent Groups 217 Chapter 9 Comparing More Than Two Groups 257 188 Elementary Statistics Using SAS Chapter 7 Comparing
More informationTesting for differences I exercises with SPSS
Testing for differences I exercises with SPSS Introduction The exercises presented here are all about the ttest and its nonparametric equivalents in their various forms. In SPSS, all these tests can
More informationKSTAT MINIMANUAL. Decision Sciences 434 Kellogg Graduate School of Management
KSTAT MINIMANUAL Decision Sciences 434 Kellogg Graduate School of Management Kstat is a set of macros added to Excel and it will enable you to do the statistics required for this course very easily. To
More informationTesting Group Differences using Ttests, ANOVA, and Nonparametric Measures
Testing Group Differences using Ttests, ANOVA, and Nonparametric Measures Jamie DeCoster Department of Psychology University of Alabama 348 Gordon Palmer Hall Box 870348 Tuscaloosa, AL 354870348 Phone:
More informationNonParametric Tests (I)
Lecture 5: NonParametric Tests (I) KimHuat LIM lim@stats.ox.ac.uk http://www.stats.ox.ac.uk/~lim/teaching.html Slide 1 5.1 Outline (i) Overview of DistributionFree Tests (ii) Median Test for Two Independent
More information12.5: CHISQUARE GOODNESS OF FIT TESTS
125: ChiSquare Goodness of Fit Tests CD121 125: CHISQUARE GOODNESS OF FIT TESTS In this section, the χ 2 distribution is used for testing the goodness of fit of a set of data to a specific probability
More informationStatistics 2014 Scoring Guidelines
AP Statistics 2014 Scoring Guidelines College Board, Advanced Placement Program, AP, AP Central, and the acorn logo are registered trademarks of the College Board. AP Central is the official online home
More informationClass 19: Two Way Tables, Conditional Distributions, ChiSquare (Text: Sections 2.5; 9.1)
Spring 204 Class 9: Two Way Tables, Conditional Distributions, ChiSquare (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the
More informationTutorial 5: Hypothesis Testing
Tutorial 5: Hypothesis Testing Rob Nicholls nicholls@mrclmb.cam.ac.uk MRC LMB Statistics Course 2014 Contents 1 Introduction................................ 1 2 Testing distributional assumptions....................
More informationEstimation of σ 2, the variance of ɛ
Estimation of σ 2, the variance of ɛ The variance of the errors σ 2 indicates how much observations deviate from the fitted surface. If σ 2 is small, parameters β 0, β 1,..., β k will be reliably estimated
More informationStatistics 100 Sample Final Questions (Note: These are mostly multiple choice, for extra practice. Your Final Exam will NOT have any multiple choice!
Statistics 100 Sample Final Questions (Note: These are mostly multiple choice, for extra practice. Your Final Exam will NOT have any multiple choice!) Part A  Multiple Choice Indicate the best choice
More informationIn the past, the increase in the price of gasoline could be attributed to major national or global
Chapter 7 Testing Hypotheses Chapter Learning Objectives Understanding the assumptions of statistical hypothesis testing Defining and applying the components in hypothesis testing: the research and null
More informationChapter 2 Probability Topics SPSS T tests
Chapter 2 Probability Topics SPSS T tests Data file used: gss.sav In the lecture about chapter 2, only the OneSample T test has been explained. In this handout, we also give the SPSS methods to perform
More informationStatistiek II. John Nerbonne. October 1, 2010. Dept of Information Science j.nerbonne@rug.nl
Dept of Information Science j.nerbonne@rug.nl October 1, 2010 Course outline 1 Oneway ANOVA. 2 Factorial ANOVA. 3 Repeated measures ANOVA. 4 Correlation and regression. 5 Multiple regression. 6 Logistic
More informationThe Wilcoxon RankSum Test
1 The Wilcoxon RankSum Test The Wilcoxon ranksum test is a nonparametric alternative to the twosample ttest which is based solely on the order in which the observations from the two samples fall. We
More informationPaired 2 Sample ttest
Variations of the ttest: Paired 2 Sample 1 Paired 2 Sample ttest Suppose we are interested in the effect of different sampling strategies on the quality of data we recover from archaeological field surveys.
More informationChapter 8: Hypothesis Testing for One Population Mean, Variance, and Proportion
Chapter 8: Hypothesis Testing for One Population Mean, Variance, and Proportion Learning Objectives Upon successful completion of Chapter 8, you will be able to: Understand terms. State the null and alternative
More informationOneWay Analysis of Variance (ANOVA) Example Problem
OneWay Analysis of Variance (ANOVA) Example Problem Introduction Analysis of Variance (ANOVA) is a hypothesistesting technique used to test the equality of two or more population (or treatment) means
More informationGuide to Microsoft Excel for calculations, statistics, and plotting data
Page 1/47 Guide to Microsoft Excel for calculations, statistics, and plotting data Topic Page A. Writing equations and text 2 1. Writing equations with mathematical operations 2 2. Writing equations with
More informationStatistics. Onetwo sided test, Parametric and nonparametric test statistics: one group, two groups, and more than two groups samples
Statistics Onetwo sided test, Parametric and nonparametric test statistics: one group, two groups, and more than two groups samples February 3, 00 Jobayer Hossain, Ph.D. & Tim Bunnell, Ph.D. Nemours
More informationMath 251, Review Questions for Test 3 Rough Answers
Math 251, Review Questions for Test 3 Rough Answers 1. (Review of some terminology from Section 7.1) In a state with 459,341 voters, a poll of 2300 voters finds that 45 percent support the Republican candidate,
More informationThe Analysis of Variance ANOVA
3σ σ σ +σ +σ +3σ The Analysis of Variance ANOVA Lecture 0909.400.0 / 0909.400.0 Dr. P. s Clinic Consultant Module in Probability & Statistics in Engineering Today in P&S 3σ σ σ +σ +σ +3σ Analysis
More informationData Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools
Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools Occam s razor.......................................................... 2 A look at data I.........................................................
More informationINTERPRETING THE ONEWAY ANALYSIS OF VARIANCE (ANOVA)
INTERPRETING THE ONEWAY ANALYSIS OF VARIANCE (ANOVA) As with other parametric statistics, we begin the oneway ANOVA with a test of the underlying assumptions. Our first assumption is the assumption of
More informationNonparametric Statistics
Nonparametric Statistics J. Lozano University of Goettingen Department of Genetic Epidemiology Interdisciplinary PhD Program in Applied Statistics & Empirical Methods Graduate Seminar in Applied Statistics
More informationA POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING
CHAPTER 5. A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING 5.1 Concepts When a number of animals or plots are exposed to a certain treatment, we usually estimate the effect of the treatment
More informationSummary of Formulas and Concepts. Descriptive Statistics (Ch. 14)
Summary of Formulas and Concepts Descriptive Statistics (Ch. 14) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume
More informationIntroduction to Hypothesis Testing OPRE 6301
Introduction to Hypothesis Testing OPRE 6301 Motivation... The purpose of hypothesis testing is to determine whether there is enough statistical evidence in favor of a certain belief, or hypothesis, about
More informationBA 275 Review Problems  Week 5 (10/23/0610/27/06) CD Lessons: 48, 49, 50, 51, 52 Textbook: pp. 380394
BA 275 Review Problems  Week 5 (10/23/0610/27/06) CD Lessons: 48, 49, 50, 51, 52 Textbook: pp. 380394 1. Does vigorous exercise affect concentration? In general, the time needed for people to complete
More informationOutline. Definitions Descriptive vs. Inferential Statistics The ttest  Onesample ttest
The ttest Outline Definitions Descriptive vs. Inferential Statistics The ttest  Onesample ttest  Dependent (related) groups ttest  Independent (unrelated) groups ttest Comparing means Correlation
More informationSimple Regression Theory II 2010 Samuel L. Baker
SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the
More informationGeneral Method: Difference of Means. 3. Calculate df: either WelchSatterthwaite formula or simpler df = min(n 1, n 2 ) 1.
General Method: Difference of Means 1. Calculate x 1, x 2, SE 1, SE 2. 2. Combined SE = SE1 2 + SE2 2. ASSUMES INDEPENDENT SAMPLES. 3. Calculate df: either WelchSatterthwaite formula or simpler df = min(n
More informationPermutation Tests for Comparing Two Populations
Permutation Tests for Comparing Two Populations Ferry Butar Butar, Ph.D. JaeWan Park Abstract Permutation tests for comparing two populations could be widely used in practice because of flexibility of
More informationChapter 7. Oneway ANOVA
Chapter 7 Oneway ANOVA Oneway ANOVA examines equality of population means for a quantitative outcome and a single categorical explanatory variable with any number of levels. The ttest of Chapter 6 looks
More informationBowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition
Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition Online Learning Centre Technology StepbyStep  Excel Microsoft Excel is a spreadsheet software application
More informationHYPOTHESIS TESTING WITH SPSS:
HYPOTHESIS TESTING WITH SPSS: A NONSTATISTICIAN S GUIDE & TUTORIAL by Dr. Jim Mirabella SPSS 14.0 screenshots reprinted with permission from SPSS Inc. Published June 2006 Copyright Dr. Jim Mirabella CHAPTER
More informationChapter 7 Section 7.1: Inference for the Mean of a Population
Chapter 7 Section 7.1: Inference for the Mean of a Population Now let s look at a similar situation Take an SRS of size n Normal Population : N(, ). Both and are unknown parameters. Unlike what we used
More informationIntroduction to Analysis of Variance (ANOVA) Limitations of the ttest
Introduction to Analysis of Variance (ANOVA) The Structural Model, The Summary Table, and the One Way ANOVA Limitations of the ttest Although the ttest is commonly used, it has limitations Can only
More informationPremaster Statistics Tutorial 4 Full solutions
Premaster Statistics Tutorial 4 Full solutions Regression analysis Q1 (based on Doane & Seward, 4/E, 12.7) a. Interpret the slope of the fitted regression = 125,000 + 150. b. What is the prediction for
More informationStatCrunch and Nonparametric Statistics
StatCrunch and Nonparametric Statistics You can use StatCrunch to calculate the values of nonparametric statistics. It may not be obvious how to enter the data in StatCrunch for various data sets that
More informationNonInferiority Tests for Two Means using Differences
Chapter 450 oninferiority Tests for Two Means using Differences Introduction This procedure computes power and sample size for noninferiority tests in twosample designs in which the outcome is a continuous
More informationHypothesis Testing. Reminder of Inferential Statistics. Hypothesis Testing: Introduction
Hypothesis Testing PSY 360 Introduction to Statistics for the Behavioral Sciences Reminder of Inferential Statistics All inferential statistics have the following in common: Use of some descriptive statistic
More informationIntroduction. Hypothesis Testing. Hypothesis Testing. Significance Testing
Introduction Hypothesis Testing Mark Lunt Arthritis Research UK Centre for Ecellence in Epidemiology University of Manchester 13/10/2015 We saw last week that we can never know the population parameters
More informationBA 275 Review Problems  Week 6 (10/30/0611/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp. 394398, 404408, 410420
BA 275 Review Problems  Week 6 (10/30/0611/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp. 394398, 404408, 410420 1. Which of the following will increase the value of the power in a statistical test
More informationHow To Check For Differences In The One Way Anova
MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. OneWay
More informationTests for Two Proportions
Chapter 200 Tests for Two Proportions Introduction This module computes power and sample size for hypothesis tests of the difference, ratio, or odds ratio of two independent proportions. The test statistics
More informationNonparametric TwoSample Tests. Nonparametric Tests. Sign Test
Nonparametric TwoSample Tests Sign test MannWhitney Utest (a.k.a. Wilcoxon twosample test) KolmogorovSmirnov Test Wilcoxon SignedRank Test TukeyDuckworth Test 1 Nonparametric Tests Recall, nonparametric
More information