2. DATA AND EXERCISES (Geos2911 students please read page 8)

Size: px
Start display at page:

Download "2. DATA AND EXERCISES (Geos2911 students please read page 8)"

Transcription

1 2. DATA AND EXERCISES (Geos2911 students please read page 8) 2.1 Data set The data set available to you is an Excel spreadsheet file called cyclones.xls. The file consists of 3 sheets. Only the third is relevant to this week s practical. Sheet 3 Column 1 cyclone season. Column 2 cyclone identification number. Column 3 ocean basin the cyclone was generated. Column 4 central pressure of the cyclone in hpa. These data represent the total population of cyclones generated in the South Pacific Ocean (SPO) and South Indian Ocean (SIO). Note also: 1. The important aspect of this analysis is the intensity of each cyclone generated in Australian waters, particularly the numbers of the most intense Category 4 or greater cyclones. While it would also be useful to know their tracks to determine whether they crossed the coastline, such data is only available for cyclones back to 1980 (i.e. the data in Sheets 1 and 2). This is too short a time period for the low frequency large magnitude events that we are interested in today, thus we will investigate a longer record of cyclone intensity that exists back to 1907, and accept the shortcoming that we don t know whether they crossed the coastline or not. 2. Category 1 cyclone central pressures of hpa Category 2 cyclone central pressures of hpa Category 3 cyclone central pressures of hpa Category 4 cyclone central pressures of hpa Category 5 cyclone central pressures of <931 hpa 3. The lower the central pressure the more intense the cyclone. 4. Category 4 and 5 cyclones cause extensive damage and lead to major insured losses. 2.2 Exercises 1. Highlighting all of the columns with information in them (from Row 3 down), sort the data set according to ocean basin and then cut and paste the data so that you have a set of 4 columns for each basin next to each other. 2. Use the Tools Data analysis Histogram facility to produce a frequency histogram of the population of central pressures for cyclones generated in the South Indian Ocean. If you cannot find the histogram facility then use the Help menu and look for the FREQUENCY function. Produce a separate frequency histogram for cyclones generated in the South Pacific Ocean. Use a bin range of 900 to 1000 hpa with bin intervals of 10. Annotate the charts with appropriate axis labels and titles. Look at your plotted distributions does the data appear Normally distributed? 3. Calculate the mean of the central pressures for the cyclones generated in each ocean. This can be achieved using the AVERAGE function. Which ocean basin on average generates the most intense cyclones? 4. This question is intended to assess whether your answer in Step 3 above is

2 statistically significant. Insert a new worksheet into your Excel Workbook (Sheet 4) and copy your data sets for each ocean basin from Sheet 3 into Sheet 4. Now you are going to take a random sample of cyclone pressures from each ocean basin. The sample size will be 30 each from the South Indian and South Pacific Oceans. In a column next to the SIO, data create a column of 30 random numbers between 2 and 363, which is the range of row numbers in the SIO data set. Use the RANDBETWEEN functions to do this. Once you have the random numbers use the copy and paste special values facility to convert the cells from formulas to numbers, otherwise they will keep recalculating. Write down your list of random numbers for the South Indian Ocean on a sheet of paper. Then write next to each number on your sheet of paper the central pressure that corresponds to that row number. In the next column after your column of random numbers in Sheet 4 type in the corresponding central pressures. Repeat the exercise for the SPO data set, but collect 30 random numbers between 2 and 283. These are your random samples for each ocean basin. We want to assess if the average intensity of cyclones from South Indian Ocean is statistically equal to that of cyclones from the South Pacific Ocean. In statistics, an observation is statistically significant if it is unlikely to have occurred by chance. This question can be answered via statistical tools such as the Student s t-test and the Mann-Whitney test. Student s t-test for equivalence of means. Consider two samples x and y with sample size m and n, respectively. We are interested in the question are the means of x and y the same or different (i.e. is x = y or alternatively x > y ). In other words: Ho (null hypothesis): mean of population x = mean of population y H1 (alternate hypothesis): mean of population x > mean of population y The test statistic population m and n. x y t = 1 S. m + 1 n, in which S is the pooled variance of both With S = (m 1) *σ 2 2 x + (n 1) *σ y m + n 2 variance of m and n respectively. With in which σ x 2 and σ y 2 are the sample (x x ) 2 σ 2 x = m and (y y ) 2 σ 2 y = n If test statistic t is lower that the critical t given in the critical t distribution table (cf appendice) for the degree of freedom of the test (ν=m+n-2) then the null hypothesis is correct for the given degree of significance of the test. The principal assumption of the Student s t- test is that the samples are drawn from populations that are normally distributed (ie. characterized by data that cluster around the mean). The standard deviation σ expresses the dispersion of x i about the mean. Test the following hypothesis using a Student s t-test.

3 Null hypothesis: The mean of the central pressures of cyclones in the South Pacific Ocean is equal to the mean for the South Indian Ocean. Alternate hypothesis: The means of the central pressures of cyclones in the South Pacific Ocean is greater than the mean for the South Indian Ocean. You will first need to calculate the t-statistic, and then compare it to the critical t for the appropriate degrees of freedom and level of confidence. For both the South Indian and South Pacific oceans: 1- Calculate the pressure average. 2- Calculate for each cyclone the square of the difference between its pressure and the pressure average: (P-Average[P]) 2 3- Average all (P-Average[P]) 2, this is the variance of the pressure. 4- Calculate the pooled variance (S) of both the South Indian and South Pacific oceans: S = (m 1) *σ 2 2 x + (n 1) *σ y, in which σ 2 x and σ 2 y are the averaged m + n 2 (P-Average[P]) 2 for South Indian and South Pacific ocean. x y 5- Calculate the test statistic t = 1 S. m + 1 in which m is the number of n cyclones in the South Indian and n the number of cyclone in the South Pacific ocean; x and y are the pressure average for the South Indian and South Pacific oceans respectively. 6- Calculate the degree of freedom (ν) of the test: m+n-2. The mean of the central pressures of cyclones in the South Pacific Ocean is statistically equal to the mean for the South Indian Ocean when the calculated test statistic t is less that the critical t value given in the critical t distribution table. If it is not the case then the alternative hypothesis cannot be ruled out. Use the critical t distribution table and the degree of freedom (ν) to determine the probability that the calculated test statistic t is less that the critical t value in the t distribution table. The level of confidence (in %) is given by (100-α). Based on your statistical test complete the following sentence: We can be % confident that the mean of the central pressures of cyclones generated in the South Pacific Ocean (is or is not) significantly greater than the mean for the South Indian Ocean. Are the assumptions of the Student s t-test satisfied (recall your answer to Exercise 2)? How reliable is your test? 5. Insert a new worksheet in your Excel workbook (Sheet 5) and copy your sample of cyclone central pressures for the South Indian Ocean. Place a column of labels, SIO, next to them. Do the same for the South Pacific Ocean central pressures, but place them directly beneath the SIO sample. Use the RANK function to rank the central pressures in ascending order. Perform a Mann-Whitney test to determine at 95% confidence (α=5%) if the central pressures in the South Pacific and South Indian Oceans are significantly different. For this consider two random samples x and y with sample size m (SIO)

4 and n (SPO) respectively. We are interested in the question are the medians of x and y the same or different. In other words: Null hypothesis Ho: median of population x = median of population y Alternate hypothesis H1: median of population x > median of population y Mann-Whitney statistic for equivalence of medians. In statistics, the Mann- Whitney test assesses whether two samples of observations come from the same distribution. The Mann-Whitney test is useful in the same situations as the Student's t-test, and the question arises of which should be preferred. Consider two random samples x and y with sample size m and n respectively. We are interested in the question: Are the medians of x and y the same or different? In other words: Null hypothesis Ho: median of population x = median of population y Alternate hypothesis H1: median of population x > median of population y The test statistic t is calculated using: t = mn + m(m +1) 2 m R(x i ) i=1 where R(xi ) are the ranks of sample x and m is the sample size of x. The sample size of y is n. The test statistic t can be understood as the number of times observations in one sample precede observations in the other sample in the ranking. Critical values for t for the Mann-Whitney test are listed in the appendice. For the hypothesis stated above the appropriate test is a one-tail test (statistical test in which the critical region consists of all values that are less than a given value or greater than a given value, but not both). If the calculated test statistic t is less than the critical t we reject the null hypothesis. If it is greater, we cannot reject the null hypothesis. Note that there are no assumptions concerning the distribution of the samples or populations for the Mann-Whitney test. To perform a Mann-Whitney test one has to calculate the test statistic t: m m(m +1) t = mn + R(x 2 i ), in which R(x i ) are the ranks of sample x (x individual i=1 SIO cyclones), m is the number of SIO cyclones. Based on your statistical test complete the following sentence: We can be % confident that the mean of the central pressures of cyclones generated in the South Pacific Ocean (is or is not) significantly greater than the mean for the South Indian Ocean. Does the result differ from your t-test? Which test is more reliable in this case and why? Have you changed your mind regarding your answer to Exercise 3? 6. Insert a new worksheet in your Excel workbook (Sheet 6) and copy your data sets for each ocean basin from Sheet 3 into Sheet 6. In Sheet 6, highlighting all of the columns with information in them, sort the data set for the South Indian Ocean in ascending order according to central cyclone pressure. In the next column, enter a tag from 5 through to 1 that indicates the cyclone category based on the central pressures (see note 2 Section 2.1). Do the same for the South Pacific Ocean.

5 Copy that part of the list of years that includes Category 5 and 4 cyclones in the South Indian Ocean to a new location in Sheet 6. Sort this sub-list of years into ascending order. Next to this list, create a new list, which contains the number of Category 4 or greater cyclones that occurred in each decade: ; ; ; Do the same for the South Pacific Ocean. Determine the average rate at which Category 4 or greater cyclones occur in a decade for both the South Indian and South Pacific Oceans. Find the probability that the time between two successive Category 4 or greater cyclones is less than 1 year for the South Indian Ocean. Do the same for the South Pacific Ocean. Use the inferences from the exponential distribution, which assumes that the number of Category 4 or greater cyclones occurring in successive decades has a Poisson distribution. Inferences from exponential distribution: If discrete events occur randomly and independently at the mean rate λ per time interval y (so that the number occurring in a time interval has a Poisson distribution with parameter λ), the intervals between events give rise to a relative frequency histogram conforming to an exponential distribution. The probability that the time between two successive events X is less than a given time period x can be evaluated by using the following result: Pr(X x) =1 Exp( λ x y ) where λ is the mean rate of occurrence per interval y. This result is based on several assumptions for a Poisson process: 1. The process is independent. 2. The probability of one occurrence in any time interval is approximately proportional to the size of the interval. 3. The process is stationary; i.e. the number of occurrences in a time interval has the same probability distribution for all time intervals. In other words, the value of λ should not have an increasing or decreasing trend with time. Is the probability of two Category 4 or greater cyclones (which cause major insured losses, see note Section 2.1) occurring in the one year relatively low (ca. <50%) or relatively high (ca. >50%) for the South Indian Ocean; for the South Pacific Ocean. Does the last assumption listed for a Poisson process (see Section 1) appear to be satisfied here? Repeat the calculations to find the probability that the time between two successive Category 4 or greater cyclones is less than 1 year for the South Indian Ocean, based only on the past 3 decades of data. Do the same for the South Pacific Ocean, but based on the last 4 decades of data. How does this change your answer to the previous question? What might be making the record of cyclone activity unsteady (i.e. increasing number of intense cyclones in recent years)? See Science and Nature articles on WebCT.

6 REPORT (Geos-2911 only) In addition to the indicated material from Prac 2, the graphs from Exercise 2 and results from Exercises 3 to 6 in this Prac 3 provide the basis for the following report, so make sure that you understand the concepts clearly and have produced the graphs correctly. You are working as a geoscientist for an insurance company and you have been asked to prepare a report addressing whether households and businesses in Port Hedland and Cairns should be charged the same premium for insurance against losses due to cyclones. Use your knowledge of the components involved in assessing risk (recall the Introduction lecture), as well as the exercises you have completed in Pracs 2 and 3, to write this report. Your report should have the following sections: Introduction, Data and Methods, Results, and Conclusion. The text should be no longer than 4 double spaced pages (excluding figures and tables). The results section of your report should incorporate all of the indicated graphs and answers to questions in Pracs 2 and 3. Your conclusion must make an explicit recommendation one way or the other regarding whether premiums should differ between the two towns and if so which should be higher. Note that there is no absolute right or wrong answer here; it depends on how you view risk. Make sure you justify your conclusion. nb: When you are writing your report, note that the occurrence of two Category 4 or greater cyclones crossing the coast in a year causes serious cash flow problems for insurance companies because of large successive payouts in a short period of time. Don t forget, however, that the analysis in this prac has been for all cyclones generated in the South Indian and South Pacific Oceans and not all of these necessarily cross the coast.

7

8

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.

More information

Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition

Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition Online Learning Centre Technology Step-by-Step - Excel Microsoft Excel is a spreadsheet software application

More information

Lecture 13: Kolmogorov Smirnov Test & Power of Tests

Lecture 13: Kolmogorov Smirnov Test & Power of Tests Lecture 13: Kolmogorov Smirnov Test & Power of Tests S. Massa, Department of Statistics, University of Oxford 2 February 2016 An example Suppose you are given the following 100 observations. -0.16-0.68-0.32-0.85

More information

Technology Step-by-Step Using StatCrunch

Technology Step-by-Step Using StatCrunch Technology Step-by-Step Using StatCrunch Section 1.3 Simple Random Sampling 1. Select Data, highlight Simulate Data, then highlight Discrete Uniform. 2. Fill in the following window with the appropriate

More information

Introduction to Hypothesis Testing. Hypothesis Testing. Step 1: State the Hypotheses

Introduction to Hypothesis Testing. Hypothesis Testing. Step 1: State the Hypotheses Introduction to Hypothesis Testing 1 Hypothesis Testing A hypothesis test is a statistical procedure that uses sample data to evaluate a hypothesis about a population Hypothesis is stated in terms of the

More information

NPTEL STRUCTURAL RELIABILITY

NPTEL STRUCTURAL RELIABILITY NPTEL Course On STRUCTURAL RELIABILITY Module # 02 Lecture 6 Course Format: Web Instructor: Dr. Arunasis Chakraborty Department of Civil Engineering Indian Institute of Technology Guwahati 6. Lecture 06:

More information

NCSS Statistical Software

NCSS Statistical Software Chapter 06 Introduction This procedure provides several reports for the comparison of two distributions, including confidence intervals for the difference in means, two-sample t-tests, the z-test, the

More information

One-Sample t-test. Example 1: Mortgage Process Time. Problem. Data set. Data collection. Tools

One-Sample t-test. Example 1: Mortgage Process Time. Problem. Data set. Data collection. Tools One-Sample t-test Example 1: Mortgage Process Time Problem A faster loan processing time produces higher productivity and greater customer satisfaction. A financial services institution wants to establish

More information

Review of the Topics for Midterm II

Review of the Topics for Midterm II Review of the Topics for Midterm II STA 100 Lecture 18 I. Confidence Interval 1. Point Estimation a. A point estimator of a parameter is a statistic used to estimate that parameter. b. Properties of a

More information

Chi-Square Test. Contingency Tables. Contingency Tables. Chi-Square Test for Independence. Chi-Square Tests for Goodnessof-Fit

Chi-Square Test. Contingency Tables. Contingency Tables. Chi-Square Test for Independence. Chi-Square Tests for Goodnessof-Fit Chi-Square Tests 15 Chapter Chi-Square Test for Independence Chi-Square Tests for Goodness Uniform Goodness- Poisson Goodness- Goodness Test ECDF Tests (Optional) McGraw-Hill/Irwin Copyright 2009 by The

More information

Projects Involving Statistics (& SPSS)

Projects Involving Statistics (& SPSS) Projects Involving Statistics (& SPSS) Academic Skills Advice Starting a project which involves using statistics can feel confusing as there seems to be many different things you can do (charts, graphs,

More information

Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Two-Sample T-Tests Assuming Equal Variance (Enter Means) Chapter 4 Two-Sample T-Tests Assuming Equal Variance (Enter Means) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when the variances of

More information

Suggested problem set #4. Chapter 7: 4, 9, 10, 11, 17, 20. Chapter 7.

Suggested problem set #4. Chapter 7: 4, 9, 10, 11, 17, 20. Chapter 7. Suggested problem set #4 Chapter 7: 4, 9, 10, 11, 17, 20 Chapter 7. 4. For the following scenarios, say whether the binomial distribution would describe the probability distribution of possible outcomes.

More information

Variables and Data A variable contains data about anything we measure. For example; age or gender of the participants or their score on a test.

Variables and Data A variable contains data about anything we measure. For example; age or gender of the participants or their score on a test. The Analysis of Research Data The design of any project will determine what sort of statistical tests you should perform on your data and how successful the data analysis will be. For example if you decide

More information

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Chapter 45 Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when no assumption

More information

LAB 4 ASSIGNMENT CONFIDENCE INTERVALS AND HYPOTHESIS TESTING. Using Data to Make Decisions

LAB 4 ASSIGNMENT CONFIDENCE INTERVALS AND HYPOTHESIS TESTING. Using Data to Make Decisions LAB 4 ASSIGNMENT CONFIDENCE INTERVALS AND HYPOTHESIS TESTING This lab assignment will give you the opportunity to explore the concept of a confidence interval and hypothesis testing in the context of a

More information

GOODNESS OF FIT INTRODUCTION GOODNESS OF FIT TESTS

GOODNESS OF FIT INTRODUCTION GOODNESS OF FIT TESTS GOODNESS OF FIT INTRODUCTION Goodness of fit tests are used to determine how well the shape of a sample of data obtained from an experiment matches a conjectured or hypothesized distribution shape for

More information

Illustrations of using the menus in Minitab 17 for confidence intervals and hypothesis tests on means and proportions.

Illustrations of using the menus in Minitab 17 for confidence intervals and hypothesis tests on means and proportions. Confidence Intervals and Hypothesis tests Minitab 17 Mary Parker page 1 of 8 Illustrations of using the menus in Minitab 17 for confidence intervals and hypothesis tests on means and proportions. The output

More information

Chapter 3 RANDOM VARIATE GENERATION

Chapter 3 RANDOM VARIATE GENERATION Chapter 3 RANDOM VARIATE GENERATION In order to do a Monte Carlo simulation either by hand or by computer, techniques must be developed for generating values of random variables having known distributions.

More information

Hypothesis Test Notes Chi-Squared Test Statistic & Goodness of Fit Test

Hypothesis Test Notes Chi-Squared Test Statistic & Goodness of Fit Test Hypothesis Test Notes Chi-Squared Test Statistic & Goodness of Fit Test Remember when comparing a sample percentage to a claimed population percentage we use a 1 proportion hypothesis test and a Z-test

More information

HYPOTHESIS TESTING: POWER OF THE TEST

HYPOTHESIS TESTING: POWER OF THE TEST HYPOTHESIS TESTING: POWER OF THE TEST The first 6 steps of the 9-step test of hypothesis are called "the test". These steps are not dependent on the observed data values. When planning a research project,

More information

A GUIDE TO SIMULATION IN MINITAB

A GUIDE TO SIMULATION IN MINITAB 1 A GUIDE TO SIMULATION IN MINITAB Many simulations can be run easily in MINITAB by using menu commands. For some simulations it is necessary to write a short program, called a macro, to perform the commands.

More information

When one clicks on the right end of the Input Range Box that has a red arrow mark, the following dialog will be shown.

When one clicks on the right end of the Input Range Box that has a red arrow mark, the following dialog will be shown. EXCEL Instruction I, 9/22/01 1 Getting Descriptive Statistics with EXCEL. If one wishes to obtain descriptive statistics for the weight variable, 1) First open or create the data sheet and then click on

More information

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar business statistics using Excel Glyn Davis & Branko Pecar OXFORD UNIVERSITY PRESS Detailed contents Introduction to Microsoft Excel 2003 Overview Learning Objectives 1.1 Introduction to Microsoft Excel

More information

NCSS Statistical Software. One-Sample T-Test

NCSS Statistical Software. One-Sample T-Test Chapter 205 Introduction This procedure provides several reports for making inference about a population mean based on a single sample. These reports include confidence intervals of the mean or median,

More information

Normality Testing in Excel

Normality Testing in Excel Normality Testing in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com

More information

3.4 Statistical inference for 2 populations based on two samples

3.4 Statistical inference for 2 populations based on two samples 3.4 Statistical inference for 2 populations based on two samples Tests for a difference between two population means The first sample will be denoted as X 1, X 2,..., X m. The second sample will be denoted

More information

Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools

Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools Occam s razor.......................................................... 2 A look at data I.........................................................

More information

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management KSTAT MINI-MANUAL Decision Sciences 434 Kellogg Graduate School of Management Kstat is a set of macros added to Excel and it will enable you to do the statistics required for this course very easily. To

More information

Using Excel for inferential statistics

Using Excel for inferential statistics FACT SHEET Using Excel for inferential statistics Introduction When you collect data, you expect a certain amount of variation, just caused by chance. A wide variety of statistical tests can be applied

More information

Statistical Analysis

Statistical Analysis by Dr. James E. Parks Department of Physics and Astronomy 401 Nielsen Physics Building The University of Tennessee Knoxville, Tennessee 37996-1200 Copyright August, 2000 by James Edgar Parks* *All rights

More information

Lecture - 32 Regression Modelling Using SPSS

Lecture - 32 Regression Modelling Using SPSS Applied Multivariate Statistical Modelling Prof. J. Maiti Department of Industrial Engineering and Management Indian Institute of Technology, Kharagpur Lecture - 32 Regression Modelling Using SPSS (Refer

More information

Statistical Functions in Excel

Statistical Functions in Excel Statistical Functions in Excel There are many statistical functions in Excel. Moreover, there are other functions that are not specified as statistical functions that are helpful in some statistical analyses.

More information

CHAPTER 11 CHI-SQUARE: NON-PARAMETRIC COMPARISONS OF FREQUENCY

CHAPTER 11 CHI-SQUARE: NON-PARAMETRIC COMPARISONS OF FREQUENCY CHAPTER 11 CHI-SQUARE: NON-PARAMETRIC COMPARISONS OF FREQUENCY The hypothesis testing statistics detailed thus far in this text have all been designed to allow comparison of the means of two or more samples

More information

Minitab Guide. This packet contains: A Friendly Guide to Minitab. Minitab Step-By-Step

Minitab Guide. This packet contains: A Friendly Guide to Minitab. Minitab Step-By-Step Minitab Guide This packet contains: A Friendly Guide to Minitab An introduction to Minitab; including basic Minitab functions, how to create sets of data, and how to create and edit graphs of different

More information

Lecture Topic 6: Chapter 9 Hypothesis Testing

Lecture Topic 6: Chapter 9 Hypothesis Testing Lecture Topic 6: Chapter 9 Hypothesis Testing 9.1 Developing Null and Alternative Hypotheses Hypothesis testing can be used to determine whether a statement about the value of a population parameter should

More information

Permutation Tests for Comparing Two Populations

Permutation Tests for Comparing Two Populations Permutation Tests for Comparing Two Populations Ferry Butar Butar, Ph.D. Jae-Wan Park Abstract Permutation tests for comparing two populations could be widely used in practice because of flexibility of

More information

Paired T-Test for Equivalence

Paired T-Test for Equivalence Chapter 202 Paired T-Test for Equivalence Introduction This procedure provides reports for making inference about the equivalence of the means two means based on a paired sample. The question of interest

More information

MBA 611 STATISTICS AND QUANTITATIVE METHODS

MBA 611 STATISTICS AND QUANTITATIVE METHODS MBA 611 STATISTICS AND QUANTITATIVE METHODS Part I. Review of Basic Statistics (Chapters 1-11) A. Introduction (Chapter 1) Uncertainty: Decisions are often based on incomplete information from uncertain

More information

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES SCHOOL OF HEALTH AND HUMAN SCIENCES Using SPSS Topics addressed today: 1. Differences between groups 2. Graphing Use the s4data.sav file for the first part of this session. DON T FORGET TO RECODE YOUR

More information

Non-Parametric Tests (I)

Non-Parametric Tests (I) Lecture 5: Non-Parametric Tests (I) KimHuat LIM lim@stats.ox.ac.uk http://www.stats.ox.ac.uk/~lim/teaching.html Slide 1 5.1 Outline (i) Overview of Distribution-Free Tests (ii) Median Test for Two Independent

More information

Chapter 7 Part 2. Hypothesis testing Power

Chapter 7 Part 2. Hypothesis testing Power Chapter 7 Part 2 Hypothesis testing Power November 6, 2008 All of the normal curves in this handout are sampling distributions Goal: To understand the process of hypothesis testing and the relationship

More information

Describing the data graphically: Frequency distributions, histograms, and other types of graphs

Describing the data graphically: Frequency distributions, histograms, and other types of graphs Lecture 2 Describing the data graphically: Frequency distributions, histograms, and other types of graphs. 2-1 2.1 Frequency Distributions and Histograms Frequency Distribution A summary of a set of data

More information

Suggested solution for exam in MSA830: Statistical Analysis and Experimental Design October 2009

Suggested solution for exam in MSA830: Statistical Analysis and Experimental Design October 2009 Petter Mostad Matematisk Statistik Chalmers Suggested solution for exam in MSA830: Statistical Analysis and Experimental Design October 2009 1. (a) To use a t-test, one must assume that both groups of

More information

TIPS FOR DOING STATISTICS IN EXCEL

TIPS FOR DOING STATISTICS IN EXCEL TIPS FOR DOING STATISTICS IN EXCEL Before you begin, make sure that you have the DATA ANALYSIS pack running on your machine. It comes with Excel. Here s how to check if you have it, and what to do if you

More information

Six Sigma Black Belt Study guides

Six Sigma Black Belt Study guides Six Sigma Black Belt Study guides 1 www.pmtutor.org Powered by POeT Solvers Limited. Measure Basic Statistics 2 www.pmtutor.org Powered by POeT Solvers Limited. Basic Terms In Descriptive Statistics, the

More information

Comments on Discussion Sheet 21 and Worksheet 21 ( ) Hypothesis Tests for Population Variances and Ratios of Variances

Comments on Discussion Sheet 21 and Worksheet 21 ( ) Hypothesis Tests for Population Variances and Ratios of Variances Comments on Discussion Sheet and Worksheet ( 9.- 9.3) Hypothesis Tests for Population Variances and Ratios of Variances Discussion Sheet Hypothesis Tests for Population Variances and Ratios of Variances

More information

CHAPTER 12 TESTING DIFFERENCES WITH ORDINAL DATA: MANN WHITNEY U

CHAPTER 12 TESTING DIFFERENCES WITH ORDINAL DATA: MANN WHITNEY U CHAPTER 12 TESTING DIFFERENCES WITH ORDINAL DATA: MANN WHITNEY U Previous chapters of this text have explained the procedures used to test hypotheses using interval data (t-tests and ANOVA s) and nominal

More information

Continuous Random Variables and the Normal Distribution

Continuous Random Variables and the Normal Distribution Overview Continuous Random Variables and the Normal Distribution Dr Tom Ilvento Department of Food and Resource Economics Most intro stat class would have a section on probability - we don t But it is

More information

To open a CMA file > Download and Save file Start CMA Open file from within CMA

To open a CMA file > Download and Save file Start CMA Open file from within CMA Example name Cannon High Dose vs. Standard Dose Statins Effect size Analysis type Level Risk ratio Basic Basic Synopsis This analysis includes four studies where patients were randomized to receive either

More information

Describing Populations Statistically: The Mean, Variance, and Standard Deviation

Describing Populations Statistically: The Mean, Variance, and Standard Deviation Describing Populations Statistically: The Mean, Variance, and Standard Deviation BIOLOGICAL VARIATION One aspect of biology that holds true for almost all species is that not every individual is exactly

More information

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis

More information

Excel 2007 BASICS for Elementary Statistics: Looking at the Big Picture

Excel 2007 BASICS for Elementary Statistics: Looking at the Big Picture Excel 2007 BASICS for Elementary Statistics: Looking at the Big Picture By Nancy Pfenning and Melissa M. Sovak Preview c Brooks/Cole, Cengage Learning The first part of Elementary Statistics: Looking at

More information

Confidence Intervals to Assess Variation in Fat Content at a Fast-Food Restaurant

Confidence Intervals to Assess Variation in Fat Content at a Fast-Food Restaurant 3 Confidence Intervals to Assess Variation in Fat Content at a Fast-Food Restaurant Hamburrgerr, Inc. is a fast-food restaurant serving hamburgers among a few other items. The restaurant claims that the

More information

Chapter 16: Nonparametric Tests

Chapter 16: Nonparametric Tests Chapter 16: Nonparametric Tests In Chapter 1-13 we discussed tests of hypotheses in a parametric statistics framework: which assumes that the functional form of the (population) probability distribution

More information

The data analysis tools are located on the Data Tab in the Analysis Group on the Excel Ribbon, as shown below. Data Tab Data Analysis Tools

The data analysis tools are located on the Data Tab in the Analysis Group on the Excel Ribbon, as shown below. Data Tab Data Analysis Tools Getting to Know Your Data (Activity) Maryann Allen There are two components to this activity: Technical Component using Descriptive Statistics and Histogram Tools in Excel Quantitative Literacy Component

More information

12.5: CHI-SQUARE GOODNESS OF FIT TESTS

12.5: CHI-SQUARE GOODNESS OF FIT TESTS 125: Chi-Square Goodness of Fit Tests CD12-1 125: CHI-SQUARE GOODNESS OF FIT TESTS In this section, the χ 2 distribution is used for testing the goodness of fit of a set of data to a specific probability

More information

Final Exam Review Questions

Final Exam Review Questions Final Exam Review Questions You should work each of the following on your own, then review the solutions guide. DO NOT look at the solutions guide first. 1. Explain the difference between a population

More information

Minitab Guide for Math 355

Minitab Guide for Math 355 Minitab Guide for Math 355 8 7 6 Heights of Math 355 Students Spring, 2002 Frequency 5 4 3 2 1 0 60 62 64 66 68 70 72 74 76 78 80 Height Heights by Gender 80 Height 70 60 Female Gender Male Descriptive

More information

II. DISTRIBUTIONS distribution normal distribution. standard scores

II. DISTRIBUTIONS distribution normal distribution. standard scores Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

More information

THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7.

THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7. THERE ARE TWO WAYS TO DO HYPOTHESIS TESTING WITH STATCRUNCH: WITH SUMMARY DATA (AS IN EXAMPLE 7.17, PAGE 236, IN ROSNER); WITH THE ORIGINAL DATA (AS IN EXAMPLE 8.5, PAGE 301 IN ROSNER THAT USES DATA FROM

More information

Notes 5a: One-sample t test

Notes 5a: One-sample t test Notes 5a: One-sample t test 1. Purpose One-sample t-test is designed to test whether one sample of data differs from a standard value or a population mean. The data must be quantitative (ratio, interval,

More information

Chapter Additional: Standard Deviation and Chi- Square

Chapter Additional: Standard Deviation and Chi- Square Chapter Additional: Standard Deviation and Chi- Square Chapter Outline: 6.4 Confidence Intervals for the Standard Deviation 7.5 Hypothesis testing for Standard Deviation Section 6.4 Objectives Interpret

More information

Statistics 641 - EXAM II - 1999 through 2003

Statistics 641 - EXAM II - 1999 through 2003 Statistics 641 - EXAM II - 1999 through 2003 December 1, 1999 I. (40 points ) Place the letter of the best answer in the blank to the left of each question. (1) In testing H 0 : µ 5 vs H 1 : µ > 5, the

More information

Non-Inferiority Tests for Two Means using Differences

Non-Inferiority Tests for Two Means using Differences Chapter 450 on-inferiority Tests for Two Means using Differences Introduction This procedure computes power and sample size for non-inferiority tests in two-sample designs in which the outcome is a continuous

More information

1/22/2016. What are paired data? Tests of Differences: two related samples. What are paired data? Paired Example. Paired Data.

1/22/2016. What are paired data? Tests of Differences: two related samples. What are paired data? Paired Example. Paired Data. Tests of Differences: two related samples What are paired data? Frequently data from ecological work take the form of paired (matched, related) samples Before and after samples at a specific site (or individual)

More information

Confidence Intervals for the Difference Between Two Means

Confidence Intervals for the Difference Between Two Means Chapter 47 Confidence Intervals for the Difference Between Two Means Introduction This procedure calculates the sample size necessary to achieve a specified distance from the difference in sample means

More information

Module 4 (Effect of Alcohol on Worms): Data Analysis

Module 4 (Effect of Alcohol on Worms): Data Analysis Module 4 (Effect of Alcohol on Worms): Data Analysis Michael Dunn Capuchino High School Introduction In this exercise, you will first process the timelapse data you collected. Then, you will cull (remove)

More information

Hypothesis Testing COMP 245 STATISTICS. Dr N A Heard. 1 Hypothesis Testing 2 1.1 Introduction... 2 1.2 Error Rates and Power of a Test...

Hypothesis Testing COMP 245 STATISTICS. Dr N A Heard. 1 Hypothesis Testing 2 1.1 Introduction... 2 1.2 Error Rates and Power of a Test... Hypothesis Testing COMP 45 STATISTICS Dr N A Heard Contents 1 Hypothesis Testing 1.1 Introduction........................................ 1. Error Rates and Power of a Test.............................

More information

Difference of Means and ANOVA Problems

Difference of Means and ANOVA Problems Difference of Means and Problems Dr. Tom Ilvento FREC 408 Accounting Firm Study An accounting firm specializes in auditing the financial records of large firm It is interested in evaluating its fee structure,particularly

More information

When the conditions are met, the standardized sample difference between the means of two independent groups, t= SE(y - y )

When the conditions are met, the standardized sample difference between the means of two independent groups, t= SE(y - y ) STAT E-50 - Introduction to Statistics Comparing Means; Paired Samples The Sampling Distribution for the Difference between Two Means When the conditions are met, the standardized sample difference between

More information

Lecture 3: Quantitative Variable

Lecture 3: Quantitative Variable Lecture 3: Quantitative Variable 1 Quantitative Variable A quantitative variable takes values that have numerical meanings. For example, educ = 14 means that a person has 14 years of schooling. Do not

More information

9-3.4 Likelihood ratio test. Neyman-Pearson lemma

9-3.4 Likelihood ratio test. Neyman-Pearson lemma 9-3.4 Likelihood ratio test Neyman-Pearson lemma 9-1 Hypothesis Testing 9-1.1 Statistical Hypotheses Statistical hypothesis testing and confidence interval estimation of parameters are the fundamental

More information

STA-3123: Statistics for Behavioral and Social Sciences II. Text Book: McClave and Sincich, 12 th edition. Contents and Objectives

STA-3123: Statistics for Behavioral and Social Sciences II. Text Book: McClave and Sincich, 12 th edition. Contents and Objectives STA-3123: Statistics for Behavioral and Social Sciences II Text Book: McClave and Sincich, 12 th edition Contents and Objectives Initial Review and Chapters 8 14 (Revised: Aug. 2014) Initial Review on

More information

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance

More information

Statistiek I. t-tests. John Nerbonne. CLCG, Rijksuniversiteit Groningen. John Nerbonne 1/35

Statistiek I. t-tests. John Nerbonne. CLCG, Rijksuniversiteit Groningen.  John Nerbonne 1/35 Statistiek I t-tests John Nerbonne CLCG, Rijksuniversiteit Groningen http://wwwletrugnl/nerbonne/teach/statistiek-i/ John Nerbonne 1/35 t-tests To test an average or pair of averages when σ is known, we

More information

Hypothesis Testing hypothesis testing approach formulation of the test statistic

Hypothesis Testing hypothesis testing approach formulation of the test statistic Hypothesis Testing For the next few lectures, we re going to look at various test statistics that are formulated to allow us to test hypotheses in a variety of contexts: In all cases, the hypothesis testing

More information

MAT140: Applied Statistical Methods Summary of Calculating Confidence Intervals and Sample Sizes for Estimating Parameters

MAT140: Applied Statistical Methods Summary of Calculating Confidence Intervals and Sample Sizes for Estimating Parameters MAT140: Applied Statistical Methods Summary of Calculating Confidence Intervals and Sample Sizes for Estimating Parameters Inferences about a population parameter can be made using sample statistics for

More information

seven Statistical Analysis with Excel chapter OVERVIEW CHAPTER

seven Statistical Analysis with Excel chapter OVERVIEW CHAPTER seven Statistical Analysis with Excel CHAPTER chapter OVERVIEW 7.1 Introduction 7.2 Understanding Data 7.3 Relationships in Data 7.4 Distributions 7.5 Summary 7.6 Exercises 147 148 CHAPTER 7 Statistical

More information

The Math Part of the Course

The Math Part of the Course The Math Part of the Course Measures of Central Tendency Mode: The number with the highest frequency in a dataset Median: The middle number in a dataset Mean: The average of the dataset When to use each:

More information

Introduction to Quantitative Methods

Introduction to Quantitative Methods Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................

More information

Unit 21 Student s t Distribution in Hypotheses Testing

Unit 21 Student s t Distribution in Hypotheses Testing Unit 21 Student s t Distribution in Hypotheses Testing Objectives: To understand the difference between the standard normal distribution and the Student's t distributions To understand the difference between

More information

Unit 14: Nonparametric Statistical Methods

Unit 14: Nonparametric Statistical Methods Unit 14: Nonparametric Statistical Methods Statistics 571: Statistical Methods Ramón V. León 7/26/2004 Unit 14 - Stat 571 - Ramón V. León 1 Introductory Remarks Most methods studied so far have been based

More information

RESULTS: STATISTICAL INFERENCE

RESULTS: STATISTICAL INFERENCE UNDERSTANDING RESEARCH RESULTS: STATISTICAL INFERENCE A FEW TERMS A FEW TERMS SAMPLES AND POPULATIONS Inferential statistics are necessary because The results of a given study are based on data obtained

More information

Seminar paper Statistics

Seminar paper Statistics Seminar paper Statistics The seminar paper must contain: - the title page - the characterization of the data (origin, reason why you have chosen this analysis,...) - the list of the data (in the table)

More information

Data Analysis Tools. Tools for Summarizing Data

Data Analysis Tools. Tools for Summarizing Data Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool

More information

Lecture 7: Binomial Test, Chisquare

Lecture 7: Binomial Test, Chisquare Lecture 7: Binomial Test, Chisquare Test, and ANOVA May, 01 GENOME 560, Spring 01 Goals ANOVA Binomial test Chi square test Fisher s exact test Su In Lee, CSE & GS suinlee@uw.edu 1 Whirlwind Tour of One/Two

More information

Activity Duration Method Media References. 9. Evaluation The evaluation is performed based on the student activities in discussion, doing exercise.

Activity Duration Method Media References. 9. Evaluation The evaluation is performed based on the student activities in discussion, doing exercise. 5. Based Competency : To understand basic concepts of statistics a) Student can explain definition of statistics, data, types of data, variable b) Student can explain sample and population, parameter and

More information

How to Conduct a Hypothesis Test

How to Conduct a Hypothesis Test How to Conduct a Hypothesis Test The idea of hypothesis testing is relatively straightforward. In various studies we observe certain events. We must ask, is the event due to chance alone, or is there some

More information

Hypothesis Test Notes Analysis of Variance (ANOVA)

Hypothesis Test Notes Analysis of Variance (ANOVA) ypothesis Test Notes Analysis of Variance (ANOVA) Recall that the goodness of fit categorical data test can be used when comparing a percentage in 3 or more groups. What if we have quantitative data from

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information

One-Way Repeated-Measures ANOVA

One-Way Repeated-Measures ANOVA One-Way Repeated-Measures ANOVA Analysis of Variance (ANOVA) is a common and robust statistical test that you can use to compare the mean scores collected from different conditions or groups in an experiment.

More information

Using Excel for descriptive statistics

Using Excel for descriptive statistics FACT SHEET Using Excel for descriptive statistics Introduction Biologists no longer routinely plot graphs by hand or rely on calculators to carry out difficult and tedious statistical calculations. These

More information

MATH 10: Elementary Statistics and Probability Chapter 9: Hypothesis Testing with One Sample

MATH 10: Elementary Statistics and Probability Chapter 9: Hypothesis Testing with One Sample MATH 10: Elementary Statistics and Probability Chapter 9: Hypothesis Testing with One Sample Tony Pourmohamad Department of Mathematics De Anza College Spring 2015 Objectives By the end of this set of

More information

Drawing a histogram using Excel

Drawing a histogram using Excel Drawing a histogram using Excel STEP 1: Examine the data to decide how many class intervals you need and what the class boundaries should be. (In an assignment you may be told what class boundaries to

More information

Using Excel in Research. Hui Bian Office for Faculty Excellence

Using Excel in Research. Hui Bian Office for Faculty Excellence Using Excel in Research Hui Bian Office for Faculty Excellence Data entry in Excel Directly type information into the cells Enter data using Form Command: File > Options 2 Data entry in Excel Tool bar:

More information

Two Categorical Variables: The Chi Square Test

Two Categorical Variables: The Chi Square Test CHAPTER 22 Two Categorical Variables: The Chi Square Test Two Way Tables We can use Excel to create a two way table from our data that we place in columns in the spreadsheet. Our example uses the data

More information

Skewed Data and Non-parametric Methods

Skewed Data and Non-parametric Methods 0 2 4 6 8 10 12 14 Skewed Data and Non-parametric Methods Comparing two groups: t-test assumes data are: 1. Normally distributed, and 2. both samples have the same SD (i.e. one sample is simply shifted

More information

ANSWERS TO TEST NUMBER 6

ANSWERS TO TEST NUMBER 6 Question 1: (15 points) A point estimate of µ x - µ y is ANSWERS TO TEST NUMBER 6 Ȳ X 7.0 3.0 4. Given the small samples, we are required to assume that the populations are normally distributed with the

More information

Null Hypothesis H 0. The null hypothesis (denoted by H 0

Null Hypothesis H 0. The null hypothesis (denoted by H 0 Hypothesis test In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test (or test of significance) is a standard procedure for testing a claim about a property

More information