2. DATA AND EXERCISES (Geos2911 students please read page 8)

Size: px
Start display at page:

Download "2. DATA AND EXERCISES (Geos2911 students please read page 8)"

Transcription

1 2. DATA AND EXERCISES (Geos2911 students please read page 8) 2.1 Data set The data set available to you is an Excel spreadsheet file called cyclones.xls. The file consists of 3 sheets. Only the third is relevant to this week s practical. Sheet 3 Column 1 cyclone season. Column 2 cyclone identification number. Column 3 ocean basin the cyclone was generated. Column 4 central pressure of the cyclone in hpa. These data represent the total population of cyclones generated in the South Pacific Ocean (SPO) and South Indian Ocean (SIO). Note also: 1. The important aspect of this analysis is the intensity of each cyclone generated in Australian waters, particularly the numbers of the most intense Category 4 or greater cyclones. While it would also be useful to know their tracks to determine whether they crossed the coastline, such data is only available for cyclones back to 1980 (i.e. the data in Sheets 1 and 2). This is too short a time period for the low frequency large magnitude events that we are interested in today, thus we will investigate a longer record of cyclone intensity that exists back to 1907, and accept the shortcoming that we don t know whether they crossed the coastline or not. 2. Category 1 cyclone central pressures of hpa Category 2 cyclone central pressures of hpa Category 3 cyclone central pressures of hpa Category 4 cyclone central pressures of hpa Category 5 cyclone central pressures of <931 hpa 3. The lower the central pressure the more intense the cyclone. 4. Category 4 and 5 cyclones cause extensive damage and lead to major insured losses. 2.2 Exercises 1. Highlighting all of the columns with information in them (from Row 3 down), sort the data set according to ocean basin and then cut and paste the data so that you have a set of 4 columns for each basin next to each other. 2. Use the Tools Data analysis Histogram facility to produce a frequency histogram of the population of central pressures for cyclones generated in the South Indian Ocean. If you cannot find the histogram facility then use the Help menu and look for the FREQUENCY function. Produce a separate frequency histogram for cyclones generated in the South Pacific Ocean. Use a bin range of 900 to 1000 hpa with bin intervals of 10. Annotate the charts with appropriate axis labels and titles. Look at your plotted distributions does the data appear Normally distributed? 3. Calculate the mean of the central pressures for the cyclones generated in each ocean. This can be achieved using the AVERAGE function. Which ocean basin on average generates the most intense cyclones? 4. This question is intended to assess whether your answer in Step 3 above is

2 statistically significant. Insert a new worksheet into your Excel Workbook (Sheet 4) and copy your data sets for each ocean basin from Sheet 3 into Sheet 4. Now you are going to take a random sample of cyclone pressures from each ocean basin. The sample size will be 30 each from the South Indian and South Pacific Oceans. In a column next to the SIO, data create a column of 30 random numbers between 2 and 363, which is the range of row numbers in the SIO data set. Use the RANDBETWEEN functions to do this. Once you have the random numbers use the copy and paste special values facility to convert the cells from formulas to numbers, otherwise they will keep recalculating. Write down your list of random numbers for the South Indian Ocean on a sheet of paper. Then write next to each number on your sheet of paper the central pressure that corresponds to that row number. In the next column after your column of random numbers in Sheet 4 type in the corresponding central pressures. Repeat the exercise for the SPO data set, but collect 30 random numbers between 2 and 283. These are your random samples for each ocean basin. We want to assess if the average intensity of cyclones from South Indian Ocean is statistically equal to that of cyclones from the South Pacific Ocean. In statistics, an observation is statistically significant if it is unlikely to have occurred by chance. This question can be answered via statistical tools such as the Student s t-test and the Mann-Whitney test. Student s t-test for equivalence of means. Consider two samples x and y with sample size m and n, respectively. We are interested in the question are the means of x and y the same or different (i.e. is x = y or alternatively x > y ). In other words: Ho (null hypothesis): mean of population x = mean of population y H1 (alternate hypothesis): mean of population x > mean of population y The test statistic population m and n. x y t = 1 S. m + 1 n, in which S is the pooled variance of both With S = (m 1) *σ 2 2 x + (n 1) *σ y m + n 2 variance of m and n respectively. With in which σ x 2 and σ y 2 are the sample (x x ) 2 σ 2 x = m and (y y ) 2 σ 2 y = n If test statistic t is lower that the critical t given in the critical t distribution table (cf appendice) for the degree of freedom of the test (ν=m+n-2) then the null hypothesis is correct for the given degree of significance of the test. The principal assumption of the Student s t- test is that the samples are drawn from populations that are normally distributed (ie. characterized by data that cluster around the mean). The standard deviation σ expresses the dispersion of x i about the mean. Test the following hypothesis using a Student s t-test.

3 Null hypothesis: The mean of the central pressures of cyclones in the South Pacific Ocean is equal to the mean for the South Indian Ocean. Alternate hypothesis: The means of the central pressures of cyclones in the South Pacific Ocean is greater than the mean for the South Indian Ocean. You will first need to calculate the t-statistic, and then compare it to the critical t for the appropriate degrees of freedom and level of confidence. For both the South Indian and South Pacific oceans: 1- Calculate the pressure average. 2- Calculate for each cyclone the square of the difference between its pressure and the pressure average: (P-Average[P]) 2 3- Average all (P-Average[P]) 2, this is the variance of the pressure. 4- Calculate the pooled variance (S) of both the South Indian and South Pacific oceans: S = (m 1) *σ 2 2 x + (n 1) *σ y, in which σ 2 x and σ 2 y are the averaged m + n 2 (P-Average[P]) 2 for South Indian and South Pacific ocean. x y 5- Calculate the test statistic t = 1 S. m + 1 in which m is the number of n cyclones in the South Indian and n the number of cyclone in the South Pacific ocean; x and y are the pressure average for the South Indian and South Pacific oceans respectively. 6- Calculate the degree of freedom (ν) of the test: m+n-2. The mean of the central pressures of cyclones in the South Pacific Ocean is statistically equal to the mean for the South Indian Ocean when the calculated test statistic t is less that the critical t value given in the critical t distribution table. If it is not the case then the alternative hypothesis cannot be ruled out. Use the critical t distribution table and the degree of freedom (ν) to determine the probability that the calculated test statistic t is less that the critical t value in the t distribution table. The level of confidence (in %) is given by (100-α). Based on your statistical test complete the following sentence: We can be % confident that the mean of the central pressures of cyclones generated in the South Pacific Ocean (is or is not) significantly greater than the mean for the South Indian Ocean. Are the assumptions of the Student s t-test satisfied (recall your answer to Exercise 2)? How reliable is your test? 5. Insert a new worksheet in your Excel workbook (Sheet 5) and copy your sample of cyclone central pressures for the South Indian Ocean. Place a column of labels, SIO, next to them. Do the same for the South Pacific Ocean central pressures, but place them directly beneath the SIO sample. Use the RANK function to rank the central pressures in ascending order. Perform a Mann-Whitney test to determine at 95% confidence (α=5%) if the central pressures in the South Pacific and South Indian Oceans are significantly different. For this consider two random samples x and y with sample size m (SIO)

4 and n (SPO) respectively. We are interested in the question are the medians of x and y the same or different. In other words: Null hypothesis Ho: median of population x = median of population y Alternate hypothesis H1: median of population x > median of population y Mann-Whitney statistic for equivalence of medians. In statistics, the Mann- Whitney test assesses whether two samples of observations come from the same distribution. The Mann-Whitney test is useful in the same situations as the Student's t-test, and the question arises of which should be preferred. Consider two random samples x and y with sample size m and n respectively. We are interested in the question: Are the medians of x and y the same or different? In other words: Null hypothesis Ho: median of population x = median of population y Alternate hypothesis H1: median of population x > median of population y The test statistic t is calculated using: t = mn + m(m +1) 2 m R(x i ) i=1 where R(xi ) are the ranks of sample x and m is the sample size of x. The sample size of y is n. The test statistic t can be understood as the number of times observations in one sample precede observations in the other sample in the ranking. Critical values for t for the Mann-Whitney test are listed in the appendice. For the hypothesis stated above the appropriate test is a one-tail test (statistical test in which the critical region consists of all values that are less than a given value or greater than a given value, but not both). If the calculated test statistic t is less than the critical t we reject the null hypothesis. If it is greater, we cannot reject the null hypothesis. Note that there are no assumptions concerning the distribution of the samples or populations for the Mann-Whitney test. To perform a Mann-Whitney test one has to calculate the test statistic t: m m(m +1) t = mn + R(x 2 i ), in which R(x i ) are the ranks of sample x (x individual i=1 SIO cyclones), m is the number of SIO cyclones. Based on your statistical test complete the following sentence: We can be % confident that the mean of the central pressures of cyclones generated in the South Pacific Ocean (is or is not) significantly greater than the mean for the South Indian Ocean. Does the result differ from your t-test? Which test is more reliable in this case and why? Have you changed your mind regarding your answer to Exercise 3? 6. Insert a new worksheet in your Excel workbook (Sheet 6) and copy your data sets for each ocean basin from Sheet 3 into Sheet 6. In Sheet 6, highlighting all of the columns with information in them, sort the data set for the South Indian Ocean in ascending order according to central cyclone pressure. In the next column, enter a tag from 5 through to 1 that indicates the cyclone category based on the central pressures (see note 2 Section 2.1). Do the same for the South Pacific Ocean.

5 Copy that part of the list of years that includes Category 5 and 4 cyclones in the South Indian Ocean to a new location in Sheet 6. Sort this sub-list of years into ascending order. Next to this list, create a new list, which contains the number of Category 4 or greater cyclones that occurred in each decade: ; ; ; Do the same for the South Pacific Ocean. Determine the average rate at which Category 4 or greater cyclones occur in a decade for both the South Indian and South Pacific Oceans. Find the probability that the time between two successive Category 4 or greater cyclones is less than 1 year for the South Indian Ocean. Do the same for the South Pacific Ocean. Use the inferences from the exponential distribution, which assumes that the number of Category 4 or greater cyclones occurring in successive decades has a Poisson distribution. Inferences from exponential distribution: If discrete events occur randomly and independently at the mean rate λ per time interval y (so that the number occurring in a time interval has a Poisson distribution with parameter λ), the intervals between events give rise to a relative frequency histogram conforming to an exponential distribution. The probability that the time between two successive events X is less than a given time period x can be evaluated by using the following result: Pr(X x) =1 Exp( λ x y ) where λ is the mean rate of occurrence per interval y. This result is based on several assumptions for a Poisson process: 1. The process is independent. 2. The probability of one occurrence in any time interval is approximately proportional to the size of the interval. 3. The process is stationary; i.e. the number of occurrences in a time interval has the same probability distribution for all time intervals. In other words, the value of λ should not have an increasing or decreasing trend with time. Is the probability of two Category 4 or greater cyclones (which cause major insured losses, see note Section 2.1) occurring in the one year relatively low (ca. <50%) or relatively high (ca. >50%) for the South Indian Ocean; for the South Pacific Ocean. Does the last assumption listed for a Poisson process (see Section 1) appear to be satisfied here? Repeat the calculations to find the probability that the time between two successive Category 4 or greater cyclones is less than 1 year for the South Indian Ocean, based only on the past 3 decades of data. Do the same for the South Pacific Ocean, but based on the last 4 decades of data. How does this change your answer to the previous question? What might be making the record of cyclone activity unsteady (i.e. increasing number of intense cyclones in recent years)? See Science and Nature articles on WebCT.

6 REPORT (Geos-2911 only) In addition to the indicated material from Prac 2, the graphs from Exercise 2 and results from Exercises 3 to 6 in this Prac 3 provide the basis for the following report, so make sure that you understand the concepts clearly and have produced the graphs correctly. You are working as a geoscientist for an insurance company and you have been asked to prepare a report addressing whether households and businesses in Port Hedland and Cairns should be charged the same premium for insurance against losses due to cyclones. Use your knowledge of the components involved in assessing risk (recall the Introduction lecture), as well as the exercises you have completed in Pracs 2 and 3, to write this report. Your report should have the following sections: Introduction, Data and Methods, Results, and Conclusion. The text should be no longer than 4 double spaced pages (excluding figures and tables). The results section of your report should incorporate all of the indicated graphs and answers to questions in Pracs 2 and 3. Your conclusion must make an explicit recommendation one way or the other regarding whether premiums should differ between the two towns and if so which should be higher. Note that there is no absolute right or wrong answer here; it depends on how you view risk. Make sure you justify your conclusion. nb: When you are writing your report, note that the occurrence of two Category 4 or greater cyclones crossing the coast in a year causes serious cash flow problems for insurance companies because of large successive payouts in a short period of time. Don t forget, however, that the analysis in this prac has been for all cyclones generated in the South Indian and South Pacific Oceans and not all of these necessarily cross the coast.

7

8

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.

More information

Introduction to Hypothesis Testing. Hypothesis Testing. Step 1: State the Hypotheses

Introduction to Hypothesis Testing. Hypothesis Testing. Step 1: State the Hypotheses Introduction to Hypothesis Testing 1 Hypothesis Testing A hypothesis test is a statistical procedure that uses sample data to evaluate a hypothesis about a population Hypothesis is stated in terms of the

More information

Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition

Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition Online Learning Centre Technology Step-by-Step - Excel Microsoft Excel is a spreadsheet software application

More information

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES SCHOOL OF HEALTH AND HUMAN SCIENCES Using SPSS Topics addressed today: 1. Differences between groups 2. Graphing Use the s4data.sav file for the first part of this session. DON T FORGET TO RECODE YOUR

More information

NCSS Statistical Software

NCSS Statistical Software Chapter 06 Introduction This procedure provides several reports for the comparison of two distributions, including confidence intervals for the difference in means, two-sample t-tests, the z-test, the

More information

Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Two-Sample T-Tests Assuming Equal Variance (Enter Means) Chapter 4 Two-Sample T-Tests Assuming Equal Variance (Enter Means) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when the variances of

More information

II. DISTRIBUTIONS distribution normal distribution. standard scores

II. DISTRIBUTIONS distribution normal distribution. standard scores Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

More information

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Chapter 45 Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when no assumption

More information

Projects Involving Statistics (& SPSS)

Projects Involving Statistics (& SPSS) Projects Involving Statistics (& SPSS) Academic Skills Advice Starting a project which involves using statistics can feel confusing as there seems to be many different things you can do (charts, graphs,

More information

CHAPTER 12 TESTING DIFFERENCES WITH ORDINAL DATA: MANN WHITNEY U

CHAPTER 12 TESTING DIFFERENCES WITH ORDINAL DATA: MANN WHITNEY U CHAPTER 12 TESTING DIFFERENCES WITH ORDINAL DATA: MANN WHITNEY U Previous chapters of this text have explained the procedures used to test hypotheses using interval data (t-tests and ANOVA s) and nominal

More information

Chapter 3 RANDOM VARIATE GENERATION

Chapter 3 RANDOM VARIATE GENERATION Chapter 3 RANDOM VARIATE GENERATION In order to do a Monte Carlo simulation either by hand or by computer, techniques must be developed for generating values of random variables having known distributions.

More information

UNDERSTANDING THE TWO-WAY ANOVA

UNDERSTANDING THE TWO-WAY ANOVA UNDERSTANDING THE e have seen how the one-way ANOVA can be used to compare two or more sample means in studies involving a single independent variable. This can be extended to two independent variables

More information

Odds ratio, Odds ratio test for independence, chi-squared statistic.

Odds ratio, Odds ratio test for independence, chi-squared statistic. Odds ratio, Odds ratio test for independence, chi-squared statistic. Announcements: Assignment 5 is live on webpage. Due Wed Aug 1 at 4:30pm. (9 days, 1 hour, 58.5 minutes ) Final exam is Aug 9. Review

More information

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar business statistics using Excel Glyn Davis & Branko Pecar OXFORD UNIVERSITY PRESS Detailed contents Introduction to Microsoft Excel 2003 Overview Learning Objectives 1.1 Introduction to Microsoft Excel

More information

3.4 Statistical inference for 2 populations based on two samples

3.4 Statistical inference for 2 populations based on two samples 3.4 Statistical inference for 2 populations based on two samples Tests for a difference between two population means The first sample will be denoted as X 1, X 2,..., X m. The second sample will be denoted

More information

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis

More information

MBA 611 STATISTICS AND QUANTITATIVE METHODS

MBA 611 STATISTICS AND QUANTITATIVE METHODS MBA 611 STATISTICS AND QUANTITATIVE METHODS Part I. Review of Basic Statistics (Chapters 1-11) A. Introduction (Chapter 1) Uncertainty: Decisions are often based on incomplete information from uncertain

More information

HYPOTHESIS TESTING: POWER OF THE TEST

HYPOTHESIS TESTING: POWER OF THE TEST HYPOTHESIS TESTING: POWER OF THE TEST The first 6 steps of the 9-step test of hypothesis are called "the test". These steps are not dependent on the observed data values. When planning a research project,

More information

Non-Parametric Tests (I)

Non-Parametric Tests (I) Lecture 5: Non-Parametric Tests (I) KimHuat LIM lim@stats.ox.ac.uk http://www.stats.ox.ac.uk/~lim/teaching.html Slide 1 5.1 Outline (i) Overview of Distribution-Free Tests (ii) Median Test for Two Independent

More information

Using Excel for inferential statistics

Using Excel for inferential statistics FACT SHEET Using Excel for inferential statistics Introduction When you collect data, you expect a certain amount of variation, just caused by chance. A wide variety of statistical tests can be applied

More information

12.5: CHI-SQUARE GOODNESS OF FIT TESTS

12.5: CHI-SQUARE GOODNESS OF FIT TESTS 125: Chi-Square Goodness of Fit Tests CD12-1 125: CHI-SQUARE GOODNESS OF FIT TESTS In this section, the χ 2 distribution is used for testing the goodness of fit of a set of data to a specific probability

More information

Describing Populations Statistically: The Mean, Variance, and Standard Deviation

Describing Populations Statistically: The Mean, Variance, and Standard Deviation Describing Populations Statistically: The Mean, Variance, and Standard Deviation BIOLOGICAL VARIATION One aspect of biology that holds true for almost all species is that not every individual is exactly

More information

Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013

Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013 Statistics I for QBIC Text Book: Biostatistics, 10 th edition, by Daniel & Cross Contents and Objectives Chapters 1 7 Revised: August 2013 Chapter 1: Nature of Statistics (sections 1.1-1.6) Objectives

More information

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management KSTAT MINI-MANUAL Decision Sciences 434 Kellogg Graduate School of Management Kstat is a set of macros added to Excel and it will enable you to do the statistics required for this course very easily. To

More information

Using Excel for descriptive statistics

Using Excel for descriptive statistics FACT SHEET Using Excel for descriptive statistics Introduction Biologists no longer routinely plot graphs by hand or rely on calculators to carry out difficult and tedious statistical calculations. These

More information

CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

More information

Normality Testing in Excel

Normality Testing in Excel Normality Testing in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com

More information

One-Way ANOVA using SPSS 11.0. SPSS ANOVA procedures found in the Compare Means analyses. Specifically, we demonstrate

One-Way ANOVA using SPSS 11.0. SPSS ANOVA procedures found in the Compare Means analyses. Specifically, we demonstrate 1 One-Way ANOVA using SPSS 11.0 This section covers steps for testing the difference between three or more group means using the SPSS ANOVA procedures found in the Compare Means analyses. Specifically,

More information

Statistical Functions in Excel

Statistical Functions in Excel Statistical Functions in Excel There are many statistical functions in Excel. Moreover, there are other functions that are not specified as statistical functions that are helpful in some statistical analyses.

More information

TIPS FOR DOING STATISTICS IN EXCEL

TIPS FOR DOING STATISTICS IN EXCEL TIPS FOR DOING STATISTICS IN EXCEL Before you begin, make sure that you have the DATA ANALYSIS pack running on your machine. It comes with Excel. Here s how to check if you have it, and what to do if you

More information

Permutation Tests for Comparing Two Populations

Permutation Tests for Comparing Two Populations Permutation Tests for Comparing Two Populations Ferry Butar Butar, Ph.D. Jae-Wan Park Abstract Permutation tests for comparing two populations could be widely used in practice because of flexibility of

More information

CHI-SQUARE: TESTING FOR GOODNESS OF FIT

CHI-SQUARE: TESTING FOR GOODNESS OF FIT CHI-SQUARE: TESTING FOR GOODNESS OF FIT In the previous chapter we discussed procedures for fitting a hypothesized function to a set of experimental data points. Such procedures involve minimizing a quantity

More information

Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools

Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools Occam s razor.......................................................... 2 A look at data I.........................................................

More information

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test Experimental Design Power and Sample Size Determination Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison November 3 8, 2011 To this point in the semester, we have largely

More information

Section 13, Part 1 ANOVA. Analysis Of Variance

Section 13, Part 1 ANOVA. Analysis Of Variance Section 13, Part 1 ANOVA Analysis Of Variance Course Overview So far in this course we ve covered: Descriptive statistics Summary statistics Tables and Graphs Probability Probability Rules Probability

More information

Module 4 (Effect of Alcohol on Worms): Data Analysis

Module 4 (Effect of Alcohol on Worms): Data Analysis Module 4 (Effect of Alcohol on Worms): Data Analysis Michael Dunn Capuchino High School Introduction In this exercise, you will first process the timelapse data you collected. Then, you will cull (remove)

More information

Data Analysis Tools. Tools for Summarizing Data

Data Analysis Tools. Tools for Summarizing Data Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool

More information

Introduction to Quantitative Methods

Introduction to Quantitative Methods Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................

More information

Confidence Intervals for the Difference Between Two Means

Confidence Intervals for the Difference Between Two Means Chapter 47 Confidence Intervals for the Difference Between Two Means Introduction This procedure calculates the sample size necessary to achieve a specified distance from the difference in sample means

More information

Study Guide for the Final Exam

Study Guide for the Final Exam Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make

More information

StatCrunch and Nonparametric Statistics

StatCrunch and Nonparametric Statistics StatCrunch and Nonparametric Statistics You can use StatCrunch to calculate the values of nonparametric statistics. It may not be obvious how to enter the data in StatCrunch for various data sets that

More information

Lecture Notes Module 1

Lecture Notes Module 1 Lecture Notes Module 1 Study Populations A study population is a clearly defined collection of people, animals, plants, or objects. In psychological research, a study population usually consists of a specific

More information

Using Excel in Research. Hui Bian Office for Faculty Excellence

Using Excel in Research. Hui Bian Office for Faculty Excellence Using Excel in Research Hui Bian Office for Faculty Excellence Data entry in Excel Directly type information into the cells Enter data using Form Command: File > Options 2 Data entry in Excel Tool bar:

More information

4. Continuous Random Variables, the Pareto and Normal Distributions

4. Continuous Random Variables, the Pareto and Normal Distributions 4. Continuous Random Variables, the Pareto and Normal Distributions A continuous random variable X can take any value in a given range (e.g. height, weight, age). The distribution of a continuous random

More information

Fairfield Public Schools

Fairfield Public Schools Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity

More information

Drawing a histogram using Excel

Drawing a histogram using Excel Drawing a histogram using Excel STEP 1: Examine the data to decide how many class intervals you need and what the class boundaries should be. (In an assignment you may be told what class boundaries to

More information

NCSS Statistical Software. One-Sample T-Test

NCSS Statistical Software. One-Sample T-Test Chapter 205 Introduction This procedure provides several reports for making inference about a population mean based on a single sample. These reports include confidence intervals of the mean or median,

More information

Comparing Means in Two Populations

Comparing Means in Two Populations Comparing Means in Two Populations Overview The previous section discussed hypothesis testing when sampling from a single population (either a single mean or two means from the same population). Now we

More information

Lab 11: Budgeting with Excel

Lab 11: Budgeting with Excel Lab 11: Budgeting with Excel This lab exercise will have you track credit card bills over a period of three months. You will determine those months in which a budget was met for various categories. You

More information

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing Chapter 8 Hypothesis Testing 1 Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing 8-3 Testing a Claim About a Proportion 8-5 Testing a Claim About a Mean: s Not Known 8-6 Testing

More information

THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7.

THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7. THERE ARE TWO WAYS TO DO HYPOTHESIS TESTING WITH STATCRUNCH: WITH SUMMARY DATA (AS IN EXAMPLE 7.17, PAGE 236, IN ROSNER); WITH THE ORIGINAL DATA (AS IN EXAMPLE 8.5, PAGE 301 IN ROSNER THAT USES DATA FROM

More information

seven Statistical Analysis with Excel chapter OVERVIEW CHAPTER

seven Statistical Analysis with Excel chapter OVERVIEW CHAPTER seven Statistical Analysis with Excel CHAPTER chapter OVERVIEW 7.1 Introduction 7.2 Understanding Data 7.3 Relationships in Data 7.4 Distributions 7.5 Summary 7.6 Exercises 147 148 CHAPTER 7 Statistical

More information

Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

More information

Non-Inferiority Tests for Two Means using Differences

Non-Inferiority Tests for Two Means using Differences Chapter 450 on-inferiority Tests for Two Means using Differences Introduction This procedure computes power and sample size for non-inferiority tests in two-sample designs in which the outcome is a continuous

More information

Skewed Data and Non-parametric Methods

Skewed Data and Non-parametric Methods 0 2 4 6 8 10 12 14 Skewed Data and Non-parametric Methods Comparing two groups: t-test assumes data are: 1. Normally distributed, and 2. both samples have the same SD (i.e. one sample is simply shifted

More information

Association Between Variables

Association Between Variables Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi

More information

DESCRIPTIVE STATISTICS & DATA PRESENTATION*

DESCRIPTIVE STATISTICS & DATA PRESENTATION* Level 1 Level 2 Level 3 Level 4 0 0 0 0 evel 1 evel 2 evel 3 Level 4 DESCRIPTIVE STATISTICS & DATA PRESENTATION* Created for Psychology 41, Research Methods by Barbara Sommer, PhD Psychology Department

More information

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1) Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm

More information

Difference of Means and ANOVA Problems

Difference of Means and ANOVA Problems Difference of Means and Problems Dr. Tom Ilvento FREC 408 Accounting Firm Study An accounting firm specializes in auditing the financial records of large firm It is interested in evaluating its fee structure,particularly

More information

The Wilcoxon Rank-Sum Test

The Wilcoxon Rank-Sum Test 1 The Wilcoxon Rank-Sum Test The Wilcoxon rank-sum test is a nonparametric alternative to the twosample t-test which is based solely on the order in which the observations from the two samples fall. We

More information

Statistics Review PSY379

Statistics Review PSY379 Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses

More information

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance

More information

Chapter 7. Comparing Means in SPSS (t-tests) Compare Means analyses. Specifically, we demonstrate procedures for running Dependent-Sample (or

Chapter 7. Comparing Means in SPSS (t-tests) Compare Means analyses. Specifically, we demonstrate procedures for running Dependent-Sample (or 1 Chapter 7 Comparing Means in SPSS (t-tests) This section covers procedures for testing the differences between two means using the SPSS Compare Means analyses. Specifically, we demonstrate procedures

More information

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) As with other parametric statistics, we begin the one-way ANOVA with a test of the underlying assumptions. Our first assumption is the assumption of

More information

SPSS/Excel Workshop 3 Summer Semester, 2010

SPSS/Excel Workshop 3 Summer Semester, 2010 SPSS/Excel Workshop 3 Summer Semester, 2010 In Assignment 3 of STATS 10x you may want to use Excel to perform some calculations in Questions 1 and 2 such as: finding P-values finding t-multipliers and/or

More information

The right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median

The right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median CONDENSED LESSON 2.1 Box Plots In this lesson you will create and interpret box plots for sets of data use the interquartile range (IQR) to identify potential outliers and graph them on a modified box

More information

Nonparametric Two-Sample Tests. Nonparametric Tests. Sign Test

Nonparametric Two-Sample Tests. Nonparametric Tests. Sign Test Nonparametric Two-Sample Tests Sign test Mann-Whitney U-test (a.k.a. Wilcoxon two-sample test) Kolmogorov-Smirnov Test Wilcoxon Signed-Rank Test Tukey-Duckworth Test 1 Nonparametric Tests Recall, nonparametric

More information

research/scientific includes the following: statistical hypotheses: you have a null and alternative you accept one and reject the other

research/scientific includes the following: statistical hypotheses: you have a null and alternative you accept one and reject the other 1 Hypothesis Testing Richard S. Balkin, Ph.D., LPC-S, NCC 2 Overview When we have questions about the effect of a treatment or intervention or wish to compare groups, we use hypothesis testing Parametric

More information

Comparing Multiple Proportions, Test of Independence and Goodness of Fit

Comparing Multiple Proportions, Test of Independence and Goodness of Fit Comparing Multiple Proportions, Test of Independence and Goodness of Fit Content Testing the Equality of Population Proportions for Three or More Populations Test of Independence Goodness of Fit Test 2

More information

ABSORBENCY OF PAPER TOWELS

ABSORBENCY OF PAPER TOWELS ABSORBENCY OF PAPER TOWELS 15. Brief Version of the Case Study 15.1 Problem Formulation 15.2 Selection of Factors 15.3 Obtaining Random Samples of Paper Towels 15.4 How will the Absorbency be measured?

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information

Using MS Excel to Analyze Data: A Tutorial

Using MS Excel to Analyze Data: A Tutorial Using MS Excel to Analyze Data: A Tutorial Various data analysis tools are available and some of them are free. Because using data to improve assessment and instruction primarily involves descriptive and

More information

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY 1. Introduction Besides arriving at an appropriate expression of an average or consensus value for observations of a population, it is important to

More information

Data Analysis. Using Excel. Jeffrey L. Rummel. BBA Seminar. Data in Excel. Excel Calculations of Descriptive Statistics. Single Variable Graphs

Data Analysis. Using Excel. Jeffrey L. Rummel. BBA Seminar. Data in Excel. Excel Calculations of Descriptive Statistics. Single Variable Graphs Using Excel Jeffrey L. Rummel Emory University Goizueta Business School BBA Seminar Jeffrey L. Rummel BBA Seminar 1 / 54 Excel Calculations of Descriptive Statistics Single Variable Graphs Relationships

More information

Psychology 60 Fall 2013 Practice Exam Actual Exam: Next Monday. Good luck!

Psychology 60 Fall 2013 Practice Exam Actual Exam: Next Monday. Good luck! Psychology 60 Fall 2013 Practice Exam Actual Exam: Next Monday. Good luck! Name: 1. The basic idea behind hypothesis testing: A. is important only if you want to compare two populations. B. depends on

More information

STATISTICAL ANALYSIS WITH EXCEL COURSE OUTLINE

STATISTICAL ANALYSIS WITH EXCEL COURSE OUTLINE STATISTICAL ANALYSIS WITH EXCEL COURSE OUTLINE Perhaps Microsoft has taken pains to hide some of the most powerful tools in Excel. These add-ins tools work on top of Excel, extending its power and abilities

More information

Using SPSS, Chapter 2: Descriptive Statistics

Using SPSS, Chapter 2: Descriptive Statistics 1 Using SPSS, Chapter 2: Descriptive Statistics Chapters 2.1 & 2.2 Descriptive Statistics 2 Mean, Standard Deviation, Variance, Range, Minimum, Maximum 2 Mean, Median, Mode, Standard Deviation, Variance,

More information

Point Biserial Correlation Tests

Point Biserial Correlation Tests Chapter 807 Point Biserial Correlation Tests Introduction The point biserial correlation coefficient (ρ in this chapter) is the product-moment correlation calculated between a continuous random variable

More information

Understanding Confidence Intervals and Hypothesis Testing Using Excel Data Table Simulation

Understanding Confidence Intervals and Hypothesis Testing Using Excel Data Table Simulation Understanding Confidence Intervals and Hypothesis Testing Using Excel Data Table Simulation Leslie Chandrakantha lchandra@jjay.cuny.edu Department of Mathematics & Computer Science John Jay College of

More information

Outline. Definitions Descriptive vs. Inferential Statistics The t-test - One-sample t-test

Outline. Definitions Descriptive vs. Inferential Statistics The t-test - One-sample t-test The t-test Outline Definitions Descriptive vs. Inferential Statistics The t-test - One-sample t-test - Dependent (related) groups t-test - Independent (unrelated) groups t-test Comparing means Correlation

More information

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing Introduction Hypothesis Testing Mark Lunt Arthritis Research UK Centre for Ecellence in Epidemiology University of Manchester 13/10/2015 We saw last week that we can never know the population parameters

More information

Statistics 2014 Scoring Guidelines

Statistics 2014 Scoring Guidelines AP Statistics 2014 Scoring Guidelines College Board, Advanced Placement Program, AP, AP Central, and the acorn logo are registered trademarks of the College Board. AP Central is the official online home

More information

Below is a very brief tutorial on the basic capabilities of Excel. Refer to the Excel help files for more information.

Below is a very brief tutorial on the basic capabilities of Excel. Refer to the Excel help files for more information. Excel Tutorial Below is a very brief tutorial on the basic capabilities of Excel. Refer to the Excel help files for more information. Working with Data Entering and Formatting Data Before entering data

More information

START Selected Topics in Assurance

START Selected Topics in Assurance START Selected Topics in Assurance Related Technologies Table of Contents Introduction Some Statistical Background Fitting a Normal Using the Anderson Darling GoF Test Fitting a Weibull Using the Anderson

More information

SECTION 2-1: OVERVIEW SECTION 2-2: FREQUENCY DISTRIBUTIONS

SECTION 2-1: OVERVIEW SECTION 2-2: FREQUENCY DISTRIBUTIONS SECTION 2-1: OVERVIEW Chapter 2 Describing, Exploring and Comparing Data 19 In this chapter, we will use the capabilities of Excel to help us look more carefully at sets of data. We can do this by re-organizing

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

More information

p ˆ (sample mean and sample

p ˆ (sample mean and sample Chapter 6: Confidence Intervals and Hypothesis Testing When analyzing data, we can t just accept the sample mean or sample proportion as the official mean or proportion. When we estimate the statistics

More information

Independent samples t-test. Dr. Tom Pierce Radford University

Independent samples t-test. Dr. Tom Pierce Radford University Independent samples t-test Dr. Tom Pierce Radford University The logic behind drawing causal conclusions from experiments The sampling distribution of the difference between means The standard error of

More information

Engineering Problem Solving and Excel. EGN 1006 Introduction to Engineering

Engineering Problem Solving and Excel. EGN 1006 Introduction to Engineering Engineering Problem Solving and Excel EGN 1006 Introduction to Engineering Mathematical Solution Procedures Commonly Used in Engineering Analysis Data Analysis Techniques (Statistics) Curve Fitting techniques

More information

Step 3: Go to Column C. Use the function AVERAGE to calculate the mean values of n = 5. Column C is the column of the means.

Step 3: Go to Column C. Use the function AVERAGE to calculate the mean values of n = 5. Column C is the column of the means. EXAMPLES - SAMPLING DISTRIBUTION EXCEL INSTRUCTIONS This exercise illustrates the process of the sampling distribution as stated in the Central Limit Theorem. Enter the actual data in Column A in MICROSOFT

More information

Northumberland Knowledge

Northumberland Knowledge Northumberland Knowledge Know Guide How to Analyse Data - November 2012 - This page has been left blank 2 About this guide The Know Guides are a suite of documents that provide useful information about

More information

Statistics. One-two sided test, Parametric and non-parametric test statistics: one group, two groups, and more than two groups samples

Statistics. One-two sided test, Parametric and non-parametric test statistics: one group, two groups, and more than two groups samples Statistics One-two sided test, Parametric and non-parametric test statistics: one group, two groups, and more than two groups samples February 3, 00 Jobayer Hossain, Ph.D. & Tim Bunnell, Ph.D. Nemours

More information

How To Check For Differences In The One Way Anova

How To Check For Differences In The One Way Anova MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. One-Way

More information

Opgaven Onderzoeksmethoden, Onderdeel Statistiek

Opgaven Onderzoeksmethoden, Onderdeel Statistiek Opgaven Onderzoeksmethoden, Onderdeel Statistiek 1. What is the measurement scale of the following variables? a Shoe size b Religion c Car brand d Score in a tennis game e Number of work hours per week

More information

Using Microsoft Excel to Analyze Data

Using Microsoft Excel to Analyze Data Entering and Formatting Data Using Microsoft Excel to Analyze Data Open Excel. Set up the spreadsheet page (Sheet 1) so that anyone who reads it will understand the page. For the comparison of pipets:

More information

Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct 16 2015

Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct 16 2015 Stat 411/511 THE RANDOMIZATION TEST Oct 16 2015 Charlotte Wickham stat511.cwick.co.nz Today Review randomization model Conduct randomization test What about CIs? Using a t-distribution as an approximation

More information

To launch the Microsoft Excel program, locate the Microsoft Excel icon, and double click.

To launch the Microsoft Excel program, locate the Microsoft Excel icon, and double click. EDIT202 Spreadsheet Lab Assignment Guidelines Getting Started 1. For this lab you will modify a sample spreadsheet file named Starter- Spreadsheet.xls which is available for download from the Spreadsheet

More information