SIMULATION STUDIES IN STATISTICS WHAT IS A SIMULATION STUDY, AND WHY DO ONE? What is a (Monte Carlo) simulation study, and why do one?



Similar documents
LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

Standard Deviation Estimator

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE

Descriptive Statistics

Fairfield Public Schools

Introduction to Hypothesis Testing. Hypothesis Testing. Step 1: State the Hypotheses

NCSS Statistical Software

CALCULATIONS & STATISTICS

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Sample Size and Power in Clinical Trials

Additional sources Compilation of sources:

4. Continuous Random Variables, the Pareto and Normal Distributions

Week 3&4: Z tables and the Sampling Distribution of X

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

The Variability of P-Values. Summary

Point and Interval Estimates

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7.

Exact Nonparametric Tests for Comparing Means - A Personal Summary

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

Lecture Notes Module 1

Statistics 2014 Scoring Guidelines

Hypothesis Testing for Beginners

HYPOTHESIS TESTING: POWER OF THE TEST

Non-Parametric Tests (I)

5.1 Identifying the Target Parameter

TEACHER NOTES MATH NSPIRED

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

Independent t- Test (Comparing Two Means)

SENSITIVITY ANALYSIS AND INFERENCE. Lecture 12

Chapter 4. Probability and Probability Distributions

Variables Control Charts

Calculating VaR. Capital Market Risk Advisors CMRA

individualdifferences

P(every one of the seven intervals covers the true mean yield at its location) = 3.

UNDERSTANDING THE INDEPENDENT-SAMPLES t TEST

BA 275 Review Problems - Week 5 (10/23/06-10/27/06) CD Lessons: 48, 49, 50, 51, 52 Textbook: pp

Statistiek I. Proportions aka Sign Tests. John Nerbonne. CLCG, Rijksuniversiteit Groningen.

Basics of Statistical Machine Learning

Permutation Tests for Comparing Two Populations

Research Methods & Experimental Design

IEOR 6711: Stochastic Models I Fall 2012, Professor Whitt, Tuesday, September 11 Normal Approximations and the Central Limit Theorem

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

Stat 5102 Notes: Nonparametric Tests and. confidence interval

II. DISTRIBUTIONS distribution normal distribution. standard scores

Skewed Data and Non-parametric Methods

Practice problems for Homework 12 - confidence intervals and hypothesis testing. Open the Homework Assignment 12 and solve the problems.

2 Sample t-test (unequal sample sizes and unequal variances)

Introduction to Hypothesis Testing OPRE 6301

Content Sheet 7-1: Overview of Quality Control for Quantitative Tests

Projects Involving Statistics (& SPSS)

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

research/scientific includes the following: statistical hypotheses: you have a null and alternative you accept one and reject the other

Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Introduction to Quantitative Methods

99.37, 99.38, 99.38, 99.39, 99.39, 99.39, 99.39, 99.40, 99.41, cm

statistics Chi-square tests and nonparametric Summary sheet from last time: Hypothesis testing Summary sheet from last time: Confidence intervals

of course the mean is p. That is just saying the average sample would have 82% answering

Examining Differences (Comparing Groups) using SPSS Inferential statistics (Part I) Dwayne Devonish

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

Statistics 104: Section 6!

MISSING DATA: THE POINT OF VIEW OF ETHICAL COMMITTEES

UNDERSTANDING THE TWO-WAY ANOVA

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS

MBA 611 STATISTICS AND QUANTITATIVE METHODS

Means, standard deviations and. and standard errors

Hypothesis testing - Steps

Chapter 1 Introduction. 1.1 Introduction

Handling missing data in Stata a whirlwind tour

Inclusion and Exclusion Criteria

Partial Estimates of Reliability: Parallel Form Reliability in the Key Stage 2 Science Tests

Unit 26 Estimation with Confidence Intervals

Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013

Chapter 2. Hypothesis testing in one population

NCSS Statistical Software. One-Sample T-Test

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon

Numerical Methods for Option Pricing

CHI-SQUARE: TESTING FOR GOODNESS OF FIT

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

Two Related Samples t Test

X = rnorm(5e4,mean=1,sd=2) # we need a total of 5 x 10,000 = 5e4 samples X = matrix(data=x,nrow=1e4,ncol=5)

Part 3. Comparing Groups. Chapter 7 Comparing Paired Groups 189. Chapter 8 Comparing Two Independent Groups 217

C. The null hypothesis is not rejected when the alternative hypothesis is true. A. population parameters.

Sample Size Planning for the Squared Multiple Correlation Coefficient: Accuracy in Parameter Estimation via Narrow Confidence Intervals

Introduction to Hypothesis Testing

p ˆ (sample mean and sample

How To Test For Significance On A Data Set

NONPARAMETRIC STATISTICS 1. depend on assumptions about the underlying distribution of the data (or on the Central Limit Theorem)

Non Parametric Inference

Part 2: Analysis of Relationship Between Two Variables

Statistics courses often teach the two-sample t-test, linear regression, and analysis of variance

This chapter discusses some of the basic concepts in inferential statistics.

Inference for two Population Means

THE KRUSKAL WALLLIS TEST

3.4 Statistical inference for 2 populations based on two samples

Monte Carlo Simulation in Stata

Risk Analysis and Quantification

HOW TO WRITE A LABORATORY REPORT

Binomial lattice model for stock prices

Transcription:

SIMULATION STUDIES IN STATISTICS WHAT IS A SIMULATION STUDY, AND WHY DO ONE? What is a (Monte Carlo) simulation study, and why do one? Simulations for properties of estimators Simulations for properties of hypothesis tests Simulation study principles Presenting results Simulation: A numerical technique for conducting experiments on the computer Monte Carlo simulation: Computer experiment involving random sampling from probability distributions Invaluable in statistics... Usually, when statisticians talk about simulations, they mean Monte Carlo simulations Simulation Studies in Statistics 1 Simulation Studies in Statistics 2 Rationale: In statistics Properties of statistical methods must be established so that the methods may be used with confidence Exact analytical derivations of properties are rarely possible Large sample approximations to properties are often possible, however...... evaluation of the relevance of the approximation to (finite) sample sizes likely to be encountered in practice is needed Moreover, analytical results may require assumptions (e.g., normality) But what happens when these assumptions are violated? Analytical results, even large sample ones, may not be possible Usual issues: Under various conditions Is an estimator biased in finite samples? Is it still consistent under departures from assumptions? What is its sampling variance? How does it compare to competing estimators on the basis of bias, precision, etc.? Does a procedure for constructing a confidence interval for a parameter achieve the advertised nominal level of coverage? Does a hypothesis testing procedure attain the advertised level or size? If it does, what power is possible against different alternatives to the null hypothesis? Do different test procedures deliver different power? Simulation Studies in Statistics 3 Simulation Studies in Statistics 4

Usual issues: Under various conditions Is an estimator biased in finite samples? Is it still consistent under departures from assumptions? What is its sampling variance? How does it compare to competing estimators on the basis of bias, precision, etc.? Does a procedure for constructing a confidence interval for a parameter achieve the advertised nominal level of coverage? Does a hypothesis testing procedure attain the advertised level or size? Monte Carlo simulation to the rescue: An estimator or test statistic has a true sampling distribution under a particular set of conditions (finite sample size, true distribution of the data, etc.) Ideally, we would want to know this true sampling distribution in order to address the issues on the previous slide But derivation of the true sampling distribution is not tractable Approximate the sampling distribution of an estimator or test statistic under a particular set of conditions If it does, what power is possible against different alternatives to the null hypothesis? Do different test procedures deliver different power? How to answer these questions in the absence of analytical results? Simulation Studies in Statistics 4 Simulation Studies in Statistics 5 How to approximate: A typical Monte Carlo simulation involves the following Generate S independent data sets under the conditions of interest Compute the numerical value of the estimator/test statistic T (data) for each data set T 1,...,T S If S is large enough, summary statistics across T 1,...,T S should be good approximations to the true sampling properties of the estimator/test statistic under the conditions of interest How to approximate: A typical Monte Carlo simulation involves the following Generate S independent data sets under the conditions of interest Compute the numerical value of the estimator/test statistic T (data) for each data set T 1,...,T S If S is large enough, summary statistics across T 1,...,T S should be good approximations to the true sampling properties of the estimator/test statistic under the conditions of interest E.g., for an estimator for a parameter θ: T s is the value of T from the sth data set, s =1,...,S The sample mean over S data sets is an estimate of the true mean of the sampling distribution of the estimator Simulation Studies in Statistics 6 Simulation Studies in Statistics 6

SIMULATIONS FOR PROPERTIES OF ESTIMATORS Simple example: Compare three estimators for the mean µ of a distribution based on i.i.d. draws Y 1,...,Y n Sample mean T (1) Sample 20% trimmed mean T (2) Sample median T (3) SIMULATIONS FOR PROPERTIES OF ESTIMATORS Simple example: Compare three estimators for the mean µ of a distribution based on i.i.d. draws Y 1,...,Y n Sample mean T (1) Sample 20% trimmed mean T (2) Sample median T (3) Remarks: If the distribution of the data is symmetric, all three estimators indeed estimate the mean If the distribution is skewed, they do not Simulation Studies in Statistics 7 Simulation Studies in Statistics 7 Simulation procedure: For a particular choice of µ, n, and true underlying distribution Generate independent draws Y 1,...,Y n from the distribution Compute T (1), T (2), T (3) Repeat S times T (1) 1,...,T (1) S ; T (2) 1,...,T (2) S ; T (3) 1,...,T (3) S Compute for k =1, 2, 3 S ŜD = (S 1) 1 mean = S 1 s=1 (T (k) S s=1 T (k) s = T (k), bias = T (k) µ s T (k) ) 2, MSE = S 1 S s=1 (T (k) s µ) 2 ŜD2 + bias 2 Relative efficiency: For any estimators for which E(T (1) )=E(T (2) )=µ RE = var(t (1) ) var(t (2) ) is the relative efficiency of estimator 2 to estimator 1 When the estimators are not unbiased it is standard to compute RE = MSE(T (1) ) MSE(T (2) ) In either case RE < 1 means estimator 1 is preferred (estimator 2 is inefficient relative to estimator 1 in this sense) Simulation Studies in Statistics 8 Simulation Studies in Statistics 9

In R: See class website for program > set.seed(3) Normal data: > out <- generate.normal(s,n,mu,sigma) > S <- 1000 > outsampmean <- apply(out$dat,1,mean) > n <- 15 > outtrimmean <- apply(out$dat,1,trimmean) > trimmean <- function(y){mean(y,0.2)} > outmedian <- apply(out$dat,1,median) > mu <- 1 > sigma <- sqrt(5/3) > summary.sim <- data.frame(mean=outsampmean,trim=outtrimmean, + median=outmedian) > results <- simsum(summary.sim,mu) Simulation Studies in Statistics 10 Simulation Studies in Statistics 11 > view(round(summary.sim,4),5) First 5 rows mean trim median 1 0.7539 0.7132 1.0389 2 0.6439 0.4580 0.3746 3 1.5553 1.6710 1.9395 4 0.5171 0.4827 0.4119 5 1.3603 1.4621 1.3452 > results Sample mean Trimmed mean Median true value 1.000 1.000 1.000 # sims 1000.000 1000.000 1000.000 MC mean 0.985 0.987 0.992 MC bias -0.015-0.013-0.008 MC relative bias -0.015-0.013-0.008 MC standard deviation 0.331 0.348 0.398 MC MSE 0.110 0.121 0.158 MC relative efficiency 1.000 0.905 0.694 Simulation Studies in Statistics 12 Simulation Studies in Statistics 13

Performance of estimates of uncertainty: How well do estimated standard errors represent the true sampling variation? E.g., For sample mean T (1) (Y 1,...,Y n )=Y SE(Y )= s n, s 2 =(n 1) 1 (Y j Y ) 2 n MC standard deviation approximates the true sampling variation Compare average of estimated standard errors to MC standard deviation j=1 For sample mean: MC standard deviation 0.331 > outsampmean <- apply(out$dat,1,mean) > sampmean.ses <- sqrt(apply(out$dat,1,var)/n) > ave.sampmeanses <- mean(sampmean.ses) > round(ave.sampmeanses,3) [1] 0.329 Simulation Studies in Statistics 14 Simulation Studies in Statistics 15 Usual 100(1-α)% confidence interval for µ: Based on sample mean [ s s ] Y t 1 α/2,n 1, Y + t 1 α/2,n 1 n n Does the interval achieve the nominal level of coverage 1 α? E.g. α =0.05 > t05 <- qt(0.975,n-1) > coverage <- sum((outsampmean-t05n*sampmean.ses <= mu) & (outsampmean+t05n*sampmean.ses >= mu))/s > coverage [1] 0.949 Simulation Studies in Statistics 16 SIMULATIONS FOR PROPERTIES OF HYPOTHESIS TESTS Real simple example: Size and power of the usual t-test for the mean H 0 : µ = µ 0 vs. H 1 : µ mu 0 To evaluate whether size/level of test achieves advertised α generate data under µ = µ 0 and calculate proportion of rejections of H 0 Approximates the true probability of rejecting H 0 when it is true Proportion should α To evaluate power, generate data under some alternative µ µ 0 and calculate proportion of rejections of H 0 Approximates the true probability of rejecting H 0 when the alternative is true (power) If actual size is >α, then evaluation of power is flawed Simulation Studies in Statistics 17

Size/level of test: > set.seed(3); S <- 1000; n <- 15; sigma <- sqrt(5/3) > mu0 <- 1; mu <- 1 > out <- generate.normal(s,n,mu,sigma) > ttests <- + (apply(out$dat,1,mean)-mu0)/sqrt(apply(out$dat,1,var)/n) > t05 <- qt(0.975,n-1) > power <- sum(abs(ttests)>t05)/s > power [1] 0.051 Simulation Studies in Statistics 18 Power of test: > set.seed(3); S <- 1000; n <- 15; sigma <- sqrt(5/3) > mu0 <- 1; mu <- 1.75 > out <- generate.normal(s,n,mu,sigma) > ttests <- + (apply(out$dat,1,mean)-mu0)/sqrt(apply(out$dat,1,var)/n) > t05 <- qt(0.975,n-1) > power <- sum(abs(ttests)>t05)/s > power [1] 0.534 Simulation Studies in Statistics 19 SIMULATION STUDY PRINCIPLES Issue: How well do the Monte Carlo quantities approximate properties of the true sampling distribution of the estimator/test statistic? Is S = 1000 large enough to get a feel for the true sampling properties? How believable are the results? A simulation is just an experiment like any other, so use statistical principles! Each data set yields a draw from the true sampling distribution, so S is the sample size on which estimates of mean, bias, SD, etc. of this distribution are based Select a sample size (number of data sets S) that will achieve acceptable precision of the approximation in the usual way! Principle 1: A Monte Carlo simulation is just like any other experiment Careful planning is required Factors that are of interest to vary in the experiment: sample size n, distribution of the data, magnitude of variation,... Each combination of factors is a separate simulation, so that many factors can lead to very large number of combinations and thus number of simulations time consuming Can use experimental design principles Results must be recorded and saved in a systematic, sensible way Don t only choose factors favorable to a method you have developed! Sample size S (number of data sets) must deliver acceptable precision... Simulation Studies in Statistics 20 Simulation Studies in Statistics 21

Choosing S: Estimator for θ (true value θ 0 ) Estimation of mean of sampling distribution/bias: ( ) var(t θ 0 )= var(t )= S var S 1 T s where d is the acceptable error s=1 S = {SD(T s)} 2 d 2 = SD(T s) S Can guess SD(T s ) from asymptotic theory, preliminary runs = d Choosing S: Coverage probabilities, size, power Estimating a proportion p (= coverage probability, size, power) binomial sampling, e.g. for a hypothesis test ( ) Z p(1 p) Z =#rejections binomial(s, p) var = S S Worst case is at p =1/2 1/ 4S d acceptable error S =1/(4d 2 ); e.g., d =0.01 yields S = 2500 For coverage, size, p =0.05 Simulation Studies in Statistics 22 Simulation Studies in Statistics 23 Principle 2: Save everything! Save the individual estimates in a file and then analyze (mean, bias, SD, etc) later...... as opposed to computing these summaries and saving only them! Critical if the simulation takes a long time to run! Advantage: can use software for summary statistics (e.g., SAS, R, etc.) Principle 3: Keep S small at first Test and refine code until you are sure everything is working correctly before carrying out final production runs Get an idea of how long it takes to process one data set Principle 4: Set a different seed for each run and keep records!!! Ensure simulation runs are independent Runs may be replicated if necessary Principle 5: Document your code!!! Simulation Studies in Statistics 24 Simulation Studies in Statistics 25

PRESENTING THE RESULTS Key principle: Your simulation is useless unless other people can clearly and unambigously understand what you did and why you did it, and what it means! Results: Must be presented in a form that Clearly answers the questions Makes it easy to appreciate the main conclusions What did you do and why? Before giving results, you must first give a reader enough information to appreciate them! State the objectives Why do this simulation? What specific questions are you trying to answer? State the rationale for choice of factors studied, assumptions made Review all methods under study be precise and detailed Describe exactly how you generated data for each choice of factors enough detail should be given so that a reader could write his/her own program to reproduce your results! Simulation Studies in Statistics 26 Simulation Studies in Statistics 27 Results: Must be presented in a form that Clearly answers the questions Makes it easy to appreciate the main conclusions Some basic principles: Only present a subset of results ( Results were qualitatively similar for all other scenarios we tried. ) Tables: An obvious way to present results, however, some caveats Avoid zillions of numbers jam-packed into a table! Place things to be compared adjacent to one another so that comparison is easy Rounding... Only present information that is interesting ( Relative biases for all estimators were less than 2% under all scenarios and hence are not shown in the table. ) The mode of presentation should be friendly... Simulation Studies in Statistics 27 Simulation Studies in Statistics 28

Rounding: Three reasons (Wainer, 1993) Humans cannot understand more than two digits very easily More than two digits can almost never be statistically justified We almost never care about accuracy of more than two digits Wainer, H. (1993) Visual Revelations, Chance Magazine Understanding/who cares? This year s school budget is $27,329,681.32 or This year s school budget is about 27 million dollars Mean life expectancy of Australian males is 67.14 years or Mean life expectancy of Australian males is 67 years Simulation Studies in Statistics 29 Simulation Studies in Statistics 30 Statistical justification: We are statisticians! For example Reporting Monte Carlo power how many digits? Design the study to achieve the desired accuracy and only report what we can justify as accurate The program yields 0.56273 If we wish to report 0.56 (two digits) need the standard error of this estimated proportion to be 0.005 so we can tell the difference between 0.56 and 0.57 or 0.58 (1.96 0.005 0.01) d =0.005 = 1/ 4S gives S = 10000! Statistical justification: We are statisticians! For example Reporting Monte Carlo power how many digits? Design the study to achieve the desired accuracy and only report what we can justify as accurate The program yields 0.56273 If we wish to report 0.56 (two digits) need the standard error of this estimated proportion to be 0.005 so we can tell the difference between 0.56 and 0.57 or 0.58 (1.96 0.005 0.01) d =0.005 = 1/ 4S gives S = 10000! Always report the standard error of entries in the table so a reader can gauge the accuracy! Simulation Studies in Statistics 31 Simulation Studies in Statistics 31

Bad table: Digits, apples and oranges Sample mean Trimmed mean Median Normal t 5 Normal t 5 Normal t 5 Mean 0.98515 0.98304 0.98690 0.98499 0.99173 0.98474 Bias 0.01485 0.01696 0.01310-0.01501-0.00827-0.01526 SD 0.33088 0.33067 0.34800 0.31198 0.39763 0.35016 MSE 0.10959 0.10952 0.12116 0.09746 0.15802 0.12273 Rel. Eff. 1.00000 1.00000 0.90456 1.12370 0.69356 0.89238 Good table: Digits, apples with apples Normal t 5 Sample mean Trim mean Median Sample mean Trim mean Median Mean 0.99 0.99 0.99 0.98 0.98 0.98 Bias 0.01 0.01 0.01 0.02 0.02 0.02 SD 0.33 0.35 0.40 0.33 0.31 0.35 MSE 0.11 0.12 0.16 0.11 0.10 0.12 Rel. Eff. 1.00 0.90 0.69 1.00 1.12 0.89 Simulation Studies in Statistics 32 Simulation Studies in Statistics 33 Graphs: Often a more effective strategy than tables! Example: Power of the t-test for H 0 : µ =1.0 vs. H 1 : µ 1.0 for normal data (S = 10000, n =15) µ 1.0 1.25 1.50 1.75 2.00 2.50 3.00 power 0.05 0.11 0.29 0.55 0.79 0.99 0.99 Power 0.2 0.4 0.6 0.8 1.0 1.0 1.5 2.0 2.5 3.0 µ Simulation Studies in Statistics 34 Simulation Studies in Statistics 35

Must reading: Available on the class web page Gelman, A., Pasarica, C., and Dodhia, R. (2002). Let s practice what preach: Turning tables into graphs. The American Statistician Simulation Studies in Statistics 36