Bayesian hypothesis testing for proportions

Similar documents
Inference of Probability Distributions for Trust and Security applications

Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013

Likelihood: Frequentist vs Bayesian Reasoning

Lecture 9: Bayesian hypothesis testing

Notes on the Negative Binomial Distribution

Comparison of frequentist and Bayesian inference. Class 20, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Tests for Two Survival Curves Using Cox s Proportional Hazards Model

1 Prior Probability and Posterior Probability

Basics of Statistical Machine Learning

Bayesian Analysis for the Social Sciences

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Point Biserial Correlation Tests

Non-Inferiority Tests for One Mean

THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7.

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

Study Guide for the Final Exam

Introduction to Hypothesis Testing. Hypothesis Testing. Step 1: State the Hypotheses

NONPARAMETRIC STATISTICS 1. depend on assumptions about the underlying distribution of the data (or on the Central Limit Theorem)

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables.

Tests for One Proportion

In the general population of 0 to 4-year-olds, the annual incidence of asthma is 1.4%

Permutation Tests for Comparing Two Populations

Chapter 3 RANDOM VARIATE GENERATION

Hypothesis Testing for Beginners

1.5 Oneway Analysis of Variance

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Multivariate normal distribution and testing for means (see MKB Ch 3)

E3: PROBABILITY AND STATISTICS lecture notes

Non-Parametric Tests (I)

Bayesian Statistics in One Hour. Patrick Lam

Independent t- Test (Comparing Two Means)

Topic 8. Chi Square Tests

STAT 360 Probability and Statistics. Fall 2012

Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.

Description. Textbook. Grading. Objective

Bayes and Naïve Bayes. cs534-machine Learning

2 Binomial, Poisson, Normal Distribution

Confidence Interval Calculation for Binomial Proportions

NCSS Statistical Software

Hypothesis Testing --- One Mean

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Hypothesis testing - Steps

99.37, 99.38, 99.38, 99.39, 99.39, 99.39, 99.39, 99.40, 99.41, cm

Part 2: One-parameter models

BA 275 Review Problems - Week 5 (10/23/06-10/27/06) CD Lessons: 48, 49, 50, 51, 52 Textbook: pp

Non-Inferiority Tests for Two Means using Differences

Module 2 Probability and Statistics

Statistiek I. Proportions aka Sign Tests. John Nerbonne. CLCG, Rijksuniversiteit Groningen.

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

August 2012 EXAMINATIONS Solution Part I

II. DISTRIBUTIONS distribution normal distribution. standard scores

THE NUMBER OF GRAPHS AND A RANDOM GRAPH WITH A GIVEN DEGREE SEQUENCE. Alexander Barvinok

Error Type, Power, Assumptions. Parametric Tests. Parametric vs. Nonparametric Tests

Section 13, Part 1 ANOVA. Analysis Of Variance

You have data! What s next?

Two Correlated Proportions (McNemar Test)

HYPOTHESIS TESTING: POWER OF THE TEST

Chicago Booth BUSINESS STATISTICS Final Exam Fall 2011

WISE Power Tutorial All Exercises

Stat 5102 Notes: Nonparametric Tests and. confidence interval

Fairfield Public Schools

A Predictive Probability Design Software for Phase II Cancer Clinical Trials Version 1.0.0

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Introduction to General and Generalized Linear Models

Tests for Two Proportions

Density Curve. A density curve is the graph of a continuous probability distribution. It must satisfy the following properties:

Sampling Distributions

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

SENSITIVITY ANALYSIS AND INFERENCE. Lecture 12

Final Exam Practice Problem Answers

Statistical Testing of Randomness Masaryk University in Brno Faculty of Informatics

Section 7.1. Introduction to Hypothesis Testing. Schrodinger s cat quantum mechanics thought experiment (1935)

Statistics in Medicine Research Lecture Series CSMC Fall 2014

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

Probabilistic Models for Big Data. Alex Davies and Roger Frigola University of Cambridge 13th February 2014

Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Pearson's Correlation Tests

Homework 4 - KEY. Jeff Brenion. June 16, Note: Many problems can be solved in more than one way; we present only a single solution here.

Bayesian Adaptive Designs for Early-Phase Oncology Trials

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation

Introduction to Hypothesis Testing OPRE 6301

Basic Bayesian Methods

Dongfeng Li. Autumn 2010

Part 2: Analysis of Relationship Between Two Variables

MATH4427 Notebook 2 Spring MATH4427 Notebook Definitions and Examples Performance Measures for Estimators...

The Variability of P-Values. Summary

NCSS Statistical Software. One-Sample T-Test

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

A SURVEY ON CONTINUOUS ELLIPTICAL VECTOR DISTRIBUTIONS

Multinomial and Ordinal Logistic Regression

How To Check For Differences In The One Way Anova

A Bayesian hierarchical surrogate outcome model for multiple sclerosis

Survey Research: Choice of Instrument, Sample. Lynda Burton, ScD Johns Hopkins University

Exact Nonparametric Tests for Comparing Means - A Personal Summary

Exam C, Fall 2006 PRELIMINARY ANSWER KEY

People have thought about, and defined, probability in different ways. important to note the consequences of the definition:

3.4 Statistical inference for 2 populations based on two samples

MULTIPLE REGRESSION EXAMPLE

Transcription:

Paper SP08 Bayesian hypothesis testing for proportions Antonio Nieto, PharmaMar, Madrid, Spain Sonia Extremera, PharmaMar, Madrid, Spain Javier Gómez, PharmaMar, Madrid, Spain ABSTRACT Most clinical trials contain tests on proportions, usually they are answered by means of the Frequentist approach, nevertheless another valid option could be to solve them using a Bayesian approach. The Bayesian approach has the advantage that it is not restricted to only one alternative hypothesis. Moreover, the hypotheses to be tested do not necessarily overlap. In this paper we show a SAS macro to perform Bayesian hypothesis testing for proportions, that can be also extended to other kinds of endpoints and distributions. For simplicity only the null and one alternative hypothesis are shown. This macro is constructed assuming an improper prior distribution, the uniform (0,1), and a Beta as the posterior conjugate distribution. Therefore after calculating the proportion of successes in the trial, the probability of being under the null hypothesis or under the alternative hypothesis and a text label indicating the highest probability are shown. INTRODUCTION This paper has not the aim to confront Frequentist approach vs. Bayesian approach. In fact, both approaches can coexist and should be used indistinctly in the statistical interest. Consequently, we have implemented easy SAS macros to calculate the probabilities of different hypotheses using a Bayesian approach. TESTS ON PROPORTIONS Almost all, if not all, clinical trials contain tests on proportions. The proportion distribution is a collection of n Bernoulli experiments; i.e., it is counted as the sum of the number of successes/failures out of n independent samples. Proportion tests usually are solved by means of a Frequentist approach, but this is not the only way. In a Frequentist analysis, if the comparison p-value is lower than the significance level selected, then the null hypothesis is rejected. In a Bayesian approach, the probability to be under any hypothesis is estimated and then these probabilities can be compared to decide what is the most plausible alternative. THE BAYES THEOREM The Bayesian approach is based on the Bayes theorem (1763), and expresses the conditional probability of a random event A given that an event B has occurred in terms of the conditional probability distribution of the event B given that A has occurred and the marginal probability of only A. In other words, beginning with the prior experience/knowledge (i.e., a priori distribution ) and then joining it with the trial investigation, a posterior conjugate distribution is obtained to be used to produce probabilities once the clinical trial has been completed. BAYESIAN TESTS The sum of the Bernouilli experiments is a Binomial distribution, which combined with the a priori information should lead to a posterior known distribution that allow an easy calculation of probabilities. Then, in practice, we will need to model the prior information to find the probability distribution that better fits the a priori knowledge and to lead to a posterior distribution easy to handle. The Bayesian approach has the advantage that it is not restricted to only one alternative hypothesis. In addition, the hypotheses to be tested do not necessarily overlap and, therefore, probabilities associated under any hypotheses can be calculated in function of the different cutoffs selected as long as we know the conjugate distribution to be used. 1

When the endpoint in a clinical trial follows a binomial distribution, the most appropriate distribution to model the a priori information is the Beta distribution. If we know that the prior probability of response can be modeled following a Beta distribution with parameters and ß, then it can be derived that the posterior conjugate Beta-Binomial distribution will have parameters a= x i + and b=n- x i +ß. For the Bayesian test on proportions, the initial assumption is that the prior probability of the proportion could be any value between zero and one (i.e., no a priori information is available). In this case, an improper prior distribution, the uniform (0,1), can be assumed. From the uniform (0,1) or its equivalent beta (1,1) as prior distribution, the Betabinomial distribution of the Bayes estimate under quadratic loss will follow a Beta distribution with parameters a= x i +1 and b=n- x i +1, being x i the number of successes in the experiment and n the number of independent samples (i.e., patients in a clinical trial). The utility of the Bayesian tests is enhanced when some information is available on the parameter to be estimated before the clinical trial is started. The a priori assumption could be modified if the range of possible values where the proportion is contained, is previously known; in that case, we have to find the prior beta distribution that fits to the initial assumption and derive the Beta-Binomial conjugate distribution, as explained above. For example, if a panel of experts concludes that the probability of response to a treatment in a clinical trial will fall between 0.3 and 0.7 with a high probability (i.e., 95%), and we want to perform a clinical trial to test hypotheses about the probability of response, then we need to find a Beta distribution that fits to this a priori information. Since the beta distributions for values relatively high of and ß have an approximately normal shape, we can model easily a normal distribution and use the mean (m) and standard deviation (s) of the normal distribution to characterize the a priori beta distribution. In a normal distribution, we know that we can find >95% of the probability between m-2s and m+2s. Therefore, we can make m-2s=0.3 and m+2s=0.7, and taking into account the symmetric shape of the normal distribution, m=0.5 and s=0.1. We just have to find a beta distribution with these mean and standard deviation, and one way to approximate could be by means of a moment s method type. As we know that for a beta distribution: m= / ( + ß) s 2 =m(1-m) / ( + ß + 1) and having the values of m and s, it can be derived that: = [m 2 (1-m) /s 2 ] m ß = ( -m )/m=[m (1-m) 2 /s 2 ] + m -1 In our example M= 0.5, S= 0.1, and we can calculate: = [0.5 2 (1-0.5) /0.1 2 ] 0.5 =12 ß = [0.5 (1-0.5) 2 /0.1 2 ] + 0.5-1 = 12 Then, the prior distribution is approximated as a Beta (12,12) and the conjugate binomial-beta distribution to be derived when we know the results from our clinical trial would be Beta( x i +12, n- x i +12). SAS MACRO (1) Macro assuming a prior Uniform (0,1). In the macro after calculating the proportion of successes in the trial, the probability of being under the null hypothesis or under the alternative hypothesis and a text label indicating the highest probability are shown below. /* Bayesian macro to test two hypotheses with a non-informative prior distribution (Uniform(0,1)=Beta (1,1)); -Variables needed * x: number of successes in the sample * n: sample size H0: Null hypothesis H1: Alternative hypothesis -The conjugate distribution is a Beta(x+1,n-x+1) */ 2

%MACRO Bayes_test (x=,n=,h0=,h1=); DATA bayes1; length test $255.; alfa=&x+1; beta=&n-&x+1; h0="p<=" left(trim(&h0)); h1="p>" left(trim(&h1)); x=&x; n=&n; x1=probbeta(&h0,alfa,beta); x2=1-probbeta(&h1,alfa,beta); If x1>x2 then test='h0 is more probable than H1'; else if x1<x2 then test='h1 is more probable than H0'; else if x1=x2 then test='equally probable hypotheses '; Proc print data=bayes1 noobs l; var h0 h1 x n test x1 x2; label h0='h0' h1='h1' x='x' n='n' test='test' x1='prob. under H0' x2='prob. under H1'; title "Bayes test of &x successes in &n samples"; footnote "Prior distribution Uniform (0,1)"; title; footnote; %MEND Bayes_test; EXAMPLE 1 We plan a clinical trial, with n=40 as sample size, no prior information on the proportion of responders, and we would like to test the hypotheses: H0: Proportion of responders is 40%. H1: Proportion of responders is >60%. If we obtain 24 successes in our trial (i.e., 24 patients responding to a given experimental therapy), then we can obtain the posterior probability of the null and alternative hypotheses taking into account the results in our sample. %Bayes_test (x=24,n=40,h0=0.40,h1=0.60); Bayes test of 2 successes in 40 samples H0 H1 X N Test Prob. under H0 Prob. under H1 p<=0.40 p>0.60 24 40 H1 is more probable than H0 0.005347226 0.48303 Prior distribution: Uniform (0,1) 3

Beta distribution plot PhUSE 2011 5.5 5.0 4.5 4.0 Prior Posterior Probability density 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 X SAS MACRO (2) Macro assuming a prior Beta (, ß). In the macro after calculating the proportion of successes in the trial, the probability of being under the null hypothesis or under the alternative hypothesis and a text label indicating the highest probability are shown below. /* Bayesian macro to test two hypotheses with a Beta prior distribution (Beta (alpha,beta)); -Variables and parameters needed * x: number of successes in the sample * n: sample size * Alpha: alpha parameter of the prior beta distribution * Beta: beta parameter of the prior beta distribution H0: Null hypothesis H1: Alternative hypothesis -The conjugate distribution is a Beta(x+alpha,n-x+beta) */ %MACRO Bayes_test (x=,n=,h0=,h1=,alpha=,beta=); DATA bayes1; length test $255.; a=&x+α b=&n-&x+β h0="p<=" left(trim(&h0)); h1="p>" left(trim(&h1)); x=&x; n=&n; x1=probbeta(&h0,a,b); x2=1-probbeta(&h1,a,b); If x1>x2 then test='h0 is more probable than H1'; else if x1<x2 then test='h1 is more probable than H0'; else if x1=x2 then test='equally probable hypotheses'; 4

Proc print data=bayes1 noobs l; var h0 h1 x n test x1 x2; label h0='h0' h1='h1' x='x' n='n' test='test' x1='prob. under H0' x2='prob. under H1'; title "Bayes test of &x successes in &n samples"; footnote "Prior distribution Beta (&alpha,&beta)"; title; footnote; %MEND Bayes_test; EXAMPLE 2 In the same trial of Example 1, we know that the proportion of responders will fall within [0.3-0.7] with a 95% probability, and we would like to test the same hypotheses. As calculated above, the prior distribution is Beta (12,12), and after obtaining 24 successes in our trial, thus the conjugate distribution is a Beta (24+12, 40-24+12) -> Beta (36, 28). The macro call will be: %Bayes_test (x=24,n=40,h0=0.40,h1=0.60,alpha=12,beta=12); Bayes test of 2 successes in 40 samples H0 H1 X N Test Prob. under H0 Prob. under H1 p<=0.40 p>0.60 24 40 H1 is more probable than H0 0.004406341 0.27539 Prior distribution: Beta (12,12) As we can see, in this second example, the probabilities under Ho and H1 changed according to the existing prior distribution. 6.5 Beta distribution plot Probability density 6.0 5.5 5.0 Prior Posterior 4.5 4.0 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 X NOTE: If prior distribution selected in SAS Macro (2) is Beta (1,1), then the conjugate results found are the same than those obtained with the first macro; therefore, this macro could be generalized and used alone. 5

CONCLUSION Frequentist approaches are usually employed in clinical investigation as they are a good method to conduct proportion tests, but they are not the unique method available. Bayesian tests, especially in the context of adaptive designs, are nowadays being increasingly used. We presented a Bayesian approach to be included in the statistical armamentarium to test proportion hypotheses. In this paper, we show SAS macros to perform Bayesian hypothesis testing for proportions, but its use can be also extended to other endpoints and distributions. The most important aspects to take into account for Bayesian tests are a good selection of the distributions and a clear definition of the a priori information collected. REFERENCES -SAS Online Doc. -Bayesian Approaches to clinical trials and health care evaluation. David J. Spiegelhalter, Keith R. Abrams and Jonathan P. Myles. John Wiley and Sons. Dec 1, 2003. CONTACT INFORMATION Your comments and questions are valued and encouraged. Antonio Nieto Archilla Clinical Development. PharmaMar S.A. Avda. de los Reyes, 1 Polígono Industrial La Mina 28770 Colmenar Viejo. Madrid (SPAIN) anieto@pharmamar.com Sonia Extremera Tenaguillo Clinical Development. PharmaMar S.A. Avda. de los Reyes, 1 Polígono Industrial La Mina 28770 Colmenar Viejo. Madrid (SPAIN) sextremera@pharmamar.com Javier Gómez García Clinical Development. PharmaMar S.A. Avda. de los Reyes, 1 Polígono Industrial de la Mina 28770 Colmenar Viejo. Madrid (SPAIN) jgomez@pharmamar.com 6