Interim Analysis in Clinical Trials



Similar documents
Likelihood Approaches for Trial Designs in Early Phase Oncology

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

TUTORIAL on ICH E9 and Other Statistical Regulatory Guidance. Session 1: ICH E9 and E10. PSI Conference, May 2011

Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Biostat Methods STAT 5820/6910 Handout #6: Intro. to Clinical Trials (Matthews text)

"Statistical methods are objective methods by which group trends are abstracted from observations on many separate individuals." 1

Sample Size and Power in Clinical Trials

HYPOTHESIS TESTING: POWER OF THE TEST

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

Mind on Statistics. Chapter 12

CHOICE OF CONTROL GROUP AND RELATED ISSUES IN CLINICAL TRIALS E10

Adoption by CHMP for release for consultation November End of consultation (deadline for comments) 31 March 2011

Testing Hypotheses About Proportions

Tests for Two Survival Curves Using Cox s Proportional Hazards Model

Tests for Two Proportions

CLINICAL TRIALS: Part 2 of 2

Non-Inferiority Tests for Two Means using Differences

AVOIDING BIAS AND RANDOM ERROR IN DATA ANALYSIS

Non-Inferiority Tests for Two Proportions

Comparison/Control Groups. Mary Foulkes, PhD Johns Hopkins University

This clinical study synopsis is provided in line with Boehringer Ingelheim s Policy on Transparency and Publication of Clinical Study Data.

Hypothesis testing. c 2014, Jeffrey S. Simonoff 1

Tests for One Proportion

The Promise and Challenge of Adaptive Design in Oncology Trials

BA 275 Review Problems - Week 6 (10/30/06-11/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp , ,

Analysis and Interpretation of Clinical Trials. How to conclude?

2 Precision-based sample size calculations

Issues Regarding Use of Placebo in MS Drug Trials. Peter Scott Chin, MD Novartis Pharmaceuticals Corporation

Adaptive Design for Intra Patient Dose Escalation in Phase I Trials in Oncology

Introduction to Hypothesis Testing OPRE 6301

Comparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples

CLINICAL TRIALS SHOULD YOU PARTICIPATE? by Gwen L. Nichols, MD

How to evaluate medications in Multiple Sclerosis when placebo controlled RCTs are not feasible

S+SeqTrial User s Manual

Introduction to Hypothesis Testing

Descriptive Statistics

Measure #257 (NQF 1519): Statin Therapy at Discharge after Lower Extremity Bypass (LEB) National Quality Strategy Domain: Effective Clinical Care

Non-Inferiority Tests for One Mean

Summary and general discussion

Comparison of frequentist and Bayesian inference. Class 20, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

C. The null hypothesis is not rejected when the alternative hypothesis is true. A. population parameters.

II. DISTRIBUTIONS distribution normal distribution. standard scores

Guidance for Industry Non-Inferiority Clinical Trials

MISSING DATA: THE POINT OF VIEW OF ETHICAL COMMITTEES

Efficacy analysis and graphical representation in Oncology trials - A case study

Statistical Impact of Slip Simulator Training at Los Alamos National Laboratory

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

Study Design. Date: March 11, 2003 Reviewer: Jawahar Tiwari, Ph.D. Ellis Unger, M.D. Ghanshyam Gupta, Ph.D. Chief, Therapeutics Evaluation Branch

Endpoint Selection in Phase II Oncology trials

Operational aspects of a clinical trial

Correlational Research

Point Biserial Correlation Tests

Bios 6648: Design & conduct of clinical research

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

Section 13, Part 1 ANOVA. Analysis Of Variance

Sample Size Planning, Calculation, and Justification

Introduction to Hypothesis Testing. Hypothesis Testing. Step 1: State the Hypotheses

UNTIED WORST-RANK SCORE ANALYSIS

Analysis Issues II. Mary Foulkes, PhD Johns Hopkins University

IS 30 THE MAGIC NUMBER? ISSUES IN SAMPLE SIZE ESTIMATION

Lecture Notes Module 1

Confidence Intervals for the Difference Between Two Means

Confidence Intervals for One Standard Deviation Using Standard Deviation

Methods of Sample Size Calculation for Clinical Trials. Michael Tracy

The Clinical Trials Process an educated patient s guide

Guidance for Industry Diabetes Mellitus Evaluating Cardiovascular Risk in New Antidiabetic Therapies to Treat Type 2 Diabetes

First In Human Pediatric Trials and Safety Assessment for Rare and Orphan Diseases

Understanding Clinical Trial Design: A Tutorial for Research Advocates

PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Chi-square test Fisher s Exact test

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

University of Hawai i Human Studies Program. Guidelines for Developing a Clinical Research Protocol

Hypothesis Testing for Beginners

3.4 Statistical inference for 2 populations based on two samples

MANAGEMENT OF DATA IN CLINICAL TRIALS

Clinical Trial Design. Sponsored by Center for Cancer Research National Cancer Institute

Confidence Intervals for Cpk

Simple Linear Regression Inference

Hypothesis testing - Steps

Math 251, Review Questions for Test 3 Rough Answers

ADVANCE: a factorial randomised trial of blood pressure lowering and intensive glucose control in 11,140 patients with type 2 diabetes

Bayesian Phase I/II clinical trials in Oncology

Regulatory Pathways for Licensure and Use of Ebola Virus Vaccines During the Current Outbreak FDA Perspective

STAT 350 Practice Final Exam Solution (Spring 2015)

Design and Analysis of Phase III Clinical Trials

Bayesian Adaptive Designs for Early-Phase Oncology Trials

Pearson's Correlation Tests

Statistics 2014 Scoring Guidelines

Sheffield Kidney Institute. Planning a Clinical Trial

Bayesian Model Averaging Continual Reassessment Method BMA-CRM. Guosheng Yin and Ying Yuan. August 26, 2009

Guidelines for AJO-DO submissions: Randomized Clinical Trials June 2015

Parametric and non-parametric statistical methods for the life sciences - Session I

Glossary of Clinical Trial Terms

Variables Control Charts

Transcription:

Interim Analysis in Clinical Trials Professor Bikas K Sinha [ ISI, KolkatA ] Courtesy : Dr Gajendra Viswakarma Visiting Scientist Indian Statistical Institute Tezpur Centre e-mail: sinhabikas@yahoo.com 1

What is a clinical trial? A Clinical trial is defined as a prospective study comparing the effect and value of intervention (s) against a control in human beings. A test of a new intervention or treatment on people for detecting -Tolerability -Safety -Efficacy 2

Types of clinical trials Superiority Non-inferiority Equivalence It can be a Phase I, Phase II or Phase III Trial 3

Diagrammatical Presentation of Clinical Trials equivalence non-inferior superior Control better - 0 Test better 4

Clinical Trial Stages Phase I: Clinical Pharmacology and Toxicity Objective: To determine a safe drug dose for further studies of therapeutic efficacy of the drug Design: Dose-escalation to establish a maximum tolerated dose (MTD) for a new drug Subjects: 1-10 normal volunteers or patients with disease 5

Clinical Trial Stages Phase II: Initial Clinical Investigation for Treatment Effect Is a fairly small-scale Objective: To get preliminary information on effectiveness and safety of the drug Design: Often single arm (no control group) Subjects: 100-500 patients with disease (or depends on Therapeutic Area [TA]) 6

Clinical Trial Stages Phase III: Full-Scale Evaluation of the Treatment (Comparative clinical trial): planned experiment on human subjects. To some people the term Clinical trial is synonymous with such a full-scale Phase III trial. Phase III trial is most rigorous and extensive type of scientific clinical investigation of a new treatment. Objective: To compare efficacy of the new treatment with the standard regimen Design: Randomized Control Subjects: depends on phase II trial patients with disease 7

Clinical Trial Stages Phase IV: Post-Marketing After the research program leading to a drug being approved for marketing, there remain substantial inquiries still to be undertaken as regards monitoring for adverse effects and additional large-scale, longterm studies of morbidity and mortality. Objective: To get more information (long-term side effects) Design: no control group Subjects: Patients with disease using the treatment 8

The Big Picture DRUG A DRUG B Test stat 9

So What is Different? Ethics: Experiment involving human subjects brings up new ethical issues Bias: Experiment on intelligent subjects requires new measures of control We will also study the additional considerations in clinical trials to address the above requirements. 10

Interim Analysis Analysis comparing intervention groups at any time before the formal completion of the trial, usually before recruitment is complete. Often used with "stopping rules" so that a trial can be stopped if participants are being put at risk unnecessarily. Timing and frequency of interim analyses should be specified in the protocol. 11

Interim Analyses Interim analyses is a tool to protect the welfare of subjects By stopping enrollment/treatment as soon as a drug is determined to be harmful By stopping enrollment as soon as a drug is determined to be highly beneficial By stopping trials which will yield little additional useful information (or which have negligible chance of demonstrating efficacy if fully enrolled, given results to date) The associated statistical methods are generally referred to as group sequential methods 12

Flowchart of the Study Treatment period Treatment-free follow up Control T1 Test (safe dose determined) T2 Screening 15 days to 4 weeks Visit 1 Enrolment 4 weeks 4 weeks 4 weeks 4 weeks 4 weeks 4 weeks Visit 2 Visit 3 Visit 4 Visit 5 End of treatment Visit 6 Visit 7 Required Sample size of the study is 330 (each are required 110 subjects) 13

Disposition Table on going study Drug C Drug T1 Drug T2 Total Patient Screened Screening Failure Patient Randomized Study Incomplete + ongoing 129 23 36 36 34 106 9+5 8+5 10+3 28+12 Completed Visits 5+ 22 23 21 66 14

Mean PASI Change at Visits in Different Treatment Groups 16.00 Drug A Drug B Drug C 14.00 12.00 10.00 Mean PASI 8.00 6.00 4.00 2.00 0.00 V1 V2 V3 V4 V5 Visit 15

Some Examples of Why a Trial May Be Terminated Treatments found to be convincingly different Treatments found to be convincingly not different Side effects or toxicities are too severe Data quality is poor Accrual is slow Definitive information becomes available from an outside source making trial unnecessary or unethical Scientific question is no longer important Adherence to treatment is unacceptably low Resources to perform study are lost or diminished Study integrity has been undermined by fraud or misconduct 16

Opposing Pressures in Interim Analyses To Terminate: minimize size of trial minimize number of patients on inferior arm costs and economics timeliness of results To Continue: increase precision reduce errors increase power increase ability to look at subgroups gather information on secondary endpoints 17

The pitfalls of interim analyses RCTs [Randomized Clinical Trials] with interim analysis 1. Calculate sample size 2. Carry out the clinical trial 3. Employ statistical test of efficacy at pre-planned stages in the interim until sample size has been reached* *One treatment declared significantly better than the other if we get a p-value less than 5%... 18

Statistical Considerations in Interim Analyses Consider a safety/efficacy study (phase II) At this point in time, is there statistical evidence that. The treatment will not be as efficacious as we would hope/need it to be? The treatment is clearly dangerous/unsafe? The treatment is very efficacious and we should proceed to a comparative trial? 19

Statistical Considerations in Interim Analyses Consider a comparative study (phase III) At this point in time, is there statistical evidence that. One arm is clearly more effective than the other? One arm is clearly dangerous/unsafe? The two treatments have such similar responses that there is no possibility that we will see a significant difference by the end of the trial? 20

Statistical Considerations in Interim Analyses We use interim statistical analyses to determine the answers to these questions. It is a tricky business: interim analyses involve relatively few data points inferences can be inexact we increase chance of errors. if interim results are conveyed to investigators, a bias may be introduced in general, we look for strong evidence in one or another direction. 21

Example: ECMO trial Extra-corporeal membrane oxygenation (ECMO) versus standard treatment for newborn infants with persistent pulmonary hypertension. N = 39 infants enrolled in study Trial terminated after interim analysis 4/10 deaths in standard therapy arm 0/9 deaths in ECMO arm p = 0.054 (one-sided) Questions: Is this result sufficient evidence on which to change routine practice? Is the evidence in favor of ECMO very strong? 22

Example: ISIS trial The Second International Study of Infarct Survival (ISIS-2) Five week study of streptokinase versus placebo based on 17,187 patients with myocardial infarction. Trial continued until 12% death rate in placebo group 9.2% death rate in streptokinase group p < 0.000001 Issues: strong evidence in favor of streptokinase was available early on impact would be greater with better precision on death rate, which would not be possible if trial stopped early earlier trials of streptokinase has similar results, yet little impact. 23

Statistical Approaches for Interim Analysis Three main philosophic approaches Frequentist approach: Multiple Looks Group Sequential Designs Stopping Boundaries Alpha Spending Functions Two Stage Designs Likelihood approach Bayesian approach All differ in their approaches Frequentist (Multiple Looks) is most commonly seen ( but not necessarily the best! ) 24

An Example of Multiple Looks: RCT (Randomized Clinical Trial with Trt A vs Trt B): Required Sample Size: 200 TRT A 100 TRT B 100 25

An Example of Multiple Looks: Four interim looks (50, 100, 150, and 200) TRT A 100 P = 0.028 1st Interim look TRT B 100 26

An Example of Multiple Looks: Four interim looks (50, 100, 150, and 200) TRT A 100 P = 0.38 2nd Interim look TRT B 100 27

An Example of Multiple Looks: Four interim looks (50, 100, 150, and 200) TRT A 100 P = 0.028 P = 0.028 P = 0.38 P = 0.62 P = 1.00 TRT B 100 28

An Example of Multiple Looks: Consider planning a comparative trial in which two treatments are being compared for efficacy (response rate). H 0 : p 2 = p 1 H 1 : p 2 > p 1 A standard design says that for 80% power and with alpha of 0.05, you need about 100 patients per arm based on the assumption p 2 = 0.50, p 1 = 0.30 which results in 0.20 for the difference. So what happens if we find p < 0.05 before all patients are enrolled? Why can t we look at the data a few times in the middle of the trial and conclude that one treatment is better if we see p < 0.05? 29

The plots to the right show simulated data where p 1 = 0.40 and p 2 = 0.50 In our trial, looking to find a difference between 0.30 to 0.50, we would not expect to conclude that there is evidence for a difference. Risk Ratio 0.0 0.5 1.0 1.5 0 50 100 150 200 Number of Patients However, if we look after every 4 patients, we get the scenario where we would stop at 96 patients and conclude that there is a significant difference. H 1 pvalue 0.2 0.4 0.6 0.8 1.0 0 50 100 150 200 Number of Patients 30

If we look after every 10 patients, we get the scenario where we would not stop until all 200 patients were observed and would conclude that there is not a significant difference (p =0.40) Risk Ratio pvalue 1.0 1.2 1.4 1.6 0.2 0.4 0.6 0.8 1.0 50 100 150 200 Number of Patients 50 100 150 200 Number of Patients H 1 31

If we look after every 40 patients, we get the scenario where we would not stop either. If we wait until the END of the trial (N = 200), then we estimate p 1 to be 0.45 and p 2 to be 0.52. The p-value for testing that there is a significant difference is 0.40. Risk Ratio 1.0 1.2 1.4 pvalue 0.2 0.4 0.6 0.8 1.0 50 100 150 200 Number of Patients 50 100 150 200 Number of Patients H 1 32

Would we have messed up if we looked early on? Every time we look at the data and consider stopping, we introduce the chance of falsely rejecting the null hypothesis. In other words, every time we look at the data, we have the chance of a type 1 error. If we look at the data multiple times, and we use alpha of 0.05 as our criterion for significance, then we have a 5% chance of stopping each time. Under the true null hypothesis and just 2 looks at the data, then we approximate the error rates as: Probability stop at first look: 0.05 Probability stop at second look: 0.95*0.05 = 0.0475 Total probability of stopping is 0.0975 33

Effect of Sample Size on a True Proportion n\p^ 0.20 0.30 0.40 0.50 0.60 10 0,.45 0,.60.1,.7.18,.82.3,.9 20.02,.38.1,.5.18,.62.28,.72.38,.82 30.05,.35.42,.78 40.07,.33.35,.75 50.09,.31 p^ +/- 2 sqrt{p^(1-p^)/n}.36,.74 100.12,.28 serve as both-sided.50,.70 200.15,.25 limits to TRUE p.53,.67 300.16,.24.54,.66 34

Effect of Sample Size on a True Proportion n\p^ 0.2 0.3 0.4 0.5 0.6 400 0.16, 0.24 500 0.17, 0.23 1000.175,.225 1500.18,.22 2000.182,.218 p^ +/- 2 sqrt{p^(1-p^)/n} 3000.185,.215 serve as both-sided limits 4000.19,.21 for TRUE p 5000.19,.21 35

Illustrative Examples :Interim Analysis Example 1. It is desired to carry out an experiment to examine the superiority, or otherwise, of a therapeutic drug over a standard drug with 5% level and 90% power for detection of 10% difference in the proportions cured. C : Standard Drug T : Therapeutic Drug H_0 : P_C - P_T = 0 H_1 : P_C # P_T Size = 0.05, Power = 0.90 for =P_T P_C = 0.10. IT IS A BOTH-SIDED TEST. 36

Determination of Sample Size for Full Analysis Two-sided Test = 0.05; Z_ /2 = 1.96 Power = 0.90; = 0.10, Z_ = 1.282, =0.10 N = 2(Z_ /2 + Z_ )^2 pbar(1-pbar)/ ^2 Assume pbar = 0.35 [suggestive cure rate] N = 2(1.96 + 1.282)^2 (0.35)(0.65)/(0.10)^2 = 21.021128 x 22.75= 478.23 480 Conclusion: Each arm involves 480 subjects. 37

Full Experiment vs. Interim Analysis For Full Experiment : Needed 480 subjects in each arm. At the end of the entire experiment, suppose we observe : C : # cured = 156 out of 480 i.e., 32.5% T : # cured = 190 out of 480 i.e., 39.6% Therefore, p^_c = 0.325 and p^_t = 0.396. Hence, pbar = [p^_c + p^_t]/2 = 0.3605. Finally, we compute the value of z given by 38

Full Analysis.. Z_obs. = [p^_c p^_t]/sqrt[pbar(1-pbar)2/n] =[.325-.396]/sqrt[.36x.64x2/480] = -[.071]/sqrt[0.00192] = -2.29 In absolute value, z_obs. is computed as 2.29 which is more than the critical value of z given by 1.96 [for a both-sided test with size 5%]. Hence, we conclude that the Null Hypothesis is not tenable, given the experimental outputs. 39

Interim Analysis : 2 Looks First Look : use 50% of data 2nd Look : At the end, if continued after 1st. Q. What is the size of the test at 1 st look? Also, what is the size at the 2 nd look so that on the whole the size is 5 %? Ans. If we use 5% for the size at each of 1 st and 2 nd looks, then the over-all size becomes 8%. Hence both can NOT be taken at 5%. Start with < 5% and then take > 5%... 40

Interim Analysis : 2 Looks Defining Equation : = P[ Z_I > z*] + P[ Z_I < z*, Z_{I,II} > z**] where Z_I and Z_II are based on 50% data in two identical and independent segments so that their distributions are identical. Further, Z_{I,II} = [z_i + z_ii]/sqrt(2) is based on combined evidence of I & II and hence Z_I and Z_{I,II} are dependent. Choices of z* and z** : intricate formulae. 41

Interim Analysis : 2 Looks Z-computation. z_i obs. is to be based on 50% data upto the 1 st look for each of C and T. Data : C (90/240) & T(120/240) & n = 240. p^_c = 90/240 = 0.375; p^_t = 120/240=0.50 pbar = (0.375 + 0.50)/2 = 0.4375. z_i obs. = [p^_c p^_t]/sqrt[pbar(1-pbar)2/n] = - [ 0.125 ]/sqrt{.4375x.5625x2/240} = - (0.125)/sqrt{0.002050} = - 2.76 implies??? 42

Interim Analysis : 2 Looks Suggested cut-off points :Adopted for 2 Looks z_c Hebittle-Peto Pocock O Brien-Fleming z* 3.0 2.46 3.5 z** 2.0 2.46 2.0 z_i obs. in absolute value = 2.76 Conclusion? Reject H_0.suggested by Pocock s Rule Continue suggested by other two. Finally, z = - 2.29 suggests acceptance of H_0 only by Pocock s rule 43

Interim Analysis : 4 Looks Cut-off points : Suggested Rules z_c Hebittle-Peto Pocock O Brien-Fleming z* 3.0 2.42 4.00 z** 3.0 2.42 2.83 z*** 3.0 2.42 2.32 z**** 2.0 2.42 2.00 : 1 st look; ** : 2 nd look; *** : 3 rd look and **** : last [4 th ] look 44

Interim Analysis : 4 Looks Details of data sets : C : 48/120; 42/120; 30/120; 36/120 Total 156/480 T : 54/120; 66/120; 32/120; 38/120 Total 190/480 Progressive proportions for C : 48/120=0.40; (48+42)/240= 0.375; (48+42+30)/360=0.333; 156/480=0.325 Progressive proportions for T : 54/120=0.45; (54+66)/240= 0.50; 45 (54+ 66+32)/360=0.422; 190/480=0.396

Interim Analysis : 4 Looks Progressive computations of pbar 1 st Look : pbar = (0.40 + 0.45)/2 = 0.425 2 nd Look : pbar = (0.375 + 0.50)/2 = 0.4375 3 rd Look : pbar = ( 0.333 + 0.422)/2 = 0.3639 4 th Look : pbar = (0.325 + 0.396)/2 = 0.3605 46

Interim Analysis : 4 Looks Progressive Computations of z-statistic Generic Formula : z-obs. for Look # i is the ratio of (a) [p^_c(i) p^_t(i)] for i-th Look (b) sqrt[pbar(i)(1-pbar(i))2/n(i)] where pbar(i) corresponds to Look # i and also n(i) corresponds to size of each arm of Look # i for each i = 1, 2, 3,4. Note : n(1)=120; n(2)=240; n(3)=360, n(4)=480 47

Interim Analysis : 1st Look z_(look I) obs. = [p^_c p^_t]/sqrt[pbar(1-pbar)2/n*] = [ 0.40-0.45 ]/sqrt{.425x.575x2/120} = - (0.05)/sqrt{0.004073} = -0.7835 Conclusion : All Rules are suggestive of Continuation to 2 nd Look 48

Interim Analysis : 2 nd Look z_(look II) obs. = [p^_c p^_t]/sqrt[pbar(1-pbar)2/n**] = [0.375-0.50 ]/sqrt{.4375x.5625x2/240} = - (0.125)/sqrt{0.002050} = - 2.76 Conclusion : Reject H_0 by Pocock s Rule However, continue to 3 rd Look according to the other two rules. 49

Interim Analysis : 3 rd Look and z_(look III) obs. = [p^_c p^_t]/sqrt[pbar(1-pbar)2/n***] = [0.333-0.422 ]/sqrt{.3639x.6361x2/360} = - (0.089)/sqrt{0.001286} = - 2.48 Conclusion : Reject H_0 by Pocock & OBF Rules but Continue by H-P Rule Last Look : z_obs. = -2.29 Accept H_0 by Pocock s Rule only 50

Data Analysis.Interpretations Relative Merits of Decision Rules : Pocock s Rule : Maintains uniformity in critical values.so apparently conservative at the start slowly turns into liberal! Other Rules : Liberal at the start and conservative at the end.. All Rules have to maintain the averaging principle to meet alpha at the end. No Rule can be strict/liberal all through the Looks. 51

Interim Analysis : Example 2 Continuous data : Testing for equality of mean effects of two treatments : C & T. As before, we have Null and Alt. Hypotheses and we have a specified value of DELTA = Mean of T Mean of C and a specified power, say 90% to detect this. Taking size equal to 5%, we solve for the sample size in each arm. This is routine computation and we take sample size N = 525 in each arm.

Full Analysis : Sample Size Computation Assume normal distribution with sigma = 5. Two-sided Test = 0.05; Z_ /2 = 1.96 Power = 0.90; = 0.10, Z_ = 1.282, = 0.20 times sigma = 20% of sigma = 1.0 N = 2(Z_ /2 + Z_ )^2 x sigma^2 / ^2 = 2(1.96 + 1.282)^2 / 0.04 = 525 [approx.] We can think of 5 Looks altogether at equal Steps..each with approx. 105 observations.

Interim Analysis Example contd. Details of data sets : (mean, sample size) C : (30.5,105); (31.8, 105); (29.7, 105); (30.2, 105); (31.3, 105) T : (31.7,105); (32.0, 105); (30.8, 105); (33.7, 105); (32.8, 105) Progressive sample means for C : 30.5, 31.15, 30.67, 30.55, 30.70 Progressive sample means for T : 31.7, 31.85, 30.83, 32.55, 32.60

Interim Analysis : Example contd. Progressive Computations of z-statistic Generic Formula : z-obs. for Look # i is the ratio of (a) [mean_c(i) mean_t(i)] for i-th Look (b) sigma times Sqrt 2/n(i)] where mean refers to sample mean for and also n(i) corresponds to size of each arm of Look # i for each i = 1, 2, 3,4, 5. Note : n(1)=105; n(2)=210; n(3)=315, n(4)=420 and n(5) = 525.

Interim Analysis : Example contd. Cut-off points : Suggested Rules z_c Hebittle-Peto Pocock O Brien-Fleming z* 3.0 2.60 4.56 z** 3.0 2.60 3.23 z*** 3.0 2.60 2.63 z**** 3.0 2.60 2.28 z***** 2.0 2.60 2.00 : 1 st look; ** : 2 nd look; *** : 3 rd look; **** : 4 th look & ***** : Last [5th] look

Interim Analysis Example contd. z_(look I) obs. = [mean_c mean_t]/sigma x sqrt[2/n*] = - [ 1.2] / 5 x sqrt{2/105} = - 1.74 Conclusion : Continue to 2 nd Look

Interim Analysis : Example contd. z_(look II) obs. = [mean_c mean_t]/sigma x sqrt[2/n**] = - [ 0.7 ] / 5 x sqrt{2/210} = - 1.43 Conclusion : Continue to 3rd Look

Interim Analysis : Example contd. z_(look III) obs. = [mean_c mean_t]/sigma x sqrt[2/n***] = - [ 0.16 ] / 5 x sqrt{2/315} = - 0.40 Conclusion : Continue to 4th Look

Interim Analysis : Example contd. z_(look IV) obs. = [mean_c mean_t]/sigma x sqrt[2/n****] = - [ 2.0 ] / 5 x sqrt{2/420} = - 5.80 Conclusion : Stop and Reject H_0. Strong evidence against H_0 and yet 105 observations per arm are left to be studied. What if the expt was continued till the end anyway?

Interim Analysis : Example contd. z_(look V) obs. = [mean_c mean_t]/sigma x sqrt[2/n*****] = - [ 1.90 ] / 5 x sqrt{2/525} = - 6.16 Conclusion : Reject H_0. Quite a strong evidence against H_0