# Statistics for Biology and Health

1 Statistics for Biology and Health Series Editors M. Gail, K. Krickeberg, J.M. Samet, A. Tsiatis, W. Wong For further volumes:

2

3 David G. Kleinbaum Mitchel Klein Survival Analysis A Self Learning Text Third Edition

5 To Rachel Robinson Morris Dees Aung San Suu Kyi John Lewis And countless other persons, well-known or unknown, who have had the courage to stand up for their beliefs for the benefit of humanity.

6

7 Preface This is the third edition of this text on survival analysis, originally published in As in the first and second editions, each chapter contains a presentation of its topic in lecture-book format together with objectives, an outline, key formulae, practice exercises, and a test. The lecture-book format has a sequence of illustrations and formulae in the left column of each page and a script in the right column. This format allows you to read the script in conjunction with the illustrations and formulae that highlight the main points, formulae, or examples being presented. This third edition has expanded the second edition by adding one new chapter, additional sections and clarifications to several chapters, and a revised computer appendix. The new chapter is Chapter 10, Design Issues for Randomized Trials, which considers how to compute sample size when designing a randomized trial involving time-to-event data. We have expanded Chapter 1 to clarify the distinction between random, independent, and noninformative censoring assumptions often made about survival data. We also added a section in Chapter 1 that introduces the Counting Process data layout that is discussed in later chapters (3, 6, and 8). We added sections in Chapter 2 to describe how to obtain confidence intervals for the Kaplan Meier (KM) curve and the median survival time obtained from a KM curve. We have expanded Chapter 3 on the Cox Proportional Hazards (PH) Model by describing the use of age as the time scale instead of time-on-follow-up as the outcome variable. We also added a section that clarifies how to obtain confidence intervals for PH models that contain product terms that reflect effect modification of exposure variables of interest. vii

8 viii Preface We have added sections that describe the derivation of the (partial) likelihood functions for the stratified Cox (SC) model in Chapter 5 and the extended Cox model in Chapter 6. We have expanded Chapter 9 on competing risks to describe the Fine and Gray model for a subdistribution hazard that allows for a multivariable analysis involving a cumulative incidence curve (CIC). We also added a numerical example to illustrate the calculation of a conditional probability curve (CPC) defined from a CIC. The Computer Appendix in the second edition of this text provided step-by-step instructions for using the computer packages STATA, SAS, and SPSS to carry out the survival analyses presented in the main text. We expanded this Appendix to include the free internet-based computer software package call R. We have also updated our description of STATA (version 10.0), SAS (version 9.2), and SPSS (version PASW 18). The application of these computer packages to survival data is described in separate selfcontained sections of the Computer Appendix, with the analysis of the same datasets illustrated in each section. In addition to the above new material, the original nine chapters have been modified slightly to correct for errata in the second edition and to add or modify exercises provided at the end of some chapters. The authors Web site for this textbook has the following Web-link: This Web site includes information on how to order this second edition from the publisher and a freely downloadable zip-file containing data-files for examples used in the textbook. Suggestions for Use This text was originally intended for self-study, but in the 15 years since the first edition was published, it has also been effectively used as a text in a standard lecture-type classroom format. The text may also be used to supplement material covered in a course or to review previously learned material in a self-instructional course or selfplanned learning activity. A more individualized learning program may be particularly suitable to a working professional who does not have the time to participate in a regularly scheduled course.

9 Preface ix In working with any chapter, the learner is encouraged first to read the abbreviated outline and the objectives and then work through the presentation. The reader is then encouraged to read the detailed outline for a summary of the presentation, work through the practice exercises, and, finally, complete the test to check what has been learned. Recommended Preparation The ideal preparation for this text on survival analysis is a course on quantitative methods in epidemiology and a course in applied multiple regression. Also, knowledge of logistic regression, modeling strategies, and maximumlikelihood techniques is crucial for the material on the Cox and parametric models described in Chapters 3 9. Recommended references on these subjects, with suggested chapter readings are: Kleinbaum D, Kupper L, Nizam A, and Muller K, Applied Regression Analysis and Other Multivariable Methods, Fourth Edition, Cengage Publishers, 2007, Chapters 1 16, Kleinbaum D, Kupper L and Morgenstern H, Epidemiologic Research: Principles and Quantitative Methods, John Wiley and Sons, Publishers, New York, 1982, Chapters Kleinbaum D and Klein M, Logistic Regression: A Self- Learning Text, Third Edition, Springer Publishers, New York, 2010, Chapters 4 7, 11. Kleinbaum D, ActivEpi-A CD Rom Electronic Textbook on Fundamentals of Epidemiology, Springer Publishers, New York, 2002, Chapters A first course on the principles of epidemiologic research would be helpful, since all chapters in this text are written from the perspective of epidemiologic research. In particular, the reader should be familiar with the basic characteristics of epidemiologic study designs, and should have some idea of the frequently encountered problem of controlling for confounding and assessing interaction/ effect modification. The above reference, ActivEpi, provides a convenient and hopefully enjoyable way to review epidemiology.

10

11 Acknowledgments We thank Dr. Val Gebski of the NHMRC Clinical Trials Centre, Sydney, Australia for providing continued insight on current methods of survival analysis and review of new additions to the manuscript for this edition. Finally, David Kleinbaum and Mitch Klein thank Edna Kleinbaum and Becky Klein for their love, support, companionship, and sense of humor during the writing of this third edition. xi

12

13 Contents Preface vii Acknowledgments xi Chapter 1 Introduction to Survival Analysis 1 Introduction 2 Abbreviated Outline 2 Objectives 3 Presentation 4 Detailed Outline 44 Practice Exercises 50 Test 52 Answers to Practice Exercises 54 Chapter 2 Kaplan-Meier Survival Curves and the Log-Rank Test 55 Introduction 56 Abbreviated Outline 56 Objectives 57 Presentation 58 Detailed Outline 83 Practice Exercises 87 Test 91 Answers to Practice Exercises 93 Chapter 3 The Cox Proportional Hazards Model and Its Characteristics 97 Introduction 98 Abbreviated Outline 98 Objectives 99 Presentation 100 Detailed Outline 145 Practice Exercises 149 Test 153 Answers to Practice Exercises 157 xiii

14 xiv Contents Chapter 4 Evaluating the Proportional Hazards Assumption 161 Introduction 162 Abbreviated Outline 162 Objectives 163 Presentation 164 Detailed Outline 188 Practice Exercises 191 Test 194 Answers to Practice Exercises 197 Chapter 5 The Stratified Cox Procedure 201 Introduction 202 Abbreviated Outline 202 Objectives 203 Presentation 204 Detailed Outline 228 Practice Exercises 231 Test 234 Answers to Practice Exercises 237 Chapter 6 Extension of the Cox Proportional Hazards Model for Time-Dependent Variables 241 Introduction 242 Abbreviated Outline 242 Objectives 243 Presentation 244 Detailed Outline 278 Practice Exercises 281 Test 285 Answers to Practice Exercises 287 Chapter 7 Parametric Survival Models 289 Introduction 290 Abbreviated Outline 290 Objectives 291 Presentation 292 Detailed Outline 345 Practice Exercises 351 Test 356 Answers to Practice Exercises 359

15 Contents xv Chapter 8 Recurrent Event Survival Analysis 363 Introduction 364 Abbreviated Outline 364 Objectives 365 Presentation 366 Detailed Outline 402 Practice Exercises 408 Test 412 Answers to Practice Exercises 422 Chapter 9 Competing Risks Survival Analysis 425 Introduction 426 Abbreviated Outline 428 Objectives 429 Presentation 430 Detailed Outline 474 Practice Exercises 481 Test 486 Answers to Practice Exercises 492 Chapter 10 Design Issues for Randomized Trials 497 Introduction 498 Abbreviated Outline 498 Objectives 499 Presentation 500 Detailed Outline 518 Practice Exercises 521 Test 523 Answers to Practice Exercises 523 Computer Appendix: Survival Analysis on the Computer 525 A. Stata 527 B. SAS 570 C. SPSS 607 D. R Software 620 Test Answers 665 References 690 Index 695

