Arjas plots with Stata
|
|
|
- Linette Austin
- 9 years ago
- Views:
Transcription
1 Fourth Italian Stata Users Group Meeting September 24 25, 2007 Rome Arjas plots with Stata Enzo Coviello Outline of the talk Graphical checks of the proportional hazards assumption Brief digression on the hazard plot Arjas plots by examples -starjas- 1
2 Graphical checks of the proportional hazards assumption The validity of Cox s regression analysis relies on the assumption of proportionality of the hazard rates of individuals with distinct values of covariates. There are several graphical techniques for checking this assumption. Some of them are readily available by using official Stata commands or can be easily produced by just a few instructions. The effect of two categorical covariates are introduced as examples: Distant metastases at diagnosis in Finnish colon cancer (available from the Paul Dickman web site net get where hazard rates are not proportional Detection method of diagnosis in a sample of Italian breast cancer screening study, where the proportionality assumption holds 2
3 Distant Metastases in Colon Cancer. stcox distant, sch(sch) sca(sca) nolog. estat phtest Test of proportional-hazards assumption Time: Time chi2 df Prob>chi global test Invited to breast cancer screening test. stcox invited, sca(sca) sch(sch). estat phtest Test of proportional-hazards assumption Time: Time chi2 df Prob>chi global test scaled Schoenfeld - distant Distant Metastases in Colon Cancer Time from diagnosis bandwidth =.8 Invited to breast cancer screening test scaled Schoenfeld - Invited Time from diagnosis 3
4 ln Cumulative Hazard stphplot, by(distant) noneg noln distant = 0 distant = 1 analysis time Distant Metastases in Colon Cancer ln Cumulative Hazard Not Invited stphplot, by(invited) noneg nosh These curves should be approximately parallel if the hazard rates are proportional invitate = All Invited analysis tim e Invited to breast cancer screening test An alternative to the previous approach is to plot log cumulative hazard difference. If the proportionality holds, we should see a roughly horizontal line that is easier to assess than comparing parallel curves. Differences in Log Cumulative Hazard Distant Metastases in Colon Cancer Time from diag nosis Differences in Log Cumulative Hazard Invited to breast cancer screening test Time from diag nosis 4
5 Andersen Plot Another graphical check based on cumulative hazard is the so-called Andersen plot. For a binary covariate as distant (or invited ) we simply plot: H (distant=1) versus H (distant=0) There is not a specific command to obtain this plot but the codes to be used are straightforward: stcox, estimate strata(distant) basecha(hdist) g double Hdist1 = Hdist if distant==1 g double Hdist0 = Hdist if distant==0 g Hhat= Hdist0 * sort _t replace Hdist1 = Hdist1[_n-1] if Hdist1==. replace Hdist0 = Hdist0[_n-1] if Hdist0==. twoway line Hdist1 Hhat Hdist0 Note that the variable we checked has been specified as strata() in - stcox-. Andersen plot If the proportionality of hazards hold, we should see a straight line passing through the origin Hazard Not Invited Invited to breast cancer screening test Hazard All Invited 5
6 Andersen plot Hazard distant = Distant Metastases in Colon Cancer Otherwise we see a concave or convex curve. A dashed line corresponding to the hazard ratio estimated in the Cox model has been drawn. For example, the hazard ratio of distant metastases at diagnosis is 7.2. The Andersen plot tells us that the actual effect is greater than the estimated effect in the first part of the follow-up and less than the estimated effect in the rest Hazard distant = 0 Brief digression on the hazard plot 6
7 Plotting the hazard function for each level of a categorical covariate is a direct approach to check the proportionality of hazards. To obtain this plot we have to use -sts graph, hazardand not -stcurve, hazardbecause we must compare non parametrical hazard estimates rather than model based hazard estimates. If the hazards are proportional, then the hazard curves are approximately parallel on a log scale. Hazard estimates are obtained by a kernel-smoothing technique applied to the cumulative hazard estimates. In Stata 10 smoothed hazard are bias-adjusted to take into account the restricted range of data available near the boundaries of the analysis time. Let me remember -stkerhaz-, a version 7 command available on SSC Archive, that computes the same bias-adjusted estimates by applying the so-called asymmetric kernel near the boundaries. Has -stkerhaz- still some utility since version 10 has been released? 7
8 In some case yes: -stkerhaz- allows non parametrical hazard estimates to be saved in a file (-stcurve- saves estimates, but -sts graph- does not). This could be useful for non standard graphs. For example, the bandwidth is a key parameter in the kernel smoothing method. This may be a sensible choice to employ short bandwidth at the start of the follow-up and larger at the end when the number of subjects still observed can be seriously reduced. By means of -stkerhaz- we can save smoothed hazard estimates obtained by using two (or more) bandwidths, we can combine the files and then produce a hazard plot with different degree of smoothing in different parts of the same curve. Excess risk model compares the mortality of a study population to the mortality of a reference population. By subtracting the cumulative expected hazard of the reference population (-stexpect-) to the cumulative hazard (-sts gen-) we can estimate the cumulative excess hazard. From the cumulative excess hazard, via -stkerhaz-, we can get and plot smoothed excess hazard estimates. 8
9 sts graph, hazard by(distant) width(3 3) k(epan2) ysca(log) /// yla( , angle(0)) Distant Metastases in Colon Cancer 400 hazard x analysis time distant = 0 dista nt = 1 bandwidth = 3 years Here we see a decreasing distance of the hazard plots, i.e. a decreasing hazard ratio. We should be cautious in evaluating the distance in the right tail of the curves since confidence intervals may be very large there Distant Metastases in Colon Cancer 400 hazard x analysis time distant = 0 dista nt = 1 bandwidth = 3 years 9
10 sts graph, hazard by(invited) width(3 3) k(epan2) ti("") ysca(log) /// yla( , angle(0)) 40 Invited to breast cancer screening test 27 hazard x analysis time Not Invited All Invited bandwidth = 3 years Here we see a constant distance of the hazard plots, i.e. a constant hazard ratio. 40 Invited to breast cancer screening test 27 hazard x analysis time Not Invited All Invited bandwidth = 3 years 10
11 Arjas plot by examples Arjas plot compares the total number of expected events to the total number of observed events occurring up to each event time. The cumulative hazard, calculated by the Nelson-Aalen estimator or by fitting a Cox model, is the base for working out the expected number of events. The Arjas plot allows to check: 1. whether a covariate should be included in a proportional hazards model 2. whether a covariate has proportional hazards effect. 11
12 1. Should a covariate be included in the model? Assuming that a Cox model has been fitted with covariates x1 and x2, we wish to check whether a categorical covariate x3 should be included in the model. To this aim, we should first fit a Cox model with x1 and x2 covariates, ignoring x3. The resulting cumulative hazard estimate may be used to compute the total expected number of events at each event time and for each level of x3. If plotting the above estimate versus the total number of events for each level of x3 gives linear curves with slopes differing from 1, then x3 should be included in the model. Let we consider the effect of method of detection of cases from the breast cancer screening study. stcox invited, nolog nosh Cox regression -- Breslow method for ties No. of subjects = 2962 No. of failures = _t Haz. Ratio invited
13 Let we consider the effect of method of detection of cases from the breast cancer screening study starjas invited, twoway options In the plot we see a curve for each level of the invited covariate linearly differing from 45 Estimated Cumulative Number of Events Arjas Plots - Invited Cumulative Number of Events Not Invited All Invited In the breast cancer screening study we investigated whether the improved survival of cases invited to the screening test depended on the better tumour characteristics of cases diagnosed at the screening test or was biased by spurious factors (selection bias, lead time bias). To this aim we compared the survival experience of invited and not invited cases after adjusting for tumour characteristics: T, N and grading. If spurious factors play a role we should see some residual independent effect of the method of detection. In the Arjas plot we can investigate if a covariate should be included in the model after adjustment for other covariates. In the next slide Arjas plot for method of detection of cases is shown after adjusting for T, N and grading. 13
14 starjas invited, adjust(t2 T3 T4 T5 N2 N3 Gr2 Gr3 Gr9 yydx_m) The plot tell us that the method of detection has not residual effect after adjusting for T N and grading Arjas Plots Estimated Cumulative Number of Events Cumulative Number of Events Not invited All Invited starjas invited, adjust(t2 T3 T4 T5 N2 N3 Gr2 Gr3 Gr9 yydx_m) The plot tell us that the method of detection has not residual effect after adjusting for T N and grading. xi: stcox invited T2 T3 T4 T5 N2 N3 Gr2 Gr3 Gr9 yydx_m Cox regression -- Breslow method for ties No. of subjects = 2962 Number of obs = 2962 No. of failures = _t Haz. Ratio Std. Err. z P> z [95% Conf. Interval] invited omisis Thus, using Arjas plot allows us to give a graphical expression to the effect of confounders on the main factor 14
15 2. Does a covariate have proportional hazards effect? Let we refer to the Bone Marrow Transplants data set (Klein and Moeschberger, p. 10 and p. 373) where the hazards of the autologous and allogeneic bone marrow transplant are clearly non-proportional Smoothed hazard estimates The hazard ratio of the allotransplant variable is greater than 1 at the beginning of the follow-up,.02 but it is sharply declining and becomes less than 1 after 6 months analysis time allo transplan t auto tra nsplant starjas allotransplant Estimated Cumulative Number of Events Arjas Plots - alloauto The curves differ nonlinearly from the 45 line show a decreasing divergence until they become parallel to the 45 line when the hazard ratio approaches the value 1 and, finally, converge toward the 45 line when the hazard ratio turns < Cumulative Number of Events allo transplant auto transplant 15
16 Estimated Cumulative Number of Events starjas distant Arjas Plots - Distant Metastases Cumulative Number of Events distant = 0 distant = 1 Cases without distant metastases produce a curve as expected Cases with distant metastases produce an approximately straight line. Why this happens? Estimated Cumulative Number of Events starjas distant Arjas Plots - Distant Metastases Cumulative Number of Events distant = 0 distant = Smoothed hazard estimates 95% CI 95% CI distant = 0 distant = (824) 3291 (333) 1998 (132) 1155 (39) 570 (10) (2273) 462 (172) 194 (42) 96 (9) 44 (2) 6 In the group with distant metastases only a few events (number in parentheses) occur after 2 years of follow-up. Comparing observed and expected deaths after 2 years in the Arjas plot provides a very short line. 16
17 Estimated Cumulative Number of Events starjas distant Arjas Plots - Distant Metastases Cumulative Number of Events distant = 0 distant = Smoothed hazard estimates 95% CI 95% CI distant = 0 distant = (824) 3291 (333) 1998 (132) 1155 (39) 570 (10) (2273) 462 (172) 194 (42) 96 (9) 44 (2) 6 Arjas plot rises the doubt that proportional hazards assumption for this variable is not so heavily questionable starjas 17
18 help for starjas Title starjas Arjas plot to check proportional hazards assumption Syntax starjas varname [if exp] [in range ] [, adjust(varlist ) levelplot(# [#].. ) atobs(#) rrglance(#) twoway_options ] options description Options adjust specify the variables fitted in the Cox model before checking if varname is proportional levelplot specify the levels of varname to be displayed in the plot atobs restrict the plot to the first # events rrglance draw a line approximately corresponding in the case of binary covariate to a # relative risk Y-Axis, X-Axis, Caption, Legend twoway_options some of the options documented in [G] twoway_options Add plot plot(plot ) add other plots to generated graph You must stset your data before using starjas; see stset. levelplot (# [#] ) can be useful when the variable we are checking has more than two levels. As an example, consider the effect of age at diagnosis in the Finnish colon cancer data set: egen agecat, cut( ) icodes label Estimated Cumulative Number of Events starjas Arjas Plots agecat Estimated Cumulative Number of Events , Arjas levelplot(2 Plots - agecat 3) Cumulative Number of Events Cumulative Number of Events 18
19 In interpreting plots for categorical covariates having more than two levels we must consider that the 45 line corresponds to the plot of expected and observed counts in the overall cohort of patients. The plots for the age categories 0-49, and are above this line because the hazards of these groups are lower than the hazard of the overall cohort and vice versa for cases in the age category > 70. Estimated Cumulative Number of Events Arjas Plots - agecat Cumulative Number of Events In the case of a binary covariate, the dashed line in the Arjas plot indicates absence of effect, i.e. hazard ratio = 1. rrglance(#) allows to move this line so that it corresponds to a # hazard ratio.. stcox invited, nolog nosh Cox regression -- Breslow method for ties No. of subjects = _t Haz. Ratio invited Estimated Cumulative Number of Events starjas invited, rrgla( ) Cumulative Number of Events Not Invited All Invited 19
20 Arjas plot is useful to address two critical questions in modeling survival data by a Cox regression: Does the variable have to be included in the model? Is its effect proportional? Conclusions By this approach, we can try to answer to both questions using the same graph. -starjas- allows us to easily obtain Arjas plot and may be a complement to other Cox diagnostic plots available in the official package. It is freely downloadable from the SSC Archive by typing from within Stata: ssc install starjas and hope the users enjoy it. Thanks 20
Introduction. Survival Analysis. Censoring. Plan of Talk
Survival Analysis Mark Lunt Arthritis Research UK Centre for Excellence in Epidemiology University of Manchester 01/12/2015 Survival Analysis is concerned with the length of time before an event occurs.
Competing-risks regression
Competing-risks regression Roberto G. Gutierrez Director of Statistics StataCorp LP Stata Conference Boston 2010 R. Gutierrez (StataCorp) Competing-risks regression July 15-16, 2010 1 / 26 Outline 1. Overview
Relative survival an introduction and recent developments
Relative survival an introduction and recent developments Paul W. Dickman Department of Medical Epidemiology and Biostatistics Karolinska Institutet, Stockholm, Sweden [email protected] 11 December 2008
Correlation and Regression
Correlation and Regression Scatterplots Correlation Explanatory and response variables Simple linear regression General Principles of Data Analysis First plot the data, then add numerical summaries Look
especially with continuous
Handling interactions in Stata, especially with continuous predictors Patrick Royston & Willi Sauerbrei German Stata Users meeting, Berlin, 1 June 2012 Interactions general concepts General idea of a (two-way)
Ordinal Regression. Chapter
Ordinal Regression Chapter 4 Many variables of interest are ordinal. That is, you can rank the values, but the real distance between categories is unknown. Diseases are graded on scales from least severe
Gamma Distribution Fitting
Chapter 552 Gamma Distribution Fitting Introduction This module fits the gamma probability distributions to a complete or censored set of individual or grouped data values. It outputs various statistics
Lecture 2 ESTIMATING THE SURVIVAL FUNCTION. One-sample nonparametric methods
Lecture 2 ESTIMATING THE SURVIVAL FUNCTION One-sample nonparametric methods There are commonly three methods for estimating a survivorship function S(t) = P (T > t) without resorting to parametric models:
Comparing Means in Two Populations
Comparing Means in Two Populations Overview The previous section discussed hypothesis testing when sampling from a single population (either a single mean or two means from the same population). Now we
Tips for surviving the analysis of survival data. Philip Twumasi-Ankrah, PhD
Tips for surviving the analysis of survival data Philip Twumasi-Ankrah, PhD Big picture In medical research and many other areas of research, we often confront continuous, ordinal or dichotomous outcomes
Development of the nomolog program and its evolution
Development of the nomolog program and its evolution Towards the implementation of a nomogram generator for the Cox regression Alexander Zlotnik, Telecom.Eng. Víctor Abraira Santos, PhD Ramón y Cajal University
Statistics courses often teach the two-sample t-test, linear regression, and analysis of variance
2 Making Connections: The Two-Sample t-test, Regression, and ANOVA In theory, there s no difference between theory and practice. In practice, there is. Yogi Berra 1 Statistics courses often teach the two-sample
Module 2 Basic Data Management, Graphs, and Log-Files
AGRODEP Stata Training April 2013 Module 2 Basic Data Management, Graphs, and Log-Files Manuel Barron 1 and Pia Basurto 2 1 University of California, Berkeley, Department of Agricultural and Resource Economics
Missing data and net survival analysis Bernard Rachet
Workshop on Flexible Models for Longitudinal and Survival Data with Applications in Biostatistics Warwick, 27-29 July 2015 Missing data and net survival analysis Bernard Rachet General context Population-based,
Marginal Effects for Continuous Variables Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 21, 2015
Marginal Effects for Continuous Variables Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 21, 2015 References: Long 1997, Long and Freese 2003 & 2006 & 2014,
Survey, Statistics and Psychometrics Core Research Facility University of Nebraska-Lincoln. Log-Rank Test for More Than Two Groups
Survey, Statistics and Psychometrics Core Research Facility University of Nebraska-Lincoln Log-Rank Test for More Than Two Groups Prepared by Harlan Sayles (SRAM) Revised by Julia Soulakova (Statistics)
SUMAN DUVVURU STAT 567 PROJECT REPORT
SUMAN DUVVURU STAT 567 PROJECT REPORT SURVIVAL ANALYSIS OF HEROIN ADDICTS Background and introduction: Current illicit drug use among teens is continuing to increase in many countries around the world.
HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS
Mathematics Revision Guides Histograms, Cumulative Frequency and Box Plots Page 1 of 25 M.K. HOME TUITION Mathematics Revision Guides Level: GCSE Higher Tier HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS
Distribution (Weibull) Fitting
Chapter 550 Distribution (Weibull) Fitting Introduction This procedure estimates the parameters of the exponential, extreme value, logistic, log-logistic, lognormal, normal, and Weibull probability distributions
Regression III: Advanced Methods
Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models
Nonlinear Regression Functions. SW Ch 8 1/54/
Nonlinear Regression Functions SW Ch 8 1/54/ The TestScore STR relation looks linear (maybe) SW Ch 8 2/54/ But the TestScore Income relation looks nonlinear... SW Ch 8 3/54/ Nonlinear Regression General
III. INTRODUCTION TO LOGISTIC REGRESSION. a) Example: APACHE II Score and Mortality in Sepsis
III. INTRODUCTION TO LOGISTIC REGRESSION 1. Simple Logistic Regression a) Example: APACHE II Score and Mortality in Sepsis The following figure shows 30 day mortality in a sample of septic patients as
Interaction effects and group comparisons Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 20, 2015
Interaction effects and group comparisons Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 20, 2015 Note: This handout assumes you understand factor variables,
Nonlinear Regression:
Zurich University of Applied Sciences School of Engineering IDP Institute of Data Analysis and Process Design Nonlinear Regression: A Powerful Tool With Considerable Complexity Half-Day : Improved Inference
Chapter 5 Analysis of variance SPSS Analysis of variance
Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,
How to set the main menu of STATA to default factory settings standards
University of Pretoria Data analysis for evaluation studies Examples in STATA version 11 List of data sets b1.dta (To be created by students in class) fp1.xls (To be provided to students) fp1.txt (To be
Fairfield Public Schools
Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity
MULTIPLE REGRESSION EXAMPLE
MULTIPLE REGRESSION EXAMPLE For a sample of n = 166 college students, the following variables were measured: Y = height X 1 = mother s height ( momheight ) X 2 = father s height ( dadheight ) X 3 = 1 if
Organizing Your Approach to a Data Analysis
Biost/Stat 578 B: Data Analysis Emerson, September 29, 2003 Handout #1 Organizing Your Approach to a Data Analysis The general theme should be to maximize thinking about the data analysis and to minimize
Survival analysis methods in Insurance Applications in car insurance contracts
Survival analysis methods in Insurance Applications in car insurance contracts Abder OULIDI 1-2 Jean-Marie MARION 1 Hérvé GANACHAUD 3 1 Institut de Mathématiques Appliquées (IMA) Angers France 2 Institut
Review of Fundamental Mathematics
Review of Fundamental Mathematics As explained in the Preface and in Chapter 1 of your textbook, managerial economics applies microeconomic theory to business decision making. The decision-making tools
Comparison of Survival Curves
Comparison of Survival Curves We spent the last class looking at some nonparametric approaches for estimating the survival function, Ŝ(t), over time for a single sample of individuals. Now we want to compare
Scatter Plots with Error Bars
Chapter 165 Scatter Plots with Error Bars Introduction The procedure extends the capability of the basic scatter plot by allowing you to plot the variability in Y and X corresponding to each point. Each
Slope-Intercept Equation. Example
1.4 Equations of Lines and Modeling Find the slope and the y intercept of a line given the equation y = mx + b, or f(x) = mx + b. Graph a linear equation using the slope and the y-intercept. Determine
Session 7 Bivariate Data and Analysis
Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table co-variation least squares
The Central Idea CHAPTER 1 CHAPTER OVERVIEW CHAPTER REVIEW
CHAPTER 1 The Central Idea CHAPTER OVERVIEW Economic interactions involve scarcity and choice. Time and income are limited, and people choose among alternatives every day. In this chapter, we study the
PLOTTING DATA AND INTERPRETING GRAPHS
PLOTTING DATA AND INTERPRETING GRAPHS Fundamentals of Graphing One of the most important sets of skills in science and mathematics is the ability to construct graphs and to interpret the information they
Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),
Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables
Chapter 6: Constructing and Interpreting Graphic Displays of Behavioral Data
Chapter 6: Constructing and Interpreting Graphic Displays of Behavioral Data Chapter Focus Questions What are the benefits of graphic display and visual analysis of behavioral data? What are the fundamental
Confidence Intervals for Exponential Reliability
Chapter 408 Confidence Intervals for Exponential Reliability Introduction This routine calculates the number of events needed to obtain a specified width of a confidence interval for the reliability (proportion
LAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE
LAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE MAT 119 STATISTICS AND ELEMENTARY ALGEBRA 5 Lecture Hours, 2 Lab Hours, 3 Credits Pre-
Comparing the predictive power of survival models using Harrell s c or Somers D
The Stata Journal (yyyy) vv, Number ii, pp. 1 19 Comparing the predictive power of survival models using Harrell s c or Somers D Roger B. Newson National Heart and Lung Institute, Imperial College London
Reflection and Refraction
Equipment Reflection and Refraction Acrylic block set, plane-concave-convex universal mirror, cork board, cork board stand, pins, flashlight, protractor, ruler, mirror worksheet, rectangular block worksheet,
13. Poisson Regression Analysis
136 Poisson Regression Analysis 13. Poisson Regression Analysis We have so far considered situations where the outcome variable is numeric and Normally distributed, or binary. In clinical work one often
11. Analysis of Case-control Studies Logistic Regression
Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:
Simple linear regression
Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between
Using R for Linear Regression
Using R for Linear Regression In the following handout words and symbols in bold are R functions and words and symbols in italics are entries supplied by the user; underlined words and symbols are optional
Topic 3 - Survival Analysis
Topic 3 - Survival Analysis 1. Topics...2 2. Learning objectives...3 3. Grouped survival data - leukemia example...4 3.1 Cohort survival data schematic...5 3.2 Tabulation of events and time at risk...6
I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN
Beckman HLM Reading Group: Questions, Answers and Examples Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Linear Algebra Slide 1 of
Nonlinear relationships Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 20, 2015
Nonlinear relationships Richard Williams, University of Notre Dame, http://www.nd.edu/~rwilliam/ Last revised February, 5 Sources: Berry & Feldman s Multiple Regression in Practice 985; Pindyck and Rubinfeld
The Cox Proportional Hazards Model
The Cox Proportional Hazards Model Mario Chen, PhD Advanced Biostatistics and RCT Workshop Office of AIDS Research, NIH ICSSC, FHI Goa, India, September 2009 1 The Model h i (t)=h 0 (t)exp(z i ), Z i =
Updates to Graphing with Excel
Updates to Graphing with Excel NCC has recently upgraded to a new version of the Microsoft Office suite of programs. As such, many of the directions in the Biology Student Handbook for how to graph with
Tests for Two Survival Curves Using Cox s Proportional Hazards Model
Chapter 730 Tests for Two Survival Curves Using Cox s Proportional Hazards Model Introduction A clinical trial is often employed to test the equality of survival distributions of two treatment groups.
The first three steps in a logistic regression analysis with examples in IBM SPSS. Steve Simon P.Mean Consulting www.pmean.com
The first three steps in a logistic regression analysis with examples in IBM SPSS. Steve Simon P.Mean Consulting www.pmean.com 2. Why do I offer this webinar for free? I offer free statistics webinars
Simple Regression Theory II 2010 Samuel L. Baker
SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the
Hands-On Math Algebra
Hands-On Math Algebra by Pam Meader and Judy Storer illustrated by Julie Mazur Contents To the Teacher... v Topic: Ratio and Proportion 1. Candy Promotion... 1 2. Estimating Wildlife Populations... 6 3.
Handling missing data in Stata a whirlwind tour
Handling missing data in Stata a whirlwind tour 2012 Italian Stata Users Group Meeting Jonathan Bartlett www.missingdata.org.uk 20th September 2012 1/55 Outline The problem of missing data and a principled
Research Methods & Experimental Design
Research Methods & Experimental Design 16.422 Human Supervisory Control April 2004 Research Methods Qualitative vs. quantitative Understanding the relationship between objectives (research question) and
A Basic Introduction to Missing Data
John Fox Sociology 740 Winter 2014 Outline Why Missing Data Arise Why Missing Data Arise Global or unit non-response. In a survey, certain respondents may be unreachable or may refuse to participate. Item
AP STATISTICS REVIEW (YMS Chapters 1-8)
AP STATISTICS REVIEW (YMS Chapters 1-8) Exploring Data (Chapter 1) Categorical Data nominal scale, names e.g. male/female or eye color or breeds of dogs Quantitative Data rational scale (can +,,, with
Checking proportionality for Cox s regression model
Checking proportionality for Cox s regression model by Hui Hong Zhang Thesis for the degree of Master of Science (Master i Modellering og dataanalyse) Department of Mathematics Faculty of Mathematics and
Module 14: Missing Data Stata Practical
Module 14: Missing Data Stata Practical Jonathan Bartlett & James Carpenter London School of Hygiene & Tropical Medicine www.missingdata.org.uk Supported by ESRC grant RES 189-25-0103 and MRC grant G0900724
STATISTICAL ANALYSIS OF SAFETY DATA IN LONG-TERM CLINICAL TRIALS
STATISTICAL ANALYSIS OF SAFETY DATA IN LONG-TERM CLINICAL TRIALS Tailiang Xie, Ping Zhao and Joel Waksman, Wyeth Consumer Healthcare Five Giralda Farms, Madison, NJ 794 KEY WORDS: Safety Data, Adverse
Descriptive Statistics
Descriptive Statistics Descriptive statistics consist of methods for organizing and summarizing data. It includes the construction of graphs, charts and tables, as well various descriptive measures such
Data Analysis, Research Study Design and the IRB
Minding the p-values p and Quartiles: Data Analysis, Research Study Design and the IRB Don Allensworth-Davies, MSc Research Manager, Data Coordinating Center Boston University School of Public Health IRB
Dealing with Missing Data
Dealing with Missing Data Roch Giorgi email: [email protected] UMR 912 SESSTIM, Aix Marseille Université / INSERM / IRD, Marseille, France BioSTIC, APHM, Hôpital Timone, Marseille, France January
SEQUENCES ARITHMETIC SEQUENCES. Examples
SEQUENCES ARITHMETIC SEQUENCES An ordered list of numbers such as: 4, 9, 6, 25, 36 is a sequence. Each number in the sequence is a term. Usually variables with subscripts are used to label terms. For example,
Part 1: Background - Graphing
Department of Physics and Geology Graphing Astronomy 1401 Equipment Needed Qty Computer with Data Studio Software 1 1.1 Graphing Part 1: Background - Graphing In science it is very important to find and
Simple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
outreg help pages Write formatted regression output to a text file After any estimation command: (Text-related options)
outreg help pages OUTREG HELP PAGES... 1 DESCRIPTION... 2 OPTIONS... 3 1. Text-related options... 3 2. Coefficient options... 4 3. Options for t statistics, standard errors, etc... 5 4. Statistics options...
Failure to take the sampling scheme into account can lead to inaccurate point estimates and/or flawed estimates of the standard errors.
Analyzing Complex Survey Data: Some key issues to be aware of Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 24, 2015 Rather than repeat material that is
Chapter 5: Working with contours
Introduction Contoured topographic maps contain a vast amount of information about the three-dimensional geometry of the land surface and the purpose of this chapter is to consider some of the ways in
Curve Fitting, Loglog Plots, and Semilog Plots 1
Curve Fitting, Loglog Plots, and Semilog Plots 1 In this MATLAB exercise, you will learn how to plot data and how to fit lines to your data. Suppose you are measuring the height h of a seedling as it grows.
LOGIT AND PROBIT ANALYSIS
LOGIT AND PROBIT ANALYSIS A.K. Vasisht I.A.S.R.I., Library Avenue, New Delhi 110 012 [email protected] In dummy regression variable models, it is assumed implicitly that the dependent variable Y
LOGISTIC REGRESSION ANALYSIS
LOGISTIC REGRESSION ANALYSIS C. Mitchell Dayton Department of Measurement, Statistics & Evaluation Room 1230D Benjamin Building University of Maryland September 1992 1. Introduction and Model Logistic
25 Working with categorical data and factor variables
25 Working with categorical data and factor variables Contents 25.1 Continuous, categorical, and indicator variables 25.1.1 Converting continuous variables to indicator variables 25.1.2 Converting continuous
Applied Statistics. J. Blanchet and J. Wadsworth. Institute of Mathematics, Analysis, and Applications EPF Lausanne
Applied Statistics J. Blanchet and J. Wadsworth Institute of Mathematics, Analysis, and Applications EPF Lausanne An MSc Course for Applied Mathematicians, Fall 2012 Outline 1 Model Comparison 2 Model
Formulas, Functions and Charts
Formulas, Functions and Charts :: 167 8 Formulas, Functions and Charts 8.1 INTRODUCTION In this leson you can enter formula and functions and perform mathematical calcualtions. You will also be able to
Module 5: Multiple Regression Analysis
Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College
Module 2: Introduction to Quantitative Data Analysis
Module 2: Introduction to Quantitative Data Analysis Contents Antony Fielding 1 University of Birmingham & Centre for Multilevel Modelling Rebecca Pillinger Centre for Multilevel Modelling Introduction...
SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES
SCHOOL OF HEALTH AND HUMAN SCIENCES Using SPSS Topics addressed today: 1. Differences between groups 2. Graphing Use the s4data.sav file for the first part of this session. DON T FORGET TO RECODE YOUR
Confidence Intervals for Cp
Chapter 296 Confidence Intervals for Cp Introduction This routine calculates the sample size needed to obtain a specified width of a Cp confidence interval at a stated confidence level. Cp is a process
Rockefeller College University at Albany
Rockefeller College University at Albany PAD 705 Handout: Hypothesis Testing on Multiple Parameters In many cases we may wish to know whether two or more variables are jointly significant in a regression.
Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm
Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm
HURDLE AND SELECTION MODELS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009
HURDLE AND SELECTION MODELS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 1. Introduction 2. A General Formulation 3. Truncated Normal Hurdle Model 4. Lognormal
A Guide to Imputing Missing Data with Stata Revision: 1.4
A Guide to Imputing Missing Data with Stata Revision: 1.4 Mark Lunt December 6, 2011 Contents 1 Introduction 3 2 Installing Packages 4 3 How big is the problem? 5 4 First steps in imputation 5 5 Imputation
The Effect of Dropping a Ball from Different Heights on the Number of Times the Ball Bounces
The Effect of Dropping a Ball from Different Heights on the Number of Times the Ball Bounces Or: How I Learned to Stop Worrying and Love the Ball Comment [DP1]: Titles, headings, and figure/table captions
Guide to Biostatistics
MedPage Tools Guide to Biostatistics Study Designs Here is a compilation of important epidemiologic and common biostatistical terms used in medical research. You can use it as a reference guide when reading
Pie Charts. proportion of ice-cream flavors sold annually by a given brand. AMS-5: Statistics. Cherry. Cherry. Blueberry. Blueberry. Apple.
Graphical Representations of Data, Mean, Median and Standard Deviation In this class we will consider graphical representations of the distribution of a set of data. The goal is to identify the range of
Advanced Quantitative Methods for Health Care Professionals PUBH 742 Spring 2015
1 Advanced Quantitative Methods for Health Care Professionals PUBH 742 Spring 2015 Instructor: Joanne M. Garrett, PhD e-mail: [email protected] Class Notes: Copies of the class lecture slides
Geometric Optics Converging Lenses and Mirrors Physics Lab IV
Objective Geometric Optics Converging Lenses and Mirrors Physics Lab IV In this set of lab exercises, the basic properties geometric optics concerning converging lenses and mirrors will be explored. The
Logit and Probit. Brad Jones 1. April 21, 2009. University of California, Davis. Bradford S. Jones, UC-Davis, Dept. of Political Science
Logit and Probit Brad 1 1 Department of Political Science University of California, Davis April 21, 2009 Logit, redux Logit resolves the functional form problem (in terms of the response function in the
Using Stata for Categorical Data Analysis
Using Stata for Categorical Data Analysis NOTE: These problems make extensive use of Nick Cox s tab_chi, which is actually a collection of routines, and Adrian Mander s ipf command. From within Stata,
Introduction to Quantitative Methods
Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................
DATA INTERPRETATION AND STATISTICS
PholC60 September 001 DATA INTERPRETATION AND STATISTICS Books A easy and systematic introductory text is Essentials of Medical Statistics by Betty Kirkwood, published by Blackwell at about 14. DESCRIPTIVE
Overview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)
Overview Classes 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) 2-4 Loglinear models (8) 5-4 15-17 hrs; 5B02 Building and
Standards for Mathematical Practice: Commentary and Elaborations for 6 8
Standards for Mathematical Practice: Commentary and Elaborations for 6 8 c Illustrative Mathematics 6 May 2014 Suggested citation: Illustrative Mathematics. (2014, May 6). Standards for Mathematical Practice:
SAS Software to Fit the Generalized Linear Model
SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling
What Does the Normal Distribution Sound Like?
What Does the Normal Distribution Sound Like? Ananda Jayawardhana Pittsburg State University [email protected] Published: June 2013 Overview of Lesson In this activity, students conduct an investigation
Analysing Questionnaires using Minitab (for SPSS queries contact -) [email protected]
Analysing Questionnaires using Minitab (for SPSS queries contact -) [email protected] Structure As a starting point it is useful to consider a basic questionnaire as containing three main sections:
