Unit 12 Logistic Regression Supplementary Chapter 14 in IPS On CD (Chap 16, 5th ed.)

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "Unit 12 Logistic Regression Supplementary Chapter 14 in IPS On CD (Chap 16, 5th ed.)"

Transcription

1 Unit 12 Logistic Regression Supplementary Chapter 14 in IPS On CD (Chap 16, 5th ed.) Logistic regression generalizes methods for 2-way tables Adds capability studying several predictors, but Limited to binary response variables Similar in intent to linear regression, but details are different Method for estimating joint association between several predictors and a response variable Typically useful in some class projects 1

2 2

3 Betting in a fair game An American roulette wheel has 38 slots: 1,2,3,, 36, 0, 00 If you place a $1 bet on 00 for a single spin of the wheel, you have 1/38 chance of winning in a single spin 1 way to win, 37 ways to lose, or The casino has 37 ways to win, 1 way to lose The odds of winning for the house are 37 to 1, 1 to 37 for you 3

4 Betting in roulette... For the game to be fair, Casino keeps your $1 if 00 does not come up Casino pays $37 if 00 comes up, and you keep your bet If X is your winnings from a $1 bet, E(X) = -1 (37/38) + 37 (1/38) = 0 Casinos stay in business by paying out 35 to 1, the casinos insure that roulette is not a fair game. In this case E(X) = -1 (37/38) + 35 (1/38) = -(2/38) =

5 Converting probabilities to odds and log(odds) In a game of chance, the odds of winning is the same as the ratio of money that should be bet by the two players. In roulette, the odds of your winning is the ratio of the probability of your winning to the probability of losing p/(1-p) = (1/38) / (37/38) = 1/37 Typically, odds are given to show the ratio of the payout: 37 to 1 in this case The values of an `odds range from 0 to Think of probabilities 0.01, 0.001, , 0.99, 0.999, ,etc We will use a transformation of odds to log(odds) The values of log(odds) range from - to. 5

6 Odds vs log(odds) Transformation of p Why consider such a transformation? Answer: it transforms a 0 < p < 1 variable to a quantitative variable from - to + It is a simple algebraic operation to go back and forth between probabilities and log(odds) Odds Log odds p p/(1-p) log(p/(1-p))

7 Computing odds in data, an example The example on the next slide is very similar to IPS Example 8.1 (5th and 6th edition), but the numbers are from the 5th ed. Be careful when reading the example in the 6th ed. 7

8 Example: binge drinking survey Binger Men Women Total Yes % % 3314 No % Total % % %

9 Logistic Regression Idea behind logistic regression Let ˆp M be the proportion of men who are binge drinkers; log(ˆp M /(1 ˆp M )) is the log odds. Let ˆp F be the proportion of women who are binge drinkers; log(ˆp F /(1 ˆp F )) is the log odds. The ratio of the odds (called the odds ratio) of men to women being binge drinkers is ˆp M 1 ˆp M = ˆp F 1 ˆp F ( ˆpM )( 1 ˆpF 1 ˆp M ˆp F ) = = Now recall log(x/y) = log(x) log(y). 9

10 New page Idea behind logistic regression In the binge drinking table, log [( ˆpM )( 1 ˆpF 1 ˆp M ˆp F )] = log(0.294) log(0.205) = ( 1.587) = The log odds for males differs from the log odds for females by a constant. Logistic regresson is a model in which predictors induce changes in log(odds), similar to linear regression, where Predictors induce changes in mean of response variable. 10

11 Model for Logistic Regression Set the log odds to be a linear combination of the predictor variables This is the Logistic Regression Model Sometimes equivalently written as: 11

12 Logistic regression The Logistic Regression Model Predictor variables (x s) can be quantitative or binary More complex formulas for estimates than least squares Omnibus test of the model now a χ 2, not an F test We can test each predictor variable (x i ) for its contribution but now this is a z test, not a t test Assumptions of this model are quite complex and are not often checked Logistic regression model is widely used Coefficients can be derived directly in some 2-way tables Back to binge drinking example 12

13 Example: binge drinking survey Binger Men Women Total Yes % % % No % Total % % % % % 13

14 The logistic model - binge drinking From the previous slide (odds of being a binge drinker) For men: Log odds = For women: Log odds = The logistic model for one predictor (gender) is Log (p /(1-p)) = Log odds = b0 + b1x1 where Y = 1 if a binge drinker; 0 otherwise and X1 = 1 if male; 0 if female So the logistic model is For men: Log odds = b0 + b1 = For women: Log odds = b0 = Solving b0 = and b1 = b0 = Thus the fitted logistic model for this example is Log odds = X1 14

15 The logistic model - binge drinking Working backwards to confirm this fitted model Log odds = log (p/(1-p)) = X 1 where X 1 = 1 if male and X 1 = 0 if female So for men Log odds = log (p/(1-p)) = (1) = and odds = e = Thus the proportion of binge drinkers is odds / (odds +1) = / = For women Log odds = log (p/(1-p)) = and odds = e = Thus the proportion of binge drinkers is odds / (odds +1) = / =

16 Comparing two proportions Relative risk and odds ratio S F Total Group 1 a b a + b Group 2 c d c + d Total a + c b + d 16

17 Odds ratio As with RR, an odds ratio of 1 indicates the proportion of successes (events) is the same in both groups RR is easier to interpret (ratio of sample proportions) When successes are rare, RR and OR are very similar When successes are common, RR and OR are similar only if they are close to 1 OR tends to overstate differences Example: binge drinking 17

18 Odds ratio and logistic regression Odds ratio is the key output from a logistic regression An OR is calculated for each predictor variable OR measures the strength of the effect on p (probability of `success ) Example: binge drinking Log odds = X 1 where X 1 = 1 if M, 0 if F For men: Log odds = b 0 + b 1 = For women: Log odds = b 0 = Let OR be the odds ratio men to women Log (OR) = log (odds for men) log (odds for women) = (b 0 + b 1 ) (b 0 ) = b 1 So OR = e b 1 For the binge drinking example OR = e =

19 Inference for logistic regression parameters A 95% confidence interval for the coefficient β 1 is given by b 1 ± 1.96 s.e.(b 1 ) A 95% confidence interval for the odds ratio e β 1 is given by e (b 1± 1.96 s.e.(b 1 )) To test the null hypothesis H 0 : β 1 =0(i.e., no association between the response variable and the predictor variable X 1 ), use Z = b 1 s.e.(b 1 ) Z has (approximately) a N(0, 1) distribution when H 0 is true. 19

20 Binge drinking: expanding data from a 2x2 table to a rectangular data file The 2x2 table Let Binge = 1 if binger Let Sex = 1 if male 0 otherwise Stata commands input Binge Sex Count end expand Count Binger Men Women Total Yes No Total This creates a rectangular data file with 17,096 rows: etc. 20

21 Binge drinking logistic regression Stata has 2 commands - logistic and logit logistic displays ORs; logit displays model coefficients Note: b 0 = and b 1 = as earlier 21

22 Binge drinking logistic regression Logistic command displays the odds ratio Notes: OR = as earlier 95% CI for OR (1.33, 1.55) excludes OR = 1 z for the Wald test = 9.31 (P < 0.001) 22

23 Multiple logistic regression Example: Intensive Care Unit (ICU) Study of 200 patients admitted to the adult ICU at Baystate Medical Center in Springfield, MA Response variable Survival until hospital discharge (Surv) Surv = 1 if died, 0 if survived Predictor variables Age, in years (Age) Sex = 1 if female, 0 if male (Sex) Race = 1 if white, 0 otherwise (Race) Heart rate at ICU admission, beats/min. (HRate) Level of consciousness at ICU admission (LOC) LOC = 1 if deep stupor of coma, 0 otherwise Source: Hosmer & Lemeshow (2000) Wiley & Sons 23

24 Density Heart Rate at ICU Admission Density Age Density 0 Density Age Heart Rate at ICU Admission 200. table LOC Level of Consciousness at ICU Admission Freq. No Coma or Deep Stupor 185 Deep Stupor 5 Coma 10. table surv surv Freq

25 Example ICU, summary of the data Response variable - survival to hospital discharge Surv N = 200, 20% died Predictor variables Sex 38% female Age Average is 57.5 yrs., range from 16 to 92 yrs. Race 87.5% white HRate Average is 99, range from 39 to 192 beats/min. LOC 7.5% in deep stupor or coma 25

26 ICU - predictors of death before discharge A logistic regression with all 5 predictors Age and level of consciousness (LOC) both significant Re-estimate the model keeping only the significant terms 26

27 ICU - predictors of death before discharge The final logistic model (using logistic command) Age and LOC are significant Odds ratio For Age (95% CI is to 1.064) P = For LOC (95% CI is 7.63 to ) P <

28 ICU - predictors of death before discharge The final logistic model (using logit command) Shows the logistic model coefficients Log odds = Log (p/1-p) = Age LOC Note: e = (OR) and e 3.59 = (OR for LOC) 28

29 Interpretation of coefficients of the logistic regression model The sign of the β i term indicates whether p increases or decreases as x increases ICU Example: both β i terms were positive so risk of death increases with age and presence of deep stupor or coma The magnitude of the β i term gives the additive change in log odds when there is +1 unit change in the predictor variable, holding other predictors fixed 29

30 Interpretation of coefficients The magnitude of the odds ratio term (= e βi ) gives the multiplicative change in odds for +1 change in predictor ICU Example: the odds of death increases multiplicatively by 2.8% (OR = 1.028) for each year increase in age To see this, exponentiate both sides of logistic model and note (p/1-p) = e β0 + β1[x + 1] = (e β0 )(e β1x )(e β1 ), where e β1 = OR 30

31 Final Thoughts on Logistic Regression Some of you will find logistic regression useful in a project, so last p-set has logistic regression problem. Not covered on final exam, because we have not had time to digest it. Logistic regression extends the analysis of two-way tables Response variable must still be binary. Predictors can now be categorical or quantitative. Logistic regression is an example of a class of regression models much more general than linear regression. These models are covered in detail in Stat 138 and Stat

11. Analysis of Case-control Studies Logistic Regression

11. Analysis of Case-control Studies Logistic Regression Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:

More information

Multivariate Logistic Regression

Multivariate Logistic Regression 1 Multivariate Logistic Regression As in univariate logistic regression, let π(x) represent the probability of an event that depends on p covariates or independent variables. Then, using an inv.logit formulation

More information

Expected Value. 24 February 2014. Expected Value 24 February 2014 1/19

Expected Value. 24 February 2014. Expected Value 24 February 2014 1/19 Expected Value 24 February 2014 Expected Value 24 February 2014 1/19 This week we discuss the notion of expected value and how it applies to probability situations, including the various New Mexico Lottery

More information

Overview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)

Overview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) Overview Classes 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) 2-4 Loglinear models (8) 5-4 15-17 hrs; 5B02 Building and

More information

VI. Introduction to Logistic Regression

VI. Introduction to Logistic Regression VI. Introduction to Logistic Regression We turn our attention now to the topic of modeling a categorical outcome as a function of (possibly) several factors. The framework of generalized linear models

More information

Improved Interaction Interpretation: Application of the EFFECTPLOT statement and other useful features in PROC LOGISTIC

Improved Interaction Interpretation: Application of the EFFECTPLOT statement and other useful features in PROC LOGISTIC Paper AA08-2013 Improved Interaction Interpretation: Application of the EFFECTPLOT statement and other useful features in PROC LOGISTIC Robert G. Downer, Grand Valley State University, Allendale, MI ABSTRACT

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models We have previously worked with regression models where the response variable is quantitative and normally distributed. Now we turn our attention to two types of models where the

More information

5. Ordinal regression: cumulative categories proportional odds. 6. Ordinal regression: comparison to single reference generalized logits

5. Ordinal regression: cumulative categories proportional odds. 6. Ordinal regression: comparison to single reference generalized logits Lecture 23 1. Logistic regression with binary response 2. Proc Logistic and its surprises 3. quadratic model 4. Hosmer-Lemeshow test for lack of fit 5. Ordinal regression: cumulative categories proportional

More information

LOGISTIC REGRESSION ANALYSIS

LOGISTIC REGRESSION ANALYSIS LOGISTIC REGRESSION ANALYSIS C. Mitchell Dayton Department of Measurement, Statistics & Evaluation Room 1230D Benjamin Building University of Maryland September 1992 1. Introduction and Model Logistic

More information

THE WINNING ROULETTE SYSTEM by http://www.webgoldminer.com/

THE WINNING ROULETTE SYSTEM by http://www.webgoldminer.com/ THE WINNING ROULETTE SYSTEM by http://www.webgoldminer.com/ Is it possible to earn money from online gambling? Are there any 100% sure winning roulette systems? Are there actually people who make a living

More information

STATISTICA Formula Guide: Logistic Regression. Table of Contents

STATISTICA Formula Guide: Logistic Regression. Table of Contents : Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary

More information

Statistics and Data Analysis

Statistics and Data Analysis NESUG 27 PRO LOGISTI: The Logistics ehind Interpreting ategorical Variable Effects Taylor Lewis, U.S. Office of Personnel Management, Washington, D STRT The goal of this paper is to demystify how SS models

More information

Math 58. Rumbos Fall 2008 1. Solutions to Review Problems for Exam 2

Math 58. Rumbos Fall 2008 1. Solutions to Review Problems for Exam 2 Math 58. Rumbos Fall 2008 1 Solutions to Review Problems for Exam 2 1. For each of the following scenarios, determine whether the binomial distribution is the appropriate distribution for the random variable

More information

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

More information

The first three steps in a logistic regression analysis with examples in IBM SPSS. Steve Simon P.Mean Consulting www.pmean.com

The first three steps in a logistic regression analysis with examples in IBM SPSS. Steve Simon P.Mean Consulting www.pmean.com The first three steps in a logistic regression analysis with examples in IBM SPSS. Steve Simon P.Mean Consulting www.pmean.com 2. Why do I offer this webinar for free? I offer free statistics webinars

More information

Finding Supporters. Political Predictive Analytics Using Logistic Regression. Multivariate Solutions

Finding Supporters. Political Predictive Analytics Using Logistic Regression. Multivariate Solutions Finding Supporters Political Predictive Analytics Using Logistic Regression Multivariate Solutions What is Logistic Regression? In a political application, logistic regression can describe the outcome

More information

General Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1.

General Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1. General Method: Difference of Means 1. Calculate x 1, x 2, SE 1, SE 2. 2. Combined SE = SE1 2 + SE2 2. ASSUMES INDEPENDENT SAMPLES. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n

More information

Multinomial and Ordinal Logistic Regression

Multinomial and Ordinal Logistic Regression Multinomial and Ordinal Logistic Regression ME104: Linear Regression Analysis Kenneth Benoit August 22, 2012 Regression with categorical dependent variables When the dependent variable is categorical,

More information

If, under a given assumption, the of a particular observed is extremely. , we conclude that the is probably not

If, under a given assumption, the of a particular observed is extremely. , we conclude that the is probably not 4.1 REVIEW AND PREVIEW RARE EVENT RULE FOR INFERENTIAL STATISTICS If, under a given assumption, the of a particular observed is extremely, we conclude that the is probably not. 4.2 BASIC CONCEPTS OF PROBABILITY

More information

Solutions: Problems for Chapter 3. Solutions: Problems for Chapter 3

Solutions: Problems for Chapter 3. Solutions: Problems for Chapter 3 Problem A: You are dealt five cards from a standard deck. Are you more likely to be dealt two pairs or three of a kind? experiment: choose 5 cards at random from a standard deck Ω = {5-combinations of

More information

III. INTRODUCTION TO LOGISTIC REGRESSION. a) Example: APACHE II Score and Mortality in Sepsis

III. INTRODUCTION TO LOGISTIC REGRESSION. a) Example: APACHE II Score and Mortality in Sepsis III. INTRODUCTION TO LOGISTIC REGRESSION 1. Simple Logistic Regression a) Example: APACHE II Score and Mortality in Sepsis The following figure shows 30 day mortality in a sample of septic patients as

More information

International Statistical Institute, 56th Session, 2007: Phil Everson

International Statistical Institute, 56th Session, 2007: Phil Everson Teaching Regression using American Football Scores Everson, Phil Swarthmore College Department of Mathematics and Statistics 5 College Avenue Swarthmore, PA198, USA E-mail: peverso1@swarthmore.edu 1. Introduction

More information

Statistics 151 Practice Midterm 1 Mike Kowalski

Statistics 151 Practice Midterm 1 Mike Kowalski Statistics 151 Practice Midterm 1 Mike Kowalski Statistics 151 Practice Midterm 1 Multiple Choice (50 minutes) Instructions: 1. This is a closed book exam. 2. You may use the STAT 151 formula sheets and

More information

3.4 Statistical inference for 2 populations based on two samples

3.4 Statistical inference for 2 populations based on two samples 3.4 Statistical inference for 2 populations based on two samples Tests for a difference between two population means The first sample will be denoted as X 1, X 2,..., X m. The second sample will be denoted

More information

Yiming Peng, Department of Statistics. February 12, 2013

Yiming Peng, Department of Statistics. February 12, 2013 Regression Analysis Using JMP Yiming Peng, Department of Statistics February 12, 2013 2 Presentation and Data http://www.lisa.stat.vt.edu Short Courses Regression Analysis Using JMP Download Data to Desktop

More information

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96 1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

More information

Linda K. Muthén Bengt Muthén. Copyright 2008 Muthén & Muthén www.statmodel.com. Table Of Contents

Linda K. Muthén Bengt Muthén. Copyright 2008 Muthén & Muthén www.statmodel.com. Table Of Contents Mplus Short Courses Topic 2 Regression Analysis, Eploratory Factor Analysis, Confirmatory Factor Analysis, And Structural Equation Modeling For Categorical, Censored, And Count Outcomes Linda K. Muthén

More information

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not. Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

More information

Students' Opinion about Universities: The Faculty of Economics and Political Science (Case Study)

Students' Opinion about Universities: The Faculty of Economics and Political Science (Case Study) Cairo University Faculty of Economics and Political Science Statistics Department English Section Students' Opinion about Universities: The Faculty of Economics and Political Science (Case Study) Prepared

More information

Survival Analysis Using SPSS. By Hui Bian Office for Faculty Excellence

Survival Analysis Using SPSS. By Hui Bian Office for Faculty Excellence Survival Analysis Using SPSS By Hui Bian Office for Faculty Excellence Survival analysis What is survival analysis Event history analysis Time series analysis When use survival analysis Research interest

More information

PROC LOGISTIC: Traps for the unwary Peter L. Flom, Independent statistical consultant, New York, NY

PROC LOGISTIC: Traps for the unwary Peter L. Flom, Independent statistical consultant, New York, NY PROC LOGISTIC: Traps for the unwary Peter L. Flom, Independent statistical consultant, New York, NY ABSTRACT Keywords: Logistic. INTRODUCTION This paper covers some gotchas in SAS R PROC LOGISTIC. A gotcha

More information

Is it statistically significant? The chi-square test

Is it statistically significant? The chi-square test UAS Conference Series 2013/14 Is it statistically significant? The chi-square test Dr Gosia Turner Student Data Management and Analysis 14 September 2010 Page 1 Why chi-square? Tests whether two categorical

More information

Logistic Regression. BUS 735: Business Decision Making and Research

Logistic Regression. BUS 735: Business Decision Making and Research Goals of this section 2/ 8 Specific goals: Learn how to conduct regression analysis with a dummy independent variable. Learning objectives: LO2: Be able to construct and use multiple regression models

More information

Statistics 305: Introduction to Biostatistical Methods for Health Sciences

Statistics 305: Introduction to Biostatistical Methods for Health Sciences Statistics 305: Introduction to Biostatistical Methods for Health Sciences Modelling the Log Odds Logistic Regression (Chap 20) Instructor: Liangliang Wang Statistics and Actuarial Science, Simon Fraser

More information

Testing on proportions

Testing on proportions Testing on proportions Textbook Section 5.4 April 7, 2011 Example 1. X 1,, X n Bernolli(p). Wish to test H 0 : p p 0 H 1 : p > p 0 (1) Consider a related problem The likelihood ratio test is where c is

More information

Chi Squared and Fisher's Exact Tests. Observed vs Expected Distributions

Chi Squared and Fisher's Exact Tests. Observed vs Expected Distributions BMS 617 Statistical Techniques for the Biomedical Sciences Lecture 11: Chi-Squared and Fisher's Exact Tests Chi Squared and Fisher's Exact Tests This lecture presents two similarly structured tests, Chi-squared

More information

Elementary Statistics and Inference. Elementary Statistics and Inference. 17 Expected Value and Standard Error. 22S:025 or 7P:025.

Elementary Statistics and Inference. Elementary Statistics and Inference. 17 Expected Value and Standard Error. 22S:025 or 7P:025. Elementary Statistics and Inference S:05 or 7P:05 Lecture Elementary Statistics and Inference S:05 or 7P:05 Chapter 7 A. The Expected Value In a chance process (probability experiment) the outcomes of

More information

You can place bets on the Roulette table until the dealer announces, No more bets.

You can place bets on the Roulette table until the dealer announces, No more bets. Roulette Roulette is one of the oldest and most famous casino games. Every Roulette table has its own set of distinctive chips that can only be used at that particular table. These chips are purchased

More information

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives

More information

Multiple logistic regression analysis of cigarette use among high school students

Multiple logistic regression analysis of cigarette use among high school students Multiple logistic regression analysis of cigarette use among high school students ABSTRACT Joseph Adwere-Boamah Alliant International University A binary logistic regression analysis was performed to predict

More information

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1) Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the

More information

13. Poisson Regression Analysis

13. Poisson Regression Analysis 136 Poisson Regression Analysis 13. Poisson Regression Analysis We have so far considered situations where the outcome variable is numeric and Normally distributed, or binary. In clinical work one often

More information

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ STA 3024 Practice Problems Exam 2 NOTE: These are just Practice Problems. This is NOT meant to look just like the test, and it is NOT the only thing that you should study. Make sure you know all the material

More information

Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010

Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010 Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010 Week 1 Week 2 14.0 Students organize and describe distributions of data by using a number of different

More information

Lesson 14 14 Outline Outline

Lesson 14 14 Outline Outline Lesson 14 Confidence Intervals of Odds Ratio and Relative Risk Lesson 14 Outline Lesson 14 covers Confidence Interval of an Odds Ratio Review of Odds Ratio Sampling distribution of OR on natural log scale

More information

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1) CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.

More information

Section Format Day Begin End Building Rm# Instructor. 001 Lecture Tue 6:45 PM 8:40 PM Silver 401 Ballerini

Section Format Day Begin End Building Rm# Instructor. 001 Lecture Tue 6:45 PM 8:40 PM Silver 401 Ballerini NEW YORK UNIVERSITY ROBERT F. WAGNER GRADUATE SCHOOL OF PUBLIC SERVICE Course Syllabus Spring 2016 Statistical Methods for Public, Nonprofit, and Health Management Section Format Day Begin End Building

More information

How to set the main menu of STATA to default factory settings standards

How to set the main menu of STATA to default factory settings standards University of Pretoria Data analysis for evaluation studies Examples in STATA version 11 List of data sets b1.dta (To be created by students in class) fp1.xls (To be provided to students) fp1.txt (To be

More information

Developing Risk Adjustment Techniques Using the SAS@ System for Assessing Health Care Quality in the lmsystem@

Developing Risk Adjustment Techniques Using the SAS@ System for Assessing Health Care Quality in the lmsystem@ Developing Risk Adjustment Techniques Using the SAS@ System for Assessing Health Care Quality in the lmsystem@ Yanchun Xu, Andrius Kubilius Joint Commission on Accreditation of Healthcare Organizations,

More information

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Name: Section: I pledge my honor that I have not violated the Honor Code Signature: This exam has 34 pages. You have 3 hours to complete this

More information

Raul Cruz-Cano, HLTH653 Spring 2013

Raul Cruz-Cano, HLTH653 Spring 2013 Multilevel Modeling-Logistic Schedule 3/18/2013 = Spring Break 3/25/2013 = Longitudinal Analysis 4/1/2013 = Midterm (Exercises 1-5, not Longitudinal) Introduction Just as with linear regression, logistic

More information

Statistics in Retail Finance. Chapter 2: Statistical models of default

Statistics in Retail Finance. Chapter 2: Statistical models of default Statistics in Retail Finance 1 Overview > We consider how to build statistical models of default, or delinquency, and how such models are traditionally used for credit application scoring and decision

More information

Elementary Statistics

Elementary Statistics Elementary Statistics Chapter 1 Dr. Ghamsary Page 1 Elementary Statistics M. Ghamsary, Ph.D. Chap 01 1 Elementary Statistics Chapter 1 Dr. Ghamsary Page 2 Statistics: Statistics is the science of collecting,

More information

Advanced Quantitative Methods for Health Care Professionals PUBH 742 Spring 2015

Advanced Quantitative Methods for Health Care Professionals PUBH 742 Spring 2015 1 Advanced Quantitative Methods for Health Care Professionals PUBH 742 Spring 2015 Instructor: Joanne M. Garrett, PhD e-mail: joanne_garrett@med.unc.edu Class Notes: Copies of the class lecture slides

More information

ANALYSING LIKERT SCALE/TYPE DATA, ORDINAL LOGISTIC REGRESSION EXAMPLE IN R.

ANALYSING LIKERT SCALE/TYPE DATA, ORDINAL LOGISTIC REGRESSION EXAMPLE IN R. ANALYSING LIKERT SCALE/TYPE DATA, ORDINAL LOGISTIC REGRESSION EXAMPLE IN R. 1. Motivation. Likert items are used to measure respondents attitudes to a particular question or statement. One must recall

More information

Elementary Statistics

Elementary Statistics lementary Statistics Chap10 Dr. Ghamsary Page 1 lementary Statistics M. Ghamsary, Ph.D. Chapter 10 Chi-square Test for Goodness of fit and Contingency tables lementary Statistics Chap10 Dr. Ghamsary Page

More information

Interaction between quantitative predictors

Interaction between quantitative predictors Interaction between quantitative predictors In a first-order model like the ones we have discussed, the association between E(y) and a predictor x j does not depend on the value of the other predictors

More information

Introduction to Stata

Introduction to Stata Introduction to Stata September 23, 2014 Stata is one of a few statistical analysis programs that social scientists use. Stata is in the mid-range of how easy it is to use. Other options include SPSS,

More information

Simple Linear Regression

Simple Linear Regression STAT 101 Dr. Kari Lock Morgan Simple Linear Regression SECTIONS 9.3 Confidence and prediction intervals (9.3) Conditions for inference (9.1) Want More Stats??? If you have enjoyed learning how to analyze

More information

The point estimate you choose depends on the nature of the outcome of interest odds ratio hazard ratio

The point estimate you choose depends on the nature of the outcome of interest odds ratio hazard ratio Point Estimation Definition: A point estimate is a onenumber summary of data. If you had just one number to summarize the inference from your study.. Examples: Dose finding trials: MTD (maximum tolerable

More information

Mind on Statistics. Chapter 15

Mind on Statistics. Chapter 15 Mind on Statistics Chapter 15 Section 15.1 1. A student survey was done to study the relationship between class standing (freshman, sophomore, junior, or senior) and major subject (English, Biology, French,

More information

Introduction to Analysis Methods for Longitudinal/Clustered Data, Part 3: Generalized Estimating Equations

Introduction to Analysis Methods for Longitudinal/Clustered Data, Part 3: Generalized Estimating Equations Introduction to Analysis Methods for Longitudinal/Clustered Data, Part 3: Generalized Estimating Equations Mark A. Weaver, PhD Family Health International Office of AIDS Research, NIH ICSSC, FHI Goa, India,

More information

In the situations that we will encounter, we may generally calculate the probability of an event

In the situations that we will encounter, we may generally calculate the probability of an event What does it mean for something to be random? An event is called random if the process which produces the outcome is sufficiently complicated that we are unable to predict the precise result and are instead

More information

Use of the Chi-Square Statistic. Marie Diener-West, PhD Johns Hopkins University

Use of the Chi-Square Statistic. Marie Diener-West, PhD Johns Hopkins University This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

MA 1125 Lecture 14 - Expected Values. Friday, February 28, 2014. Objectives: Introduce expected values.

MA 1125 Lecture 14 - Expected Values. Friday, February 28, 2014. Objectives: Introduce expected values. MA 5 Lecture 4 - Expected Values Friday, February 2, 24. Objectives: Introduce expected values.. Means, Variances, and Standard Deviations of Probability Distributions Two classes ago, we computed the

More information

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm

More information

Advanced Statistical Analysis of Mortality. Rhodes, Thomas E. and Freitas, Stephen A. MIB, Inc. 160 University Avenue. Westwood, MA 02090

Advanced Statistical Analysis of Mortality. Rhodes, Thomas E. and Freitas, Stephen A. MIB, Inc. 160 University Avenue. Westwood, MA 02090 Advanced Statistical Analysis of Mortality Rhodes, Thomas E. and Freitas, Stephen A. MIB, Inc 160 University Avenue Westwood, MA 02090 001-(781)-751-6356 fax 001-(781)-329-3379 trhodes@mib.com Abstract

More information

Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

More information

Categorical Data Analysis

Categorical Data Analysis Richard L. Scheaffer University of Florida The reference material and many examples for this section are based on Chapter 8, Analyzing Association Between Categorical Variables, from Statistical Methods

More information

If You Think Investing is Gambling, You re Doing it Wrong!

If You Think Investing is Gambling, You re Doing it Wrong! If You Think Investing is Gambling, You re Doing it Wrong! Warren Buffet Jennifer Arthur, M.Sc. PhD Candidate, University of Adelaide Supervisor: Dr. Paul Delfabbro 10th European Conference on Gambling

More information

BRIEF OVERVIEW ON INTERPRETING COUNT MODEL RISK RATIOS

BRIEF OVERVIEW ON INTERPRETING COUNT MODEL RISK RATIOS BRIEF OVERVIEW ON INTERPRETING COUNT MODEL RISK RATIOS An Addendum to Negative Binomial Regression Cambridge University Press (2007) Joseph M. Hilbe 2008, All Rights Reserved This short monograph is intended

More information

Introduction to the Practice of Statistics Fifth Edition Moore, McCabe

Introduction to the Practice of Statistics Fifth Edition Moore, McCabe Introduction to the Practice of Statistics Fifth Edition Moore, McCabe Section 5.1 Homework Answers 5.7 In the proofreading setting if Exercise 5.3, what is the smallest number of misses m with P(X m)

More information

Introduction to Quantitative Methods

Introduction to Quantitative Methods Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................

More information

SUGI 29 Statistics and Data Analysis

SUGI 29 Statistics and Data Analysis Paper 194-29 Head of the CLASS: Impress your colleagues with a superior understanding of the CLASS statement in PROC LOGISTIC Michelle L. Pritchard and David J. Pasta Ovation Research Group, San Francisco,

More information

The Math. P (x) = 5! = 1 2 3 4 5 = 120.

The Math. P (x) = 5! = 1 2 3 4 5 = 120. The Math Suppose there are n experiments, and the probability that someone gets the right answer on any given experiment is p. So in the first example above, n = 5 and p = 0.2. Let X be the number of correct

More information

Math 108 Exam 3 Solutions Spring 00

Math 108 Exam 3 Solutions Spring 00 Math 108 Exam 3 Solutions Spring 00 1. An ecologist studying acid rain takes measurements of the ph in 12 randomly selected Adirondack lakes. The results are as follows: 3.0 6.5 5.0 4.2 5.5 4.7 3.4 6.8

More information

Poisson Regression or Regression of Counts (& Rates)

Poisson Regression or Regression of Counts (& Rates) Poisson Regression or Regression of (& Rates) Carolyn J. Anderson Department of Educational Psychology University of Illinois at Urbana-Champaign Generalized Linear Models Slide 1 of 51 Outline Outline

More information

Chapter 23. Inferences for Regression

Chapter 23. Inferences for Regression Chapter 23. Inferences for Regression Topics covered in this chapter: Simple Linear Regression Simple Linear Regression Example 23.1: Crying and IQ The Problem: Infants who cry easily may be more easily

More information

Nominal and ordinal logistic regression

Nominal and ordinal logistic regression Nominal and ordinal logistic regression April 26 Nominal and ordinal logistic regression Our goal for today is to briefly go over ways to extend the logistic regression model to the case where the outcome

More information

Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression

Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression Logistic Regression Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max

More information

Chapter 7: Proportional Play and the Kelly Betting System

Chapter 7: Proportional Play and the Kelly Betting System Chapter 7: Proportional Play and the Kelly Betting System Proportional Play and Kelly s criterion: Investing in the stock market is, in effect, making a series of bets. Contrary to bets in a casino though,

More information

Final Exam Practice Problem Answers

Final Exam Practice Problem Answers Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal

More information

Section 7C: The Law of Large Numbers

Section 7C: The Law of Large Numbers Section 7C: The Law of Large Numbers Example. You flip a coin 00 times. Suppose the coin is fair. How many times would you expect to get heads? tails? One would expect a fair coin to come up heads half

More information

MONT 107N Understanding Randomness Solutions For Final Examination May 11, 2010

MONT 107N Understanding Randomness Solutions For Final Examination May 11, 2010 MONT 07N Understanding Randomness Solutions For Final Examination May, 00 Short Answer (a) (0) How are the EV and SE for the sum of n draws with replacement from a box computed? Solution: The EV is n times

More information

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing Introduction Hypothesis Testing Mark Lunt Arthritis Research UK Centre for Ecellence in Epidemiology University of Manchester 13/10/2015 We saw last week that we can never know the population parameters

More information

Statistics in Medicine Research Lecture Series CSMC Fall 2014

Statistics in Medicine Research Lecture Series CSMC Fall 2014 Catherine Bresee, MS Senior Biostatistician Biostatistics & Bioinformatics Research Institute Statistics in Medicine Research Lecture Series CSMC Fall 2014 Overview Review concept of statistical power

More information

Using An Ordered Logistic Regression Model with SAS Vartanian: SW 541

Using An Ordered Logistic Regression Model with SAS Vartanian: SW 541 Using An Ordered Logistic Regression Model with SAS Vartanian: SW 541 libname in1 >c:\=; Data first; Set in1.extract; A=1; PROC LOGIST OUTEST=DD MAXITER=100 ORDER=DATA; OUTPUT OUT=CC XBETA=XB P=PROB; MODEL

More information

Module 4 - Multiple Logistic Regression

Module 4 - Multiple Logistic Regression Module 4 - Multiple Logistic Regression Objectives Understand the principles and theory underlying logistic regression Understand proportions, probabilities, odds, odds ratios, logits and exponents Be

More information

Irfan Syed, M.B.B.S., M.P.H Sandra Minor Bulmer, Ph.D. Christine Unson, Ph.D.

Irfan Syed, M.B.B.S., M.P.H Sandra Minor Bulmer, Ph.D. Christine Unson, Ph.D. Determinants and Correlations of Excessive Alcohol Use and Depression among College Students in a North East University Irfan Syed, M.B.B.S., M.P.H Sandra Minor Bulmer, Ph.D. Christine Unson, Ph.D. SOUTHERN

More information

Analysis of Survey Data Using the SAS SURVEY Procedures: A Primer

Analysis of Survey Data Using the SAS SURVEY Procedures: A Primer Analysis of Survey Data Using the SAS SURVEY Procedures: A Primer Patricia A. Berglund, Institute for Social Research - University of Michigan Wisconsin and Illinois SAS User s Group June 25, 2014 1 Overview

More information

Example. A casino offers the following bets (the fairest bets in the casino!) 1 You get $0 (i.e., you can walk away)

Example. A casino offers the following bets (the fairest bets in the casino!) 1 You get $0 (i.e., you can walk away) : Three bets Math 45 Introduction to Probability Lecture 5 Kenneth Harris aharri@umich.edu Department of Mathematics University of Michigan February, 009. A casino offers the following bets (the fairest

More information

Calculating the Probability of Returning a Loan with Binary Probability Models

Calculating the Probability of Returning a Loan with Binary Probability Models Calculating the Probability of Returning a Loan with Binary Probability Models Associate Professor PhD Julian VASILEV (e-mail: vasilev@ue-varna.bg) Varna University of Economics, Bulgaria ABSTRACT The

More information

School of Nursing Faculty Salary Equity Report and Action Plan

School of Nursing Faculty Salary Equity Report and Action Plan July 1, 2015 School of Nursing Faculty Salary Equity Report and Action Plan Shari L. Dworkin, Ph.D., M.S. Associate Dean for Academic Affairs Overview: In 2012, then UC President Mark Yudof charged each

More information

Outline. Dispersion Bush lupine survival Quasi-Binomial family

Outline. Dispersion Bush lupine survival Quasi-Binomial family Outline 1 Three-way interactions 2 Overdispersion in logistic regression Dispersion Bush lupine survival Quasi-Binomial family 3 Simulation for inference Why simulations Testing model fit: simulating the

More information

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

 Y. Notation and Equations for Regression Lecture 11/4. Notation: Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

More information

Estimation of σ 2, the variance of ɛ

Estimation of σ 2, the variance of ɛ Estimation of σ 2, the variance of ɛ The variance of the errors σ 2 indicates how much observations deviate from the fitted surface. If σ 2 is small, parameters β 0, β 1,..., β k will be reliably estimated

More information

Chapter 7: Dummy variable regression

Chapter 7: Dummy variable regression Chapter 7: Dummy variable regression Why include a qualitative independent variable?........................................ 2 Simplest model 3 Simplest case.............................................................

More information

Examining a Fitted Logistic Model

Examining a Fitted Logistic Model STAT 536 Lecture 16 1 Examining a Fitted Logistic Model Deviance Test for Lack of Fit The data below describes the male birth fraction male births/total births over the years 1931 to 1990. A simple logistic

More information

SUMAN DUVVURU STAT 567 PROJECT REPORT

SUMAN DUVVURU STAT 567 PROJECT REPORT SUMAN DUVVURU STAT 567 PROJECT REPORT SURVIVAL ANALYSIS OF HEROIN ADDICTS Background and introduction: Current illicit drug use among teens is continuing to increase in many countries around the world.

More information

Using Stata 11 & higher for Logistic Regression Richard Williams, University of Notre Dame, Last revised March 28, 2015

Using Stata 11 & higher for Logistic Regression Richard Williams, University of Notre Dame,  Last revised March 28, 2015 Using Stata 11 & higher for Logistic Regression Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised March 28, 2015 NOTE: The routines spost13, lrdrop1, and extremes are

More information