5. Ordinal regression: cumulative categories proportional odds. 6. Ordinal regression: comparison to single reference generalized logits

Size: px
Start display at page:

Download "5. Ordinal regression: cumulative categories proportional odds. 6. Ordinal regression: comparison to single reference generalized logits"

Transcription

1 Lecture Logistic regression with binary response 2. Proc Logistic and its surprises 3. quadratic model 4. Hosmer-Lemeshow test for lack of fit 5. Ordinal regression: cumulative categories proportional odds 6. Ordinal regression: comparison to single reference generalized logits Logistic regression using the SAS system: theory and application by Paul Allison, 2001, Wiley-SAS Categorical Data Analysis Using the SAS System, 2nd ed. by Stokes, Davis, and Koch, 2009, SAS Institute 1 Proc Logistic: Traps for the Unwary, by P.L. Flom (on course website) 2

2 Example: Obesity in NHANES 2004 NHANES 2004 data for children and adults people under age 50 (n = 6116) Event = obesity, defined as BMI 30, or 95th percentile for children Association between age and rate of obesity? P[obese age] = º(age) 3 Proc Logistic Proc Logistic data=under50; model obese = age; model format like Proc GLM Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Intercept <.0001 age <.0001 Odds Ratio Estimates Point 95% Wald Effect Estimate Confidence Limits age exp( ) = , Correct value, but does this odds ratio make sense? 4

3 First part of the output: The LOGISTIC Procedure Data Set WORK.ADULT Response Variable obese Number of Response Levels 2 Model binary logit Optimization Technique Fisher s scoring Number of Observations Read 6116 Number of Observations Used 6116 Response Profile Ordered Total Value obese Frequency Probability modeled is obese=0. 5 There is also a warning in the log NOTE: PROC LOGISTIC is modeling the probability that obese=0. One way to change this to model the probability that obese=1 is to specify the response variable option EVENT= 1. NOTE: Convergence criterion (GCONV=1E-8) satisfied. NOTE: There were 6116 observations read from the data set WORK.UNDER50. Proc Logistic models probability of no event by default. Opposite of what everyone expects. 6

4 Two fixes: 1. Use the descending option to make SAS fit the probability that y = 1. Works in both Logistic and Proc Genmod (which also fits logistic regression). Proc Logistic descending model obese = age; data=under50; 2. Define the event in the MODEL statement. Only works in Logistic. Proc Logistic data=under50; model obese (event = 1 ) = age; Clearer code, but unfortunately doesn t work in Proc Genmod. 7 Proc Logistic data=under50; model obese (event = 1 ) = age; NOTE: PROC LOGISTIC is modeling the probability that obese=1. NOTE: Convergence criterion (GCONV=1E-8) satisfied. NOTE: There were 6116 observations read from the data set WORK.UNDER50 8

5 Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Intercept <.0001 age <.0001 Odds Ratio Estimates Point 95% Wald Effect Estimate Confidence Limits age Odds of obesity increase by a factor of 1.03 per year for people under 50 (95% confidence interval ). Odds of obesity increase by 3% per year for people under 50 (95% CI %). 9 Two graphs of fitted (linear) logistic model: µ º(age) log odds of obesity (as function of age) = log = age 1 º(age) 0.4 Log odds of obesity Age (years) 10

6 exp( age) probability of obesity as function of age = º(age) = 1 + exp( age) 1.0 Fitted probability of obesity Age (years) 11 Fitted logistic curve extrapolated to show full S-shaped curve: 1 Rate of Obesity Range of data Age (years) 12

7 Changing units for odds ratios To get odds ratios for a 10-year change in age: Proc Logistic descending data=under50; model obese = age / rsquare CLodds = PL; units age = 10.0 ; Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Intercept <.0001 age <.0001 Odds Ratio Estimates Point 95% Wald Effect Estimate Confidence Limits age Profile Likelihood Confidence Interval for Odds Ratios Effect Unit Estimate 95% Confidence Limits age Two useful options to the model statement in Proc Logistic: Proc Logistic descending; model obese = age / CLodds = PL Rsquare ; Default (Wald) confidence intervals for OR are exp ˆØ ± SE( ˆØ) CLodds = PL profile likelihood confidence intervals" are more accurate than Wald for small samples. Gives same answer for large samples. Rsquare gives a version of R-square from linear regression Maximum possible value of generalized R 2 is not 1.0 as for linear regression. Max-rescaled R-Square divides by this maximum value to fix this. 14

8 R-Square Max-rescaled R-Square Odds Ratio Estimates Point 95% Wald Effect Estimate Confidence Limits age Profile Likelihood Confidence Interval for Odds Ratios Effect Unit Estimate 95% Confidence Limits age Profile Likelihood CI is identical here because sample size is large (n = 6116) 15 CLASS variables in Proc Logistic proc logistic descending data=under50; class gender; model obese = age gender; Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Intercept <.0001 age <.0001 gender F Odds Ratio Estimates Point 95% Wald Effect Estimate Confidence Limits age gender F vs M For age, exp(0.0302) = 1.031, but for gender, exp(0.0779) =

9 The LOGISTIC Procedure Model Information Data Set WORK.ADULT Response Variable obese Number of Response Levels 2 Model binary logit Optimization Technique Fisher s scoring Class Level Information Design Class Value Variables gender F 1 M -1 Default coding for CLASS variables is not the same as Proc GLM (0/1) 17 If you want to work with regression coefficients, then request 0 ± 1 indicator variables. proc logistic descending data=under50; class gender / param = GLM ; model obese = age gender Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Intercept <.0001 age <.0001 gender F gender M Odds Ratio Estimates Point 95% Wald Effect Estimate Confidence Limits age gender F vs M exp(0.1558) =

10 Logistic regression fits linear model on log(odds) scale. is parallel lines model. Distance between lines = log(odds ratio). obese = age gender Females 0.5 Log(odds of obesity) 1.0 Males Age (years) 19 Parallel lines model on probability scale Females Males Probability of Obesity Age (years) 20

11 Check for interaction (separate lines model): Proc Logistic descending data=under50; class gender; model obese = age gender age*gender ; Surprise: get regression coefficients but no odds ratios. 21 Type 3 Analysis of Effects Wald Effect DF Chi-Square Pr > ChiSq age <.0001 gender age*gender Analysis of Maximum Likelihood Estimates reg coef Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Intercept <.0001 age <.0001 gender female age*gender female Proc Logistic does not calculate odds ratios for effects included in interaction terms: no odds ratios for age or gender 22

12 Why no odds ratio for terms in interaction? Proc Logistic descending data=young; class gender; model obese = age gender age*gender / rsquare CLodds = PL; Analysis of Maximum Likelihood Estimates reg coef Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Intercept <.0001 age <.0001 gender female age*gender female Model fits separate line for each gender: 2 intercepts, 2 slopes 23 Separate line for each gender on log(odds) scale. Distance between gender lines is different at every age: no single gender effect 0.0 Females Log(odds of obesity) Males Age (years) 24

13 Separate line for each gender on probability scale Probability of Obesity Females Males Age (years) 25 ODDSRATIO statement This is like LSmeans in Proc GLM: gives odds ratios for combinations of factor-levels in the interaction. proc logistic descending data=under50; class gender; model obese = age gender age*gender / rsquare CLodds = PL; oddsratio age ; oddsratio gender / at (age= ); 26

14 Wald Confidence Interval for Odds Ratios Label Estimate 95% Confidence Limits age at gender=f age at gender=m separate age odds ratios for each gender) gender F vs M at age= gender F vs M at age= gender F vs M at age= gender F vs M at age= comparisons of genders adjusted for specific ages 27 Summary: surprises in Proc Logistic 1. Fits probability of no event by default. Opposite of what everyone expects. 2. Codes class variables using +1 ± 1 rather than 0 ± 1. Makes regression coefficients difficult to interpret. 3. Odds ratios are main effects, so no odds ratios for terms involved in interactions. Use oddsratio statement. 4. More: see Proc Logistic: Traps for the Unwary, by P.L. Flom (on course website) 28

15 Fitting a quadratic model Linear logistic regression model: obese = age 1.0 Fitted probability of obesity Age (years) 29 cannot capture the initial dip we see in the smooth: 30

16 To test for this initial decrease, fit a quadratic term in age: age2 = (age-15.)*(age-15.)/100.; add variable in data step proc logistic descending data=under50; model obese = age age2 / rsquare CLodds = PL ; Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Intercept <.0001 age age Conclusion? 31 Lack of fit test: Hosmer-Lemeshow Hosmer and Lemeshow (2000) proposed a test for lack of fit: 1. Divide the fitted probabilities into deciles (rank and divide into tenths). 2. Find the mean probability p i in each decile. 3. For each decile, calculate expected events as p i N i, for i = 1,..., Calculate chi-square test using observed and expected events: X 2 = 10X i=1 observed count expected count 2 expected count Null hypothesis: no lack of fit expected count = observed count. 32

17 proc logistic descending data=under50; model obese = age / rsquare CLodds = PL lackfit ; requests Hosmer-Lemeshow test Partition for the Hosmer and Lemeshow Test obese = 1 obese = 0 Group Total Observed Expected Observed Expected Hosmer and Lemeshow Goodness-of-Fit Test Chi-Square DF Pr > ChiSq Hosmer-Lemeshow results when the quadratic age term is included: Partition for the Hosmer and Lemeshow Test obese = 1 obese = 0 Group Total Observed Expected Observed Expected Hosmer and Lemeshow Goodness-of-Fit Test Chi-Square DF Pr > ChiSq

18 Ordinal logistic regression In the logistic regression example, we looked at how rate of obesity related to age and gender in NHANES Obesity is a binary response, defined by BMI 30. However, there is an intermediate category, defined in adults as: Obese: BMI 30. Overweight: 25 BMI < 30 Normal weight: 18 BMI < 25 Examine how rates of obesity and overweight relate to age and gender, then three ordered categories: Normal weight < Overweight < Obese 35 Ordinal response Response variable has three or more ordered categories. Ordinal response categories may be defined by a continuous measurement scale, as obesity and overweight are defined with reference to the BMI scale. Or they may just be ordered: Worse < No Change < Recovered where it does not make sense to ask about the distance between categories. Ordinal models use only the ranks of the categories. 36

19 Proc SGplot data=ph6470.age_bmi; loess y = obese x = age /LINEATTRS= (pattern=1 color="red"); loess y = overweight x = age /LINEATTRS= (pattern=1 color="blue"); 37 Divide age into two categories: young aged years old, and old aged Frequency Row Pct 1_normal 2_overwt 3_obese Total old young Total Percent in each weight category for old: ô 11, ô 12, ô 13. Percent in each weight category for young: ô 21, ô 22, ô

20 Multinomial distribution (generalizes binomial distribution): parameters are the probabilities (º ij ) of being in each category. Logistic regression estimates the difference of odds (on the log scale). Ordinal regression will fit two logistic regression simultaneously: Odds of being in the top category vs the rest: obese vs (overweight + normal) Odds of being in the top two categories vs the rest: (overweight + obese) vs normal Difference in odds between young and old assumed to be same for both. 39 Proportional odds model for ordinal responses Proportional odds model forces > 2 ordinal categories into binary comparisons by combining categories in sequence from the top. Gives cumulative odds: 1. Odds of top category vs the rest: obese vs normal + overweight 2. Odds of top two categories vs the rest: obese + overweight vs normal 3. Odds of being in the top three categories vs the rest, etc. To define these odds we define cumulative probabilities: µ 3 = chance of obesity = º 3 µ 2 = chance of obesity or overweight = º 2 + º 3, 40

21 Frequency Row Pct 1_normal 2_overwt 3_obese Total old young Total Find odds ratios for old to young of: obesity overweight + obesity 41 Normal + Obese Overweight odds odds ratio Age Age Obese + Overweight Normal odds odds ratio Age Age

22 Proportional odds model combines two logistic regression models: logit(µ h3 ) logit(µ h2 ) = log odds of being in the top category vs the rest, for group h = log odds of being in the top two category vs the rest, for group h 8 >< >: µh3 logit(µ h3 ) = log 1 µ h3 µh2 logit(µ h2 ) = log 1 µ h2 = Æ 3 + x h Ø = Æ 2 + x h Ø Ø estimates the same covariate effect in both models: an average effect (odds ratio) of age for both BMI cut-points, ratio of the odds for someone 40+ of being in a heavier category to the odds for someone Proc Logistic tests this assumption. 43 Fitting the proportional odds model in Proc Logistic Proc Logistic descending data=mayod327.age_bmi_sample; class age_category /param=glm; model bmi_cat = age_category ; bmi_cat has 3 levels. Default when the response has > 2 levels is the proportional odds model. 44

23 It is critical to check that SAS is combining categories in the right direction: use the descending option to reverse the order. Response Profile Ordered Total Value bmi_cat Frequency 1 3_obese _overwt _normal 308 Probabilities modeled are cumulated over the lower Ordered Values. From the log file: NOTE: PROC LOGISTIC is fitting the cumulative logit model. The probabilities modeled are summed over the responses having the lower Ordered Values in the Response Profile table. 45 Test of proportional odds assumption SAS tests H 0 : odds are proportional, against a larger model with separate effects of age for each category-comparison. In our example, this means two Ø values instead of one, so this larger model has 1 extra parameter and the test has 1 degree of freedom. Score Test for the Proportional Odds Assumption Chi-Square DF Pr > ChiSq

24 Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Intercept 3_obese <.0001 Intercept 2_overwt <.0001 age_category old <.0001 age_category young Odds Ratio Estimates Point 95% Wald Effect Estimate Confidence Limits age_category old vs young Age effect: the old have 1.7 times the odds of being obese compared to the young, and 1.7 times the odds of being overweight or obese. Better: those have 1.7 times higher odds of being in a heavier category than those Intercepts are Æ 2 and Æ 3 : not informative. 47 Why not just do separate logistic regressions for each cut-point? Ordinal logistic regression also usually gives greater precision (more power): the standard error for the regression coefficient is smaller. Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq obese vs rest age_category old obese+overweight vs rest age_category old <.0001 Proportional Odds age_category old <.0001 Often a good model check. Regression coefficient from proportional odds is essentially an average of the regression coefficients in the 2 logistic regression models, but they are quite close. 48

25 Proportional odds model with age group and gender Proc Logistic descending data=mayod327.age_bmi_sample; class age_category gender /param=glm; model bmi_cat = age_category gender ; Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Intercept 3_obese <.0001 Intercept 2_overwt <.0001 age_category old <.0001 age_category young gender female gender male Odds Ratio Estimates Point 95% Wald Effect Estimate Confidence Limits age_category old vs young gender female vs male With 2 age categories and 2 genders, we have 4 subgroups. We assume that the odds ratios between any two are the same for all cumulative comparisons of categories. Score Test for the Proportional Odds Assumption Chi-Square DF Pr > ChiSq It appears that this assumption fails for this data. What now? 50

26 Generalized logits model: unranked categories Proportional odds makes odds from adjoining ordered categories. If categories are not ordered, then proportional odds cannot be applied. Generalized logits makes odds between one reference category and all the other categories. Handles categories without order, eg. vanilla, strawberry, chocolate Proc Logistic descending data=mayod327.age_bmi_sample; class age_category gender / param=glm; model bmi_cat = age_category gender / link=glogit ; 51 Default is to use the highest category as reference: Ordered Total Value bmi_cat Frequency 1 3_obese _overwt _normal 308 Logits modeled use bmi_cat= 1_normal as the reference category. Notice that degrees of freedom are twice as large as they should be: Type 3 Analysis of Effects Wald Effect DF Chi-Square Pr > ChiSq age_category <.0001 gender We are essentially fitting two separate models (normal vs overweight, normal vs obese). H 0 : reg coef for both models = 0 52

27 Here are the regression coefficients: note the doubling Standard Wald Parameter bmi_cat DF Estimate Error Chi-Square Pr > ChiSq Intercept 3_obese Intercept 2_overwt age_category old 3_obese <.0001 age_category old 2_overwt age_category young 3_obese age_category young 2_overwt gender female 3_obese gender female 2_overwt gender male 3_obese gender male 2_overwt If c response categories, then usual degrees of freedom are multiplied by (c 1). 53 Odds Ratio Estimates Point 95% Wald Effect bmi_cat Estimate Confidence Limits age_category old vs young 3_obese age_category old vs young 2_overwt gender female vs male 3_obese gender female vs male 2_overwt Averaging across genders, those over 40 have 2 times greater odds of being obese and 1.6 times greater odds of being overweight. Averaging across ages, women have about half the men s odds of being overweight, but about the same odds for obesity. 54

Cool Tools for PROC LOGISTIC

Cool Tools for PROC LOGISTIC Cool Tools for PROC LOGISTIC Paul D. Allison Statistical Horizons LLC and the University of Pennsylvania March 2013 www.statisticalhorizons.com 1 New Features in LOGISTIC ODDSRATIO statement EFFECTPLOT

More information

SUGI 29 Statistics and Data Analysis

SUGI 29 Statistics and Data Analysis Paper 194-29 Head of the CLASS: Impress your colleagues with a superior understanding of the CLASS statement in PROC LOGISTIC Michelle L. Pritchard and David J. Pasta Ovation Research Group, San Francisco,

More information

VI. Introduction to Logistic Regression

VI. Introduction to Logistic Regression VI. Introduction to Logistic Regression We turn our attention now to the topic of modeling a categorical outcome as a function of (possibly) several factors. The framework of generalized linear models

More information

Basic Statistical and Modeling Procedures Using SAS

Basic Statistical and Modeling Procedures Using SAS Basic Statistical and Modeling Procedures Using SAS One-Sample Tests The statistical procedures illustrated in this handout use two datasets. The first, Pulse, has information collected in a classroom

More information

STATISTICA Formula Guide: Logistic Regression. Table of Contents

STATISTICA Formula Guide: Logistic Regression. Table of Contents : Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary

More information

11. Analysis of Case-control Studies Logistic Regression

11. Analysis of Case-control Studies Logistic Regression Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:

More information

Multinomial and Ordinal Logistic Regression

Multinomial and Ordinal Logistic Regression Multinomial and Ordinal Logistic Regression ME104: Linear Regression Analysis Kenneth Benoit August 22, 2012 Regression with categorical dependent variables When the dependent variable is categorical,

More information

Using An Ordered Logistic Regression Model with SAS Vartanian: SW 541

Using An Ordered Logistic Regression Model with SAS Vartanian: SW 541 Using An Ordered Logistic Regression Model with SAS Vartanian: SW 541 libname in1 >c:\=; Data first; Set in1.extract; A=1; PROC LOGIST OUTEST=DD MAXITER=100 ORDER=DATA; OUTPUT OUT=CC XBETA=XB P=PROB; MODEL

More information

Overview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)

Overview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) Overview Classes 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) 2-4 Loglinear models (8) 5-4 15-17 hrs; 5B02 Building and

More information

PROC LOGISTIC: Traps for the unwary Peter L. Flom, Independent statistical consultant, New York, NY

PROC LOGISTIC: Traps for the unwary Peter L. Flom, Independent statistical consultant, New York, NY PROC LOGISTIC: Traps for the unwary Peter L. Flom, Independent statistical consultant, New York, NY ABSTRACT Keywords: Logistic. INTRODUCTION This paper covers some gotchas in SAS R PROC LOGISTIC. A gotcha

More information

Ordinal Regression. Chapter

Ordinal Regression. Chapter Ordinal Regression Chapter 4 Many variables of interest are ordinal. That is, you can rank the values, but the real distance between categories is unknown. Diseases are graded on scales from least severe

More information

Developing Risk Adjustment Techniques Using the SAS@ System for Assessing Health Care Quality in the lmsystem@

Developing Risk Adjustment Techniques Using the SAS@ System for Assessing Health Care Quality in the lmsystem@ Developing Risk Adjustment Techniques Using the SAS@ System for Assessing Health Care Quality in the lmsystem@ Yanchun Xu, Andrius Kubilius Joint Commission on Accreditation of Healthcare Organizations,

More information

SAS Software to Fit the Generalized Linear Model

SAS Software to Fit the Generalized Linear Model SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling

More information

Logistic Regression. http://faculty.chass.ncsu.edu/garson/pa765/logistic.htm#sigtests

Logistic Regression. http://faculty.chass.ncsu.edu/garson/pa765/logistic.htm#sigtests Logistic Regression http://faculty.chass.ncsu.edu/garson/pa765/logistic.htm#sigtests Overview Binary (or binomial) logistic regression is a form of regression which is used when the dependent is a dichotomy

More information

Lecture 18: Logistic Regression Continued

Lecture 18: Logistic Regression Continued Lecture 18: Logistic Regression Continued Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina

More information

Unit 12 Logistic Regression Supplementary Chapter 14 in IPS On CD (Chap 16, 5th ed.)

Unit 12 Logistic Regression Supplementary Chapter 14 in IPS On CD (Chap 16, 5th ed.) Unit 12 Logistic Regression Supplementary Chapter 14 in IPS On CD (Chap 16, 5th ed.) Logistic regression generalizes methods for 2-way tables Adds capability studying several predictors, but Limited to

More information

Multinomial and ordinal logistic regression using PROC LOGISTIC Peter L. Flom National Development and Research Institutes, Inc

Multinomial and ordinal logistic regression using PROC LOGISTIC Peter L. Flom National Development and Research Institutes, Inc ABSTRACT Multinomial and ordinal logistic regression using PROC LOGISTIC Peter L. Flom National Development and Research Institutes, Inc Logistic regression may be useful when we are trying to model a

More information

Analysis of Survey Data Using the SAS SURVEY Procedures: A Primer

Analysis of Survey Data Using the SAS SURVEY Procedures: A Primer Analysis of Survey Data Using the SAS SURVEY Procedures: A Primer Patricia A. Berglund, Institute for Social Research - University of Michigan Wisconsin and Illinois SAS User s Group June 25, 2014 1 Overview

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models We have previously worked with regression models where the response variable is quantitative and normally distributed. Now we turn our attention to two types of models where the

More information

Binary Logistic Regression

Binary Logistic Regression Binary Logistic Regression Main Effects Model Logistic regression will accept quantitative, binary or categorical predictors and will code the latter two in various ways. Here s a simple model including

More information

LOGISTIC REGRESSION ANALYSIS

LOGISTIC REGRESSION ANALYSIS LOGISTIC REGRESSION ANALYSIS C. Mitchell Dayton Department of Measurement, Statistics & Evaluation Room 1230D Benjamin Building University of Maryland September 1992 1. Introduction and Model Logistic

More information

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Beckman HLM Reading Group: Questions, Answers and Examples Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Linear Algebra Slide 1 of

More information

Logistic (RLOGIST) Example #1

Logistic (RLOGIST) Example #1 Logistic (RLOGIST) Example #1 SUDAAN Statements and Results Illustrated EFFECTS RFORMAT, RLABEL REFLEVEL EXP option on MODEL statement Hosmer-Lemeshow Test Input Data Set(s): BRFWGT.SAS7bdat Example Using

More information

Lecture 14: GLM Estimation and Logistic Regression

Lecture 14: GLM Estimation and Logistic Regression Lecture 14: GLM Estimation and Logistic Regression Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South

More information

Chapter 39 The LOGISTIC Procedure. Chapter Table of Contents

Chapter 39 The LOGISTIC Procedure. Chapter Table of Contents Chapter 39 The LOGISTIC Procedure Chapter Table of Contents OVERVIEW...1903 GETTING STARTED...1906 SYNTAX...1910 PROCLOGISTICStatement...1910 BYStatement...1912 CLASSStatement...1913 CONTRAST Statement.....1916

More information

Examining a Fitted Logistic Model

Examining a Fitted Logistic Model STAT 536 Lecture 16 1 Examining a Fitted Logistic Model Deviance Test for Lack of Fit The data below describes the male birth fraction male births/total births over the years 1931 to 1990. A simple logistic

More information

Chapter 29 The GENMOD Procedure. Chapter Table of Contents

Chapter 29 The GENMOD Procedure. Chapter Table of Contents Chapter 29 The GENMOD Procedure Chapter Table of Contents OVERVIEW...1365 WhatisaGeneralizedLinearModel?...1366 ExamplesofGeneralizedLinearModels...1367 TheGENMODProcedure...1368 GETTING STARTED...1370

More information

This can dilute the significance of a departure from the null hypothesis. We can focus the test on departures of a particular form.

This can dilute the significance of a departure from the null hypothesis. We can focus the test on departures of a particular form. One-Degree-of-Freedom Tests Test for group occasion interactions has (number of groups 1) number of occasions 1) degrees of freedom. This can dilute the significance of a departure from the null hypothesis.

More information

Lecture 19: Conditional Logistic Regression

Lecture 19: Conditional Logistic Regression Lecture 19: Conditional Logistic Regression Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina

More information

Statistics and Data Analysis

Statistics and Data Analysis NESUG 27 PRO LOGISTI: The Logistics ehind Interpreting ategorical Variable Effects Taylor Lewis, U.S. Office of Personnel Management, Washington, D STRT The goal of this paper is to demystify how SS models

More information

ANALYSING LIKERT SCALE/TYPE DATA, ORDINAL LOGISTIC REGRESSION EXAMPLE IN R.

ANALYSING LIKERT SCALE/TYPE DATA, ORDINAL LOGISTIC REGRESSION EXAMPLE IN R. ANALYSING LIKERT SCALE/TYPE DATA, ORDINAL LOGISTIC REGRESSION EXAMPLE IN R. 1. Motivation. Likert items are used to measure respondents attitudes to a particular question or statement. One must recall

More information

Chapter 5 Analysis of variance SPSS Analysis of variance

Chapter 5 Analysis of variance SPSS Analysis of variance Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,

More information

II. DISTRIBUTIONS distribution normal distribution. standard scores

II. DISTRIBUTIONS distribution normal distribution. standard scores Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

More information

Methods for Interaction Detection in Predictive Modeling Using SAS Doug Thompson, PhD, Blue Cross Blue Shield of IL, NM, OK & TX, Chicago, IL

Methods for Interaction Detection in Predictive Modeling Using SAS Doug Thompson, PhD, Blue Cross Blue Shield of IL, NM, OK & TX, Chicago, IL Paper SA01-2012 Methods for Interaction Detection in Predictive Modeling Using SAS Doug Thompson, PhD, Blue Cross Blue Shield of IL, NM, OK & TX, Chicago, IL ABSTRACT Analysts typically consider combinations

More information

ASSIGNMENT 4 PREDICTIVE MODELING AND GAINS CHARTS

ASSIGNMENT 4 PREDICTIVE MODELING AND GAINS CHARTS DATABASE MARKETING Fall 2015, max 24 credits Dead line 15.10. ASSIGNMENT 4 PREDICTIVE MODELING AND GAINS CHARTS PART A Gains chart with excel Prepare a gains chart from the data in \\work\courses\e\27\e20100\ass4b.xls.

More information

International Statistical Institute, 56th Session, 2007: Phil Everson

International Statistical Institute, 56th Session, 2007: Phil Everson Teaching Regression using American Football Scores Everson, Phil Swarthmore College Department of Mathematics and Statistics 5 College Avenue Swarthmore, PA198, USA E-mail: peverso1@swarthmore.edu 1. Introduction

More information

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group MISSING DATA TECHNIQUES WITH SAS IDRE Statistical Consulting Group ROAD MAP FOR TODAY To discuss: 1. Commonly used techniques for handling missing data, focusing on multiple imputation 2. Issues that could

More information

Yew May Martin Maureen Maclachlan Tom Karmel Higher Education Division, Department of Education, Training and Youth Affairs.

Yew May Martin Maureen Maclachlan Tom Karmel Higher Education Division, Department of Education, Training and Youth Affairs. How is Australia s Higher Education Performing? An analysis of completion rates of a cohort of Australian Post Graduate Research Students in the 1990s. Yew May Martin Maureen Maclachlan Tom Karmel Higher

More information

Logit and Probit. Brad Jones 1. April 21, 2009. University of California, Davis. Bradford S. Jones, UC-Davis, Dept. of Political Science

Logit and Probit. Brad Jones 1. April 21, 2009. University of California, Davis. Bradford S. Jones, UC-Davis, Dept. of Political Science Logit and Probit Brad 1 1 Department of Political Science University of California, Davis April 21, 2009 Logit, redux Logit resolves the functional form problem (in terms of the response function in the

More information

SP10 From GLM to GLIMMIX-Which Model to Choose? Patricia B. Cerrito, University of Louisville, Louisville, KY

SP10 From GLM to GLIMMIX-Which Model to Choose? Patricia B. Cerrito, University of Louisville, Louisville, KY SP10 From GLM to GLIMMIX-Which Model to Choose? Patricia B. Cerrito, University of Louisville, Louisville, KY ABSTRACT The purpose of this paper is to investigate several SAS procedures that are used in

More information

A Handbook of Statistical Analyses Using R. Brian S. Everitt and Torsten Hothorn

A Handbook of Statistical Analyses Using R. Brian S. Everitt and Torsten Hothorn A Handbook of Statistical Analyses Using R Brian S. Everitt and Torsten Hothorn CHAPTER 6 Logistic Regression and Generalised Linear Models: Blood Screening, Women s Role in Society, and Colonic Polyps

More information

Is it statistically significant? The chi-square test

Is it statistically significant? The chi-square test UAS Conference Series 2013/14 Is it statistically significant? The chi-square test Dr Gosia Turner Student Data Management and Analysis 14 September 2010 Page 1 Why chi-square? Tests whether two categorical

More information

Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

More information

Statistics in Retail Finance. Chapter 2: Statistical models of default

Statistics in Retail Finance. Chapter 2: Statistical models of default Statistics in Retail Finance 1 Overview > We consider how to build statistical models of default, or delinquency, and how such models are traditionally used for credit application scoring and decision

More information

ln(p/(1-p)) = α +β*age35plus, where p is the probability or odds of drinking

ln(p/(1-p)) = α +β*age35plus, where p is the probability or odds of drinking Dummy Coding for Dummies Kathryn Martin, Maternal, Child and Adolescent Health Program, California Department of Public Health ABSTRACT There are a number of ways to incorporate categorical variables into

More information

Model Fitting in PROC GENMOD Jean G. Orelien, Analytical Sciences, Inc.

Model Fitting in PROC GENMOD Jean G. Orelien, Analytical Sciences, Inc. Paper 264-26 Model Fitting in PROC GENMOD Jean G. Orelien, Analytical Sciences, Inc. Abstract: There are several procedures in the SAS System for statistical modeling. Most statisticians who use the SAS

More information

Improved Interaction Interpretation: Application of the EFFECTPLOT statement and other useful features in PROC LOGISTIC

Improved Interaction Interpretation: Application of the EFFECTPLOT statement and other useful features in PROC LOGISTIC Paper AA08-2013 Improved Interaction Interpretation: Application of the EFFECTPLOT statement and other useful features in PROC LOGISTIC Robert G. Downer, Grand Valley State University, Allendale, MI ABSTRACT

More information

Students' Opinion about Universities: The Faculty of Economics and Political Science (Case Study)

Students' Opinion about Universities: The Faculty of Economics and Political Science (Case Study) Cairo University Faculty of Economics and Political Science Statistics Department English Section Students' Opinion about Universities: The Faculty of Economics and Political Science (Case Study) Prepared

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

More information

Linda K. Muthén Bengt Muthén. Copyright 2008 Muthén & Muthén www.statmodel.com. Table Of Contents

Linda K. Muthén Bengt Muthén. Copyright 2008 Muthén & Muthén www.statmodel.com. Table Of Contents Mplus Short Courses Topic 2 Regression Analysis, Eploratory Factor Analysis, Confirmatory Factor Analysis, And Structural Equation Modeling For Categorical, Censored, And Count Outcomes Linda K. Muthén

More information

How to set the main menu of STATA to default factory settings standards

How to set the main menu of STATA to default factory settings standards University of Pretoria Data analysis for evaluation studies Examples in STATA version 11 List of data sets b1.dta (To be created by students in class) fp1.xls (To be provided to students) fp1.txt (To be

More information

Segmentation For Insurance Payments Michael Sherlock, Transcontinental Direct, Warminster, PA

Segmentation For Insurance Payments Michael Sherlock, Transcontinental Direct, Warminster, PA Segmentation For Insurance Payments Michael Sherlock, Transcontinental Direct, Warminster, PA ABSTRACT An online insurance agency has built a base of names that responded to different offers from various

More information

A LOGISTIC REGRESSION MODEL TO PREDICT FRESHMEN ENROLLMENTS Vijayalakshmi Sampath, Andrew Flagel, Carolina Figueroa

A LOGISTIC REGRESSION MODEL TO PREDICT FRESHMEN ENROLLMENTS Vijayalakshmi Sampath, Andrew Flagel, Carolina Figueroa A LOGISTIC REGRESSION MODEL TO PREDICT FRESHMEN ENROLLMENTS Vijayalakshmi Sampath, Andrew Flagel, Carolina Figueroa ABSTRACT Predictive modeling is the technique of using historical information on a certain

More information

We extended the additive model in two variables to the interaction model by adding a third term to the equation.

We extended the additive model in two variables to the interaction model by adding a third term to the equation. Quadratic Models We extended the additive model in two variables to the interaction model by adding a third term to the equation. Similarly, we can extend the linear model in one variable to the quadratic

More information

ABSTRACT INTRODUCTION

ABSTRACT INTRODUCTION Paper SP03-2009 Illustrative Logistic Regression Examples using PROC LOGISTIC: New Features in SAS/STAT 9.2 Robert G. Downer, Grand Valley State University, Allendale, MI Patrick J. Richardson, Van Andel

More information

An Introduction to Statistical Tests for the SAS Programmer Sara Beck, Fred Hutchinson Cancer Research Center, Seattle, WA

An Introduction to Statistical Tests for the SAS Programmer Sara Beck, Fred Hutchinson Cancer Research Center, Seattle, WA ABSTRACT An Introduction to Statistical Tests for the SAS Programmer Sara Beck, Fred Hutchinson Cancer Research Center, Seattle, WA Often SAS Programmers find themselves in situations where performing

More information

Discussion Section 4 ECON 139/239 2010 Summer Term II

Discussion Section 4 ECON 139/239 2010 Summer Term II Discussion Section 4 ECON 139/239 2010 Summer Term II 1. Let s use the CollegeDistance.csv data again. (a) An education advocacy group argues that, on average, a person s educational attainment would increase

More information

USING LOGISTIC REGRESSION TO PREDICT CUSTOMER RETENTION. Andrew H. Karp Sierra Information Services, Inc. San Francisco, California USA

USING LOGISTIC REGRESSION TO PREDICT CUSTOMER RETENTION. Andrew H. Karp Sierra Information Services, Inc. San Francisco, California USA USING LOGISTIC REGRESSION TO PREDICT CUSTOMER RETENTION Andrew H. Karp Sierra Information Services, Inc. San Francisco, California USA Logistic regression is an increasingly popular statistical technique

More information

Analyzing Ranking and Rating Data from Participatory On- Farm Trials

Analyzing Ranking and Rating Data from Participatory On- Farm Trials 44 Analyzing Ranking and Rating Data from Participatory On- Farm Trials RICHARD COE Abstract Responses in participatory on-farm trials are often measured as ratings (scores on an ordered but arbitrary

More information

CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS

CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS Examples: Regression And Path Analysis CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS Regression analysis with univariate or multivariate dependent variables is a standard procedure for modeling relationships

More information

Beginning Tutorials. PROC FREQ: It s More Than Counts Richard Severino, The Queen s Medical Center, Honolulu, HI OVERVIEW.

Beginning Tutorials. PROC FREQ: It s More Than Counts Richard Severino, The Queen s Medical Center, Honolulu, HI OVERVIEW. Paper 69-25 PROC FREQ: It s More Than Counts Richard Severino, The Queen s Medical Center, Honolulu, HI ABSTRACT The FREQ procedure can be used for more than just obtaining a simple frequency distribution

More information

CHAPTER 12 EXAMPLES: MONTE CARLO SIMULATION STUDIES

CHAPTER 12 EXAMPLES: MONTE CARLO SIMULATION STUDIES Examples: Monte Carlo Simulation Studies CHAPTER 12 EXAMPLES: MONTE CARLO SIMULATION STUDIES Monte Carlo simulation studies are often used for methodological investigations of the performance of statistical

More information

Multivariate Logistic Regression

Multivariate Logistic Regression 1 Multivariate Logistic Regression As in univariate logistic regression, let π(x) represent the probability of an event that depends on p covariates or independent variables. Then, using an inv.logit formulation

More information

Data Mining: An Overview of Methods and Technologies for Increasing Profits in Direct Marketing. C. Olivia Rud, VP, Fleet Bank

Data Mining: An Overview of Methods and Technologies for Increasing Profits in Direct Marketing. C. Olivia Rud, VP, Fleet Bank Data Mining: An Overview of Methods and Technologies for Increasing Profits in Direct Marketing C. Olivia Rud, VP, Fleet Bank ABSTRACT Data Mining is a new term for the common practice of searching through

More information

Module 4 - Multiple Logistic Regression

Module 4 - Multiple Logistic Regression Module 4 - Multiple Logistic Regression Objectives Understand the principles and theory underlying logistic regression Understand proportions, probabilities, odds, odds ratios, logits and exponents Be

More information

Weight of Evidence Module

Weight of Evidence Module Formula Guide The purpose of the Weight of Evidence (WoE) module is to provide flexible tools to recode the values in continuous and categorical predictor variables into discrete categories automatically,

More information

SPSS Introduction. Yi Li

SPSS Introduction. Yi Li SPSS Introduction Yi Li Note: The report is based on the websites below http://glimo.vub.ac.be/downloads/eng_spss_basic.pdf http://academic.udayton.edu/gregelvers/psy216/spss http://www.nursing.ucdenver.edu/pdf/factoranalysishowto.pdf

More information

Statistics, Data Analysis & Econometrics

Statistics, Data Analysis & Econometrics Using the LOGISTIC Procedure to Model Responses to Financial Services Direct Marketing David Marsh, Senior Credit Risk Modeler, Canadian Tire Financial Services, Welland, Ontario ABSTRACT It is more important

More information

Family economics data: total family income, expenditures, debt status for 50 families in two cohorts (A and B), annual records from 1990 1995.

Family economics data: total family income, expenditures, debt status for 50 families in two cohorts (A and B), annual records from 1990 1995. Lecture 18 1. Random intercepts and slopes 2. Notation for mixed effects models 3. Comparing nested models 4. Multilevel/Hierarchical models 5. SAS versions of R models in Gelman and Hill, chapter 12 1

More information

Paper D10 2009. Ranking Predictors in Logistic Regression. Doug Thompson, Assurant Health, Milwaukee, WI

Paper D10 2009. Ranking Predictors in Logistic Regression. Doug Thompson, Assurant Health, Milwaukee, WI Paper D10 2009 Ranking Predictors in Logistic Regression Doug Thompson, Assurant Health, Milwaukee, WI ABSTRACT There is little consensus on how best to rank predictors in logistic regression. This paper

More information

HLM software has been one of the leading statistical packages for hierarchical

HLM software has been one of the leading statistical packages for hierarchical Introductory Guide to HLM With HLM 7 Software 3 G. David Garson HLM software has been one of the leading statistical packages for hierarchical linear modeling due to the pioneering work of Stephen Raudenbush

More information

Local classification and local likelihoods

Local classification and local likelihoods Local classification and local likelihoods November 18 k-nearest neighbors The idea of local regression can be extended to classification as well The simplest way of doing so is called nearest neighbor

More information

SPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing lsun@smu.edu.sg

SPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing lsun@smu.edu.sg SPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing lsun@smu.edu.sg IN SPSS SESSION 2, WE HAVE LEARNT: Elementary Data Analysis Group Comparison & One-way

More information

Logit Models for Binary Data

Logit Models for Binary Data Chapter 3 Logit Models for Binary Data We now turn our attention to regression models for dichotomous data, including logistic regression and probit analysis. These models are appropriate when the response

More information

Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010

Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010 Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010 Week 1 Week 2 14.0 Students organize and describe distributions of data by using a number of different

More information

Guido s Guide to PROC FREQ A Tutorial for Beginners Using the SAS System Joseph J. Guido, University of Rochester Medical Center, Rochester, NY

Guido s Guide to PROC FREQ A Tutorial for Beginners Using the SAS System Joseph J. Guido, University of Rochester Medical Center, Rochester, NY Guido s Guide to PROC FREQ A Tutorial for Beginners Using the SAS System Joseph J. Guido, University of Rochester Medical Center, Rochester, NY ABSTRACT PROC FREQ is an essential procedure within BASE

More information

Modeling Lifetime Value in the Insurance Industry

Modeling Lifetime Value in the Insurance Industry Modeling Lifetime Value in the Insurance Industry C. Olivia Parr Rud, Executive Vice President, Data Square, LLC ABSTRACT Acquisition modeling for direct mail insurance has the unique challenge of targeting

More information

Logistic (RLOGIST) Example #3

Logistic (RLOGIST) Example #3 Logistic (RLOGIST) Example #3 SUDAAN Statements and Results Illustrated PREDMARG (predicted marginal proportion) CONDMARG (conditional marginal proportion) PRED_EFF pairwise comparison COND_EFF pairwise

More information

Applied Statistics. J. Blanchet and J. Wadsworth. Institute of Mathematics, Analysis, and Applications EPF Lausanne

Applied Statistics. J. Blanchet and J. Wadsworth. Institute of Mathematics, Analysis, and Applications EPF Lausanne Applied Statistics J. Blanchet and J. Wadsworth Institute of Mathematics, Analysis, and Applications EPF Lausanne An MSc Course for Applied Mathematicians, Fall 2012 Outline 1 Model Comparison 2 Model

More information

Individual Growth Analysis Using PROC MIXED Maribeth Johnson, Medical College of Georgia, Augusta, GA

Individual Growth Analysis Using PROC MIXED Maribeth Johnson, Medical College of Georgia, Augusta, GA Paper P-702 Individual Growth Analysis Using PROC MIXED Maribeth Johnson, Medical College of Georgia, Augusta, GA ABSTRACT Individual growth models are designed for exploring longitudinal data on individuals

More information

Two Correlated Proportions (McNemar Test)

Two Correlated Proportions (McNemar Test) Chapter 50 Two Correlated Proportions (Mcemar Test) Introduction This procedure computes confidence intervals and hypothesis tests for the comparison of the marginal frequencies of two factors (each with

More information

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics. Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing

More information

Survey, Statistics and Psychometrics Core Research Facility University of Nebraska-Lincoln. Log-Rank Test for More Than Two Groups

Survey, Statistics and Psychometrics Core Research Facility University of Nebraska-Lincoln. Log-Rank Test for More Than Two Groups Survey, Statistics and Psychometrics Core Research Facility University of Nebraska-Lincoln Log-Rank Test for More Than Two Groups Prepared by Harlan Sayles (SRAM) Revised by Julia Soulakova (Statistics)

More information

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

More information

Section 13, Part 1 ANOVA. Analysis Of Variance

Section 13, Part 1 ANOVA. Analysis Of Variance Section 13, Part 1 ANOVA Analysis Of Variance Course Overview So far in this course we ve covered: Descriptive statistics Summary statistics Tables and Graphs Probability Probability Rules Probability

More information

Statistics 305: Introduction to Biostatistical Methods for Health Sciences

Statistics 305: Introduction to Biostatistical Methods for Health Sciences Statistics 305: Introduction to Biostatistical Methods for Health Sciences Modelling the Log Odds Logistic Regression (Chap 20) Instructor: Liangliang Wang Statistics and Actuarial Science, Simon Fraser

More information

Gamma Distribution Fitting

Gamma Distribution Fitting Chapter 552 Gamma Distribution Fitting Introduction This module fits the gamma probability distributions to a complete or censored set of individual or grouped data values. It outputs various statistics

More information

statistics Chi-square tests and nonparametric Summary sheet from last time: Hypothesis testing Summary sheet from last time: Confidence intervals

statistics Chi-square tests and nonparametric Summary sheet from last time: Hypothesis testing Summary sheet from last time: Confidence intervals Summary sheet from last time: Confidence intervals Confidence intervals take on the usual form: parameter = statistic ± t crit SE(statistic) parameter SE a s e sqrt(1/n + m x 2 /ss xx ) b s e /sqrt(ss

More information

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not. Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

More information

School of Nursing Faculty Salary Equity Report and Action Plan

School of Nursing Faculty Salary Equity Report and Action Plan July 1, 2015 School of Nursing Faculty Salary Equity Report and Action Plan Shari L. Dworkin, Ph.D., M.S. Associate Dean for Academic Affairs Overview: In 2012, then UC President Mark Yudof charged each

More information

ECLT5810 E-Commerce Data Mining Technique SAS Enterprise Miner -- Regression Model I. Regression Node

ECLT5810 E-Commerce Data Mining Technique SAS Enterprise Miner -- Regression Model I. Regression Node Enterprise Miner - Regression 1 ECLT5810 E-Commerce Data Mining Technique SAS Enterprise Miner -- Regression Model I. Regression Node 1. Some background: Linear attempts to predict the value of a continuous

More information

Scatter Plots with Error Bars

Scatter Plots with Error Bars Chapter 165 Scatter Plots with Error Bars Introduction The procedure extends the capability of the basic scatter plot by allowing you to plot the variability in Y and X corresponding to each point. Each

More information

Are you looking for the right interactions? Statistically testing for interaction effects with dichotomous outcome variables

Are you looking for the right interactions? Statistically testing for interaction effects with dichotomous outcome variables Are you looking for the right interactions? Statistically testing for interaction effects with dichotomous outcome variables Updated 2-14-2012 for presentation to the Epi Methods group at Columbia Melanie

More information

Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition

Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition Online Learning Centre Technology Step-by-Step - Excel Microsoft Excel is a spreadsheet software application

More information

Nonlinear Regression Functions. SW Ch 8 1/54/

Nonlinear Regression Functions. SW Ch 8 1/54/ Nonlinear Regression Functions SW Ch 8 1/54/ The TestScore STR relation looks linear (maybe) SW Ch 8 2/54/ But the TestScore Income relation looks nonlinear... SW Ch 8 3/54/ Nonlinear Regression General

More information

Main Effects and Interactions

Main Effects and Interactions Main Effects & Interactions page 1 Main Effects and Interactions So far, we ve talked about studies in which there is just one independent variable, such as violence of television program. You might randomly

More information

Paper 45-2010 Evaluation of methods to determine optimal cutpoints for predicting mortgage default Abstract Introduction

Paper 45-2010 Evaluation of methods to determine optimal cutpoints for predicting mortgage default Abstract Introduction Paper 45-2010 Evaluation of methods to determine optimal cutpoints for predicting mortgage default Valentin Todorov, Assurant Specialty Property, Atlanta, GA Doug Thompson, Assurant Health, Milwaukee,

More information

The first three steps in a logistic regression analysis with examples in IBM SPSS. Steve Simon P.Mean Consulting www.pmean.com

The first three steps in a logistic regression analysis with examples in IBM SPSS. Steve Simon P.Mean Consulting www.pmean.com The first three steps in a logistic regression analysis with examples in IBM SPSS. Steve Simon P.Mean Consulting www.pmean.com 2. Why do I offer this webinar for free? I offer free statistics webinars

More information

The Probit Link Function in Generalized Linear Models for Data Mining Applications

The Probit Link Function in Generalized Linear Models for Data Mining Applications Journal of Modern Applied Statistical Methods Copyright 2013 JMASM, Inc. May 2013, Vol. 12, No. 1, 164-169 1538 9472/13/$95.00 The Probit Link Function in Generalized Linear Models for Data Mining Applications

More information

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives

More information