QUANTITATIVE METHODS CLASSES WEEK SEVEN Th rgrssion modls studid in prvious classs assum that th rspons variabl is quantitativ. Oftn, howvr, w wish to study social procsss that lad to two diffrnt outcoms. In ths instancs, thn, th rspons variabl is catgorical. For xampl, whn w study unmploymnt, marriag or votr s choic. In th cas of unmploymnt, for xampl, our rspondnts will ithr b mployd or unmployd. Dnot th rspons on Y by 1 if unmployd and 0 if mployd (it is also common to us th trm failur and succss). Th sum of th scors in th sampl is thn th numbr of succsss (i.., unmployd rspondnts). Th man of this rspons variabl (th 0s and 1s scors) quals th proportion of succsss (i.., th proportion of unmployd rspondnts). Obviously, thn, th proportion of mployd rspondnts quals 1-that man. Transforming th catgorical rspons variabl (0,1) to proportion allows us to think in trns of rgrssion analysis, sinc th ordinary rgrssion modls th man of th rspons variabl. Lt π dnot th probability of succss, and it is possibl to writ th following linar quation: π=a+b() This is th linar probability modl, and it implis that th probability of succss is a linar function of. Unfortunatly, this modl is oftn poor. First, it implis probabilitis blow 0 and abov 1, whras probabilitis must fall btwn 0 and 1. Scond, th rspons variabl is not normally distributd, and thus it violats som of th assumptions w mak whn applying OLS rgrssion. Thus, w nd to furthr transform our rspons variabl. In othr words, w nd to dscrib th rlationship btwn π and with a curvilinar rathr than a linar function. This can b achivd by th following quation: π log = α + β 1 π Th ratio π/(1- π) quals th odds. Thus, for xampl, whn th proportion of unmployd individuals (succss) in our sampl quals 0.20 th odds quals 0.25 (0.2/0.8=0.25), which mans that a succss is four tims lss likly as failur. This quation uss th natural log of th odds, and is calld th logistic transformation, or logit for short. Thus, as π incrass from 0 to 1, th odds incrass from 0 to +, and th logit incrass from to +. Tabl 1: Rangs of Probability, Odds and Log Odds Lowst Lvl Mid point Highst Lvl Probability π 0.5 1 Odds π /1- π 0 1 + Log Odds log (π /1- π) or logit (π) 0 + 1
Logistic Rgrssion Th modl: logit (π) =a+b is calld th logistic rgrssion modl. In logistic rgrssion th paramtrs of th modl ar stimatd using th maximumliklihood mthod. That is, th cofficints that mak th obsrvd rsults most likly ar slctd. For ach possibl valu a paramtr might hav, SPSS computs th probability that th obsrvd valu would hav occurrd if it wr th tru valu of th paramtr. Thn, for th stimat, it picks th paramtr for which th probability of th actual obsrvation is gratst. Th quation for logistic rgrssion may b givn in ithr th additiv or multiplicativ forms. Additiv form: log (π/1-π) = a + ß Multiplicativ form: π/1-π = xp (a) *xp ( ß ) π is th proportion with th charactristic (th probability), a is a constant, ß 1,ß 2... ar cofficints and 1, 2... ar prdictor variabls. log π/1-π is known as th log-odds and π/1- π as th odds. Exponntial (ß), ar th odds multiplirs and intrst is in valus that diffr from 1. Running Logistic Rgrssion in SPSS Hr w modl th probability of bing unmployd rathr than bing mployd (EMP86). First, w hav to mak sur that EMP86 is codd 1 and 0. It is important to cod th succss as 1. As w ar intrstd in prdicting unmploymnt, th unmployd should b codd 1 and th mployd codd 0 (w simply crat a nw variabl UNEMP: 0=mployd and 1=unmployd). W ar going to us two prdictors: Class and Ag. As Class is a catgorical variabl, w nd to rcod class to a thr-catgory variabl (CLASS1: 10,20=1 (prof); 31,41,42,43,50=2 (intr); and, 32,60,71,72=3 (working)), and thn to crat dummy variabls for Prof, Intr and working. logistic rgrssion variabls=unmp /mthod = ntr prof intr. 2
Intrprting th Rsults of a Logistic Rgrssion Modl 1. Assssing th Goodnss of Fit of th Modl On way of assssing goodnss of fit is to xamin how likly th sampl rsults ar, givn th paramtr stimats (rmmbr th modl attmpts to gnrat th paramtr stimats that mak th rsults most likly). Th probability of th obsrvd rsults givn th paramtr stimats is known as th Liklihood. Sinc th liklihood is a small numbr lss than 1, it is customary to us -2 tims th log liklihood (-2LL) as an stimat of how wll th modl fits th data. A good modl is on that rsults in a high liklihood of th obsrvd rsults. This translats into a small valu for 2LL (if a modl fits prfctly, th liklihood=1 and 2LL=0) In our modl, -2 Log Liklihood = 949.379 Modl Summary Stp -2 Log liklihood Cox & Snll R Squar Naglkrk R Squar 1 949.379.032.052 This is far from zro, howvr bcaus thr is no uppr boundary for 2LL it is difficult to mak a statmnt about th maning of th scor. It is mor oftn usd to s whthr adding additional variabls to th modl lads to a significant rduction in th 2LL. Th diffrnc btwn th 2LL for two modls, with th diffrnc in th dgrs of frdom (which is qual to th diffrnc btwn th numbr of paramtrs for th two modls) has a chi-squar distribution. Thus, th significanc of this chang is drivd from th chi-squar tabl. To assss th chang btwn diffrnt modls variabls must b addd in stps or blocks, as was don in th OLS rgrssion. If w compar th fit statistics for a modl with just class and on which adds ag w gt th following rsults. Omnibus Tsts of Modl Cofficints Chi-squar df Sig. Stp 1 Stp 12.515 1.000 Block 12.515 1.000 Modl 45.119 3.000 Modl Summary Stp -2 Log liklihood Cox & Snll R Squar Naglkrk R Squar 1 936.864.045.071 Th chi-squar figur for th nw block addd in th scond modl shows a chang of 12.52 (s omnibus tsts tabl) for 1 additional dgr of frdom (AGE). This is significant at th 2.05 lvl (you can chck this on th Chi-sq ( Χ ) tabl). This tst is comparabl to th F- chang tst in th OLS rgrssion. 3
2. Th Cofficints. B S.E. Wald Df Sig. Exp(B) AGE -.027.008 12.104 1.001.974 PROF -1.145.228 25.134 1.000.318 INTER -.722.211 11.701 1.001.486 Constant -.032.296.012 1.913.968 a Variabl(s) ntrd on stp 1: AGE, PROF, INTER. Th Bs rfr to th log-odds of bing unmployd. W can insrt ths into th logistic rgrssion quation as was don in multipl rgrssion. Additiv form: logit( π ) = α + β + β... + β 1 1 2 2 n n =.0323 + (.027)( AGE) + ( 1.145)( PROF) + (.722)( INTER) This tlls us that incrasing ag dcrass th log odds of unmploymnt (controlling for class). Bing in th profssional/managrial class or bing in th intrmdiat class also rducs th log-odds of bing unmployd rlativ to thos in th working class. Th Wald statistic (B/SE). Most softwar rports its squar (B/SE) 2. Th significanc of th Wald statistic is rportd in th column markd Sig. This shows that th thr prdictor variabls ar significant. If th cofficint is vry larg th Wald statistic can bcom unrliabl so you should rfr to th chang in th log liklihood instad (s blow). Howvr, log-odds is not a vry straightforward concpt. It is probably asir to us th multiplicativ form of th quation using xp(b), s last column of th SPSS output. Ths ar th Odds Multiplirs. Multiplicativ form π = (1 π) α β β 1 1... n n 0.323 0.027(AGE) 1.145(PROF) 0.722(INTER) = =.968 0.974(AGE) 0.318(PROF) 0.486(INTER) Rmmbr intrst is in cofficints that diffr from 1. Valus gratr than 1 indicat that th variabl in qustion incrass th odds of th dpndnt vnt occurring and valus lss than 1 (i.. btwn 0 and 1) indicat a dcras in th odds. Effctivly th odds for th bas catgory ar st to 1. Using th odds multiplirs w can mak th mor undrstandabl claims that, whn othr factors in th modl ar hld constant: ach addd yar of ag lads to about 3% rduction in th odds of bing unmployd. 4
bing in th profssional class, compard to bing in th working class, lads to about 68% rduction in th odds of bing unmployd bing in th intrmdiat class, compard to bing in th working class, lads to about 51% rduction in th odds of bing unmployd. 2. Estimating Probabilitis Th original logistic rgrssion quation can b transformd to show th stimatd probability of succss by: (α + β ) π = (α + β 1+ 1 1 + 1 1 β 2 2 + β + 2 2 β ) 3 3 + β ) 3 3 so for any individual th probability of bing unmployd can b calculatd. Exampl: What is th stimatd probability of bing unmployd for a prson agd 30 in th profssional class? ( 0.32 + (0.27)(30) + ( 1.145)(1) 1.987 ).137 π = = = =.12 ( 0.32 + (0.27)(30) + ( 1.145)(1) 1 1.987 1+ + 1.137 Answr: a 30 yar old profssional has an stimatd probability of bing unmployd of.12. 5