Measures of Fit for Logistic Regression
|
|
- Linette Morgan
- 8 years ago
- Views:
Transcription
1 ABSTRACT Paper SAS Global Forum Measures of Ft for Logstc Regresson Paul D. Allson, Statstcal Horzons LLC and the Unversty of Pennsylvana One of the most common questons about logstc regresson s How do I know f my model fts the data? There are many approaches to answerng ths queston, but they generally fall nto two categores: measures of predctve power (lke R-square) and goodness of ft tests (lke the Pearson ch-square). Ths presentaton looks frst at R-square measures, argung that the optonal R-squares reported by PROC LOGISTIC mght not be optmal. Measures proposed by McFadden and Tjur appear to be more attractve. As for goodness of ft, the popular Hosmer and Lemeshow test s shown to have some serous problems. Several alternatves are consdered. INTRODUCTION One of the most frequent questons I get about logstc regresson s How can I tell f my model fts the data? Often the questoner s expressng a genune nterest n knowng whether a model s a good model or a not-so-good model. But a more common motvaton s to convnce someone else--a boss, an edtor, or a regulator--that the model s OK. There are two very dfferent approaches to answerng ths queston. One s to get a statstc that measures how well you can predct the dependent varable based on the ndependent varables. I ll refer to these knds of statstcs as measures of predctve power. Typcally, they vary between 0 and 1, wth 0 meanng no predctve power whatsoever and 1 meanng perfect predctons. Predctve power statstcs avalable n PROC LOGISTIC nclude R-square, the area under the ROC curve, and several rank-order correlatons. Obvously, the hgher the better, but there s rarely a fxed cut-off that dstngushes an acceptable model from one that s not acceptable. The other approach to evaluatng model ft s to compute a goodness-of-ft statstc. Wth PROC LOGISTIC, you can get the devance, the Pearson ch-square, or the Hosmer-Lemeshow test. These are formal tests of the null hypothess that the ftted model s correct, and ther output s a p-value--agan a number between 0 and 1 wth hgher values ndcatng a better ft. In ths case, however, a p-value below some specfed level (say,.05) would ndcate that the model s not acceptable. What many researchers fal to realze s that measures of predctve power and goodness-of-ft statstcs are testng very dfferent thngs. It s not at all uncommon for models wth very hgh R-squares to produce unacceptable goodness-of-ft statstcs. And conversely, models wth very low R-squares, can ft the data very well accordng to goodness-of-ft tests. As I ll explan n more detal later, what goodness-of-ft statstcs are testng s not how well you can predct the dependent varable, but whether you could do even better by makng the model more complcated, specfcally, addng non-lneartes, addng nteractons, or changng the lnk functon. The goal of ths paper s to dscuss several ssues that I ve been grapplng wth over the years regardng both predctve power statstcs and goodness-of-ft statstcs. Wth one excepton, I make no clam to orgnalty here. Rather, I m smply tryng to make sense out of a rather complcated lterature, and to dstll t nto some practcal recommendatons. I begn wth measures of predctve power, and I m gong to focus exclusvely on R-square measures. I don t mean to mply that these are ether better or worse than alternatve measures lke the area under the ROC curve. But I personally happen to lke R-square statstcs just because they are so famlar from the context of ordnary lnear regresson. R STATISTICS FOR LOGISTIC REGRESSION There are many dfferent ways to calculate R for logstc regresson and, unfortunately, no consensus on whch one s best. Mttlbock and Schemper (1996) revewed 1 dfferent measures; Menard (000) consdered several others. The two methods that are most often reported n statstcal software appear to be one proposed by McFadden (1974) and another that s usually attrbuted to Cox and Snell (1989) along wth ts corrected verson. The Cox-Snell R (both corrected and uncorrected) was actually dscussed earler by Maddala (1983) and by Cragg and Uhler (1970). Cox-Snell s the optonal R reported by PROC LOGISTIC. PROC QLIM reports eght dfferent R measures ncludng both Cox-Snell and McFadden. Among other statstcal packages that I m famlar wth, Statstca reports the Cox-Snell measures. JMP reports both McFadden and Cox-Snell. SPSS reports the Cox-Snell measures for bnary logstc regresson but McFadden s measure for multnomal and ordered logt. 1
2 For years, I ve been recommendng the Cox-Snell R over the McFadden R, but I ve recently concluded that that was a mstake. I now beleve that McFadden s R s a better choce. However, I ve also learned about another R that has good propertes, a lot of ntutve appeal, and s easly calculated. At the moment, I lke t better than the McFadden R, but I m not prepared to make a defntve recommendaton at ths pont. Here are some detals. Logstc regresson s, of course, estmated by maxmzng the lkelhood functon. Let L 0 be the value of the lkelhood functon for a model wth no predctors, and let L M be the lkelhood for the model beng estmated. McFadden s R s defned as R McF = 1 ln(l M) / ln(l 0 ) where ln(.) s the natural logarthm. The ratonale for ths formula s that ln(l 0 ) plays a role analogous to the resdual sum of squares n lnear regresson. Consequently, ths formula corresponds to a proportonal reducton n error varance. It s sometmes referred to as a pseudo R. The Cox and Snell R s R C&S = 1 (L 0 / L M ) /n where n s the sample sze. The ratonale for ths formula s that, for normal-theory lnear regresson, t s an dentty. In other words, the usual R for lnear regresson depends on the lkelhoods for the models wth and wthout predctors by precsely ths formula. It s approprate, then, to descrbe ths as a generalzed R rather than a pseudo R. By contrast, the McFadden R does not have the OLS R as a specal case. I ve always found ths property of the Cox- Snell R to be very attractve, especally because the formula can be naturally extended to other knds of regresson estmated by maxmum lkelhood, lke negatve bnomal regresson for count data or Webull regresson for survval data. It s well known, however, that the bg problem wth the Cox-Snell R s that t has an upper bound that s less than 1.0. Specfcally, the upper bound s 1 L /n 0. Ths can be a lot less than 1.0, and t depends only on p, the margnal proporton of cases wth events: upper bound = 1 [p p (1-p) (1-p) ] I have not seen ths formula anywhere else, so t may be the only orgnal thng n ths paper. The upper bound reaches a maxmum of.75 when p=.5. By contrast, when p=.9 (or.1), the upper bound s only.48. For those who want an R that behaves lke a lnear-model R, ths s deeply unsettlng. There s a smple correcton, and that s to dvde R C&S by ts upper bound, whch produces the R attrbuted to Nagelkerke (1991) and whch s labeled n SAS output as the max-rescaled R. But ths correcton s purely ad hoc, and t greatly reduces the theoretcal appeal of the orgnal R C&S. I also thnk that the values t typcally produces are msleadngly hgh, especally compared wth what you get from just dong OLS wth the bnary dependent varable. (Some mght vew ths as a feature, however). So, wth some reluctance, I ve decded to cross over to the McFadden camp. As Menard (000) argued, t satsfes almost all of Kvalseth s (1985) eght crtera for a good R. When the margnal proporton s around.5, the McFadden R tends to be a lttle smaller than the uncorrected Cox-Snell R. When the margnal proporton s nearer to 0 or 1, the McFadden R tends to be larger. But there s another R, recently proposed by Tjur (009), that I m nclned to prefer over McFadden s. It has a lot of ntutve appeal, ts upper bound s 1.0, and t s closely related to R defntons for lnear models. It s also easy to calculate. The defnton s very smple. For each of the two categores of the dependent varable, calculate the mean of the predcted probabltes of an event. Then, take the absolute value of the dfference between those two means. That s t! The motvaton should be clear. If a model makes good predctons, the cases wth events should have hgh predcted values and the cases wthout events should have low predcted values. Tjur also showed that hs R (whch he called the coeffcent of dscrmnaton) s equal to the arthmetc mean of two R formulas based on squared resduals, and equal to the geometrc mean of two other R s based on squared resduals. Here s an example of how to calculate Tjur s statstc n SAS. I used a well-known data set on labor force partcpaton of 751 marred women (Mroz 1987). The dependent varable INLF s coded 1 f a woman was n the labor force, otherwse 0. A logstc regresson model was ft wth sx predctors. Here s the code: proc logstc data=my.mroz; model nlf(desc) = kdslt6 age educ huswage cty exper;
3 output out=a pred=yhat; proc ttest data=a; class nlf; var yhat; run; The OUTPUT statement produces a new data set called A wth predcted probabltes stored n a new varable called YHAT. PROC TTEST s a convenent way to compute the mean of the predcted probabltes for each category of the dependent varable, and to take ther dfference. The output s shown n Table 1. Ignorng the sgn of the dfference, the Tjur R s.575. By contrast, the Cox-Snell R s.477, and the max-rescaled R s.33. McFadden R s.08. The squared correlaton between the observed and predcted values s.57. The TTEST Procedure Varable: yhat (Estmated Probablty) INLF N Mean Std Dev Std Err Mnmum Maxmum Dff (1-) Table 1. PROC TTEST Output to Compute Tjur s R. One possble objecton to the Tjur R s that, unlke Cox-Snell and McFadden, t s not based on the quantty beng maxmzed, namely, the lkelhood functon. As a result, t s possble that addng a varable to the model could reduce the Tjur R. But Kvalseth (1985) argued that t s actually preferable that R not be based on a partcular estmaton method. In that way, t can legtmately be used to compare predctve power for models that generate ther predctons usng very dfferent methods. For example, one mght want to compare predctons based on logstc regresson wth those based on a lnear model or on a classfcaton tree method. Another potental complant s that the Tjur R cannot be easly generalzed to ordnal or nomnal logstc regresson. For McFadden and Cox-Snell, the generalzaton s trval. CLASSIC GOODNESS-OF-FIT STATISTICS I now turn to goodness-of-ft (GOF) tests, whch can help you decde whether your model s correctly specfed. GOF tests produce a p-value. If t s low (say, below.05), you reject the model. If t s hgh, then your model passes the test. Classc GOF tests are readly avalable for logstc regresson when the data can be aggregated or grouped nto unque profles. Profles are groups of cases that have exactly the same values on the predctors. For example, suppose we ft a model to the Mroz data wth just two predctor varables, CITY (1=urban, 0=nonurban) and NKIDSLT6 whch has nteger values rangng from 0 to 3. There are then eght profles, correspondng to the eght cells n the cross-classfcaton of CITY by NKIDSLT6. After fttng the model, we can get an observed number of events and an expected number of events for each profle. There are two well-known statstcs for comparng the observed number wth the expected number: the devance and Pearson s ch-square. Here s how to get them wth PROC LOGISTIC: proc logstc data=my.mroz; model nlf(desc) = kdslt6 cty / aggregate scale=none; run; The AGGREGATE opton says to aggregate the data nto profles based on the values of the predctor varables. The SCALE=NONE opton requests the devance and the Pearson ch-square, based on those profles. Here are the results. 3
4 Devance and Pearson Goodness-of-Ft Statstcs Crteron Value DF Value/DF Pr >ChSq Devance Pearson Number of unque profles: 8 Table. PROC LOGISTIC Output of GOF Statstcs For both statstcs, the ch-squares are low relatve to the degrees of freedom, and the p-values are hgh. Ths s exactly what we want to see. There s no evdence to reject the null hypothess, whch s that the ftted model s correct. Now let s take a closer look at these two statstcs. The formula for the devance s G j Oj O j log E j where each j s a cell n the -way contngency table wth each row beng a profle and each column beng one of the two categores of the dependent varable. O j s the observed frequency and E j s the expected frequency based on the ftted model. If O j =0, the entre term n the summaton s set to 0. The degrees of freedom s the number of profles mnus the number of estmated parameters. The Pearson ch-square s calculated as X O j E E j j j If the ftted model s correct, both statstcs have approxmately a ch-square dstrbuton, wth the approxmaton mprovng as the sample gets larger. But what exactly are these statstcs testng? Ths s easest to see for the devance, whch s a lkelhood rato test comparng the ftted model to a saturated model that perfectly fts the data. In our example, a saturated model would treat KIDSLT6 as a CLASS varable, and would also nclude the nteracton of KIDSLT6 and CITY. Here s the code for that model, wth the GOF output n Table 3. proc logstc data=my.mroz; class kdslt6; model nlf(desc) = kdslt6 cty kdslt6*cty / aggregate scale=none; run; Devance and Pearson Goodness-of-Ft Statstcs Crteron Value DF Value/DF Pr >ChSq Devance Pearson Table 3. PROC LOGISTIC Output for a Saturated model So the answer to the queston What are GOF tests testng? s smply ths: they are testng whether there are any non-lneartes or nteractons. You can always produce a satsfactory ft by addng enough nteractons and nonlneartes. But do you really need them to properly represent the data? GOF tests are desgned to answer that queston. A related ssue s whether the lnk functon s correct. Is t logt, probt, complementary log-log, or somethng else entrely? Note that n a saturated model, the lnk functon s rrelevant. It s only when you suppress nteractons or non-lneartes that the lnk functon becomes an ssue. For example, t s possble (although unusual) that nteractons 4
5 that are needed for a logt model could dsappear when you ft a complementary log-log model. Both the devance and the Pearson ch-square have good propertes when the expected number of events and the expected number of non-events for each profle s at least 5. But most contemporary applcatons of logstc regresson use data that do not allow for aggregaton nto profles because the model ncludes one or more contnuous (or nearly contnuous) predctors. That s certanly true for the Mroz data when you nclude age, educaton, husband s wage, and years of experence n the model. When there s only one case per profle, both the devance and Pearson ch-square have dstrbutons that depart markedly from a true ch-square dstrbuton, yeldng p-values that may be wldly naccurate. In fact, wth only one case per profle, the devance does not depend on the observed values at all, makng t utterly useless as a GOF test (McCullagh 1985). What can we do? Hosmer and Lemeshow (1980) proposed groupng cases together accordng to ther predcted values from the logstc regresson model. Specfcally, the predcted values are arrayed from lowest to hghest, and then separated nto several groups of approxmately equal sze. Ten groups s the standard recommendaton. For each group, we calculate the observed number of events and non-events, as well as the expected number of events and non-events. The expected number of events s just the sum of the predcted probabltes for all the ndvduals n the group. And the expected number of non-events s the group sze mnus the expected number of events. Pearson s ch-square s then appled to compare observed counts wth expected counts. The degrees of freedom s the number of groups mnus. As wth the classc GOF tests, low p-values suggest rejecton of the model. For the Mroz data, here s the code for a model wth fve predctors: proc logstc data=my.mroz; model nlf(desc) = kdslt6 age educ huswage cty exper / lackft; run; The LACKFIT opton requests the Hosmer-Lemeshow (HL) test. Results are n Table 4. Partton for the Hosmer and Lemeshow Test Group Total INLF = 1 INLF = 0 Observed Expected Observed Expected Hosmer and Lemeshow Goodness-of-Ft Test Ch-Square DF Pr > ChSq Table 4. Hosmer-Lemeshow Results from PROC LOGISTIC. 5
6 The p-value s just below.05, suggestng that we may need some nteractons or non-lneartes n the model. The HL test seems lke a clever soluton, and t has become the de facto standard for almost all software packages. But t turns out to have serous problems. The most troublng problem s that results can depend markedly on the number of groups, and there s no theory to gude the choce of that number. Ths problem dd not become apparent untl some software packages (but not SAS) started allowng you to specfy the number of groups, rather than just usng 10. When I estmated ths model n Stata, for example, wth the default number of 10 groups, I got a HL ch-square of 15.5 wth 8 df, yeldng a p-value of.0499 almost the same as what we just got n SAS. But f we specfy 9 groups, the p-value rses to.11. Wth 11 groups, the p-value s.64. Clearly, t s not acceptable for the results to depend so greatly on mnor changes that are completely arbtrary. Examples lke ths are easy to come by. But wat, there s more. One would hope that addng a statstcally sgnfcant nteracton or non-lnearty to a model would mprove ts ft, as judged by the HL test. But often that doesn t happen. Suppose, for example, that we add the square of EXPER (labor force experence) to the model, allowng for non-lnearty n the effect of experence. The squared term s hghly sgnfcant (p=.00). But wth 9 groups, the HL ch-square ncreases from (p=.11) n the smpler model to (p=.06) n the more complex model. Thus, the HL test suggests that we d be better off wth the model that excludes the squared term. The reverse can also happen. Qute frequently, addng a non-sgnfcant nteracton or non-lnearty to a model wll substantally mprove the HL ft. For example, I added the nteracton of EDUC and EXPER to the basc model above. The product term had a p-value of.68, clearly not statstcally sgnfcant. But the HL ch-square (based on 10 groups) declned from 15.5 (p=.05) to 9.19 (p=.33). Agan, unacceptable behavor. I am certanly not the frst person to pont out these problems. In fact, n a 1997 paper, Hosmer, Lemeshow and others acknowledged that the HL test had several drawbacks, although that hasn t stopped other people from usng t. But f the HL test s not good, then how can we assess the ft of the model? It turns out that there s been qute a lot of work on ths topc, and many alternatve tests have been proposed so many that t s rather dffcult to fgure out whch ones are useful. In the remander of ths paper, I wll revew some of the lterature on these tests, and I wll recommend four of them that I thnk are worthy of consderaton. NEW GOODNESS-OF-FIT TESTS Many of the proposed tests are based on alternatve ways of groupng the data (Tsats 1980, Pgeon and Heyse 1991, Pulkstens and Robnson 00, Xe et al. 008, Lu et al. 01). Once the data have been grouped, a standard Pearson ch-square s calculated to evaluate the dscrepancy between predcted and observed counts wthn the groups. The man problem wth these knds of tests s that the groupng process usually requres sgnfcant effort and attenton by the data analyst, and there s a certan degree of arbtrarness n how t s done. What most analysts want s a test that can be easly and routnely mplemented. And snce there are several tests that fulfll that requrement, I shall restrct my attenton to tests that can be calculated when there s only one case per profle and no groupng of observatons. Based on my readng of the lterature, I am prepared to recommend four statstcs for wdespread use: 1. Standardzed Pearson Test. Wth ungrouped data, the formula for the classc Pearson ch-square test s: X y ˆ (1 ˆ ) ˆ where y s the dependent varable wth values of 0 or 1, and ˆ s the predcted probablty that y =1, based on the ftted model. As we ve just dscussed, the problem wth the classc Pearson GOF test s that t does not have a ch-square dstrbuton when the data are not grouped. But Osus and Rojek (199) showed that X has an asymptotc normal dstrbuton wth a mean and standard devaton that they derved. Subtractng the mean and dvdng by the standard devaton yelds a test statstc that has approxmately a standard normal dstrbuton under the null hypothess. McCullagh (1985) derved a dfferent mean and standard devaton after condtonng on the vector of estmated regresson coeffcents. In practce, these two versons of the standardzed Pearson are nearly dentcal, especally n larger samples. Farrngton (1996) also proposed a modfed X test, but hs test does not work when there s only one case per profle. For the remander of ths paper, I shall refer to the standardzed Pearson test as smply the Pearson test. 6
7 . Unweghted Sum of Squares. Copas (1989) proposed the test statstc USS n ( y ˆ ) 1 Ths statstc also has an asymptotc normal dstrbuton under the null hypothess, and Hosmer et al. (1997) showed how to get ts mean and standard devaton. As wth the Pearson test, subtractng the mean and dvdng by the standard devaton yelds a standard normal test statstc. 3. Informaton Matrx Test. Whte (198) proposed a general approach to testng for model msspecfcaton by comparng two dfferent estmates of the covarance matrx of the parameter estmates (the negatve nverse of the nformaton matrx), one based on frst dervatves of the log-lkelhood functon and the other based on second dervatves. If the ftted model s correct, the expected values of these two estmators should be the same. Orme (1988, 1990) showed how to apply ths method to test models for bnary data. The test statstc s IM n p 1 j 0 ( y ˆ )(1 ˆ ) x j where the x j s are the p predctor varables n the model and x o =1. After standardzaton wth an approprate varance, ths statstc has approxmately a ch-square dstrbuton wth p+1 degrees of freedom under the null hypothess. 4. Stukel Test. Stukel (1988) proposed a generalzaton of the logstc regresson model that has two addtonal parameters, thereby allowng ether for asymmetry n the curve or for a dfferent rate of approach to the (0,1) bounds. Specal cases of the model nclude (approxmately) the complementary log-log model and the probt model. The logstc model can be tested aganst ths more general model by a smple procedure. Let g be the lnear predctor from the ftted model, that s, g = x b where x s the vector of covarate values for ndvdual and b s the vector of estmated coeffcents. Then create two new varables: z a = g f g>=0, otherwse z a = 0 z b = g f g<0, otherwse z b = 0. Add these two varables to the logstc regresson model and test the null hypothess that both of ther coeffcents are equal to 0. Stukel proposed a score test, but there s no obvous reason to prefer that to a Wald test or a lkelhood rato test. Note that n many data sets, g s ether never greater than 0 or never less than 0. In such cases, only one z varable wll be needed. IMPLEMENTING THE TESTS As we ll see, Stukel s test s easly performed n SAS wthout much dffculty. The others are not qute so easy. Fortunately, Olver Kuss has wrtten a SAS macro that wll calculate these and other tests. In fact, he presented a paper on that macro at SUGI 5 n 001. Currently, the macro can be downloaded at Unfortunately, there s a major problem wth ths macro that I wll explan later. Let s apply these tests to the Mroz data used earler. Recall that the HL test wth ten groups yelded a p-value of.048, suggestng a need for nteractons or non-lneartes n the model. Here s the code for dong the Stukel test: proc logstc data=my.mroz; model nlf(desc) = kdslt6 age educ huswage cty exper; output out=a xbeta=xb; data b; set a; za=xb***(xb>=0); zb=xb***(xb<0); num=1; proc logstc data=b; model nlf(desc) = kdslt6 age educ huswage cty exper za zb ; test za=0,zb=0; run; We frst ft the model of nterest usng PROC LOGISTIC. The OUTPUT statement produces a new data set A that contans all the varables n the model plus the new varable XB, whch s the lnear predctor based on the ftted 7
8 model. In the DATA step that follows, the two new varables needed for the Stukel test are created. In addton, NUM=1 creates a new varable that wll be needed for the GOFLOGIT macro. The second PROC LOGISTIC step estmates the extended model wth the two new varables, and tests the null hypothess that both ZA and ZB have coeffcents of 0. Ths produced a Wald ch-square of.1 ( df), yeldng a p-value of.94. A lkelhood rato ch-square (the dfference n the -logl for the two models) produced almost dentcal results. Clearly there s no evdence aganst the model. To calculate the other GOF statstcs, we call the GOFLOGIT macro wth the followng statement: %goflogt(data=b, y=nlf, xlst=kdslt6 age educ huswage cty exper, trals=num) The macro fts the logstc regresson model wth the dependent varable specfed n Y= and the ndependent varables specfed n XLIST=. TRIALS=NUM s necessary because the macro s desgned to calculate GOF statstcs for ether grouped or ungrouped data. For ungrouped data, the number of trals must be set to 1, whch s why I created the NUM=1 varable n the earler DATA step. The output s shown n Table 5. Results from the Goodness-of-Ft Tests TEST Value p-value Standard Pearson Test Standard Devance Osus-Test McCullagh-Test Farrngton-Test IM-Test RSS-Test Table 5. Output from GOFLOGIT Macro The frst two tests are the classc Pearson and Devance statstcs, wth p-values that can t be trusted wth ungrouped data. Osus and McCullagh are two dfferent versons of the standardzed Pearson. As noted earler, the Farrngton test s not approprate for ungrouped data t s always equal to 0. What the macro labels as the RSS test s what I m callng the USS test. The only test yeldng a p-value less than.05 s the standard devance but, as I sad earler, ths test s useless for ungrouped data because t doesn t depend on the observed values of y. The Farrngton test s also useless because, wth ungrouped data, t s always equal to 0. Notce that the Osus and McCullagh tests are very close, whch has been the case wth every data set that I ve looked at. As reported here, the IM test s a ch-square statstc wth df=7 (the number of covarates n the model). The RSS value s just the sum of the squared resduals. Calculaton of ts p- value requres subtractng ts approxmate mean and dvdng by ts approxmate standard devaton, and referrng the result to a standard normal dstrbuton. PROPERTIES OF THE TESTS Smulaton results show that all these tests have about the rght sze. That s, f a correct model s ftted, the proporton of tmes that the model s rejected s about the same as the chosen alpha level, say,.05. So, n that sense, all the tests are properly testng the same null hypothess. But then we must ask two related questons: what sorts of departures from the model are these tests senstve to, and how much power do they have to detect varous alternatves? We can learn a lttle from theory and a lttle from smulaton results. Theory. When the data are naturally grouped, the classc Pearson and devance tests are truly omnbus tests. That s, they respond to any non-lnearty, nteracton or devaton from the specfed lnk functon. The newer tests appear to be more specfc. For example, by ts very desgn, the Stukel test should do well n detectng departures from the logt lnk functon. Smlarly, Osus and Rojek (199) showed that the Pearson test can be derved as a score test for a parameter n dfferent generalzaton of the logt model. They descrbe ths test as a powerful test aganst partcular alternatves concernng the lnk [functon]. For the IM test, Chesher (1984) demonstrated that t s equvalent to a score test for the alternatve hypothess that the regresson coeffcents vary across ndvduals, rather than beng the same for everyone. Smlarly, Copas (1989) showed that the USS test can be derved as a score test for the alternatve hypothess that that the s are 8
9 ndependent random draws from a dstrbuton wth constant varance and means determned by the x s. These results suggest that both the IM and USS tests should be partcularly senstve to unobserved heterogenety. Smulaton. There have been three major smulaton studes desgned to assess the power of goodness-of-ft tests for logstc regresson: Hosmer et al. (1997), Hosmer and Hjort (00) and Kuss (00). For convenence, I ll refer to them as H+, HH, and K. Of the four statstcs under consderaton here, H+ ncludes the Pearson, USS and Stukel. HH only consders the Pearson and USS. K ncludes Pearson, USS and IM. All three studes use only sample szes of 100 and 500. Here s a summary of ther results for varous knds of departure from the standard logstc model: Quadratc vs. lnear effect of a covarate. H+ report that Pearson and USS have moderate power for N=100 and very good power (above 90% under most condtons) for N=500. Power for Stukel s smlar but somewhat lower. HH get smlar results for Pearson and USS. K, on the other hand, found no power for Pearson, and moderate to good power for USS and IM, wth USS notceably better than IM for N=100. Interacton vs. lnear effects of two covarates. H+ found vrtually no power for all tests under all condtons. But they set up the smulaton ncorrectly, n my judgment. HH reported power of about 40% for both Pearson and USS at N=100 for a very strong nteracton. At N=500, the power was over 90%. For weaker nteractons, power ranged between 5% and 70%. K dd not examne nteractons. None of the smulatons examned the power of Stukel or IM for testng nteractons. Alternatve lnk functons. H+ found that Pearson, USS and Stukel generally had very low power at N=100 and only small to moderate power at N=500. Stukel was the best of the three. HH report smlar results for USS and Pearson. Comparng logstc wth complementary log-log, K found no power for Pearson and moderate power for IM and USS, wth IM somewhat better. Mssng covarate and overdsperson. K found that nether Pearson, USS or IM had any apprecable power to detect these knds of departures. Dscusson. The most alarmng thng about these smulaton studes s the nconsstency between Kuss and the other two studes regardng the performance of the Pearson test. H+ and HH found that Pearson had reasonably good power to test several dfferent knds of specfcatons. Kuss, on the other hand, found that the Pearson test had vrtually no power to test ether a quadratc model or a msspecfed lnk functon. Unfortunately, I beleve that Kuss s smulatons, whch were based on hs GOFLOGIT macro, have a major flaw. For the standardzed Pearson statstcs, he used a one-sded test rather than the two-sded test recommended by Osus and Rojek, and also by Hosmer and Lemeshow n the 013 edton of ther classc text, Appled Logstc Regresson. When I replcated Kuss s smulatons usng a two-sded test, the results (shown below) were consstent wth those of H+ and HH. There s also a flaw n the H+ smulatons. For ther nteracton models, the ftted models removed the nteracton and the man effect of one of the two varables. Ths does not yeld a vald test of the nteracton. NEW SIMULATIONS To address problems wth prevous smulatons, I ran new smulatons that ncluded all the GOF tests consdered here. Whenever approprate, I tred to replcate the basc structure of the smulatons used n prevous studes. As reported n those studes, all the tests rejected the null hypothess at about the nomnal level when the ftted model was correct. So I shall only report the estmated power of the tests to reject the null hypothess when the ftted model s ncorrect. For each condton, 500 samples were drawn. Lnear vs. quadratc. The correct model was logt( ) 0 1x x.values of the coeffcents were the same as those used by Hosmer and Hjort. Coeffcents were vared to emphasze or deemphasze the quadratc component, n four confguratons: very low, low, medum and hgh. The varable x was unformly dstrbuted between -3 and 3. The ftted model deleted x. Sample szes of 100 and 500 were examned. The lnear model was rejected f the p-value for the GOF test fell below.05. In addton to the new GOF tests, I checked the power of the standard Wald ch-square test for, the coeffcent for x.table 6 shows the proporton of tmes that the lnear model was rejected,.e., estmates of the power of each test. When the quadratc effect s very low, none of the tests had any apprecable power. For the low quadratc effect, we see pretty good power at N=500, and some power at N=100. For the medum quadratc effect, N=500 gves near perfect power for all tests, but just moderate power N=100. In ths condton, the Stukel test seems notceably weaker than the others. For the hgh quadratc effect, we see very good power at N=100, and 100 percent rejecton at N=500. The Wald ch-square for the nteracton generally does better than the GOF tests, especally at N=500. 9
10 Quadratc Effect Very Low Low Medum Hgh N Osus McCullagh USS IM Stukel Wald X Table 6. Power Estmates for Detectng a Quadratc Effect Lnear vs. nteracton. The correct model was logt( ) 0 1x d 3xd. For the predctor varables, x was unformly dstrbuted between -3 and +3, d was dchotomous wth values of -1 and +1, and the two varables were ndependent. Coeffcents were chosen to represent varyng levels of nteracton. The ftted model deleted the product term xd. Sample szes of 100 and 500 were examned. The lnear model was rejected f the p-value for the GOF test fell below.05. Table 7 shows the proporton of tmes that the lnear model was rejected. Interacton Very Low Low Medum Hgh Very Hgh N Osus McCullagh USS IM Stukel Wald X Table 7. Power Estmates for Detectng an Interacton In Table 7, we see that power to detect nteracton wth GOF tests s generally on the low sde. Of the fve new tests, Stukel clearly outperforms the others, especally at N=500. IM generally comes n second. But by comparson, the standard Wald ch-square test for the nteracton s far superor to any of these tests. Ths llustrates the general prncple that, whle GOF tests may be useful n detectng unantcpated departures from the model, tests that target specfc departures from the model are often much more powerful. Incorrect Lnk Functon. Most software packages for bnary regresson offer only three lnk functons: logt, probt and complementary log-log. So the practcal ssue s whether GOF tests can dscrmnate among these three. Logt and probt curves are both symmetrcal so t s very hard to dstngush them. Instead, I ll focus on logt vs. complementary log-log (whch s asymmetrcal). The true model was lnear n the complementary log-log: log(-log(1, )) x 0 1 wth x unformly dstrbuted between -3 and 3, 0 = 0 and 1 =.81. The ftted model was a standard logstc model. Results for the GOF tests are shown n Table 8. For N=100, none of the tests s any good. For N=500, the standardzed Pearson tests are awful, USS s margnal, and IM and Stukel are half decent. Thngs look a lttle dfferent n the last two columns, however, where I ncreased the sample sze to 1,000. When the coeffcent stays the same, IM and Stukel are stll the best, although the others are much mproved. But when I reduced the coeffcent of x by half, the Pearson statstcs look better than IM and Stukel. Why the reversal? As 10
11 others have noted, the Pearson statstc may be partcularly senstve to cases where the predcted value s near 1 or 0 and the observed value s n the opposte drecton. That s because each resdual gets weghted by 1 ˆ (1 ˆ ) whch wll be large when the predcted values are near 0 or 1. When 1 =.81, many of the predcted values are near 0 or 1. But when 1 =.405, a much smaller fracton of the predcted values are near 0 or 1. Ths suggests that the earler smulatons should also explore varaton n the range of predcted values. N =.81) =.81) =.81) 1000 ( 1 =.405) Osus McCullagh USS IM Stukel Table 8. Power Estmates for Detectng an Incorrect Lnk Functon CLOSING POINTS All of the new GOF tests wth ungrouped data are potentally useful n detectng msspecfcaton. For detectng nteracton, the Stukel test was markedly better than the others. But t was somewhat weaker for detectng quadratc effects. None of the tests was great at dstngushng a logstc model from a complementary log-log model. The Pearson tests were much worse than the others when many predcted probabltes were close to 1 or 0, and better than the others when predcted probabltes were concentrated n the mdrange. Ths suggests that more elaborate smulatons are needed for a comparatve evaluaton of these statstcs. Tests for specfc knds of msspecfcaton may be much more powerful than global GOF tests. Ths was partcularly evdent for nteractons. For many applcatons a targeted approach may be the way to go. I recommend usng all these GOF tests. If your model passes all of them, you can feel releved. If any one of them s sgnfcant, t s probably worth dong targeted tests. As wth any GOF tests, when the sample sze s qute large, t may not be possble to fnd any reasonably parsmonous model wth a p-value greater than.05. If you use the GOFLOGIT macro, modfy t to calculate two-sded p-values for the Osus and McCullagh versons of the standardzed Pearson statstc. REFERENCES Chesher A. (1984) Testng for neglected heterogenety. Econometrca 5: Cragg, J.G. and R.S. Uhler (1970) The demand for automobles. The Canadan Journal of Economcs 3: Copas, J.B. (1989) Unweghted sum of squares test for proportons. Appled Statstcs 38: Cox, D.R. and E.J. Snell (1989) Analyss of Bnary Data. Second Edton. Chapman & Hall. Farrngton, C. P. (1996) On assessng goodness of ft of generalzed lnear models to sparse data. Journal of the Royal Statstcal Socety, Seres B 58: Hosmer, D.W. and N.L. Hjort (00) Goodness-of-ft processes for logstc regresson: Smulaton results. Statstcs n Medcne 1: Hosmer, D.W., T. Hosmer, S. Le Cesse and S. Lemeshow (1997). A comparson of goodness-of-ft tests for the logstc regresson model. Statstcs n Medcne 16:
12 Hosmer D.W. and S. Lemeshow (1980) A goodness-of-ft test for the multple logstc regresson model. Communcatons n Statstcs A10: Hosmer D.W. and S. Lemeshow (013) Appled Logstc Regresson, 3 rd Edton. New York: Wley. Kvalseth, T.O. (1985) Cautonary note about R. The Amercan Statstcan: 39: Kuss, O. (001) A SAS/IML macro for goodness-of-ft testng n logstc regresson models wth sparse data. Paper 65-6 presented at the SAS User s Group Internatonal 6. Kuss, O. (00) Global goodness-of-ft tests n logstc regresson wth sparse data. Statstcs n Medcne 1: Lu, Y., P.I. Nelson and S.S. Yang (01) An omnbus lack of ft test n logstc regresson wth sparse data. Statstcal Methods & Applcatons 1: McFadden, D. (1974) Condtonal logt analyss of qualtatve choce behavor. Pp n P. Zarembka (ed.), Fronters n Econometrcs. Academc Press. Maddala, G.S. (1983) Lmted Dependent and Qualtatve Varables n Econometrcs. Cambrdge Unversty Press. McCullagh, P. (1985). On the asymptotc dstrbuton of Pearson s statstcs n lnear exponental famly models. Internatonal Statstcal Revew 53: Menard, S. (000) Coeffcents of determnaton for multple logstc regresson analyss. The Amercan Statstcan 54: Mttlbock, M. and M. Schemper (1996) Explaned varaton n logstc regresson. Statstcs n Medcne 15: Mroz, T.A. (1987) The senstvy of an emprcal model of marred women's hours of work to economc and statstcal assumptons. Econometrca 55: Orme, C. (1988) The calculaton of the nformaton matrx test for bnary data models. The Manchester School 54(4): Orme, C. (1990) The small-sample performance of the nformaton-matrx test. Journal of Econometrcs 46: Osus, G., and Rojek, D. (199) Normal goodness-of-ft tests for multnomal models wth large degrees-of-freedom. Journal of the Amercan Statstcal Assocaton 87: Nagelkerke, N.J.D. (1991) A note on a general defnton of the coeffcent of determnaton. Bometrka 78: Pgeon, J. G., and Heyse, J. F. (1999) An mproved goodness of ft test for probablty predcton models. Bometrcal Journal 41: Press, S.J. and S. Wlson (1978) Choosng between logstc regresson and dscrmnant analyss. Journal of the Amercan Statstcal Assocaton 73: Pulkstens, E., and T. J. Robnson (00) Two goodness-of-ft tests for logstc regresson models wth contnuous covarates. Statstcs n Medcne 1: Stukel, T. A. (1988) Generalzed logstc models. Journal of the Amercan Statstcal Assocaton 83: Tjur, T. (009) Coeffcents of determnaton n logstc regresson models A new proposal: The coeffcent of dscrmnaton. The Amercan Statstcan 63: Tsats, A. A. (1980) A note on a goodness-of-ft test for the logstc regresson model. Bometrka, 67: Whte H. (198) Maxmum lkelhood estmaton of msspecfed models. Econometrca 50:1 5. Xe, X.J., J. Pendergast and W. Clarke (008) Increasng the power: A practcal approach to goodness-of-ft test for logstc regresson models wth contnuous predctors. Computatonal Statstcs & Data Analyss 5:
13 CONTACT INFORMATION Your comments and questons are valued and encouraged. Contact the author at: Name: Paul D. Allson Organzaton: Unversty of Pennsylvana and Statstcal Horzons LLC Address: 3718 Locust Walk Cty, State ZIP: Phladelpha, PA Work Phone: Emal: Web: SAS and all other SAS Insttute Inc. product or servce names are regstered trademarks or trademarks of SAS Insttute Inc. n the USA and other countres. ndcates USA regstraton. 13
Can Auto Liability Insurance Purchases Signal Risk Attitude?
Internatonal Journal of Busness and Economcs, 2011, Vol. 10, No. 2, 159-164 Can Auto Lablty Insurance Purchases Sgnal Rsk Atttude? Chu-Shu L Department of Internatonal Busness, Asa Unversty, Tawan Sheng-Chang
More informationAn Alternative Way to Measure Private Equity Performance
An Alternatve Way to Measure Prvate Equty Performance Peter Todd Parlux Investment Technology LLC Summary Internal Rate of Return (IRR) s probably the most common way to measure the performance of prvate
More informationbenefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).
REVIEW OF RISK MANAGEMENT CONCEPTS LOSS DISTRIBUTIONS AND INSURANCE Loss and nsurance: When someone s subject to the rsk of ncurrng a fnancal loss, the loss s generally modeled usng a random varable or
More informationPSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12
14 The Ch-squared dstrbuton PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 1 If a normal varable X, havng mean µ and varance σ, s standardsed, the new varable Z has a mean 0 and varance 1. When ths standardsed
More informationCHAPTER 14 MORE ABOUT REGRESSION
CHAPTER 14 MORE ABOUT REGRESSION We learned n Chapter 5 that often a straght lne descrbes the pattern of a relatonshp between two quanttatve varables. For nstance, n Example 5.1 we explored the relatonshp
More information1. Measuring association using correlation and regression
How to measure assocaton I: Correlaton. 1. Measurng assocaton usng correlaton and regresson We often would lke to know how one varable, such as a mother's weght, s related to another varable, such as a
More informationSTATISTICAL DATA ANALYSIS IN EXCEL
Mcroarray Center STATISTICAL DATA ANALYSIS IN EXCEL Lecture 6 Some Advanced Topcs Dr. Petr Nazarov 14-01-013 petr.nazarov@crp-sante.lu Statstcal data analyss n Ecel. 6. Some advanced topcs Correcton for
More informationForecasting the Direction and Strength of Stock Market Movement
Forecastng the Drecton and Strength of Stock Market Movement Jngwe Chen Mng Chen Nan Ye cjngwe@stanford.edu mchen5@stanford.edu nanye@stanford.edu Abstract - Stock market s one of the most complcated systems
More informationCausal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting
Causal, Explanatory Forecastng Assumes cause-and-effect relatonshp between system nputs and ts output Forecastng wth Regresson Analyss Rchard S. Barr Inputs System Cause + Effect Relatonshp The job of
More informationStatistical algorithms in Review Manager 5
Statstcal algorthms n Reve Manager 5 Jonathan J Deeks and Julan PT Hggns on behalf of the Statstcal Methods Group of The Cochrane Collaboraton August 00 Data structure Consder a meta-analyss of k studes
More informationBinomial Link Functions. Lori Murray, Phil Munz
Bnomal Lnk Functons Lor Murray, Phl Munz Bnomal Lnk Functons Logt Lnk functon: ( p) p ln 1 p Probt Lnk functon: ( p) 1 ( p) Complentary Log Log functon: ( p) ln( ln(1 p)) Motvatng Example A researcher
More informationRegression Models for a Binary Response Using EXCEL and JMP
SEMATECH 997 Statstcal Methods Symposum Austn Regresson Models for a Bnary Response Usng EXCEL and JMP Davd C. Trndade, Ph.D. STAT-TECH Consultng and Tranng n Appled Statstcs San Jose, CA Topcs Practcal
More informationNPAR TESTS. One-Sample Chi-Square Test. Cell Specification. Observed Frequencies 1O i 6. Expected Frequencies 1EXP i 6
PAR TESTS If a WEIGHT varable s specfed, t s used to replcate a case as many tmes as ndcated by the weght value rounded to the nearest nteger. If the workspace requrements are exceeded and samplng has
More informationWhat is Candidate Sampling
What s Canddate Samplng Say we have a multclass or mult label problem where each tranng example ( x, T ) conssts of a context x a small (mult)set of target classes T out of a large unverse L of possble
More informationLatent Class Regression. Statistics for Psychosocial Research II: Structural Models December 4 and 6, 2006
Latent Class Regresson Statstcs for Psychosocal Research II: Structural Models December 4 and 6, 2006 Latent Class Regresson (LCR) What s t and when do we use t? Recall the standard latent class model
More informationCHAPTER 5 RELATIONSHIPS BETWEEN QUANTITATIVE VARIABLES
CHAPTER 5 RELATIONSHIPS BETWEEN QUANTITATIVE VARIABLES In ths chapter, we wll learn how to descrbe the relatonshp between two quanttatve varables. Remember (from Chapter 2) that the terms quanttatve varable
More informationHow To Calculate The Accountng Perod Of Nequalty
Inequalty and The Accountng Perod Quentn Wodon and Shlomo Ytzha World Ban and Hebrew Unversty September Abstract Income nequalty typcally declnes wth the length of tme taen nto account for measurement.
More informationTHE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek
HE DISRIBUION OF LOAN PORFOLIO VALUE * Oldrch Alfons Vascek he amount of captal necessary to support a portfolo of debt securtes depends on the probablty dstrbuton of the portfolo loss. Consder a portfolo
More informationSingle and multiple stage classifiers implementing logistic discrimination
Sngle and multple stage classfers mplementng logstc dscrmnaton Hélo Radke Bttencourt 1 Dens Alter de Olvera Moraes 2 Vctor Haertel 2 1 Pontfíca Unversdade Católca do Ro Grande do Sul - PUCRS Av. Ipranga,
More informationStatistical Methods to Develop Rating Models
Statstcal Methods to Develop Ratng Models [Evelyn Hayden and Danel Porath, Österrechsche Natonalbank and Unversty of Appled Scences at Manz] Source: The Basel II Rsk Parameters Estmaton, Valdaton, and
More informationEvaluating credit risk models: A critique and a new proposal
Evaluatng credt rsk models: A crtque and a new proposal Hergen Frerchs* Gunter Löffler Unversty of Frankfurt (Man) February 14, 2001 Abstract Evaluatng the qualty of credt portfolo rsk models s an mportant
More informationThe OC Curve of Attribute Acceptance Plans
The OC Curve of Attrbute Acceptance Plans The Operatng Characterstc (OC) curve descrbes the probablty of acceptng a lot as a functon of the lot s qualty. Fgure 1 shows a typcal OC Curve. 10 8 6 4 1 3 4
More informationExhaustive Regression. An Exploration of Regression-Based Data Mining Techniques Using Super Computation
Exhaustve Regresson An Exploraton of Regresson-Based Data Mnng Technques Usng Super Computaton Antony Daves, Ph.D. Assocate Professor of Economcs Duquesne Unversty Pttsburgh, PA 58 Research Fellow The
More informationQuantization Effects in Digital Filters
Quantzaton Effects n Dgtal Flters Dstrbuton of Truncaton Errors In two's complement representaton an exact number would have nfntely many bts (n general). When we lmt the number of bts to some fnte value
More informationDEFINING %COMPLETE IN MICROSOFT PROJECT
CelersSystems DEFINING %COMPLETE IN MICROSOFT PROJECT PREPARED BY James E Aksel, PMP, PMI-SP, MVP For Addtonal Informaton about Earned Value Management Systems and reportng, please contact: CelersSystems,
More informationHow To Understand The Results Of The German Meris Cloud And Water Vapour Product
Ttel: Project: Doc. No.: MERIS level 3 cloud and water vapour products MAPP MAPP-ATBD-ClWVL3 Issue: 1 Revson: 0 Date: 9.12.1998 Functon Name Organsaton Sgnature Date Author: Bennartz FUB Preusker FUB Schüller
More informationTHE METHOD OF LEAST SQUARES THE METHOD OF LEAST SQUARES
The goal: to measure (determne) an unknown quantty x (the value of a RV X) Realsaton: n results: y 1, y 2,..., y j,..., y n, (the measured values of Y 1, Y 2,..., Y j,..., Y n ) every result s encumbered
More informationPRIVATE SCHOOL CHOICE: THE EFFECTS OF RELIGIOUS AFFILIATION AND PARTICIPATION
PRIVATE SCHOOL CHOICE: THE EFFECTS OF RELIIOUS AFFILIATION AND PARTICIPATION Danny Cohen-Zada Department of Economcs, Ben-uron Unversty, Beer-Sheva 84105, Israel Wllam Sander Department of Economcs, DePaul
More informationCS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements
Lecture 3 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 5329 Sennott Square Next lecture: Matlab tutoral Announcements Rules for attendng the class: Regstered for credt Regstered for audt (only f there
More informationCalculation of Sampling Weights
Perre Foy Statstcs Canada 4 Calculaton of Samplng Weghts 4.1 OVERVIEW The basc sample desgn used n TIMSS Populatons 1 and 2 was a two-stage stratfed cluster desgn. 1 The frst stage conssted of a sample
More informationModule 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur
Module LOSSLESS IMAGE COMPRESSION SYSTEMS Lesson 3 Lossless Compresson: Huffman Codng Instructonal Objectves At the end of ths lesson, the students should be able to:. Defne and measure source entropy..
More informationCHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol
CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK Sample Stablty Protocol Background The Cholesterol Reference Method Laboratory Network (CRMLN) developed certfcaton protocols for total cholesterol, HDL
More informationHow Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence
1 st Internatonal Symposum on Imprecse Probabltes and Ther Applcatons, Ghent, Belgum, 29 June 2 July 1999 How Sets of Coherent Probabltes May Serve as Models for Degrees of Incoherence Mar J. Schervsh
More informationTraffic-light a stress test for life insurance provisions
MEMORANDUM Date 006-09-7 Authors Bengt von Bahr, Göran Ronge Traffc-lght a stress test for lfe nsurance provsons Fnansnspetonen P.O. Box 6750 SE-113 85 Stocholm [Sveavägen 167] Tel +46 8 787 80 00 Fax
More informationL10: Linear discriminants analysis
L0: Lnear dscrmnants analyss Lnear dscrmnant analyss, two classes Lnear dscrmnant analyss, C classes LDA vs. PCA Lmtatons of LDA Varants of LDA Other dmensonalty reducton methods CSCE 666 Pattern Analyss
More informationSIMPLE LINEAR CORRELATION
SIMPLE LINEAR CORRELATION Smple lnear correlaton s a measure of the degree to whch two varables vary together, or a measure of the ntensty of the assocaton between two varables. Correlaton often s abused.
More informationAn Evaluation of the Extended Logistic, Simple Logistic, and Gompertz Models for Forecasting Short Lifecycle Products and Services
An Evaluaton of the Extended Logstc, Smple Logstc, and Gompertz Models for Forecastng Short Lfecycle Products and Servces Charles V. Trappey a,1, Hsn-yng Wu b a Professor (Management Scence), Natonal Chao
More informationBERNSTEIN POLYNOMIALS
On-Lne Geometrc Modelng Notes BERNSTEIN POLYNOMIALS Kenneth I. Joy Vsualzaton and Graphcs Research Group Department of Computer Scence Unversty of Calforna, Davs Overvew Polynomals are ncredbly useful
More informationMeta-Analysis of Hazard Ratios
NCSS Statstcal Softare Chapter 458 Meta-Analyss of Hazard Ratos Introducton Ths module performs a meta-analyss on a set of to-group, tme to event (survval), studes n hch some data may be censored. These
More informationAnalysis of Premium Liabilities for Australian Lines of Business
Summary of Analyss of Premum Labltes for Australan Lnes of Busness Emly Tao Honours Research Paper, The Unversty of Melbourne Emly Tao Acknowledgements I am grateful to the Australan Prudental Regulaton
More informationChapter XX More advanced approaches to the analysis of survey data. Gad Nathan Hebrew University Jerusalem, Israel. Abstract
Household Sample Surveys n Developng and Transton Countres Chapter More advanced approaches to the analyss of survey data Gad Nathan Hebrew Unversty Jerusalem, Israel Abstract In the present chapter, we
More informationAn Empirical Study of Search Engine Advertising Effectiveness
An Emprcal Study of Search Engne Advertsng Effectveness Sanjog Msra, Smon School of Busness Unversty of Rochester Edeal Pnker, Smon School of Busness Unversty of Rochester Alan Rmm-Kaufman, Rmm-Kaufman
More informationSPEE Recommended Evaluation Practice #6 Definition of Decline Curve Parameters Background:
SPEE Recommended Evaluaton Practce #6 efnton of eclne Curve Parameters Background: The producton hstores of ol and gas wells can be analyzed to estmate reserves and future ol and gas producton rates and
More informationThe Development of Web Log Mining Based on Improve-K-Means Clustering Analysis
The Development of Web Log Mnng Based on Improve-K-Means Clusterng Analyss TngZhong Wang * College of Informaton Technology, Luoyang Normal Unversty, Luoyang, 471022, Chna wangtngzhong2@sna.cn Abstract.
More informationEconomic Interpretation of Regression. Theory and Applications
Economc Interpretaton of Regresson Theor and Applcatons Classcal and Baesan Econometrc Methods Applcaton of mathematcal statstcs to economc data for emprcal support Economc theor postulates a qualtatve
More information+ + + - - This circuit than can be reduced to a planar circuit
MeshCurrent Method The meshcurrent s analog of the nodeoltage method. We sole for a new set of arables, mesh currents, that automatcally satsfy KCLs. As such, meshcurrent method reduces crcut soluton to
More informationDiagnostic Tests of Cross Section Independence for Nonlinear Panel Data Models
DISCUSSION PAPER SERIES IZA DP No. 2756 Dagnostc ests of Cross Secton Independence for Nonlnear Panel Data Models Cheng Hsao M. Hashem Pesaran Andreas Pck Aprl 2007 Forschungsnsttut zur Zukunft der Arbet
More information1.2 DISTRIBUTIONS FOR CATEGORICAL DATA
DISTRIBUTIONS FOR CATEGORICAL DATA 5 present models for a categorcal response wth matched pars; these apply, for nstance, wth a categorcal response measured for the same subjects at two tmes. Chapter 11
More informationInstitute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic
Lagrange Multplers as Quanttatve Indcators n Economcs Ivan Mezník Insttute of Informatcs, Faculty of Busness and Management, Brno Unversty of TechnologCzech Republc Abstract The quanttatve role of Lagrange
More informationAnswer: A). There is a flatter IS curve in the high MPC economy. Original LM LM after increase in M. IS curve for low MPC economy
4.02 Quz Solutons Fall 2004 Multple-Choce Questons (30/00 ponts) Please, crcle the correct answer for each of the followng 0 multple-choce questons. For each queston, only one of the answers s correct.
More informationMarginal Returns to Education For Teachers
The Onlne Journal of New Horzons n Educaton Volume 4, Issue 3 MargnalReturnstoEducatonForTeachers RamleeIsmal,MarnahAwang ABSTRACT FacultyofManagementand Economcs UnverstPenddkanSultan Idrs ramlee@fpe.ups.edu.my
More informationGender differences in revealed risk taking: evidence from mutual fund investors
Economcs Letters 76 (2002) 151 158 www.elsever.com/ locate/ econbase Gender dfferences n revealed rsk takng: evdence from mutual fund nvestors a b c, * Peggy D. Dwyer, James H. Glkeson, John A. Lst a Unversty
More informationEvaluating the generalizability of an RCT using electronic health records data
Evaluatng the generalzablty of an RCT usng electronc health records data 3 nterestng questons Is our RCT representatve? How can we generalze RCT results? Can we use EHR* data as a control group? *) Electronc
More informationPRACTICE 1: MUTUAL FUNDS EVALUATION USING MATLAB.
PRACTICE 1: MUTUAL FUNDS EVALUATION USING MATLAB. INDEX 1. Load data usng the Edtor wndow and m-fle 2. Learnng to save results from the Edtor wndow. 3. Computng the Sharpe Rato 4. Obtanng the Treynor Rato
More informationRecurrence. 1 Definitions and main statements
Recurrence 1 Defntons and man statements Let X n, n = 0, 1, 2,... be a MC wth the state space S = (1, 2,...), transton probabltes p j = P {X n+1 = j X n = }, and the transton matrx P = (p j ),j S def.
More information) of the Cell class is created containing information about events associated with the cell. Events are added to the Cell instance
Calbraton Method Instances of the Cell class (one nstance for each FMS cell) contan ADC raw data and methods assocated wth each partcular FMS cell. The calbraton method ncludes event selecton (Class Cell
More informationStaff Paper. Farm Savings Accounts: Examining Income Variability, Eligibility, and Benefits. Brent Gloy, Eddy LaDue, and Charles Cuykendall
SP 2005-02 August 2005 Staff Paper Department of Appled Economcs and Management Cornell Unversty, Ithaca, New York 14853-7801 USA Farm Savngs Accounts: Examnng Income Varablty, Elgblty, and Benefts Brent
More informationA Probabilistic Theory of Coherence
A Probablstc Theory of Coherence BRANDEN FITELSON. The Coherence Measure C Let E be a set of n propostons E,..., E n. We seek a probablstc measure C(E) of the degree of coherence of E. Intutvely, we want
More informationLuby s Alg. for Maximal Independent Sets using Pairwise Independence
Lecture Notes for Randomzed Algorthms Luby s Alg. for Maxmal Independent Sets usng Parwse Independence Last Updated by Erc Vgoda on February, 006 8. Maxmal Independent Sets For a graph G = (V, E), an ndependent
More informationFeature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College
Feature selecton for ntruson detecton Slobodan Petrovć NISlab, Gjøvk Unversty College Contents The feature selecton problem Intruson detecton Traffc features relevant for IDS The CFS measure The mrmr measure
More informationJoe Pimbley, unpublished, 2005. Yield Curve Calculations
Joe Pmbley, unpublshed, 005. Yeld Curve Calculatons Background: Everythng s dscount factors Yeld curve calculatons nclude valuaton of forward rate agreements (FRAs), swaps, nterest rate optons, and forward
More informationANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING
ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING Matthew J. Lberatore, Department of Management and Operatons, Vllanova Unversty, Vllanova, PA 19085, 610-519-4390,
More informationBrigid Mullany, Ph.D University of North Carolina, Charlotte
Evaluaton And Comparson Of The Dfferent Standards Used To Defne The Postonal Accuracy And Repeatablty Of Numercally Controlled Machnng Center Axes Brgd Mullany, Ph.D Unversty of North Carolna, Charlotte
More informationPart 1: quick summary 5. Part 2: understanding the basics of ANOVA 8
Statstcs Rudolf N. Cardnal Graduate-level statstcs for psychology and neuroscence NOV n practce, and complex NOV desgns Verson of May 4 Part : quck summary 5. Overvew of ths document 5. Background knowledge
More information1 De nitions and Censoring
De ntons and Censorng. Survval Analyss We begn by consderng smple analyses but we wll lead up to and take a look at regresson on explanatory factors., as n lnear regresson part A. The mportant d erence
More informationAn Interest-Oriented Network Evolution Mechanism for Online Communities
An Interest-Orented Network Evoluton Mechansm for Onlne Communtes Cahong Sun and Xaopng Yang School of Informaton, Renmn Unversty of Chna, Bejng 100872, P.R. Chna {chsun,yang}@ruc.edu.cn Abstract. Onlne
More informationIDENTIFICATION AND CORRECTION OF A COMMON ERROR IN GENERAL ANNUITY CALCULATIONS
IDENTIFICATION AND CORRECTION OF A COMMON ERROR IN GENERAL ANNUITY CALCULATIONS Chrs Deeley* Last revsed: September 22, 200 * Chrs Deeley s a Senor Lecturer n the School of Accountng, Charles Sturt Unversty,
More informationSurvival analysis methods in Insurance Applications in car insurance contracts
Survval analyss methods n Insurance Applcatons n car nsurance contracts Abder OULIDI 1 Jean-Mare MARION 2 Hervé GANACHAUD 3 Abstract In ths wor, we are nterested n survval models and ther applcatons on
More informationOLA HÖSSJER, BENGT ERIKSSON, KAJSA JÄRNMALM AND ESBJÖRN OHLSSON ABSTRACT
ASSESSING INDIVIDUAL UNEXPLAINED VARIATION IN NON-LIFE INSURANCE BY OLA HÖSSJER, BENGT ERIKSSON, KAJSA JÄRNMALM AND ESBJÖRN OHLSSON ABSTRACT We consder varaton of observed clam frequences n non-lfe nsurance,
More informationLecture 3: Force of Interest, Real Interest Rate, Annuity
Lecture 3: Force of Interest, Real Interest Rate, Annuty Goals: Study contnuous compoundng and force of nterest Dscuss real nterest rate Learn annuty-mmedate, and ts present value Study annuty-due, and
More informationAn Investigation of the Performance of the Generalized S-X 2 Item-Fit Index for Polytomous IRT Models. Taehoon Kang Troy T. Chen
An Investgaton of the Performance of the eneralzed S-X Item-Ft Index for Polytomous IRT Models Taehoon Kang Troy T. Chen Abstract Orlando and Thssen (, 3) proposed an tem-ft ndex, S-X, for dchotomous
More informationHOUSEHOLDS DEBT BURDEN: AN ANALYSIS BASED ON MICROECONOMIC DATA*
HOUSEHOLDS DEBT BURDEN: AN ANALYSIS BASED ON MICROECONOMIC DATA* Luísa Farnha** 1. INTRODUCTION The rapd growth n Portuguese households ndebtedness n the past few years ncreased the concerns that debt
More information5 Multiple regression analysis with qualitative information
5 Multple regresson analyss wth qualtatve nformaton Ezequel Urel Unversty of Valenca Verson: 9-13 5.1 Introducton of qualtatve nformaton n econometrc models. 1 5. A sngle dummy ndependent varable 5.3 Multple
More informationInternational University of Japan Public Management & Policy Analysis Program
Internatonal Unversty of Japan Publc Management & Polcy Analyss Program Practcal Gudes To Panel Data Modelng: A Step by Step Analyss Usng Stata * Hun Myoung Park, Ph.D. kucc65@uj.ac.jp 1. Introducton.
More informationTo manage leave, meeting institutional requirements and treating individual staff members fairly and consistently.
Corporate Polces & Procedures Human Resources - Document CPP216 Leave Management Frst Produced: Current Verson: Past Revsons: Revew Cycle: Apples From: 09/09/09 26/10/12 09/09/09 3 years Immedately Authorsaton:
More informationADVERSE SELECTION IN INSURANCE MARKETS: POLICYHOLDER EVIDENCE FROM THE U.K. ANNUITY MARKET *
ADVERSE SELECTION IN INSURANCE MARKETS: POLICYHOLDER EVIDENCE FROM THE U.K. ANNUITY MARKET * Amy Fnkelsten Harvard Unversty and NBER James Poterba MIT and NBER * We are grateful to Jeffrey Brown, Perre-Andre
More informationAdaptive Clinical Trials Incorporating Treatment Selection and Evaluation: Methodology and Applications in Multiple Sclerosis
Adaptve Clncal Trals Incorporatng Treatment electon and Evaluaton: Methodology and Applcatons n Multple cleross usan Todd, Tm Frede, Ngel tallard, Ncholas Parsons, Elsa Valdés-Márquez, Jeremy Chataway
More informationAlthough ordinary least-squares (OLS) regression
egresson through the Orgn Blackwell Oxford, TEST 0141-98X 003 5 31000 Orgnal Joseph Teachng G. UK Artcle Publshng Esenhauer through Statstcs the Ltd Trust Orgn 001 KEYWODS: Teachng; egresson; Analyss of
More informationThe Racial and Gender Interest Rate Gap. in Small Business Lending: Improved Estimates Using Matching Methods*
The Racal and Gender Interest Rate Gap n Small Busness Lendng: Improved Estmates Usng Matchng Methods* Yue Hu and Long Lu Department of Economcs Unversty of Texas at San Antono Jan Ondrch and John Ynger
More informationRisk-based Fatigue Estimate of Deep Water Risers -- Course Project for EM388F: Fracture Mechanics, Spring 2008
Rsk-based Fatgue Estmate of Deep Water Rsers -- Course Project for EM388F: Fracture Mechancs, Sprng 2008 Chen Sh Department of Cvl, Archtectural, and Envronmental Engneerng The Unversty of Texas at Austn
More informationUsing Series to Analyze Financial Situations: Present Value
2.8 Usng Seres to Analyze Fnancal Stuatons: Present Value In the prevous secton, you learned how to calculate the amount, or future value, of an ordnary smple annuty. The amount s the sum of the accumulated
More information! # %& ( ) +,../ 0 1 2 3 4 0 4 # 5##&.6 7% 8 # 0 4 2 #...
! # %& ( ) +,../ 0 1 2 3 4 0 4 # 5##&.6 7% 8 # 0 4 2 #... 9 Sheffeld Economc Research Paper Seres SERP Number: 2011010 ISSN 1749-8368 Sarah Brown, Aurora Ortz-Núñez and Karl Taylor Educatonal loans and
More information1.1 The University may award Higher Doctorate degrees as specified from time-to-time in UPR AS11 1.
HIGHER DOCTORATE DEGREES SUMMARY OF PRINCIPAL CHANGES General changes None Secton 3.2 Refer to text (Amendments to verson 03.0, UPR AS02 are shown n talcs.) 1 INTRODUCTION 1.1 The Unversty may award Hgher
More informationThe Application of Fractional Brownian Motion in Option Pricing
Vol. 0, No. (05), pp. 73-8 http://dx.do.org/0.457/jmue.05.0..6 The Applcaton of Fractonal Brownan Moton n Opton Prcng Qng-xn Zhou School of Basc Scence,arbn Unversty of Commerce,arbn zhouqngxn98@6.com
More informationEstimation of Dispersion Parameters in GLMs with and without Random Effects
Mathematcal Statstcs Stockholm Unversty Estmaton of Dsperson Parameters n GLMs wth and wthout Random Effects Meng Ruoyan Examensarbete 2004:5 Postal address: Mathematcal Statstcs Dept. of Mathematcs Stockholm
More informationExtending Probabilistic Dynamic Epistemic Logic
Extendng Probablstc Dynamc Epstemc Logc Joshua Sack May 29, 2008 Probablty Space Defnton A probablty space s a tuple (S, A, µ), where 1 S s a set called the sample space. 2 A P(S) s a σ-algebra: a set
More informationECONOMICS OF PLANT ENERGY SAVINGS PROJECTS IN A CHANGING MARKET Douglas C White Emerson Process Management
ECONOMICS OF PLANT ENERGY SAVINGS PROJECTS IN A CHANGING MARKET Douglas C Whte Emerson Process Management Abstract Energy prces have exhbted sgnfcant volatlty n recent years. For example, natural gas prces
More informationVariance estimation for the instrumental variables approach to measurement error in generalized linear models
he Stata Journal (2003) 3, Number 4, pp. 342 350 Varance estmaton for the nstrumental varables approach to measurement error n generalzed lnear models James W. Hardn Arnold School of Publc Health Unversty
More informationSupport Vector Machines
Support Vector Machnes Max Wellng Department of Computer Scence Unversty of Toronto 10 Kng s College Road Toronto, M5S 3G5 Canada wellng@cs.toronto.edu Abstract Ths s a note to explan support vector machnes.
More informationMARKET SHARE CONSTRAINTS AND THE LOSS FUNCTION IN CHOICE BASED CONJOINT ANALYSIS
MARKET SHARE CONSTRAINTS AND THE LOSS FUNCTION IN CHOICE BASED CONJOINT ANALYSIS Tmothy J. Glbrde Assstant Professor of Marketng 315 Mendoza College of Busness Unversty of Notre Dame Notre Dame, IN 46556
More informationCredit Limit Optimization (CLO) for Credit Cards
Credt Lmt Optmzaton (CLO) for Credt Cards Vay S. Desa CSCC IX, Ednburgh September 8, 2005 Copyrght 2003, SAS Insttute Inc. All rghts reserved. SAS Propretary Agenda Background Tradtonal approaches to credt
More informationTESTING FOR EVIDENCE OF ADVERSE SELECTION IN DEVELOPING AUTOMOBILE INSURANCE MARKET. Oksana Lyashuk
TESTING FOR EVIDENCE OF ADVERSE SELECTION IN DEVELOPING AUTOMOBILE INSURANCE MARKET by Oksana Lyashuk A thess submtted n partal fulfllment of the requrements for the degree of Master of Arts n Economcs
More informationFinancial Instability and Life Insurance Demand + Mahito Okura *
Fnancal Instablty and Lfe Insurance Demand + Mahto Okura * Norhro Kasuga ** Abstract Ths paper estmates prvate lfe nsurance and Kampo demand functons usng household-level data provded by the Postal Servces
More informationGeneral Iteration Algorithm for Classification Ratemaking
General Iteraton Algorthm for Classfcaton Ratemakng by Luyang Fu and Cheng-sheng eter Wu ABSTRACT In ths study, we propose a flexble and comprehensve teraton algorthm called general teraton algorthm (GIA)
More informationLecture 3: Annuity. Study annuities whose payments form a geometric progression or a arithmetic progression.
Lecture 3: Annuty Goals: Learn contnuous annuty and perpetuty. Study annutes whose payments form a geometrc progresson or a arthmetc progresson. Dscuss yeld rates. Introduce Amortzaton Suggested Textbook
More information1 Example 1: Axis-aligned rectangles
COS 511: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture # 6 Scrbe: Aaron Schld February 21, 2013 Last class, we dscussed an analogue for Occam s Razor for nfnte hypothess spaces that, n conjuncton
More informationOn the Optimal Control of a Cascade of Hydro-Electric Power Stations
On the Optmal Control of a Cascade of Hydro-Electrc Power Statons M.C.M. Guedes a, A.F. Rbero a, G.V. Smrnov b and S. Vlela c a Department of Mathematcs, School of Scences, Unversty of Porto, Portugal;
More informationTransition Matrix Models of Consumer Credit Ratings
Transton Matrx Models of Consumer Credt Ratngs Abstract Although the corporate credt rsk lterature has many studes modellng the change n the credt rsk of corporate bonds over tme, there s far less analyss
More informationHigh Correlation between Net Promoter Score and the Development of Consumers' Willingness to Pay (Empirical Evidence from European Mobile Markets)
Hgh Correlaton between et Promoter Score and the Development of Consumers' Wllngness to Pay (Emprcal Evdence from European Moble Marets Ths paper shows that the correlaton between the et Promoter Score
More informationMethod for assessment of companies' credit rating (AJPES S.BON model) Short description of the methodology
Method for assessment of companes' credt ratng (AJPES S.BON model) Short descrpton of the methodology Ljubljana, May 2011 ABSTRACT Assessng Slovenan companes' credt ratng scores usng the AJPES S.BON model
More information