Numerical Issues in Statistical Computing for the Social Scientist
|
|
|
- Eleanor Bond
- 10 years ago
- Views:
Transcription
1 Numerical Issues in Statistical Computing for the Social Scientist Micah Altman Jeff Gill Michael McDonald A Wiley-Interscience Publication JOHN WILEY & SONS, INC. New York / Chichester / Weinheim / Brisbane / Singapore / Toronto
2 9 Convergence Problems in Logistic Regression Paul Allison 9.1 INTRODUCTION Anyone who has much practical experience with logistic regression will have occasionally encountered problems with convergence. Such problems are usually both puzzling and exasperating. Most researchers haven t a clue as to why certain models and certain data sets lead to convergence difficulties. And for those who do understand the causes of the problem, it s usually unclear whether and how the problem can be fixed. In this chapter, I explain why numerical algorithms for maximum likelihood estimation of the logistic regression model sometimes fail to converge. And I will also consider a number possible solutions. Along the way, I will look at the performance of several popular computing packages when they encounter convergence problems of varying kinds. 9.2 OVERVIEW OF LOGISTIC MAXIMUM LIKELIHOOD ESTIMATION I begin with a review of the logistic regression model and maximum likelihood estimation of the parameters of that model. For a sample of cases ( 1 Î%&'&&(% ), there are data on a dummy dependent variable (with values of 1 and 0) and a vector of explanatory variables (including a 1 for the intercept term). The logistic regression 219
3 u ± 1 Õ & & & % CONVERGENCE PROBLEMS IN LOGISTIC REGRESSION model states that: œn ' 1 f X1 J wx y A Áð W (9.1) where Á is a vector of coefficients. Equivalently, we may write the model in logit œn ' œn ' 1 f 1? f W 1?Á (9.2) Assuming that the n cases are independent, the log-likelihood function for this model is: ± ðj w(x y (9.3) ÁÖn1 Áð # ÁÖ The goal of maximum likelihood estimation is to find a set of values for Á that maximize this function. One well-known approach to maximizing a function like this is to differentiate it with respect to Á, set the derivative equal to 0, and then solve the resulting set of equations. The first derivative of the log-likelihood is: where Õ ÁÖ u Á is the predicted value of : Õ m1 J w(x y A A ÁÖ The next step is to set the derivative equal to 0 and solve for Á : A Õ (9.4) (9.5) (9.6) BecauseÁ is a vector, (9.6) is actually a set of equations,one for each of the parameters to be estimated. These equations are identical to the normal equations for leastsquares linear regression, except that by (9.4), is a non-linear function of the Ç s rather than a linear function. For some models and data (e.g., saturated models), the equations in (9.6) can be explicitly solved for the ML estimator Á. For example, suppose there is a single dichotomous variable, so that the data can be arrayed in a ± table, with observed cell frequencies zî *, zî O, z O, and z O O. Then the ML estimator of the coefficient of is given by the logarithm of the cross-product ratio : Õ ÜCB zî z O O z z O (9.7) For most data and models, however, the equations in (9.6) have no explicit solution. In such cases, the equations must be solved by numerical methods, of which there are many. The most popular numerical method is the Newton-Raphson algorithm. Let
4 u u ± O ± 1 1 WHAT CAN GO WRONG? 221 Œ Á be the vector of first derivatives of the log-likelihood with respect to Á and let Á be the matrix of second derivatives. That is, Œ ÁX1 Án1 ÁÖ u Á Á u Á u Á Q A o Q Õ A Õ Õ (9.8) The vector of first derivatives,7áö is sometimes called the gradient or score while the matrix of second derivatives Á is called the Hessian. The Newton-Raphson algorithm is then Á ˆ W 1?Á ˆ A < Á ˆ Œ Á ˆ (9.9) where < is the inverse of. Chapter 4 of this volume showed what can go wrong with this process as well as some remedies. To operationalize this algorithm, a set of starting values Á E is required. Choice of starting values is not critical; usually, setting Á E 1 0 works fine. The starting values are substituted into the right-hand side of (9.9), which yields the result for the first iteration, Á, These values are then substituted back into the right hand side, the first and second derivatives are recomputed, and the result is Á O. The process is repeated until the maximum change in each parameter estimate from one iteration to the next is less than some criterion, at which point we Õ say that the algorithm has converged. Once we have the results of the final iteration, Á, a byproduct of the Newton-Raphson algorithm is an estimate of the covariance matrix of the coefficients, which is ust A < Á. Estimates of the standard errors of the coefficients are obtained by taking the square roots of the main diagonal elements of this matrix. 9.3 WHAT CAN GO WRONG? A problem that often occurs in trying to maximize a function is that the function may have local maxima, that is, points that are larger than any nearby point but not as large as some more distant point. In those cases, setting the first derivative equal to zero will yield equations that have more than one solution. And if the starting values for the Newton-Raphson algorithm are close to a local maximum, the algorithm will most likely iterate to that point rather than the global maximum. Fortunately, problems with multiple maxima cannot occur with logistic regression because the log-likelihood is globally concave, which means that the function can have at most one maximum. Unfortunately, there are many situations in which the likelihood function has no maximum, in which case we say that the maximum likelihood estimate does not exist. Consider the set of data on 10 observations in Table 9.1. For these data, it can be shown that the ML estimate of the intercept is 0. Figure 9.1 shows a graph of the log-likelihood as a function of the slope beta. It is apparent that, although the log-likelihood is bounded above by 0, it does not reach a maximum as beta increases. We can make the log-likelihood as close to 0 as we choose by making beta sufficiently large. Hence, there is no maximum likelihood estimate.
5 222 CONVERGENCE PROBLEMS IN LOGISTIC REGRESSION Table 9.1 Data Exhibiting Complete Separation x y x y Fig. 9.1 Log Likelihood Versus Beta l ogl i ke bet a Figure 1. Log-likelihood as a function of the slope. This is an example of a problem known as complete separation (Albert and Anderson 1984), which occurs whenever there exists some vector of coefficients such that m1 whenever VXªD and 1e whenever ³²X=D. In other words, complete separation occurs whenever a linear function of can generate perfect predictions of. For our hypothetical data set, a simple linear function that satisfies this property is JG V. That is, when is greater than, 1, and when is less than, 13.
6 WHAT CAN GO WRONG? 223 A related problem is known as quasi-complete separation. This occurs when (a), and there exists some coefficient vector such that s whenever 1 ² c3 whenever 1U, and equality holds for at least one case in each category of the dependent variable. Table9.2 displays a data set that satisfies this condition. Table 9.2 Data Exhibiting Quasi-Complete Separation x y x y The difference between this data set and the previous one is that there are two more observations, each with x values of 0 but having different values of y. The log-likelihood function for these data, shown in Figure 9.2, is similar in shape to that in Figure 9.1. However, the asymptote for the curve is not 0, but a number that is approximately In general, the log-likelihood function for quasi-complete separation will not approach 0, but some number lower than that. In any case, the curve has no maximum so, again, the maximum likelihood estimate does not exist. Of the two conditions, complete and quasi-complete separation, the latter is far more common. It most often occurs when an explanatory variable x is a dummy variable and, for one value of x, either every case has the event y=1 or every case has the event y=0. Consider the following ± table: If we form the linear combination c = 0 + (1) x, we have ÅrsU when 1 and Å~c@ when 1e. Further, for all the cases in the second row, Å1e for both values of. So the conditions of quasi-complete separation are satisfied. To get some intuitive sense of why this leads to non-existence of the maximum likelihood estimator, consider equation 9.7 which gives the maximum likelihood estimator of the slope coefficient for a ± table. For our quasi-complete table, this would be undefined because there is a zero in the denominator. The same problem would occur if there were a zero in the numerator because the logarithm of zero is also undefined. If the table is altered to read:
7 224 CONVERGENCE PROBLEMS IN LOGISTIC REGRESSION Fig. 9.2 Log Likelihood as a Function of the Slope, Quasi-Complete Separation l ogl i ke bet a then there is complete separation with zeros in both the numerator and the denominator. So the general principle is evident: Whenever there is a zero in any cell of a ± table, the maximum likelihood estimate of the logistic slope coefficient does not exist. This principle also extends to multiple independent variables: For any dichotomous independent variable in a logistic regression, if there is a zero in the ± table formed by that variable and the dependent variable, the ML estimate for the regression coefficient will not exist. This is by far the most common cause of convergence failure in logistic regression. Obviously, it is more likely to occur when the sample size is small. Even in large samples, it will frequently occur when there are extreme splits on the frequency distribution of either the dependent or independent variables. Consider, for example, a logistic regression predicting the occurrence of a rare disease. Suppose further,
8 BEHAVIOR OF THE NEWTON-RAPHSON ALGORITHM UNDER SEPARATION 225 that the explanatory variables include a set of seven dummy variables representing different age levels. It would not be terribly surprising if no one had the disease for at least one of the age levels, but this would produce quasi-complete separation. 9.4 BEHAVIOR OF THE NEWTON-RAPHSON ALGORITHM UNDER SEPARATION We ust saw that when there are explicit formulas for the maximum likelihood estimate and there is either complete or quasi-complete separation, the occurrence of zeros in the formulas prevents computation. What happens when the Newton-Raphson algorithm is applied to data exhibiting either kind of separation? That depends on the particular implementation of the algorithm. The classic behavior is this: at each iteration, the parameter estimate for the variable (or variables) with separation gets larger in magnitude. Iterations continue until the fixed iteration limit is exceeded. At whatever limit is reached, the parameter estimate is large and the estimated standard error is extremely large. If separation is complete, the log-likelihood will be reported as zero. What actually happens depends greatly on the software implementation of the algorithm. One issue is the criterion used for determining convergence. Some older/cruder logistic regression programs determine convergence by examining the change in the log-likelihood from one iteration to the next. If that change is very small, convergence is declared and the iterations cease. Unfortunately, as seen in Figures 9.1 and 9.2, with complete or quasi-complete separation the log-likelihood may change imperceptibly from one iteration to the next even though the parameter estimate is rapidly increasing in magnitude. So, to avoid the false appearance of convergence, it is essential that convergence be evaluated by looking at changes in the parameter estimate across iterations, rather than changes in the log-likelihood. At the other extreme, some logistic regression programs have algorithms that attempt to detect complete or quasi-complete separation, and then issue appropriate warnings to the user. Albert and Anderson (1984) proposed one empirical method that has been implemented in PROC LOGISTIC in SAS software (SAS Institute 1999). It has the following steps: 1. If the convergence criterion is satisfied within eight iterations, conclude that there is no problem. 2. For all iterations after the eighth, compute the predicted probability of the observed response for each observation, which is given by: Õ V1 J wx y #S ; A Õ Á + If the predicted probability is one for all observations, conclude that there is complete separation and stop the iterations.
9 226 CONVERGENCE PROBLEMS IN LOGISTIC REGRESSION 3. If the probability of the observed response is large (s$ ¼ ) for some observations (but not all), examine the estimated standard errors for that iteration. If they exceed some criterion, conclude that there is quasi-complete separation and stop the iteration. The check for complete separation is very reliable, but the check for quasi-complete separation is less so. For more reliable checks of quasi-complete separation, methods based on linear programming algorithms have been proposed by Albert and Anderson (1984) and Santner and Duffy (1986). To determine how available software handles complete and quasi-complete separation, I tried estimating logistic regression models for the data sets in Table 9.1 and 9.2 using several popular statistical packages. For some packages (SAS, Stata), more than one command or procedure was evaluated. Keep in mind that these tests were run in September 2002 using software versions that were available to me at the time. Results are summarized in Table 9.3. Several criteria were used in the evaluation: & Ñ Warning Messages Ideally, the program should detect the separation and issue a clear warning message to the user. The program that came closest to this ideal was the SAS procedure LOGISTIC. For complete separation, it printed the message: Complete separation of data points detected. WARNING: The maximum likelihood estimate does not exist. WARNING: The LOGISTIC procedure continues in spite of the above warning. Results shown are based on the last maximum likelihood iteration. Validity of the model fit is questionable. For quasi-complete separation, the message was Quasicomplete separation of data points detected. WARNING: The maximum likelihood estimate may not exist. WARNING: The LOGISTIC procedure continues in spite of the above warning. Results shown are based on the last maximum likelihood iteration. Validity of the model fit is questionable. SPSS did a good ob for complete separation, presenting the warning: Estimation terminated at iteration number 21 because a perfect fit is detected. This solution is not unique. Warning \# Covariance matrix cannot be computed. Remaining statistics will be omitted. But SPSS gave no warning message for quasi-complete separation. For complete separation, the Stata LOGIT command gave the message: outcome = x\texttt{>}-1 predicts data perfectly For quasi-complete separation, the message was outcome = x\texttt{>}0 predicts data perfectly except for x==0 subsample:
10 FALSE CONVERGENCE 227 x dropped and 10 obs not used For complete separation, Systat produced the message Failure to fit model or perfect fit but said nothing about the quasi-complete case. Several other programs produced warning messages that were ambiguous or cryptic. SAS CATMOD marked certain coefficients estimates with # and said that they were regarded to be infinite. JMP identified some coefficient estimates as unstable. R said that some of the fitted probabilities were numerically 0 or 1. For quasi-complete separation, the GENMOD procedure in SAS reported that the negative of the Hessian matrix was not positive definite. And finally, several programs gave no warning messages whatsoever (GLIM, LIMDEP, Stata MLOGIT, Minitab). 9.5 FALSE CONVERGENCE Strictly speaking, the Newton-Raphson algorithm should not converge under either complete or quasi-complete separation. Nevertheless, the only program that exhibited this classic behavior was Minitab. No matter how high I raised the maximum number of iterations, the program would not converge. With the exception of SAS LOGISTIC and Stata LOGIT (which stopped the iterations once separation had been detected), the remaining programs all reported that the convergence criterion had been met. In some cases (GLIM, R, SPSS, and Systat), it was necessary to increase the maximum iterations beyond the default in order to achieve this apparent convergence. Lacking information on the convergence criterion for most of these programs, I don t have an explanation for the false convergence. One possibility is that the convergence criterion is based on the log-likelihood rather than the parameter estimates. But in the case of SAS GENMOD, SAS documentation is quite explicit that the criterion is based on parameter estimates. In any case, the combination of apparent convergence and lack of clear warning messages in many programs means that some users are likely to be misled about the viability of their parameter estimates. 9.6 REPORTING OF PARAMETER ESTIMATES AND STANDARD ERRORS Some programs that do a good ob of detecting and warning about complete separation then fail to report any parameter estimates or standard errors (SPSS, STATA LOGIT). This might seem sensible since non-convergent estimates are essentially worthless as parameter estimates. However, they may still serve a useful diagnostic purpose in determining which variables have complete or quasi-complete separation.
11 228 CONVERGENCE PROBLEMS IN LOGISTIC REGRESSION Table 9.3 Performance of Packages under Complete and Quasi-Complete Separation Warning False Report LR Messages Convergence Estimates Statistics Comp Quasi Comp Quasi Comp Quasi Comp Quasi GLIM * * * * JMP A A * * * * * * LIMDEP * * * * Minitab * * R A A * * * * * * SAS GENMOD A * * * * * SAS LOGISTIC C C * * SAS CATMOD A A * * * * SPSS C * * Stata LOGIT C C * Stata MLOGIT * * * * Systat C * * * Note: C=clear warning, A=ambiguous warning Likelihood Ratio Statistics Some programs (SAS GENMOD, JMP), can report optional likelihood-ratio chi-square tests for each of the coefficients in the model. Unlike Wald chi-squares, which are essentially useless under complete or quasi-complete separation, the likelihood ratio test is still a valid test of the null hypothesis that a coefficient is equal to 0. Thus, even if a certain parameter can t be estimated, we can still udge whether or not it is significantly different from Diagnosis of Separation Problems We are now in a position to make some recommendations about how the statistical analyst should detect problems of complete or quasi-complete separation. If you re using software that gives clear diagnostic messages (SAS LOGISTIC, STATA LOGIT), then half the battle is won. But there is still a need to determine which variables are causing the problem, and to get a better sense of the nature of the problem. The second step (or the first step with programs that do not give good warning messages) is to carefully examine the estimated coefficients and their standard errors. Variables with non-existent coefficients will invariably have large parameter estimates, typically greater than 5.0, and huge standard errors, producing Wald chisquare statistics that are near 0. If any of these variables is a dummy (indicator) variable, the next step is to compute the ± table for each dummy variable with the dependent variable. A frequency of zero in any single cell of the table means quasi-
12 REPORTING OF PARAMETER ESTIMATES AND STANDARD ERRORS 229 complete separation. Less commonly, if there are two zeroes (diagonally opposed) in the table, the condition is complete separation. Once you have determined which variables are causing separation problems, it s time to consider possible solutions. The potential solutions are somewhat different for complete and quasi-complete separation, so I will treat them separately. I begin with the more common problem of quasi-complete separation Solutions for Quasi-complete Separation Deletion of Problem Variables In practice, the most widely used method for dealing with quasi-complete separation is simply to delete from the model any variables whose coefficients did not converge. I do not recommend this method. If a variable has quasi-complete separation with the dependent variable, it is natural to suspect that variable has a strong (albeit, non-infinite) effect on the dependent variable. Deleting variables with strong effects will certainly obscure the effects of those variables, and is also likely to bias the coefficients for other variables in the model Combining Categories As noted earlier, the most common cause of quasi-complete separation is a dummy predictor variable such that, for one level of the variable, either every observation has the event or no observation has the event. For those cases in which the problem variable is one of set of variables representing a single categorical variable, the problem can often be easily solved by combining categories. For example, suppose that marital status has five categories: never married, currently married, divorced, separated, and widowed. This variable could be represented by four dummy variables, with currently married as the reference category. Suppose, further, that the sample contains 50 persons who are divorced but only 10 who are separated. If the dependent variable is 1 for employed and 0 for unemployed, it s quite plausible that all 10 of the separated persons would be employed, leading to quasi-complete separation. A natural and simple solution is to combine the divorced and separated categories, turning two dummy variables into a single dummy variable. Similar problems often arise when a quantitative variable, like age, is chopped into a set of categories, with dummy variables for all but one of the categories. Although this can be a useful device for representing non-linear effects, it can easily lead to quasi-complete separation if the number of categories is large and the number of cases within some categories is small. The solution is to use a smaller number of categories, or perhaps revert to the original quantitative representation of the variable. If the dummy variable represents an irreducible dichotomy, like sex, then this solution is clearly not feasible. However, there is another simple method that often provides a very satisfactory solution Do Nothing and Report Likelihood Ratio Chi-Squares Just because maximum likelihood estimates don t exist for some coefficients because of quasi-complete separation, that doesn t mean they don t exist for other variables in the logistic regression model. In fact, if one leaves the offending variables in the
13 T T T 230 CONVERGENCE PROBLEMS IN LOGISTIC REGRESSION model, the coefficients, standard errors, and test statistics for the remaining variables are still valid maximum likelihood estimates. So one attractive strategy is ust to leave the problem J variables in the model. The coefficients for those variables could be reported as B or ACB. The standard errors and Wald statistics for the problem variables will certainly be incorrect but, as noted above, likelihood ratio tests for the null hypothesis that the coefficient is zero are still valid. If these statistics are not available as options in the computer program, they can be easily obtained by fitting the model with and without each problem variable, then taking twice the positive difference in the log-likelihoods. If the problem variable is a dummy variable, then the estimates you get for the non-problem variables have a special interpretation. They are the ML estimates for the subsample of cases who fall into the category of the dummy variable in which observations differ on the dependent variable. For example, suppose that the dependent variable is whether or not a person smokes cigars. A dummy variable for sex is included in the model, but none of the women smoke cigars, producing quasicomplete separation. If sex is left in the model, the coefficients for the remaining variables (e.g., age, income, education) represent the effects of those variables among men only. (This can easily be verified by actually running the model for men only). The advantage of doing it in the full sample with sex as a covariate is that one also gets a test of the sex effect (using the likelihood ratio chi-square) while controlling for the other predictor variables Exact Inference As previously mentioned, problems of separation are most likely to occur in small samples and/or when there is an extreme split on the dependent variable. Of course, even without separation problems, maximum likelihood estimates may not have good properties in small samples. One possible solution is to abandon maximum likelihood entirely and do exact logistic regression. This method was originally proposed by Cox (1970), but was not computationally feasible until the advent the LogXact program and, more recently, the introduction of exact methods to the LOGISTIC procedure in SAS. Exact logistic regression is designed to produce exact p-values for the null hypothesis that each predictor variable has a coefficient of 0, conditional on all the other predictors. These p-values, based on permutations of the data rather than on large-sample chi-square approximations, are essentially unaffected by complete or quasi-complete separation. The coefficient estimates reported with this method are usually conditional maximum likelihood estimates, and these can break down when there is separation. In that event, both LogXact and PROC LOGISTIC report median unbiased estimates for the problem coefficients. If the true value is Á, a median unbiased estimator Á has the property œx '0 cg ðs Jx % œx q s mös %x (9.10) Hiri et al. (1989) demonstrated that the median unbiased estimator is generally more accurate than the maximum likelihood estimator for small sample sizes. I used PROC LOGISTIC to do exact estimation for the data in Tables 9.1 and 9.2. For the completely separated data in Table 9.1, the p-value for the coefficient of
14 REPORTING OF PARAMETER ESTIMATES AND STANDARD ERRORS 231 was The median unbiased estimate was For the quasi-completely separated data in Table9.2, the p-value was with a median unbiased estimate Despite the attractiveness of exact logistic regression, it s essential to emphasize that it is computationally feasible only for quite small samples. For example, I recently tried to estimate a model for 150 cases with a 2 to 1 split on the dependent variable and five independent variables. The standard version of LogXact was unable to handle the problem. An experimental version of LogXact using a Markov Chain Monte Carlo method took three months of computation to produce the p-value for ust one of the five independent variables Bayesian Estimation In those situations where none of the preceding solutions is appropriate, a natural approach is to do Bayesian estimation with a prior distribution on the regression coefficients (Hsu and Leonard 1997, Kahn and Raftery 1996). This should be easily accomplished with widely-available software like BUGS, but I have yet to try this approach. However, my very limited experience with this approach suggests that the results obtained are extremely sensitive to the choice of prior distribution Solutions for complete separation It is fortunate that complete separation is less common than quasi-complete separation because, when it occurs, it is considerably more difficult to deal with. For example, it s not feasible to leave the problem variable in the model because that makes it impossible to get maximum likelihood estimates for any other variables. And combining categories for dummy variables will not solve the problem either. Exact logistic regression may work for small sample problems, but even then complete separation can lead to breakdowns in both conditional maximum likelihood and median unbiased estimation. Bayesian estimation may be a feasible solution, but it does require an informative prior distribution on the problem parameters and the results may be sensitive to the choice of that distribution. With conventional logistic regression, about the only practical approach to complete separation is to delete the problem variable from the model. That allows one to get estimates for the remaining variables but, as noted earlier, the exclusion of the problem variable could lead to biased estimates for the other variables. If one chooses to go this route, it is also essential to compute and report likelihood ratio chi-squares or exact p-values (where feasible) for the problem variable so that the statistical significance of this variable can be assessed Extensions In this chapter, I have focused entirely on problems of non-convergence with binary logistic regression. But it s important to stress that complete and quasi-complete separation also lead to non-existence of maximum likelihood estimates under other
15 232 CONVERGENCE PROBLEMS IN LOGISTIC REGRESSION link functions for binary dependent variables, including the probit model and the complementary log-log model. For the most part, software treatment of data with separation is the same with these link functions as with the logit link. The possible solutions I described for the logistic model should also work for these alternative link functions, with one exception: the computation of exact p-values is only available for the logit link function. Data separation can also occur for the unordered multinomial logit model; in fact, complete and quasi-complete separation were first defined in this more general setting (Albert and Anderson 1984). Separation problems can also occur for the ordered (cumulative) logit model although, to my knowledge, separation has not been rigorously defined for this model. Table 9.4 displays data with complete separation for a three-valued, ordered dependent variable Table 9.4 Ordered Data Exhibiting Quasi-Complete Separation These data could be modified to produce quasi-complete separation by adding a new observation with h1 ¼ and 1. PROC LOGISTIC in SAS correctly identifies both of these conditions and issues the same warning messages we saw earlier. REFERENCES Albert, A., and J. A. Anderson. (1984). On the Existence of Maximum Likelihood Estimates in Logistic Regression Models. Biometrika 71, Amemiya, Takeshi. (1985). Advanced Econometrics. Cambridge, MA: Harvard University Press. Cox, D. R. (1970). The Continuity Correction. Biometrika 57, Hiri, Karim F., Anastasios A. Tsiatis, and Cyrus R. Mehta. (1989). Median Unbiased Estimation for Binary Data. The American Statistician 43, Hsu, John S. J., and Tom Leonard. (1997). Hierarchical Bayesian Semiparametric Procedures for Logistic Regression. Biometrika 84, Kahn, Michael J., and Adrian E. Raftery. (1996). Discharge Rates of Medicare Stroke Patients to Skilled Nursing Facilities: Bayesian Logistic Regression with Unobserved Heterogeneity. Journal of the American Statistical Association 91,
16 REPORTING OF PARAMETER ESTIMATES AND STANDARD ERRORS 233 Santner, Thomas J., and Diane E. Duffy. (1986). A Note on A. Albert and J. A. Anderson s Conditions for the Existence of Maximum Likelihood Estimates in Logistic Regression Models. Biometrika 73,
SAS Software to Fit the Generalized Linear Model
SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling
STATISTICA Formula Guide: Logistic Regression. Table of Contents
: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary
11. Analysis of Case-control Studies Logistic Regression
Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:
Statistics in Retail Finance. Chapter 2: Statistical models of default
Statistics in Retail Finance 1 Overview > We consider how to build statistical models of default, or delinquency, and how such models are traditionally used for credit application scoring and decision
Overview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)
Overview Classes 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) 2-4 Loglinear models (8) 5-4 15-17 hrs; 5B02 Building and
LOGISTIC REGRESSION. Nitin R Patel. where the dependent variable, y, is binary (for convenience we often code these values as
LOGISTIC REGRESSION Nitin R Patel Logistic regression extends the ideas of multiple linear regression to the situation where the dependent variable, y, is binary (for convenience we often code these values
Logistic Regression. http://faculty.chass.ncsu.edu/garson/pa765/logistic.htm#sigtests
Logistic Regression http://faculty.chass.ncsu.edu/garson/pa765/logistic.htm#sigtests Overview Binary (or binomial) logistic regression is a form of regression which is used when the dependent is a dichotomy
Gerry Hobbs, Department of Statistics, West Virginia University
Decision Trees as a Predictive Modeling Method Gerry Hobbs, Department of Statistics, West Virginia University Abstract Predictive modeling has become an important area of interest in tasks such as credit
GLM, insurance pricing & big data: paying attention to convergence issues.
GLM, insurance pricing & big data: paying attention to convergence issues. Michaël NOACK - [email protected] Senior consultant & Manager of ADDACTIS Pricing Copyright 2014 ADDACTIS Worldwide.
Statistics in Retail Finance. Chapter 6: Behavioural models
Statistics in Retail Finance 1 Overview > So far we have focussed mainly on application scorecards. In this chapter we shall look at behavioural models. We shall cover the following topics:- Behavioural
PROC LOGISTIC: Traps for the unwary Peter L. Flom, Independent statistical consultant, New York, NY
PROC LOGISTIC: Traps for the unwary Peter L. Flom, Independent statistical consultant, New York, NY ABSTRACT Keywords: Logistic. INTRODUCTION This paper covers some gotchas in SAS R PROC LOGISTIC. A gotcha
Poisson Models for Count Data
Chapter 4 Poisson Models for Count Data In this chapter we study log-linear models for count data under the assumption of a Poisson error structure. These models have many applications, not only to the
Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus
Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives
STA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! [email protected]! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct
NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
VI. Introduction to Logistic Regression
VI. Introduction to Logistic Regression We turn our attention now to the topic of modeling a categorical outcome as a function of (possibly) several factors. The framework of generalized linear models
LOGISTIC REGRESSION ANALYSIS
LOGISTIC REGRESSION ANALYSIS C. Mitchell Dayton Department of Measurement, Statistics & Evaluation Room 1230D Benjamin Building University of Maryland September 1992 1. Introduction and Model Logistic
Ordinal Regression. Chapter
Ordinal Regression Chapter 4 Many variables of interest are ordinal. That is, you can rank the values, but the real distance between categories is unknown. Diseases are graded on scales from least severe
Missing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13
Missing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13 Overview Missingness and impact on statistical analysis Missing data assumptions/mechanisms Conventional
1/27/2013. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2
PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 Introduce moderated multiple regression Continuous predictor continuous predictor Continuous predictor categorical predictor Understand
Getting Correct Results from PROC REG
Getting Correct Results from PROC REG Nathaniel Derby, Statis Pro Data Analytics, Seattle, WA ABSTRACT PROC REG, SAS s implementation of linear regression, is often used to fit a line without checking
Lecture 3: Linear methods for classification
Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,
Cool Tools for PROC LOGISTIC
Cool Tools for PROC LOGISTIC Paul D. Allison Statistical Horizons LLC and the University of Pennsylvania March 2013 www.statisticalhorizons.com 1 New Features in LOGISTIC ODDSRATIO statement EFFECTPLOT
MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group
MISSING DATA TECHNIQUES WITH SAS IDRE Statistical Consulting Group ROAD MAP FOR TODAY To discuss: 1. Commonly used techniques for handling missing data, focusing on multiple imputation 2. Issues that could
Binary Logistic Regression
Binary Logistic Regression Main Effects Model Logistic regression will accept quantitative, binary or categorical predictors and will code the latter two in various ways. Here s a simple model including
From the help desk: Bootstrapped standard errors
The Stata Journal (2003) 3, Number 1, pp. 71 80 From the help desk: Bootstrapped standard errors Weihua Guan Stata Corporation Abstract. Bootstrapping is a nonparametric approach for evaluating the distribution
Incorporating prior information to overcome complete separation problems in discrete choice model estimation
Incorporating prior information to overcome complete separation problems in discrete choice model estimation Bart D. Frischknecht Centre for the Study of Choice, University of Technology, Sydney, [email protected],
Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression
Logistic Regression Department of Statistics The Pennsylvania State University Email: [email protected] Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max
Lecture 19: Conditional Logistic Regression
Lecture 19: Conditional Logistic Regression Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina
Missing Data. Paul D. Allison INTRODUCTION
4 Missing Data Paul D. Allison INTRODUCTION Missing data are ubiquitous in psychological research. By missing data, I mean data that are missing for some (but not all) variables and for some (but not all)
Regression III: Advanced Methods
Lecture 4: Transformations Regression III: Advanced Methods William G. Jacoby Michigan State University Goals of the lecture The Ladder of Roots and Powers Changing the shape of distributions Transforming
LOGIT AND PROBIT ANALYSIS
LOGIT AND PROBIT ANALYSIS A.K. Vasisht I.A.S.R.I., Library Avenue, New Delhi 110 012 [email protected] In dummy regression variable models, it is assumed implicitly that the dependent variable Y
SUMAN DUVVURU STAT 567 PROJECT REPORT
SUMAN DUVVURU STAT 567 PROJECT REPORT SURVIVAL ANALYSIS OF HEROIN ADDICTS Background and introduction: Current illicit drug use among teens is continuing to increase in many countries around the world.
Logistic Regression (1/24/13)
STA63/CBB540: Statistical methods in computational biology Logistic Regression (/24/3) Lecturer: Barbara Engelhardt Scribe: Dinesh Manandhar Introduction Logistic regression is model for regression used
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
Chapter 6: The Information Function 129. CHAPTER 7 Test Calibration
Chapter 6: The Information Function 129 CHAPTER 7 Test Calibration 130 Chapter 7: Test Calibration CHAPTER 7 Test Calibration For didactic purposes, all of the preceding chapters have assumed that the
Multinomial and Ordinal Logistic Regression
Multinomial and Ordinal Logistic Regression ME104: Linear Regression Analysis Kenneth Benoit August 22, 2012 Regression with categorical dependent variables When the dependent variable is categorical,
The Probit Link Function in Generalized Linear Models for Data Mining Applications
Journal of Modern Applied Statistical Methods Copyright 2013 JMASM, Inc. May 2013, Vol. 12, No. 1, 164-169 1538 9472/13/$95.00 The Probit Link Function in Generalized Linear Models for Data Mining Applications
Developing Risk Adjustment Techniques Using the SAS@ System for Assessing Health Care Quality in the lmsystem@
Developing Risk Adjustment Techniques Using the SAS@ System for Assessing Health Care Quality in the lmsystem@ Yanchun Xu, Andrius Kubilius Joint Commission on Accreditation of Healthcare Organizations,
SUGI 29 Statistics and Data Analysis
Paper 194-29 Head of the CLASS: Impress your colleagues with a superior understanding of the CLASS statement in PROC LOGISTIC Michelle L. Pritchard and David J. Pasta Ovation Research Group, San Francisco,
Multivariate Logistic Regression
1 Multivariate Logistic Regression As in univariate logistic regression, let π(x) represent the probability of an event that depends on p covariates or independent variables. Then, using an inv.logit formulation
Linear Classification. Volker Tresp Summer 2015
Linear Classification Volker Tresp Summer 2015 1 Classification Classification is the central task of pattern recognition Sensors supply information about an object: to which class do the object belong
Logit Models for Binary Data
Chapter 3 Logit Models for Binary Data We now turn our attention to regression models for dichotomous data, including logistic regression and probit analysis. These models are appropriate when the response
A Basic Introduction to Missing Data
John Fox Sociology 740 Winter 2014 Outline Why Missing Data Arise Why Missing Data Arise Global or unit non-response. In a survey, certain respondents may be unreachable or may refuse to participate. Item
SPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing [email protected]
SPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing [email protected] IN SPSS SESSION 2, WE HAVE LEARNT: Elementary Data Analysis Group Comparison & One-way
Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C
Missing Data Techniques for Structural Equation Modeling
Journal of Abnormal Psychology Copyright 2003 by the American Psychological Association, Inc. 2003, Vol. 112, No. 4, 545 557 0021-843X/03/$12.00 DOI: 10.1037/0021-843X.112.4.545 Missing Data Techniques
Chapter 29 The GENMOD Procedure. Chapter Table of Contents
Chapter 29 The GENMOD Procedure Chapter Table of Contents OVERVIEW...1365 WhatisaGeneralizedLinearModel?...1366 ExamplesofGeneralizedLinearModels...1367 TheGENMODProcedure...1368 GETTING STARTED...1370
Using An Ordered Logistic Regression Model with SAS Vartanian: SW 541
Using An Ordered Logistic Regression Model with SAS Vartanian: SW 541 libname in1 >c:\=; Data first; Set in1.extract; A=1; PROC LOGIST OUTEST=DD MAXITER=100 ORDER=DATA; OUTPUT OUT=CC XBETA=XB P=PROB; MODEL
Handling attrition and non-response in longitudinal data
Longitudinal and Life Course Studies 2009 Volume 1 Issue 1 Pp 63-72 Handling attrition and non-response in longitudinal data Harvey Goldstein University of Bristol Correspondence. Professor H. Goldstein
CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS
Examples: Regression And Path Analysis CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS Regression analysis with univariate or multivariate dependent variables is a standard procedure for modeling relationships
Logistic Regression. Vibhav Gogate The University of Texas at Dallas. Some Slides from Carlos Guestrin, Luke Zettlemoyer and Dan Weld.
Logistic Regression Vibhav Gogate The University of Texas at Dallas Some Slides from Carlos Guestrin, Luke Zettlemoyer and Dan Weld. Generative vs. Discriminative Classifiers Want to Learn: h:x Y X features
Statistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
STATISTICA. Clustering Techniques. Case Study: Defining Clusters of Shopping Center Patrons. and
Clustering Techniques and STATISTICA Case Study: Defining Clusters of Shopping Center Patrons STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Web-based Analytics Table
Missing Data & How to Deal: An overview of missing data. Melissa Humphries Population Research Center
Missing Data & How to Deal: An overview of missing data Melissa Humphries Population Research Center Goals Discuss ways to evaluate and understand missing data Discuss common missing data methods Know
Predicting Successful Completion of the Nursing Program: An Analysis of Prerequisites and Demographic Variables
Predicting Successful Completion of the Nursing Program: An Analysis of Prerequisites and Demographic Variables Introduction In the summer of 2002, a research study commissioned by the Center for Student
Unit 12 Logistic Regression Supplementary Chapter 14 in IPS On CD (Chap 16, 5th ed.)
Unit 12 Logistic Regression Supplementary Chapter 14 in IPS On CD (Chap 16, 5th ed.) Logistic regression generalizes methods for 2-way tables Adds capability studying several predictors, but Limited to
Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm
Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm
Logistic regression modeling the probability of success
Logistic regression modeling the probability of success Regression models are usually thought of as only being appropriate for target variables that are continuous Is there any situation where we might
Generalized Linear Models
Generalized Linear Models We have previously worked with regression models where the response variable is quantitative and normally distributed. Now we turn our attention to two types of models where the
IBM SPSS Missing Values 22
IBM SPSS Missing Values 22 Note Before using this information and the product it supports, read the information in Notices on page 23. Product Information This edition applies to version 22, release 0,
Calculating Effect-Sizes
Calculating Effect-Sizes David B. Wilson, PhD George Mason University August 2011 The Heart and Soul of Meta-analysis: The Effect Size Meta-analysis shifts focus from statistical significance to the direction
PITFALLS IN TIME SERIES ANALYSIS. Cliff Hurvich Stern School, NYU
PITFALLS IN TIME SERIES ANALYSIS Cliff Hurvich Stern School, NYU The t -Test If x 1,..., x n are independent and identically distributed with mean 0, and n is not too small, then t = x 0 s n has a standard
Using Stata for Categorical Data Analysis
Using Stata for Categorical Data Analysis NOTE: These problems make extensive use of Nick Cox s tab_chi, which is actually a collection of routines, and Adrian Mander s ipf command. From within Stata,
Mgmt 469. Model Specification: Choosing the Right Variables for the Right Hand Side
Mgmt 469 Model Specification: Choosing the Right Variables for the Right Hand Side Even if you have only a handful of predictor variables to choose from, there are infinitely many ways to specify the right
Standard errors of marginal effects in the heteroskedastic probit model
Standard errors of marginal effects in the heteroskedastic probit model Thomas Cornelißen Discussion Paper No. 320 August 2005 ISSN: 0949 9962 Abstract In non-linear regression models, such as the heteroskedastic
CCNY. BME I5100: Biomedical Signal Processing. Linear Discrimination. Lucas C. Parra Biomedical Engineering Department City College of New York
BME I5100: Biomedical Signal Processing Linear Discrimination Lucas C. Parra Biomedical Engineering Department CCNY 1 Schedule Week 1: Introduction Linear, stationary, normal - the stuff biology is not
Agenda. Mathias Lanner Sas Institute. Predictive Modeling Applications. Predictive Modeling Training Data. Beslutsträd och andra prediktiva modeller
Agenda Introduktion till Prediktiva modeller Beslutsträd Beslutsträd och andra prediktiva modeller Mathias Lanner Sas Institute Pruning Regressioner Neurala Nätverk Utvärdering av modeller 2 Predictive
MORE ON LOGISTIC REGRESSION
DEPARTMENT OF POLITICAL SCIENCE AND INTERNATIONAL RELATIONS Posc/Uapp 816 MORE ON LOGISTIC REGRESSION I. AGENDA: A. Logistic regression 1. Multiple independent variables 2. Example: The Bell Curve 3. Evaluation
SYSTEMS OF REGRESSION EQUATIONS
SYSTEMS OF REGRESSION EQUATIONS 1. MULTIPLE EQUATIONS y nt = x nt n + u nt, n = 1,...,N, t = 1,...,T, x nt is 1 k, and n is k 1. This is a version of the standard regression model where the observations
Comparing Features of Convenient Estimators for Binary Choice Models With Endogenous Regressors
Comparing Features of Convenient Estimators for Binary Choice Models With Endogenous Regressors Arthur Lewbel, Yingying Dong, and Thomas Tao Yang Boston College, University of California Irvine, and Boston
Statistics Graduate Courses
Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.
CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES. From Exploratory Factor Analysis Ledyard R Tucker and Robert C.
CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES From Exploratory Factor Analysis Ledyard R Tucker and Robert C MacCallum 1997 180 CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES In
FORECASTING DEPOSIT GROWTH: Forecasting BIF and SAIF Assessable and Insured Deposits
Technical Paper Series Congressional Budget Office Washington, DC FORECASTING DEPOSIT GROWTH: Forecasting BIF and SAIF Assessable and Insured Deposits Albert D. Metz Microeconomic and Financial Studies
Imputing Missing Data using SAS
ABSTRACT Paper 3295-2015 Imputing Missing Data using SAS Christopher Yim, California Polytechnic State University, San Luis Obispo Missing data is an unfortunate reality of statistics. However, there are
ECLT5810 E-Commerce Data Mining Technique SAS Enterprise Miner -- Regression Model I. Regression Node
Enterprise Miner - Regression 1 ECLT5810 E-Commerce Data Mining Technique SAS Enterprise Miner -- Regression Model I. Regression Node 1. Some background: Linear attempts to predict the value of a continuous
Data Mining: An Overview of Methods and Technologies for Increasing Profits in Direct Marketing. C. Olivia Rud, VP, Fleet Bank
Data Mining: An Overview of Methods and Technologies for Increasing Profits in Direct Marketing C. Olivia Rud, VP, Fleet Bank ABSTRACT Data Mining is a new term for the common practice of searching through
CHAPTER 8 EXAMPLES: MIXTURE MODELING WITH LONGITUDINAL DATA
Examples: Mixture Modeling With Longitudinal Data CHAPTER 8 EXAMPLES: MIXTURE MODELING WITH LONGITUDINAL DATA Mixture modeling refers to modeling with categorical latent variables that represent subpopulations
Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.
Introduction to Hypothesis Testing CHAPTER 8 LEARNING OBJECTIVES After reading this chapter, you should be able to: 1 Identify the four steps of hypothesis testing. 2 Define null hypothesis, alternative
PS 271B: Quantitative Methods II. Lecture Notes
PS 271B: Quantitative Methods II Lecture Notes Langche Zeng [email protected] The Empirical Research Process; Fundamental Methodological Issues 2 Theory; Data; Models/model selection; Estimation; Inference.
MULTIPLE REGRESSION WITH CATEGORICAL DATA
DEPARTMENT OF POLITICAL SCIENCE AND INTERNATIONAL RELATIONS Posc/Uapp 86 MULTIPLE REGRESSION WITH CATEGORICAL DATA I. AGENDA: A. Multiple regression with categorical variables. Coding schemes. Interpreting
Predictor Coef StDev T P Constant 970667056 616256122 1.58 0.154 X 0.00293 0.06163 0.05 0.963. S = 0.5597 R-Sq = 0.0% R-Sq(adj) = 0.
Statistical analysis using Microsoft Excel Microsoft Excel spreadsheets have become somewhat of a standard for data storage, at least for smaller data sets. This, along with the program often being packaged
Testing Market Efficiency in a Fixed Odds Betting Market
WORKING PAPER SERIES WORKING PAPER NO 2, 2007 ESI Testing Market Efficiency in a Fixed Odds Betting Market Robin Jakobsson Department of Statistics Örebro University [email protected] By Niklas
Predicting the Performance of a First Year Graduate Student
Predicting the Performance of a First Year Graduate Student Luís Francisco Aguiar Universidade do Minho - NIPE Abstract In this paper, I analyse, statistically, if GRE scores are a good predictor of the
Efficient and Practical Econometric Methods for the SLID, NLSCY, NPHS
Efficient and Practical Econometric Methods for the SLID, NLSCY, NPHS Philip Merrigan ESG-UQAM, CIRPÉE Using Big Data to Study Development and Social Change, Concordia University, November 2103 Intro Longitudinal
FIXED EFFECTS AND RELATED ESTIMATORS FOR CORRELATED RANDOM COEFFICIENT AND TREATMENT EFFECT PANEL DATA MODELS
FIXED EFFECTS AND RELATED ESTIMATORS FOR CORRELATED RANDOM COEFFICIENT AND TREATMENT EFFECT PANEL DATA MODELS Jeffrey M. Wooldridge Department of Economics Michigan State University East Lansing, MI 48824-1038
Introduction to Quantitative Methods
Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................
Gamma Distribution Fitting
Chapter 552 Gamma Distribution Fitting Introduction This module fits the gamma probability distributions to a complete or censored set of individual or grouped data values. It outputs various statistics
Penalized regression: Introduction
Penalized regression: Introduction Patrick Breheny August 30 Patrick Breheny BST 764: Applied Statistical Modeling 1/19 Maximum likelihood Much of 20th-century statistics dealt with maximum likelihood
MORTGAGE LENDER PROTECTION UNDER INSURANCE ARRANGEMENTS Irina Genriha Latvian University, Tel. +371 26387099, e-mail: irina.genriha@inbox.
MORTGAGE LENDER PROTECTION UNDER INSURANCE ARRANGEMENTS Irina Genriha Latvian University, Tel. +371 2638799, e-mail: [email protected] Since September 28, when the crisis deepened in the first place
Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.
Course Catalog In order to be assured that all prerequisites are met, students must acquire a permission number from the education coordinator prior to enrolling in any Biostatistics course. Courses are
ln(p/(1-p)) = α +β*age35plus, where p is the probability or odds of drinking
Dummy Coding for Dummies Kathryn Martin, Maternal, Child and Adolescent Health Program, California Department of Public Health ABSTRACT There are a number of ways to incorporate categorical variables into
Problem of Missing Data
VASA Mission of VA Statisticians Association (VASA) Promote & disseminate statistical methodological research relevant to VA studies; Facilitate communication & collaboration among VA-affiliated statisticians;
Marginal Effects for Continuous Variables Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 21, 2015
Marginal Effects for Continuous Variables Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 21, 2015 References: Long 1997, Long and Freese 2003 & 2006 & 2014,
DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9
DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 Analysis of covariance and multiple regression So far in this course,
