Logit, Probit and Tobit: Models for Categorical and Limited
|
|
- Kelly Hamilton
- 7 years ago
- Views:
Transcription
1 Logit, Probit and Tobit: Models for Categorical and Limited Dependent Variables By Rajulton Fernando Presented at PLCS/RDC Statistics and Data Series at Western March 23, 2011
2 Introduction In social science research, categorical data are often collected through surveys. Categorical Nominal and Ordinal variables They take only a few values that do NOT have a metric. A) Binary Case Many dependent variables of interest take only two values (a dichotomous variable), denoting an event or non-event and coded as 1 and 0 respectively. Some examples: The labor force status of a person. Voting behavior of a person (in favor of a new policy). Whether a person got married or divorced. Whether a person involved in criminal behaviour, etc.
3 Introduction With such variables, we can build models that describe the response probabilities, say P(y i = 1), of the dependent variable y i. y i For a sample of N independently and identically distributed observations i = 1,...,N and a (K+1)-dimensional vector x i of explanatory variables, the probability bilit that t y takes value 1 is modeled as P ( yi = 1 xi ) = F ( xi β ) = F ( zi where β is a (K + 1)-dimensional column vector of parameters. The transformation function F is crucial. It maps the linear combination into [0,1] and satisfies in general F( ) =0 0, F(+ ) =1, andδf(z)/δz > 0 [that is, it is a cumulative distribution function]. )
4 The Logit and Probit Models When the transformation function F is the logistic function, the response probabilities are given by P( y i = 1 x i ) = x β i e x β And, when the transformation function F is the cumulative density function (cdf) of the standard normal distribution, the response probabilities are x β x β i given by i s P ( yi = 1 xi ) = Φ ( xi β ) = Φ ( s ) ds = e 2 The Logit and Probit models are almost identical (see the Figure next slide) and the choice of the model is arbitrary, although logit model has certain advantages (simplicity and ease of interpretation) 1+ e i 2π ds
5 Source: J.S. Long, 1997
6 The Logit and Probit Models However, the parameters of the two models are scaled differently. The parameter estimates in a logistic regression tend to be 1.6 to 1.8 times higher than they are in a corresponding probit model. The probit and logit models are estimated by maximum likelihood (ML), assuming independence across observations. The ML estimator of β is consistent and asymptotically normally distributed. However, the estimation rests on the strong assumption that the latent error term is normally distributed and homoscedastic. If homoscedasticity is violated, no easy solution.
7 The Logit and Probit Models Note: The response function (logistic or probit) is an S-shaped function, which implies a fixed change in X has a smaller impact on the probability when it is near zero than when it is near the middle. Thus, it is a non-linear response function. How to interpret the coefficients : In both models, If b > 0 p increases as X increases If b <0 p decreases as X increases As mentioned above, b cannot be interpreted as a simple slope as in ordinary regression. Because the rate at which the curve ascends or descends changes according to the value of X. In other words, it is not a constant change as in ordinary regression. The greatest rate of change is at p = 0.5
8 The Logit and Probit Models In the logit model, we can interpret b as an effect on the odds. That is, every unit increase in X results in a multiplicative effect of e b on the odds. Example: If b = 0.25, then e.25 = Thus, when X changes by one unit, p increases by a factor of 1.28, or changes by 28%. - In the probit model, use the Z-score terminology. For every unit increase in X, the Z-score (or the Probit of success ) increases by b units. [Or, we can also say that an increase in X changes Z by b standard deviation units.] - If you like, you can convert the z-score to probabilities y,y p using the normal table.
9 Models for Polytomous Data B) Polytomous Case Here we need to distinguish between purely nominal variables and really ordinal variables. When the variable is purely nominal, we can extend the dichotomous logit model, using one of the categories as reference and modeling the other responses j=1,2,..m-1 compared to the reference. Example: In the case of 3 categories, using the 3 rd category as the reference, logit p 1 = ln(p 1 /p 3 ) and logit p 2 = ln(p 2 /p 3 ), which will give two sets of parameter estimates. P( y P ( y P( y = exp( β 1x) = 1) = 1 + exp( β 1x) + exp( β 2 x) exp( β 2 x) = 2) = 1 + exp( β x) + exp( β x) 3) = exp( β x) exp( β x) 2
10 Polytomous Case When the variable is really ordinal, we use cumulative logits (or probits). The logits in this model are for cumulative categories at each point, contrasting categories above with categories below. Example: Suppose Y has 4 categories; then, logit (p 1 ) = ln{p 1 /(1p (1-p 1 )} = a 1 + bx logit (p 1 + p 2 ) = ln{(p 1 + p 2 )/(1-p 1 p 2 )} = a 2 + bx logit (p 1 +p 2 +p 3 ) = ln{(p 1 + p 2 + p 3 )/(1-p 1 p 2 p 3 )} = a 3 + bx Since these are cumulative logits, the probabilities are attached to being in category j and lower. Since the right side changes only in the intercepts, and not in the slope coefficient, this model is known as Proportional odds model. Thus, in ordered logistic, we need to test the assumption of proportionality as well.
11 Ordinal Logistic a 1, a 2, a 3 are the intercepts that satisfy the property a 1 < a 2 < a 3 interpreted as thresholds of the latent variable. Interpretation of parameter estimates depends on the software used! Check the software manual. If the RHS = a + bx, a positive coefficient is associated more with lower order categories and a negative coefficient is associated more with higher order categories. If the RHS = a bx, a negative coefficient is more associated with lower ordered categories, and a positive coefficient is more associated with higher ordered categories.
12 Model for Limited Dependent Variable C) Tobit Model This model is for metric dependent variable and when it is limited in the sense we observe it only if it is above or below some cut off level. For example, the wages may be limited from below by the minimum wage The donation amount give to charity Top coding income at, say, at $300,000 Time use and leisure activity of individuals Extramarital affairs It is also called censored regression model. Censoring can be from below or from above, also called left and right censoring. [Do not confuse the term censoring with the one used in dynamic modeling.]
13 The Tobit Model The model is called Tobit because it was first proposed by Tobin (1958), and involves aspects of Probit analysis a term coined by Goldberger for Tobin s Probit. Reasoning behind: If we include the censored observations as y = 0, the censored observations on the left will pull down the end of the line, resulting in underestimates of the intercept and overestimates of the slope. If we exclude the censored observations and just use the observations for which y>0 (that is, truncating the sample), it will overestimate the intercept and underestimate the slope. The degree of bias in both will increase as the number of observations that take on the value of zero increases. (see Figure next slide)
14 Source: J.S. Long
15 The Tobit Model The Tobit model uses all of the information, including info on censoring and provides consistent estimates. It is also a nonlinear model and similar to the probit model. It is estimated using maximum likelihood estimation techniques. The likelihood function for the tobit model takes the form: This is an unusual function, it consists of two terms, the first for non-censored observations (it is the pdf), and dthe second dfor censored observations (iti is the cdf).
16 The Tobit Model The estimated tobit coefficients are the marginal effects of a change in x j on y*, the unobservable latent variable and can be interpreted in the same way as in a linear regression model. But such an interpretation may not be useful since we are interested in the effect of X on the observable y (or change in the censored outcome). It can be shown that t change in y is found by multiplying l i the coefficient with Pr(a<y*<b), that is, the probability of being uncensored. Since this probability is a fraction, the marginal effect is actually attenuated. In the above, a and b denote lower and upper censoring points. For example, in left censoring, the limits will be: a =0, b=+.
17 Illustrations for logit, probit and tobit models, using womenwk.dta from Baum available at Descriptive Statistics N Minimum Maximum Mean Std. Deviation age education married children wagefull wage lw work lwf Valid N (listwise) 1343 Binary Logistic Regression Model Summary Step -2 Log likelihood Cox & Snell R Square Nagelkerke R Square a a. Estimation terminated at iteration number 5 because parameter estimates changed by less than.001. Hosmer and Lemeshow Test Step Chi-square df Sig Variables in the Equation B S.E. Wald df Sig. Exp(B) Step 1 a age education married children Constant a. Variable(s) entered on step 1: age, education, married, children.
18 Binary Probit Regression (in SPSS, use the ordinal regression menu and select probit link function. Ignore the test of parallel lines, etc.) Model Fitting Information Model -2 Log Likelihood Chi-Square df Sig. Intercept Only Final Link function: Probit. Parameter Estimates 95% Confidence Interval Estimate Std. Error Wald df Sig. Lower Bound Upper Bound Threshold [work = 0] Location age education children [married=0] [married=1] 0 a Link function: Probit. a. This parameter is set to zero because it is redundant. Tobit regression cannot be done in SPSS. Use Stata. Here are the Stata commands. First, fit simple OLS Regression of the variable lwf (just to check). regress lwf age married children education Source SS df MS Number of obs = F( 4, 1995) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = lwf Coef. Std. Err. t P> t [95% Conf. Interval] age married children education _cons tobit lwf age married children education, ll(0)
19 Tobit regression Number of obs = 2000 LR chi2(4) = Prob > chi2 = Log likelihood = Pseudo R2 = lwf Coef. Std. Err. t P> t [95% Conf. Interval] age married children education _cons /sigma Obs. summary: 657 left-censored observations at lwf<= uncensored observations 0 right-censored observations. mfx compute, predict(pr(0,.)) Marginal effects after tobit y = Pr(lwf>0) (predict, pr(0,.)) = variable dy/dx Std. Err. z P> z [ 95% C.I. ] X age married* children educat~n (*) dy/dx is for discrete change of dummy variable from 0 to 1. mfx compute, predict(e(0,.)) Marginal effects after tobit y = E(lwf lwf>0) (predict, e(0,.)) = variable dy/dx Std. Err. z P> z [ 95% C.I. ] X age married* children educat~n (*) dy/dx is for discrete change of dummy variable from 0 to 1
Marginal Effects for Continuous Variables Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 21, 2015
Marginal Effects for Continuous Variables Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 21, 2015 References: Long 1997, Long and Freese 2003 & 2006 & 2014,
More informationMultinomial and Ordinal Logistic Regression
Multinomial and Ordinal Logistic Regression ME104: Linear Regression Analysis Kenneth Benoit August 22, 2012 Regression with categorical dependent variables When the dependent variable is categorical,
More informationOrdinal Regression. Chapter
Ordinal Regression Chapter 4 Many variables of interest are ordinal. That is, you can rank the values, but the real distance between categories is unknown. Diseases are graded on scales from least severe
More informationHURDLE AND SELECTION MODELS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009
HURDLE AND SELECTION MODELS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 1. Introduction 2. A General Formulation 3. Truncated Normal Hurdle Model 4. Lognormal
More informationESTIMATING AVERAGE TREATMENT EFFECTS: IV AND CONTROL FUNCTIONS, II Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics
ESTIMATING AVERAGE TREATMENT EFFECTS: IV AND CONTROL FUNCTIONS, II Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 1. Quantile Treatment Effects 2. Control Functions
More informationLOGIT AND PROBIT ANALYSIS
LOGIT AND PROBIT ANALYSIS A.K. Vasisht I.A.S.R.I., Library Avenue, New Delhi 110 012 amitvasisht@iasri.res.in In dummy regression variable models, it is assumed implicitly that the dependent variable Y
More informationStandard errors of marginal effects in the heteroskedastic probit model
Standard errors of marginal effects in the heteroskedastic probit model Thomas Cornelißen Discussion Paper No. 320 August 2005 ISSN: 0949 9962 Abstract In non-linear regression models, such as the heteroskedastic
More informationIAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results
IAPRI Quantitative Analysis Capacity Building Series Multiple regression analysis & interpreting results How important is R-squared? R-squared Published in Agricultural Economics 0.45 Best article of the
More informationDiscussion Section 4 ECON 139/239 2010 Summer Term II
Discussion Section 4 ECON 139/239 2010 Summer Term II 1. Let s use the CollegeDistance.csv data again. (a) An education advocacy group argues that, on average, a person s educational attainment would increase
More informationOverview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)
Overview Classes 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) 2-4 Loglinear models (8) 5-4 15-17 hrs; 5B02 Building and
More informationAugust 2012 EXAMINATIONS Solution Part I
August 01 EXAMINATIONS Solution Part I (1) In a random sample of 600 eligible voters, the probability that less than 38% will be in favour of this policy is closest to (B) () In a large random sample,
More informationMULTIPLE REGRESSION EXAMPLE
MULTIPLE REGRESSION EXAMPLE For a sample of n = 166 college students, the following variables were measured: Y = height X 1 = mother s height ( momheight ) X 2 = father s height ( dadheight ) X 3 = 1 if
More informationHandling missing data in Stata a whirlwind tour
Handling missing data in Stata a whirlwind tour 2012 Italian Stata Users Group Meeting Jonathan Bartlett www.missingdata.org.uk 20th September 2012 1/55 Outline The problem of missing data and a principled
More informationECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2
University of California, Berkeley Prof. Ken Chay Department of Economics Fall Semester, 005 ECON 14 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE # Question 1: a. Below are the scatter plots of hourly wages
More informationLOGISTIC REGRESSION ANALYSIS
LOGISTIC REGRESSION ANALYSIS C. Mitchell Dayton Department of Measurement, Statistics & Evaluation Room 1230D Benjamin Building University of Maryland September 1992 1. Introduction and Model Logistic
More informationCorrelation and Regression
Correlation and Regression Scatterplots Correlation Explanatory and response variables Simple linear regression General Principles of Data Analysis First plot the data, then add numerical summaries Look
More informationFailure to take the sampling scheme into account can lead to inaccurate point estimates and/or flawed estimates of the standard errors.
Analyzing Complex Survey Data: Some key issues to be aware of Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 24, 2015 Rather than repeat material that is
More information11. Analysis of Case-control Studies Logistic Regression
Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:
More informationMODEL I: DRINK REGRESSED ON GPA & MALE, WITHOUT CENTERING
Interpreting Interaction Effects; Interaction Effects and Centering Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 20, 2015 Models with interaction effects
More informationStatistics 104 Final Project A Culture of Debt: A Study of Credit Card Spending in America TF: Kevin Rader Anonymous Students: LD, MH, IW, MY
Statistics 104 Final Project A Culture of Debt: A Study of Credit Card Spending in America TF: Kevin Rader Anonymous Students: LD, MH, IW, MY ABSTRACT: This project attempted to determine the relationship
More informationGeneralized Linear Models
Generalized Linear Models We have previously worked with regression models where the response variable is quantitative and normally distributed. Now we turn our attention to two types of models where the
More informationLecture 15. Endogeneity & Instrumental Variable Estimation
Lecture 15. Endogeneity & Instrumental Variable Estimation Saw that measurement error (on right hand side) means that OLS will be biased (biased toward zero) Potential solution to endogeneity instrumental
More informationRegression with a Binary Dependent Variable
Regression with a Binary Dependent Variable Chapter 9 Michael Ash CPPA Lecture 22 Course Notes Endgame Take-home final Distributed Friday 19 May Due Tuesday 23 May (Paper or emailed PDF ok; no Word, Excel,
More informationCHAPTER 12 EXAMPLES: MONTE CARLO SIMULATION STUDIES
Examples: Monte Carlo Simulation Studies CHAPTER 12 EXAMPLES: MONTE CARLO SIMULATION STUDIES Monte Carlo simulation studies are often used for methodological investigations of the performance of statistical
More informationNonlinear Regression Functions. SW Ch 8 1/54/
Nonlinear Regression Functions SW Ch 8 1/54/ The TestScore STR relation looks linear (maybe) SW Ch 8 2/54/ But the TestScore Income relation looks nonlinear... SW Ch 8 3/54/ Nonlinear Regression General
More informationSTATISTICA Formula Guide: Logistic Regression. Table of Contents
: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary
More informationis paramount in advancing any economy. For developed countries such as
Introduction The provision of appropriate incentives to attract workers to the health industry is paramount in advancing any economy. For developed countries such as Australia, the increasing demand for
More informationCHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS
Examples: Regression And Path Analysis CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS Regression analysis with univariate or multivariate dependent variables is a standard procedure for modeling relationships
More informationSimple linear regression
Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between
More informationDepartment of Economics Session 2012/2013. EC352 Econometric Methods. Solutions to Exercises from Week 10 + 0.0077 (0.052)
Department of Economics Session 2012/2013 University of Essex Spring Term Dr Gordon Kemp EC352 Econometric Methods Solutions to Exercises from Week 10 1 Problem 13.7 This exercise refers back to Equation
More informationSAS Software to Fit the Generalized Linear Model
SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling
More informationMulticollinearity Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 13, 2015
Multicollinearity Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 13, 2015 Stata Example (See appendices for full example).. use http://www.nd.edu/~rwilliam/stats2/statafiles/multicoll.dta,
More informationInteraction effects and group comparisons Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 20, 2015
Interaction effects and group comparisons Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 20, 2015 Note: This handout assumes you understand factor variables,
More informationModule 4 - Multiple Logistic Regression
Module 4 - Multiple Logistic Regression Objectives Understand the principles and theory underlying logistic regression Understand proportions, probabilities, odds, odds ratios, logits and exponents Be
More informationInteraction effects between continuous variables (Optional)
Interaction effects between continuous variables (Optional) Richard Williams, University of Notre Dame, http://www.nd.edu/~rwilliam/ Last revised February 0, 05 This is a very brief overview of this somewhat
More informationA Basic Introduction to Missing Data
John Fox Sociology 740 Winter 2014 Outline Why Missing Data Arise Why Missing Data Arise Global or unit non-response. In a survey, certain respondents may be unreachable or may refuse to participate. Item
More informationAuxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus
Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives
More informationLogit and Probit. Brad Jones 1. April 21, 2009. University of California, Davis. Bradford S. Jones, UC-Davis, Dept. of Political Science
Logit and Probit Brad 1 1 Department of Political Science University of California, Davis April 21, 2009 Logit, redux Logit resolves the functional form problem (in terms of the response function in the
More informationLinda K. Muthén Bengt Muthén. Copyright 2008 Muthén & Muthén www.statmodel.com. Table Of Contents
Mplus Short Courses Topic 2 Regression Analysis, Eploratory Factor Analysis, Confirmatory Factor Analysis, And Structural Equation Modeling For Categorical, Censored, And Count Outcomes Linda K. Muthén
More informationOutline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares
Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation
More informationLogistic Regression. http://faculty.chass.ncsu.edu/garson/pa765/logistic.htm#sigtests
Logistic Regression http://faculty.chass.ncsu.edu/garson/pa765/logistic.htm#sigtests Overview Binary (or binomial) logistic regression is a form of regression which is used when the dependent is a dichotomy
More informationBinary Logistic Regression
Binary Logistic Regression Main Effects Model Logistic regression will accept quantitative, binary or categorical predictors and will code the latter two in various ways. Here s a simple model including
More informationI L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN
Beckman HLM Reading Group: Questions, Answers and Examples Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Linear Algebra Slide 1 of
More informationStatistics in Retail Finance. Chapter 6: Behavioural models
Statistics in Retail Finance 1 Overview > So far we have focussed mainly on application scorecards. In this chapter we shall look at behavioural models. We shall cover the following topics:- Behavioural
More information10 Dichotomous or binary responses
10 Dichotomous or binary responses 10.1 Introduction Dichotomous or binary responses are widespread. Examples include being dead or alive, agreeing or disagreeing with a statement, and succeeding or failing
More informationCalculating the Probability of Returning a Loan with Binary Probability Models
Calculating the Probability of Returning a Loan with Binary Probability Models Associate Professor PhD Julian VASILEV (e-mail: vasilev@ue-varna.bg) Varna University of Economics, Bulgaria ABSTRACT The
More informationMultiple logistic regression analysis of cigarette use among high school students
Multiple logistic regression analysis of cigarette use among high school students ABSTRACT Joseph Adwere-Boamah Alliant International University A binary logistic regression analysis was performed to predict
More informationVI. Introduction to Logistic Regression
VI. Introduction to Logistic Regression We turn our attention now to the topic of modeling a categorical outcome as a function of (possibly) several factors. The framework of generalized linear models
More informationANNUITY LAPSE RATE MODELING: TOBIT OR NOT TOBIT? 1. INTRODUCTION
ANNUITY LAPSE RATE MODELING: TOBIT OR NOT TOBIT? SAMUEL H. COX AND YIJIA LIN ABSTRACT. We devise an approach, using tobit models for modeling annuity lapse rates. The approach is based on data provided
More informationCREDIT SCORING MODEL APPLICATIONS:
Örebro University Örebro University School of Business Master in Applied Statistics Thomas Laitila Sune Karlsson May, 2014 CREDIT SCORING MODEL APPLICATIONS: TESTING MULTINOMIAL TARGETS Gabriela De Rossi
More informationLogit Models for Binary Data
Chapter 3 Logit Models for Binary Data We now turn our attention to regression models for dichotomous data, including logistic regression and probit analysis. These models are appropriate when the response
More informationChapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS
Chapter Seven Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS Section : An introduction to multiple regression WHAT IS MULTIPLE REGRESSION? Multiple
More informationoutreg help pages Write formatted regression output to a text file After any estimation command: (Text-related options)
outreg help pages OUTREG HELP PAGES... 1 DESCRIPTION... 2 OPTIONS... 3 1. Text-related options... 3 2. Coefficient options... 4 3. Options for t statistics, standard errors, etc... 5 4. Statistics options...
More informationInternational Statistical Institute, 56th Session, 2007: Phil Everson
Teaching Regression using American Football Scores Everson, Phil Swarthmore College Department of Mathematics and Statistics 5 College Avenue Swarthmore, PA198, USA E-mail: peverso1@swarthmore.edu 1. Introduction
More informationRegression Analysis: A Complete Example
Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty
More informationInstitut für Soziologie Eberhard Karls Universität Tübingen www.maartenbuis.nl
from Indirect Extracting from Institut für Soziologie Eberhard Karls Universität Tübingen www.maartenbuis.nl from Indirect What is the effect of x on y? Which effect do I choose: average marginal or marginal
More informationSUMAN DUVVURU STAT 567 PROJECT REPORT
SUMAN DUVVURU STAT 567 PROJECT REPORT SURVIVAL ANALYSIS OF HEROIN ADDICTS Background and introduction: Current illicit drug use among teens is continuing to increase in many countries around the world.
More informationHow to set the main menu of STATA to default factory settings standards
University of Pretoria Data analysis for evaluation studies Examples in STATA version 11 List of data sets b1.dta (To be created by students in class) fp1.xls (To be provided to students) fp1.txt (To be
More information25 Working with categorical data and factor variables
25 Working with categorical data and factor variables Contents 25.1 Continuous, categorical, and indicator variables 25.1.1 Converting continuous variables to indicator variables 25.1.2 Converting continuous
More informationPlease follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software
STATA Tutorial Professor Erdinç Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software 1.Wald Test Wald Test is used
More informationModule 5: Multiple Regression Analysis
Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College
More informationIII. INTRODUCTION TO LOGISTIC REGRESSION. a) Example: APACHE II Score and Mortality in Sepsis
III. INTRODUCTION TO LOGISTIC REGRESSION 1. Simple Logistic Regression a) Example: APACHE II Score and Mortality in Sepsis The following figure shows 30 day mortality in a sample of septic patients as
More information1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96
1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
More informationMultinomial Logistic Regression
Multinomial Logistic Regression Dr. Jon Starkweather and Dr. Amanda Kay Moske Multinomial logistic regression is used to predict categorical placement in or the probability of category membership on a
More informationSome Essential Statistics The Lure of Statistics
Some Essential Statistics The Lure of Statistics Data Mining Techniques, by M.J.A. Berry and G.S Linoff, 2004 Statistics vs. Data Mining..lie, damn lie, and statistics mining data to support preconceived
More informationRegression III: Advanced Methods
Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models
More informationNonlinear relationships Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 20, 2015
Nonlinear relationships Richard Williams, University of Notre Dame, http://www.nd.edu/~rwilliam/ Last revised February, 5 Sources: Berry & Feldman s Multiple Regression in Practice 985; Pindyck and Rubinfeld
More informationFrom the help desk: Bootstrapped standard errors
The Stata Journal (2003) 3, Number 1, pp. 71 80 From the help desk: Bootstrapped standard errors Weihua Guan Stata Corporation Abstract. Bootstrapping is a nonparametric approach for evaluating the distribution
More informationSPSS Guide: Regression Analysis
SPSS Guide: Regression Analysis I put this together to give you a step-by-step guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar
More informationFrom this it is not clear what sort of variable that insure is so list the first 10 observations.
MNL in Stata We have data on the type of health insurance available to 616 psychologically depressed subjects in the United States (Tarlov et al. 1989, JAMA; Wells et al. 1989, JAMA). The insurance is
More informationStata Walkthrough 4: Regression, Prediction, and Forecasting
Stata Walkthrough 4: Regression, Prediction, and Forecasting Over drinks the other evening, my neighbor told me about his 25-year-old nephew, who is dating a 35-year-old woman. God, I can t see them getting
More informationThe Probit Link Function in Generalized Linear Models for Data Mining Applications
Journal of Modern Applied Statistical Methods Copyright 2013 JMASM, Inc. May 2013, Vol. 12, No. 1, 164-169 1538 9472/13/$95.00 The Probit Link Function in Generalized Linear Models for Data Mining Applications
More informationLab 5 Linear Regression with Within-subject Correlation. Goals: Data: Use the pig data which is in wide format:
Lab 5 Linear Regression with Within-subject Correlation Goals: Data: Fit linear regression models that account for within-subject correlation using Stata. Compare weighted least square, GEE, and random
More informationDETERMINANTS OF CAPITAL ADEQUACY RATIO IN SELECTED BOSNIAN BANKS
DETERMINANTS OF CAPITAL ADEQUACY RATIO IN SELECTED BOSNIAN BANKS Nađa DRECA International University of Sarajevo nadja.dreca@students.ius.edu.ba Abstract The analysis of a data set of observation for 10
More informationPoisson Models for Count Data
Chapter 4 Poisson Models for Count Data In this chapter we study log-linear models for count data under the assumption of a Poisson error structure. These models have many applications, not only to the
More informationData Analysis Methodology 1
Data Analysis Methodology 1 Suppose you inherited the database in Table 1.1 and needed to find out what could be learned from it fast. Say your boss entered your office and said, Here s some software project
More informationInterpretation of Somers D under four simple models
Interpretation of Somers D under four simple models Roger B. Newson 03 September, 04 Introduction Somers D is an ordinal measure of association introduced by Somers (96)[9]. It can be defined in terms
More informationFrom the help desk: hurdle models
The Stata Journal (2003) 3, Number 2, pp. 178 184 From the help desk: hurdle models Allen McDowell Stata Corporation Abstract. This article demonstrates that, although there is no command in Stata for
More informationAn assessment of consumer willingness to pay for Renewable Energy Sources use in Italy: a payment card approach.
An assessment of consumer willingness to pay for Renewable Energy Sources use in Italy: a payment card approach. -First findings- University of Perugia Department of Economics, Finance and Statistics 1
More informationGLM I An Introduction to Generalized Linear Models
GLM I An Introduction to Generalized Linear Models CAS Ratemaking and Product Management Seminar March 2009 Presented by: Tanya D. Havlicek, Actuarial Assistant 0 ANTITRUST Notice The Casualty Actuarial
More informationCurve Fitting. Before You Begin
Curve Fitting Chapter 16: Curve Fitting Before You Begin Selecting the Active Data Plot When performing linear or nonlinear fitting when the graph window is active, you must make the desired data plot
More informationOverview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
More informationTitle. Syntax. stata.com. fp Fractional polynomial regression. Estimation
Title stata.com fp Fractional polynomial regression Syntax Menu Description Options for fp Options for fp generate Remarks and examples Stored results Methods and formulas Acknowledgment References Also
More informationCHAPTER 8 EXAMPLES: MIXTURE MODELING WITH LONGITUDINAL DATA
Examples: Mixture Modeling With Longitudinal Data CHAPTER 8 EXAMPLES: MIXTURE MODELING WITH LONGITUDINAL DATA Mixture modeling refers to modeling with categorical latent variables that represent subpopulations
More informationStatistics in Retail Finance. Chapter 2: Statistical models of default
Statistics in Retail Finance 1 Overview > We consider how to build statistical models of default, or delinquency, and how such models are traditionally used for credit application scoring and decision
More informationLecture 2 ESTIMATING THE SURVIVAL FUNCTION. One-sample nonparametric methods
Lecture 2 ESTIMATING THE SURVIVAL FUNCTION One-sample nonparametric methods There are commonly three methods for estimating a survivorship function S(t) = P (T > t) without resorting to parametric models:
More informationA Simple Feasible Alternative Procedure to Estimate Models with High-Dimensional Fixed Effects
DISCUSSION PAPER SERIES IZA DP No. 3935 A Simple Feasible Alternative Procedure to Estimate Models with High-Dimensional Fixed Effects Paulo Guimarães Pedro Portugal January 2009 Forschungsinstitut zur
More informationMultiple Choice Models II
Multiple Choice Models II Laura Magazzini University of Verona laura.magazzini@univr.it http://dse.univr.it/magazzini Laura Magazzini (@univr.it) Multiple Choice Models II 1 / 28 Categorical data Categorical
More information" Y. Notation and Equations for Regression Lecture 11/4. Notation:
Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through
More informationAddressing Alternative. Multiple Regression. 17.871 Spring 2012
Addressing Alternative Explanations: Multiple Regression 17.871 Spring 2012 1 Did Clinton hurt Gore example Did Clinton hurt Gore in the 2000 election? Treatment is not liking Bill Clinton 2 Bivariate
More informationCategorical Data Analysis
Richard L. Scheaffer University of Florida The reference material and many examples for this section are based on Chapter 8, Analyzing Association Between Categorical Variables, from Statistical Methods
More informationIt is important to bear in mind that one of the first three subscripts is redundant since k = i -j +3.
IDENTIFICATION AND ESTIMATION OF AGE, PERIOD AND COHORT EFFECTS IN THE ANALYSIS OF DISCRETE ARCHIVAL DATA Stephen E. Fienberg, University of Minnesota William M. Mason, University of Michigan 1. INTRODUCTION
More informationMORE ON LOGISTIC REGRESSION
DEPARTMENT OF POLITICAL SCIENCE AND INTERNATIONAL RELATIONS Posc/Uapp 816 MORE ON LOGISTIC REGRESSION I. AGENDA: A. Logistic regression 1. Multiple independent variables 2. Example: The Bell Curve 3. Evaluation
More informationRockefeller College University at Albany
Rockefeller College University at Albany PAD 705 Handout: Hypothesis Testing on Multiple Parameters In many cases we may wish to know whether two or more variables are jointly significant in a regression.
More informationEnd User Satisfaction With a Food Manufacturing ERP
Applied Mathematical Sciences, Vol. 8, 2014, no. 24, 1187-1192 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2014.4284 End-User Satisfaction in ERP System: Application of Logit Modeling Hashem
More informationChapter 7: Simple linear regression Learning Objectives
Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -
More information2. Linear regression with multiple regressors
2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions
More informationPARALLEL LINES ASSUMPTION IN ORDINAL LOGISTIC REGRESSION AND ANALYSIS APPROACHES
International Interdisciplinary Journal of Scientific Research ISSN: 2200-9833 www.iijsr.org PARALLEL LINES ASSUMPTION IN ORDINAL LOGISTIC REGRESSION AND ANALYSIS APPROACHES Erkan ARI 1 and Zeki YILDIZ
More informationOdds ratio, Odds ratio test for independence, chi-squared statistic.
Odds ratio, Odds ratio test for independence, chi-squared statistic. Announcements: Assignment 5 is live on webpage. Due Wed Aug 1 at 4:30pm. (9 days, 1 hour, 58.5 minutes ) Final exam is Aug 9. Review
More information