Multiple Choice Models II
|
|
- Katherine Briggs
- 7 years ago
- Views:
Transcription
1 Multiple Choice Models II Laura Magazzini University of Verona Laura Magazzini Multiple Choice Models II 1 / 28
2 Categorical data Categorical variable models Y is the result of a single decision among more than 2 alternatives Unordered choice set: Categories/Qualitative choices multinomial logit, conditional logit, nested logit Ordered choice set (rankings): models for ordered data ordered probit Laura Magazzini (@univr.it) Multiple Choice Models II 2 / 28
3 Example: Education and Occupational Choice Education Primary/Secondary University Occupation School or more Total Menial 23 (74.19%) 8 (25.81%) 31 (100%) Blue Collar 60 (86.96%) 9 (13.04%) 69 (100%) Craft 65 (77.38%) 19 (22.62%) 84 (100%) WhiteCol 27 (65.85%) 14 (34.15%) 41 (100%) Prof 27 (24.11%) 85 (75.89%) 112 (100%) Total 202 (59.94%) 135 (40.06%) 337 (100%) Laura Magazzini (@univr.it) Multiple Choice Models II 3 / 28
4 Multinomial distribution Y i : qualitative random variable with J categories P ij = Pr(Y i = j), j = 1, 2,..., J Probability that individual i will choose alternative j Categories are mutually exclusive and exaustive: P ij = 1, i = 1, 2,..., N j Let d i = (d i1, d i2,..., d ij ), where d ij = 1 if Y i = j j d ij = 1, i = 1, 2,..., N Laura Magazzini (@univr.it) Multiple Choice Models II 4 / 28
5 Multinomial logit model (MNL) Y : result of a choice among J alternatives (J > 2) d i = (d i1, d i2,..., d ij ), where d ij = 1 if Y i = j P ij = Pr(Y i = j), j P ij = 1 Logit model: Pr(Y i = j) = exp(η ij ) J l=1 exp(η il) Laura Magazzini (@univr.it) Multiple Choice Models II 5 / 28
6 Properties of MNL Categorical variable models 0 P ij 1 j P ij = 1 (by definition) For every pair of alternatives (k, l), the probability ratio is P ik P il = exp(η ik) exp(η il ) log P ik P il = η ik η il The model can be motivated by a random utility model Laura Magazzini (@univr.it) Multiple Choice Models II 6 / 28
7 Random Utility Models (1) McFadden (1973, 2001) J alternatives: mutually exclusive, exhaustive, finite set Examples: competing brands, different means of transport, different occupations,... Categories can be ordered or unordered Different tecniques will be employed according to the nature of the alternatives Assume non-ordered alternatives Rational agent chooses the alternative that maximizes his/her utility: Y i = j if U ij > U ik for each k j Laura Magazzini (@univr.it) Multiple Choice Models II 7 / 28
8 Random Utility Models (2) McFadden (1973, 2001) Linear utility model: U ij = η ij + ɛ ij with η ij = LC(z ij, θ) η ij links the agent utility to factors that can be observed η ij is different from U ij since there are factors that cannot be observed by the researcher Pr(Y i = j) = Pr(U ij > U ik, k j) = Pr(η ij + ɛ ij > η ik + ɛ ik, k j) = Pr(ɛ ik ɛ ij < η ij η ik, k j) = I (ɛik ɛ ij <η ij η ik, k j)f (ɛ)dɛ ɛ with f probability density function of ɛ The model is made operational by a particular choice of distribution for the disturbance Closed functional forms exist only for few specifications (e.g. logit) Laura Magazzini (@univr.it) Multiple Choice Models II 8 / 28
9 How to specify η ij? Categorical variable models Standard MNL η ij = x i β j x individual characteristics, constant across all the alternatives j Conditional logit model η ij = z ij γ z ij characteristics of the choice j and individual i - Datasets typically analyzed by economists do not contain mixtures of individual and choice-specific attributes - CLM is usually applied when the interest is in the effect of choice-specific attributes - Custom transformation is needed for variables containing individual-specific attributes Laura Magazzini (@univr.it) Multiple Choice Models II 9 / 28
10 Standard MNL Pr(Y i = j x i ) = exp(x i β j) J l=1 exp(x i β l) It is not possible to estimates all the β 1,..., β J By adding a constant to all the βs, the probability doesn t change Indeterminacy in the model is removed by letting β 1 = 0 J = 1 is the reference category Pr(Y i = j x i ) = exp(x i β j) 1 + J l=2 exp(x i β l) Intercept in the model is allowed by letting the first column of x i = 1 for every i Laura Magazzini (@univr.it) Multiple Choice Models II 10 / 28
11 Estimation: MLE The log likelihood can be written as ln L = n J d ij ln Pr(Y i = j) i=1 j=1 with d ij = 1 if Y i = j, 0 otherwise The derivatives have the characteristically simple form: ln L β j = i (d ij P ij )x i = 0 As a consequence, if the model is estimated with an intercept, i d ij = i P ij = 1 Laura Magazzini (@univr.it) Multiple Choice Models II 11 / 28
12 Interpretation of the parameters The partial effects for this model are complicated: [ ] P j J = P j β j P k β k = P j [β j β] x i k=1 The coefficients in this model are difficult to interpret: P j / x k need not have the same sign as β jk A simpler interpretation by considering the odds ratio: ln P ij P i1 = x i β j ln P ij P ik = x i (β j β k ) if k 1 In case of dummy variables (coded as 0 or 1) ln P ij(x i =1) P i1(xi = β =1) j ln P ij(x i =1) P ik(xi = β =1) j β k if k 1 Laura Magazzini (@univr.it) Multiple Choice Models II 12 / 28
13 Conditional logit model Pr(Y i = j z j ) = exp(z j β) J k=1 exp(z kβ) The model contains choice-specific attributes The coefficients of individual-specific attributes (that do not vary across categories) are not identified Individual-specific variable can be inserted in the model, but need to be properly transformed All the coefficients of the choice-specific attributes cannot be separately identified: adding a constant to all the coefficients does not change the estimated probability The intercept is set to zero Laura Magazzini (@univr.it) Multiple Choice Models II 13 / 28
14 Marginal effects Categorical variable models P j (z) z k = β k [P j (z)(i (j=k) P k (z))] P j (z) z j = β z [P j (z)(1 P j (z))] P j (z) z h = β z P j (z)p h (z) (j h) P j change monotonically with respect to z The sign of the derivative depends on the sign of β z Opposite effect by considering z j or z h Simmetry: P j z h = P h z j P j does not change if all the variables z kh change in the same direction (the ranking of U ij is unchanged!) Laura Magazzini (@univr.it) Multiple Choice Models II 14 / 28
15 Multinomial logit (MNL) vs conditional logit (CNL) Similar response probabilities, but they differ in some important respects MNL: the conditioning variables do not change across alternatives Characteristics of the alternatives are unimportant or not of interest, or data are not available Example: occupational choice we do not know how much someone could make in every occupation We can collect data on factors affecting individual productivity and tastes, e.g education, past experience MNL: factors can have different effects on relative probabilities (different β j for different choices) CNL: choices on the basis of observable attributes of each alternative Common β MNL as a special case of CNL Important limitation: independence from irrelevant alternatives assumption Laura Magazzini (@univr.it) Multiple Choice Models II 15 / 28
16 Independence from irrelevant alternatives (logit) For every pair of alternatives (k, l), the probability ratio (odd) is ω = Pr(Y i = k x ik ) Pr(Y i = l x il ) = exp(η ik) exp(η il ) ω depends only on the linear predictors (η) of the considered alternatives, not on the whole set of alternatives From the point of view of estimation, it is useful that the odds ratio does not depend on the other choices But it is not a particularly appealing restriction to place on consumer behaviour Laura Magazzini (@univr.it) Multiple Choice Models II 16 / 28
17 IIA: example by McFadden (1984) Commuters initially choosing between cars and red buses with equal probabilities Suppose a third mode (blue buses) is added and commuters do not care about the colur of the bus (i.e. will chose between these with equal probability) IIA imply that the fraction of commuters taking a car would fall from, a result that is not very realistic 1 2 to 1 3 Laura Magazzini (@univr.it) Multiple Choice Models II 17 / 28
18 Testing IIA Hausman and McFadden (1984) If a subset of the choice set is truly irrelevant, omitting it from the model altogether will not change the parameter estimates sistematically Exclusion of these choices will be inefficient but will not lead to inconsistency But if the remaining odds are not truly independent from these alternatives, then the parameter estimates obtained when these choices are included will be inconsistent Therefore, Hausman s specification test can be applied Laura Magazzini (@univr.it) Multiple Choice Models II 18 / 28
19 The Hausman s specification test Consider two different estimators ˆθ E and ˆθ I Under H0, ˆθ E and ˆθ I are both consistent and ˆθ E is efficient relative to ˆθ I Under H1, ˆθ I remains consistent while ˆθ E is inconsistent Then H0 can be tested by using the Hausman statistics: H = (ˆθ I ˆθ E ) [Est.Asy.Var(ˆθ I ˆθ E )] 1 (ˆθ I ˆθ E ) = (ˆθ I ˆθ E ) [Est.Asy.Var(ˆθ I ) Est.Asy.Var(ˆθ E )] 1 (ˆθ I ˆθ E ) d χ 2 J The appropriate degree of freedom for the test will depend on the context In the case of MNL, J is the number of parameter in the estimating equation of the restricted choice set Laura Magazzini (@univr.it) Multiple Choice Models II 19 / 28
20 What if IIA hypothesis is not satisfied? (1) Multivariate probit model U j = β x j + ɛ j, j = 1,..., J, [ɛ 1, ɛ 2,..., ɛ J ] N(0, Σ) Pr(Y i = j) = Pr(U j > U k, j = 1, 2,..., J, k j) Main obstacle: difficulty in computing the multivariate normal probability for any dimensionality higher than 2 Recent advances in accurate simulations of multinormal integrals have made estimation of MNP more feasible Simulation-based estimation Laura Magazzini (@univr.it) Multiple Choice Models II 20 / 28
21 IIA is maintained within groups, but does not need to hold across groups Main limitations Results can depend on the way in which groups are formed... There is no specification test to discriminated among different Laura Magazzini groupings Multiple Choice Models II 21 / 28 Categorical variable models What if IIA hypothesis is not satisfied? (2) Generalized extreme value: Nested logit models Very appealing if it is possible to assume sequential choices The J alternatives are grouped into L subgroups: (1) First the group of alternative is chosen (2) Then, one alternative is chosen within the group
22 Treatment of rankings Ordered data Y can assume a limited number of categories y c, c = 0, 1,..., C Categories are inherently ordered: y 0 < y 1 < y 2 < y C Examples: Bond rating: AAA-D Symptoms: none, minor, serious Drug effect: worsen, none, partial recovery, full recovery Customer satisfaction: very unsatisfied, unsatisfied, satisfied, very satisfied... Ordered probit and logit models Multinomial models would fail to account for the ordinal nature of the dependent variable OLS would attach a meaning to the difference between the category codings Laura Magazzini (@univr.it) Multiple Choice Models II 22 / 28
23 Latent regression Categorical variable models Treatment of rankings We consider a continuous latent variable y (unobserved), linear function of x and ɛ: y = x β + ɛ We observe y = c γ c < y γ c+1, with γ 0 = e γ C+1 = + The latent response is specified by a linear regression model without the intercept Laura Magazzini (@univr.it) Multiple Choice Models II 23 / 28
24 Ordered Probit Model y = x β + ɛ with ɛ N(0, 1) Categorical variable models Treatment of rankings Pr(y i = 0 x) = Pr(yi γ 1 ) = Pr(ɛ i γ 1 x β x) = Φ(γ 1 x β) Pr(y i = 1 x) = Pr(γ 1 < yi γ 2 ) = Φ(γ 2 x β) Φ(γ 1 x β). Pr(y i = C x) = Pr(y i > γ C ) = 1 Φ(γ C x β) Usually y has no real meaning The interest is in Pr(y x) rather than E(y x) To identify the parameters: x cannot contain the intercept If you have to specify a model with an intercept, set γ 1 = 0 Laura Magazzini (@univr.it) Multiple Choice Models II 24 / 28
25 Marginal effects Categorical variable models Treatment of rankings Coefficients are difficult to interpret: Pr(y i =0 x) x j = β j φ(γ 1 x β) sign opposite to the sign of β j Pr(y i =c x) x j ambiguous sign!!! = β j [φ(γ c+1 x β) φ(γ c x β)] Pr(y i =C x) x j = β j φ(γ C x β) same sign as β j Laura Magazzini (@univr.it) Multiple Choice Models II 25 / 28
26 Treatment of rankings Changes in y and y in response to changes in x Increasing one of the x s while holding β and γ constant is equivalent to shifting the distribution of y to the right (solid to dashed curve) Laura Magazzini (@univr.it) Multiple Choice Models II 26 / 28
27 Treatment of rankings Ordered Logistic Regression: ɛ i logistica Proportional odds model Pr(y i > c) = ( log Pr(yi >c) 1 Pr(y i >c) exp(x i β γc) 1+exp(x i β γc) ) = x i β γ c Pr(y i >c)/[1 Pr(y i >c)] Pr(y j >c)/[1 Pr(y j >c)] = exp[(x i x j ) β] Doesn t depend on the threshold Laura Magazzini (@univr.it) Multiple Choice Models II 27 / 28
28 Treatment of rankings Ordered Probit vs. Ordered Logit Coefficients and threshold parameters are different due to different scale factors (σ probit = 1, whereas σ logit = π 2 /3) Predicted probabilities are similar Marginal effects are similar If the logit is chosen, estimated coefficients can be interpreted in terms of odds Laura Magazzini (@univr.it) Multiple Choice Models II 28 / 28
Multinomial and Ordinal Logistic Regression
Multinomial and Ordinal Logistic Regression ME104: Linear Regression Analysis Kenneth Benoit August 22, 2012 Regression with categorical dependent variables When the dependent variable is categorical,
More informationLOGIT AND PROBIT ANALYSIS
LOGIT AND PROBIT ANALYSIS A.K. Vasisht I.A.S.R.I., Library Avenue, New Delhi 110 012 amitvasisht@iasri.res.in In dummy regression variable models, it is assumed implicitly that the dependent variable Y
More informationExample: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C
More informationNominal and ordinal logistic regression
Nominal and ordinal logistic regression April 26 Nominal and ordinal logistic regression Our goal for today is to briefly go over ways to extend the logistic regression model to the case where the outcome
More informationLinda K. Muthén Bengt Muthén. Copyright 2008 Muthén & Muthén www.statmodel.com. Table Of Contents
Mplus Short Courses Topic 2 Regression Analysis, Eploratory Factor Analysis, Confirmatory Factor Analysis, And Structural Equation Modeling For Categorical, Censored, And Count Outcomes Linda K. Muthén
More informationLogistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression
Logistic Regression Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max
More informationI L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN
Beckman HLM Reading Group: Questions, Answers and Examples Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Linear Algebra Slide 1 of
More informationThe Proportional Odds Model for Assessing Rater Agreement with Multiple Modalities
The Proportional Odds Model for Assessing Rater Agreement with Multiple Modalities Elizabeth Garrett-Mayer, PhD Assistant Professor Sidney Kimmel Comprehensive Cancer Center Johns Hopkins University 1
More informationLecture 3: Linear methods for classification
Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,
More informationLogistic Regression (1/24/13)
STA63/CBB540: Statistical methods in computational biology Logistic Regression (/24/3) Lecturer: Barbara Engelhardt Scribe: Dinesh Manandhar Introduction Logistic regression is model for regression used
More informationReject Inference in Credit Scoring. Jie-Men Mok
Reject Inference in Credit Scoring Jie-Men Mok BMI paper January 2009 ii Preface In the Master programme of Business Mathematics and Informatics (BMI), it is required to perform research on a business
More informationCREDIT SCORING MODEL APPLICATIONS:
Örebro University Örebro University School of Business Master in Applied Statistics Thomas Laitila Sune Karlsson May, 2014 CREDIT SCORING MODEL APPLICATIONS: TESTING MULTINOMIAL TARGETS Gabriela De Rossi
More informationGENDER DIFFERENCES IN MAJOR CHOICE AND COLLEGE ENTRANCE PROBABILITIES IN BRAZIL
GENDER DIFFERENCES IN MAJOR CHOICE AND COLLEGE ENTRANCE PROBABILITIES IN BRAZIL (PRELIMINARY VERSION) ALEJANDRA TRAFERRI PONTIFICIA UNIVERSIDAD CATÓLICA DE CHILE Abstract. I study gender differences in
More informationAuxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus
Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives
More informationLogistic regression modeling the probability of success
Logistic regression modeling the probability of success Regression models are usually thought of as only being appropriate for target variables that are continuous Is there any situation where we might
More informationMaximum Likelihood Estimation
Math 541: Statistical Theory II Lecturer: Songfeng Zheng Maximum Likelihood Estimation 1 Maximum Likelihood Estimation Maximum likelihood is a relatively simple method of constructing an estimator for
More informationIntroduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby
More informationOrdinal Regression. Chapter
Ordinal Regression Chapter 4 Many variables of interest are ordinal. That is, you can rank the values, but the real distance between categories is unknown. Diseases are graded on scales from least severe
More informationMULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS
MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance
More informationIt is important to bear in mind that one of the first three subscripts is redundant since k = i -j +3.
IDENTIFICATION AND ESTIMATION OF AGE, PERIOD AND COHORT EFFECTS IN THE ANALYSIS OF DISCRETE ARCHIVAL DATA Stephen E. Fienberg, University of Minnesota William M. Mason, University of Michigan 1. INTRODUCTION
More informationCredit Risk Models: An Overview
Credit Risk Models: An Overview Paul Embrechts, Rüdiger Frey, Alexander McNeil ETH Zürich c 2003 (Embrechts, Frey, McNeil) A. Multivariate Models for Portfolio Credit Risk 1. Modelling Dependent Defaults:
More informationLogit and Probit. Brad Jones 1. April 21, 2009. University of California, Davis. Bradford S. Jones, UC-Davis, Dept. of Political Science
Logit and Probit Brad 1 1 Department of Political Science University of California, Davis April 21, 2009 Logit, redux Logit resolves the functional form problem (in terms of the response function in the
More informationLogit Models for Binary Data
Chapter 3 Logit Models for Binary Data We now turn our attention to regression models for dichotomous data, including logistic regression and probit analysis. These models are appropriate when the response
More informationBayesian Statistics in One Hour. Patrick Lam
Bayesian Statistics in One Hour Patrick Lam Outline Introduction Bayesian Models Applications Missing Data Hierarchical Models Outline Introduction Bayesian Models Applications Missing Data Hierarchical
More informationOverview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)
Overview Classes 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) 2-4 Loglinear models (8) 5-4 15-17 hrs; 5B02 Building and
More informationPoisson Models for Count Data
Chapter 4 Poisson Models for Count Data In this chapter we study log-linear models for count data under the assumption of a Poisson error structure. These models have many applications, not only to the
More informationChapter 10: Basic Linear Unobserved Effects Panel Data. Models:
Chapter 10: Basic Linear Unobserved Effects Panel Data Models: Microeconomic Econometrics I Spring 2010 10.1 Motivation: The Omitted Variables Problem We are interested in the partial effects of the observable
More informationLeast Squares Estimation
Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David
More informationproblem arises when only a non-random sample is available differs from censored regression model in that x i is also unobserved
4 Data Issues 4.1 Truncated Regression population model y i = x i β + ε i, ε i N(0, σ 2 ) given a random sample, {y i, x i } N i=1, then OLS is consistent and efficient problem arises when only a non-random
More informationMultivariate Logistic Regression
1 Multivariate Logistic Regression As in univariate logistic regression, let π(x) represent the probability of an event that depends on p covariates or independent variables. Then, using an inv.logit formulation
More informationA General Approach to Variance Estimation under Imputation for Missing Survey Data
A General Approach to Variance Estimation under Imputation for Missing Survey Data J.N.K. Rao Carleton University Ottawa, Canada 1 2 1 Joint work with J.K. Kim at Iowa State University. 2 Workshop on Survey
More informationLinear Classification. Volker Tresp Summer 2015
Linear Classification Volker Tresp Summer 2015 1 Classification Classification is the central task of pattern recognition Sensors supply information about an object: to which class do the object belong
More informationDiscrete Choice Analysis II
Discrete Choice Analysis II Moshe Ben-Akiva 1.201 / 11.545 / ESD.210 Transportation Systems Analysis: Demand & Economics Fall 2008 Review Last Lecture Introduction to Discrete Choice Analysis A simple
More informationChapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 )
Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 ) and Neural Networks( 類 神 經 網 路 ) 許 湘 伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 10 1 / 35 13 Examples
More informationHYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION
HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate
More informationSAS Software to Fit the Generalized Linear Model
SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling
More informationStatistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
More informationGerry Hobbs, Department of Statistics, West Virginia University
Decision Trees as a Predictive Modeling Method Gerry Hobbs, Department of Statistics, West Virginia University Abstract Predictive modeling has become an important area of interest in tasks such as credit
More informationStandard errors of marginal effects in the heteroskedastic probit model
Standard errors of marginal effects in the heteroskedastic probit model Thomas Cornelißen Discussion Paper No. 320 August 2005 ISSN: 0949 9962 Abstract In non-linear regression models, such as the heteroskedastic
More informationIntroduction to Quantitative Methods
Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................
More informationECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2
University of California, Berkeley Prof. Ken Chay Department of Economics Fall Semester, 005 ECON 14 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE # Question 1: a. Below are the scatter plots of hourly wages
More informationQualitative Choice Analysis Workshop 76 LECTURE / DISCUSSION. Hypothesis Testing
Qualitative Choice Analysis Workshop 76 LECTURE / DISCUSSION Hypothesis Testing Qualitative Choice Analysis Workshop 77 T-test Use to test value of one parameter. I. Most common application: to test whether
More informationInterpretation of Somers D under four simple models
Interpretation of Somers D under four simple models Roger B. Newson 03 September, 04 Introduction Somers D is an ordinal measure of association introduced by Somers (96)[9]. It can be defined in terms
More informationIntroduction to Regression and Data Analysis
Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it
More informationLOGISTIC REGRESSION. Nitin R Patel. where the dependent variable, y, is binary (for convenience we often code these values as
LOGISTIC REGRESSION Nitin R Patel Logistic regression extends the ideas of multiple linear regression to the situation where the dependent variable, y, is binary (for convenience we often code these values
More informationOverview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
More informationGeneralized Linear Models
Generalized Linear Models We have previously worked with regression models where the response variable is quantitative and normally distributed. Now we turn our attention to two types of models where the
More informationAdditional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm
Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm
More informationWeb-based Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni
1 Web-based Supplementary Materials for Bayesian Effect Estimation Accounting for Adjustment Uncertainty by Chi Wang, Giovanni Parmigiani, and Francesca Dominici In Web Appendix A, we provide detailed
More informationUsing the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, and Discrete Changes
Using the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, Discrete Changes JunXuJ.ScottLong Indiana University August 22, 2005 The paper provides technical details on
More informationChapter 7: Dummy variable regression
Chapter 7: Dummy variable regression Why include a qualitative independent variable?........................................ 2 Simplest model 3 Simplest case.............................................................
More informationANNUITY LAPSE RATE MODELING: TOBIT OR NOT TOBIT? 1. INTRODUCTION
ANNUITY LAPSE RATE MODELING: TOBIT OR NOT TOBIT? SAMUEL H. COX AND YIJIA LIN ABSTRACT. We devise an approach, using tobit models for modeling annuity lapse rates. The approach is based on data provided
More informationLOGISTIC REGRESSION ANALYSIS
LOGISTIC REGRESSION ANALYSIS C. Mitchell Dayton Department of Measurement, Statistics & Evaluation Room 1230D Benjamin Building University of Maryland September 1992 1. Introduction and Model Logistic
More information11. Analysis of Case-control Studies Logistic Regression
Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:
More informationVI. Introduction to Logistic Regression
VI. Introduction to Logistic Regression We turn our attention now to the topic of modeling a categorical outcome as a function of (possibly) several factors. The framework of generalized linear models
More informationPart 2: Analysis of Relationship Between Two Variables
Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable
More informationSTATISTICA Formula Guide: Logistic Regression. Table of Contents
: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary
More informationModels for Longitudinal and Clustered Data
Models for Longitudinal and Clustered Data Germán Rodríguez December 9, 2008, revised December 6, 2012 1 Introduction The most important assumption we have made in this course is that the observations
More informationPanel Data: Linear Models
Panel Data: Linear Models Laura Magazzini University of Verona laura.magazzini@univr.it http://dse.univr.it/magazzini Laura Magazzini (@univr.it) Panel Data: Linear Models 1 / 45 Introduction Outline What
More informationA Classical Monetary Model - Money in the Utility Function
A Classical Monetary Model - Money in the Utility Function Jarek Hurnik Department of Economics Lecture III Jarek Hurnik (Department of Economics) Monetary Economics 2012 1 / 24 Basic Facts So far, the
More informationRegression with a Binary Dependent Variable
Regression with a Binary Dependent Variable Chapter 9 Michael Ash CPPA Lecture 22 Course Notes Endgame Take-home final Distributed Friday 19 May Due Tuesday 23 May (Paper or emailed PDF ok; no Word, Excel,
More informationMultinomial Logistic Regression
Multinomial Logistic Regression Dr. Jon Starkweather and Dr. Amanda Kay Moske Multinomial logistic regression is used to predict categorical placement in or the probability of category membership on a
More informationUsing An Ordered Logistic Regression Model with SAS Vartanian: SW 541
Using An Ordered Logistic Regression Model with SAS Vartanian: SW 541 libname in1 >c:\=; Data first; Set in1.extract; A=1; PROC LOGIST OUTEST=DD MAXITER=100 ORDER=DATA; OUTPUT OUT=CC XBETA=XB P=PROB; MODEL
More informationSimultaneous or Sequential? Search Strategies in the U.S. Auto. Insurance Industry
Simultaneous or Sequential? Search Strategies in the U.S. Auto Insurance Industry Elisabeth Honka 1 University of Texas at Dallas Pradeep Chintagunta 2 University of Chicago Booth School of Business September
More informationSimple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
More informationGender Effects in the Alaska Juvenile Justice System
Gender Effects in the Alaska Juvenile Justice System Report to the Justice and Statistics Research Association by André Rosay Justice Center University of Alaska Anchorage JC 0306.05 October 2003 Gender
More informationThe zero-adjusted Inverse Gaussian distribution as a model for insurance claims
The zero-adjusted Inverse Gaussian distribution as a model for insurance claims Gillian Heller 1, Mikis Stasinopoulos 2 and Bob Rigby 2 1 Dept of Statistics, Macquarie University, Sydney, Australia. email:
More information5. Multiple regression
5. Multiple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/5 QBUS6840 Predictive Analytics 5. Multiple regression 2/39 Outline Introduction to multiple linear regression Some useful
More informationLecture 19: Conditional Logistic Regression
Lecture 19: Conditional Logistic Regression Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina
More informationEstimating the random coefficients logit model of demand using aggregate data
Estimating the random coefficients logit model of demand using aggregate data David Vincent Deloitte Economic Consulting London, UK davivincent@deloitte.co.uk September 14, 2012 Introduction Estimation
More informationThe Probit Link Function in Generalized Linear Models for Data Mining Applications
Journal of Modern Applied Statistical Methods Copyright 2013 JMASM, Inc. May 2013, Vol. 12, No. 1, 164-169 1538 9472/13/$95.00 The Probit Link Function in Generalized Linear Models for Data Mining Applications
More informationMachine Learning Logistic Regression
Machine Learning Logistic Regression Jeff Howbert Introduction to Machine Learning Winter 2012 1 Logistic regression Name is somewhat misleading. Really a technique for classification, not regression.
More informationReview Jeopardy. Blue vs. Orange. Review Jeopardy
Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 0-3 Jeopardy Round $200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?
More informationChoice under Uncertainty
Choice under Uncertainty Part 1: Expected Utility Function, Attitudes towards Risk, Demand for Insurance Slide 1 Choice under Uncertainty We ll analyze the underlying assumptions of expected utility theory
More informationTHE SELECTION OF RETURNS FOR AUDIT BY THE IRS. John P. Hiniker, Internal Revenue Service
THE SELECTION OF RETURNS FOR AUDIT BY THE IRS John P. Hiniker, Internal Revenue Service BACKGROUND The Internal Revenue Service, hereafter referred to as the IRS, is responsible for administering the Internal
More informationStatistical Models in R
Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Structure of models in R Model Assessment (Part IA) Anova
More informationAccurately and Efficiently Measuring Individual Account Credit Risk On Existing Portfolios
Accurately and Efficiently Measuring Individual Account Credit Risk On Existing Portfolios By: Michael Banasiak & By: Daniel Tantum, Ph.D. What Are Statistical Based Behavior Scoring Models And How Are
More informationAutomated Statistical Modeling for Data Mining David Stephenson 1
Automated Statistical Modeling for Data Mining David Stephenson 1 Abstract. We seek to bridge the gap between basic statistical data mining tools and advanced statistical analysis software that requires
More informationLecture notes: single-agent dynamics 1
Lecture notes: single-agent dynamics 1 Single-agent dynamic optimization models In these lecture notes we consider specification and estimation of dynamic optimization models. Focus on single-agent models.
More information1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96
1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
More informationWeight of Evidence Module
Formula Guide The purpose of the Weight of Evidence (WoE) module is to provide flexible tools to recode the values in continuous and categorical predictor variables into discrete categories automatically,
More informationAgenda. Mathias Lanner Sas Institute. Predictive Modeling Applications. Predictive Modeling Training Data. Beslutsträd och andra prediktiva modeller
Agenda Introduktion till Prediktiva modeller Beslutsträd Beslutsträd och andra prediktiva modeller Mathias Lanner Sas Institute Pruning Regressioner Neurala Nätverk Utvärdering av modeller 2 Predictive
More informationRegression III: Advanced Methods
Lecture 4: Transformations Regression III: Advanced Methods William G. Jacoby Michigan State University Goals of the lecture The Ladder of Roots and Powers Changing the shape of distributions Transforming
More informationIAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results
IAPRI Quantitative Analysis Capacity Building Series Multiple regression analysis & interpreting results How important is R-squared? R-squared Published in Agricultural Economics 0.45 Best article of the
More informationImproving the Performance of Data Mining Models with Data Preparation Using SAS Enterprise Miner Ricardo Galante, SAS Institute Brasil, São Paulo, SP
Improving the Performance of Data Mining Models with Data Preparation Using SAS Enterprise Miner Ricardo Galante, SAS Institute Brasil, São Paulo, SP ABSTRACT In data mining modelling, data preparation
More informationFree Trial - BIRT Analytics - IAAs
Free Trial - BIRT Analytics - IAAs 11. Predict Customer Gender Once we log in to BIRT Analytics Free Trial we would see that we have some predefined advanced analysis ready to be used. Those saved analysis
More informationEconometrics Simple Linear Regression
Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight
More informationPenalized regression: Introduction
Penalized regression: Introduction Patrick Breheny August 30 Patrick Breheny BST 764: Applied Statistical Modeling 1/19 Maximum likelihood Much of 20th-century statistics dealt with maximum likelihood
More informationDeterministic and Stochastic Modeling of Insulin Sensitivity
Deterministic and Stochastic Modeling of Insulin Sensitivity Master s Thesis in Engineering Mathematics and Computational Science ELÍN ÖSP VILHJÁLMSDÓTTIR Department of Mathematical Science Chalmers University
More informationRegression 3: Logistic Regression
Regression 3: Logistic Regression Marco Baroni Practical Statistics in R Outline Logistic regression Logistic regression in R Outline Logistic regression Introduction The model Looking at and comparing
More informationInstitut für Soziologie Eberhard Karls Universität Tübingen www.maartenbuis.nl
from Indirect Extracting from Institut für Soziologie Eberhard Karls Universität Tübingen www.maartenbuis.nl from Indirect What is the effect of x on y? Which effect do I choose: average marginal or marginal
More informationCHAPTER 12 EXAMPLES: MONTE CARLO SIMULATION STUDIES
Examples: Monte Carlo Simulation Studies CHAPTER 12 EXAMPLES: MONTE CARLO SIMULATION STUDIES Monte Carlo simulation studies are often used for methodological investigations of the performance of statistical
More informationChapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS
Chapter Seven Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS Section : An introduction to multiple regression WHAT IS MULTIPLE REGRESSION? Multiple
More informationMarginal Person. Average Person. (Average Return of College Goers) Return, Cost. (Average Return in the Population) (Marginal Return)
1 2 3 Marginal Person Average Person (Average Return of College Goers) Return, Cost (Average Return in the Population) 4 (Marginal Return) 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
More informationASSIGNMENT 4 PREDICTIVE MODELING AND GAINS CHARTS
DATABASE MARKETING Fall 2015, max 24 credits Dead line 15.10. ASSIGNMENT 4 PREDICTIVE MODELING AND GAINS CHARTS PART A Gains chart with excel Prepare a gains chart from the data in \\work\courses\e\27\e20100\ass4b.xls.
More informationSUGI 29 Statistics and Data Analysis
Paper 194-29 Head of the CLASS: Impress your colleagues with a superior understanding of the CLASS statement in PROC LOGISTIC Michelle L. Pritchard and David J. Pasta Ovation Research Group, San Francisco,
More informationHandling missing data in large data sets. Agostino Di Ciaccio Dept. of Statistics University of Rome La Sapienza
Handling missing data in large data sets Agostino Di Ciaccio Dept. of Statistics University of Rome La Sapienza The problem Often in official statistics we have large data sets with many variables and
More informationMULTIPLE REGRESSION WITH CATEGORICAL DATA
DEPARTMENT OF POLITICAL SCIENCE AND INTERNATIONAL RELATIONS Posc/Uapp 86 MULTIPLE REGRESSION WITH CATEGORICAL DATA I. AGENDA: A. Multiple regression with categorical variables. Coding schemes. Interpreting
More informationMultilevel Modeling of Complex Survey Data
Multilevel Modeling of Complex Survey Data Sophia Rabe-Hesketh, University of California, Berkeley and Institute of Education, University of London Joint work with Anders Skrondal, London School of Economics
More informationA Basic Introduction to Missing Data
John Fox Sociology 740 Winter 2014 Outline Why Missing Data Arise Why Missing Data Arise Global or unit non-response. In a survey, certain respondents may be unreachable or may refuse to participate. Item
More information