Paired Comparison Models. Paired Comparison Preference Models. The prefmod Package: Part I Some Examples. Paired Comparisons.
|
|
|
- Curtis Berry
- 9 years ago
- Views:
Transcription
1 Paired Comparison Models Paired Comparison Models Paired Comparison Models General Considerations: Paired Comparison Preference Models The prefmod Package: Part I Some Examples Regina Dittrich & Reinhold Hatzinger Department of Statistics and Mathematics, WU Vienna the paired comparison models we consider in this talk are based on the Bradley-Terry Model (Biometrika, 1952) and will be used as statistical models describing and explaining data the models are similar to regression models (ANOVA) other types of dependent variables (multivariate, nominal) Generalised Linear Models but will not be considered as measurement models in this case one should use another approach e.g. taken by Winkelmaier in the R package eba Paired Comparison Models Paired Comparisons method of data collection given a set of J items Paired Comparison Models Overview LLBT models: loglinear Bradly-Terry models Basic LLBT individuals are asked to judge pairs of objects j preferred to k k preferred to j aim is to rank objects into a preference order obtain an overall ranking of the objects Extended LLBT undecided response subject covariates object specic covariates Pattern Models Paired comparison pattern models Ranking pattern models Rating pattern models
2 LLBT The Basic Bradley-Terry Model (BT) LLBT The Basic Loglinear BT Model (LLBT) for the each comparison (jk) of object j to object k we observe: n (j k)... the number of times j is preferred to k n (k j)... the number of times k is preferred to j the model can be formulated as a log-linear model following the usual Multinomial / Poisson - equivalence. the expected value m (j k) of n (j k) is m (j k) = N (jk) p (j k) N (jk) = n (j k) + n (k j) total number of responses to comparison (jk) P (j k) = π j πj = c π j + π (jk) k πk where c (jk) is constant for a given comparison the probability that j is preferred to k in comparison (jk) then our basic paired comparison model for one comparison is P (j k) = π j π j + π k π's are a called worth parameters and are non-negative numbers describing the location of the objects ln m (j k) = µ (jk) + λ j λ k λ's are the object parameters µ's are nuisance parameters this model formulation is feasible for further extensions LLBT terms and relations relation between π and λ: λ j = ln π j π j = exp 2λ j identiability of πs is obtained by the restriction π J = 1 via λ J = 0 the worth parameters are calculated by π j = exp(2λ j), j = 1, 2,..., J j exp(2λ j ) where j π j = 1 LLBT LLBT Design Structure for 3 objects we have 3 comparisons PC decision counts µ λ 1 λ 2 λ 3 (12) O 1 n (1 2) (12) O 2 n (2 1) (13) O 1 n (1 3) (13) O 3 n (3 1) (23) O 2 n (2 3) (23) O 3 n (3 2) the design structure consists of counts (dependent variable) and the design matrix X with: µ which is a factor (dummies for µ 1, µ 2, µ 3 ) and variates for the objects O 1, O 2, O
3 LLBT LLBT Model Structure model formula for the rst line is: LLBT Example Wines WINE example (cticious) 4 sicilian red wines have been tasted pairwise by 1000 people (all of them prefer red to white wines) ln m (1 2) = µ 1 + λ 1 λ 2 general matrix notation: ln m (1 2) µ 1 ln m (2 1) µ ln m 2 (1 3) = µ 3 ln m (3 1) λ 1 ln m (2 3) λ 2 ln m λ (3 2) 3 aim of the study: preference order for dierent wines data: PC-responses about the choices of 4 selected wines the wines are: O1 = Nero D'Avola (also called 'Calabrese') O2 = Gaglioppo (a red of Calabrian origin frequently grown in Sicily O3 = Perricone (Pignatello - esoteric, robust red) O4 = Nerello (Mascalese - strong red) Data - Coding Data preparation Data Coding Coding 4 objects (items) A,G,P,N number of comparisons is 4 3/2 = 6 for data input the ordering of comparisons is important wine rst in comparison wines A 1 G 2 P 3 N 4 A 1 G 2 V1 P 3 V2 V3 N 4 V4 V5 V6 V1 V2 V3 V4 V5 V6 (12) (13) (23) (14) (24) (34) (A,G) (A,P) (G,P) (A,N) (G,N) (P,N) One possible coding for each comparison (V1,... V6 ) is (jk) = { 1 if rst object is preferred to second object (j k) 1 if second object is preferred to rst object (k j) First respondent gave the following responses: NOTE: V1 V2 V3 V4 V5 V6 (12) (13) (23) (14) (24) (34) (A,G) (A,P) (G,P) (A,N) (G,N) (P,N) missing responses should be coded with NA if the coding is not (1, 1) but any other two numbers, the smaller number means the rst object is preferred e.g. a coding with (0, 1) 0 means rst object preferred
4 prefmod LLBT WINE data Wine data preparations > library(prefmod) > load("wine.rdata") > wnames <- c("a", "G", "P", "N") produce barplots for all comparisons > par(mfrow = c(2, 3)) > for (i in 1:6) barplot(table(wine[, i])) > par(mfrow = c(1, 1)) prefmod LLBT WINE Example Function: llbtpc.fit() preparations > library(prefmod) > load("wine.rdata") > wnames <- c("a", "G", "P", "N") t simple model > m1 <- llbtpc.fit(wine, nitems = 4, obj.names = wnames) > m prefmod LLBT Wine Example - Result > m1 Call: gnm(formula = formula, eliminate = elim, family = poisson, data = dfr) Coefficients of interest: A G P N NA Deviance: Pearson chi-squared: Residual df: 3 prefmod LLBT Wine Example - Output model t was done by the package gnm() We see (parameter) estimates for λ 1, λ 2, λ 3 and λ 4 = NA (is set to zero because of restriction for estimation) We do not see parameter estimates for µ 1, µ 2, µ 3, µ 4 these are nuisance parameters and they are eliminated by the procedure (tted, but not displayed) Does the model t? We can use the Deviance degrees of freedom 3 > dev1 <- round(m1$deviance, digits = 5) > df1 <- m1$df.residual > prob1 <- 1 - pchisq(dev1, df1) > print(prob1) [1] Model t is OK (probability is > 0.05 )
5 deviance prefmod LLBT Wine Example - Worth Functions: llbt.worth(), plotworth() Deviance: oij ln ( o ij e ij ) = n ij ln ( n ij m ij ) calculate worth parameters under full modell n ij = m ij > worth1 <- llbt.worth(m1) χ 2 statistic: (o ij e ij ) 2 e ij = (n ij m ij ) 2 m ij Remember: π j = exp(2λj) j exp(2λj) λ A = 1.118, λ G = 1.073, λ P = 0.794, λ N = 0 or get estimates > est1 <- llbt.worth(m1, outmat = "lambda") plot worth or estimates > plotworth(worth1, ylab = "estimated worth") > plotworth(est1) prefmod LLBT Wine Example - Plot extraction of parameter estimates estimated worth Preferences A G P N How to get p.e. from m1 (which is a gnm object ) all parameter estimates (µ 1, µ 2,... λ 1, λ 2,...) > m1$coefficients A G P N NA attr(,"eliminated") [1] How to get parameter estimates ofinterest only?(λ 1, λ 2,...) > c1 <- coef(m1) > c1 Coefficients of interest: A G P N NA estimate
6 extraction of parameter estimates extraction of parameter estimates Replace last parameter (NA) with 0 and give names > c1 <- c(c1[1:3], 0) > names(c1) <- c("a", "G", "P", "N") How to get the µ's?(µ 1, µ 2,...) How to get s.e. from m1 > cov <- vcov(m1) > se <- sqrt(diag(cov)) How to make a condence interval plot > mu <- attr(m1$coefficients, "eliminated") > mu[1:6] [1] > mu[4] [1] > up1 <- c1 + se * 1.96 > lo1 <- c1 - se * 1.96 > cbind(lo1, c1, up1) lo1 c1 up1 A G P N Plot estimates and condence intevals > plot(1:4, c1, type = "b", ylim = c(0, 1.4)) > for (i in 1:4) { + lines(rep(i, 2), c(lo1[i], up1[i]), col = "red", lty = "dashed") + text(rep(i, 2), c1[i], names(c1)[i], pos = 2) + } c A G P N :4 LLBT Interpretation of parameters Parameter Interpretation log odds for preferring A to N ln P (A N) = ln P (N A) m (A N) N (A,N) m (N B) N (A,N) = ln m (A N) = ln m m (A N) ln m (N A) (N B) insert equations ln m (A N) = µ (A,N) + λ A λ N ln m (N A) = µ (A,N) + λ A λ N log odds preferring A to N are parameter estimates are stored in c2[1] and c2[4] λ A is and λ N is 0 (reference object) 2(λ A λ N ) is the odds for preferring A to N are exp(2.237) =
7 LLBT expected preferences LLBT Comparing observed and expected preferences How would you calculate the expected preferences for m (A G) and for m (G A)? the λ parameter estimates are: > coef(m1) Coefficients of interest: A G P N NA the estimates for the µs are: > attr(m1$coefficients, "eliminated") [1] Hint 1: Use model formula to calculate the logarithm of the expected numbers with ln m (A G) = µ (A,G) + λ A λ G Hint 2: the expected numbers are exp(ln m (A G) ) The observed preferences (counts) n (i j) are stored in > m1$y 1.V11 1.V12 1.V21 1.V22 1.V31 1.V32 1.V41 1.V42 1.V51 1.V52 1.V61 1.V The expected preferences m (i j) are stored in > m1$fitted.values 1.V11 1.V12 1.V21 1.V22 1.V31 1.V32 1.V41 1.V42 1.V51 1.V V61 1.V The raw residuals are the dierence between observed and expected preferences > m1$y - m1$fitted.values 1.V11 1.V12 1.V21 1.V22 1.V31 1.V32 1.V V42 1.V51 1.V52 1.V61 1.V LLBT Example CEMS Example: CEMS exchange programme LLBT Extensions LLBT Extensions Overview students of the WU can study abroad visiting one of currently 17 CEMS universities undecided responses (ties) subject covariates aim of the study: preference orderings of students for dierent locations identify reasons for these preferences object specic covariates data: PC-responses about their choices of 6 selected CEMS universities for the semester abroad (London, Paris, Milan, Barcelona, St.Gall, Stockholm) answer: can not decide was allowed several covariates (e.g., gender, working status, language abilities, etc.)
8 LLBT Extensions undecided Response Undecided Response undecided 3 possible responses for each comparison between two objects j and k the response can be: j k k j j = k In our CEMSexample: comparing London (LO) to Paris (PA) the response can be: LO PA PA LO LO = PA LLBT Extensions undecided Response LLBT with Undecided Response Using the respecication of the probabilities suggested by Davidson and Beaver (1977): the LLBT model formulas for the comparison (jk) are now: ln m (j k) = µ (jk) + λ j λ k ln m (k j) = µ (jk) λ j + λ k ln m (j=k) = µ (jk) + γ where γ is the parameter for undecided response (could also be γ (jk) ) λ's are the object parameters µ's are nuisance parameters LLBT Interpretation of parameters Interpretation of parameter γ log odds for preferring (j k) to "no decision" (j = k) ln P (j k) = ln m P (j k) ln m (j=k) (j=k) insert equations ln m (j k) = µ (jk) + λ j λ k ln m (j=k) = µ (jk) γ for λ j = λ k log odds preferring (j k) to "no decision" (j = k) is γ if γ < 0 : there is an advantage in favour of a decision xxx LLBT Design Structure with undecided PC decision counts µ γ λ 1 λ 2 λ 3 (12) O 1 n (1 2) (12) no n (1=2) (12) O 2 n (2 1) (13) O 1 n (1 3) (13) no n (1=3) (13) O 3 n (3 1) (23) O 2 n (2 3) (23) no n (2=3) (23) O 3 n (3 2) Design matrix consists of µ, λ 1, λ 2, λ 3 and additionally γ where µ is a factor and the n's are the counts
9 prefmod LLBT CEMS Example Function: llbtpc.fit() preparations > library(prefmod) > load("cpc.rdata") > cities <- c("lo", "PA", "MI", "SG", "BA", "ST") t simple model > m2 <- llbtpc.fit(cpc, nitems = 6, obj.names = cities) > summary(m2) t model including eect for undecided prefmod LLBT CEMS Example Functions: llbt.worth(), plotworth() calculate worth parameters > worth2 <- llbt.worth(m2) > worth3 <- llbt.worth(m3) > worthmat <- cbind(worth2, worth3) plot them > plotworth(worthmat) > m3 <- llbtpc.fit(cpc, nitems = 6, undec = TRUE, obj.names = cities) > summary(m3) compare models > anova(m3, m2) LLBT Extension Subject Covariates Subject Covariates Are the preference orderings dierent for dierent groups of subjects? For one subject covariate on s levels we have now ln m (j k) s = µ (jk)s + λ S s + (λ O j j where λ O λ OS + λ O js js ) (λ O k k + λo ks ks ) object parameters (for subject baseline group) interaction parameter between objects and subject category xxx LLBT Design structure with 1 subject covariate PC decison counts µ λ S γ λ O 1 λ O 2 λ O 3 λ OS 12 λ OS 22 λ OS 32 (12) O 1 n (1 2) (12) no n (1=2) (12) O 2 n (2 1) (12) O 1 n (1 2) (12) no n (1=2) (12) O 2 n (2 1) O and OS are parameter of interest λ S s µ's xing the margin for category s of covariate S (nuisance) nuisance parameters MU S not related to objects nuisance parameters
10 prefmod LLBT: One Subject Covariates CEMS Example prefmod LLBT CEMS Example - Plot Options for llbtpc.fit(): formel, elim Preferences t null model (without subject covariates) > mb0 <- llbtpc.fit(cpc, nitems = 6, undec = TRUE, formel = ~1, + elim = ~SEX, obj.names = cities) t model with dierent preference scales for SEX > mbsex <- llbtpc.fit(cpc, nitems = 6, undec = TRUE, formel = ~SEX, + elim = ~SEX, obj.names = cities) Plot for dierent preference scales for SEX female=1 and male =2 estimated worth LO PA MI SG BA ST LO PA SG BA ST MI > wbsex <- llbt.worth(mbsex) > colnames(wbsex) <- c("female", "male") > plotworth(wbsex, ylab = "estimated worth") female male LLBT Interpretation of parameters ofinterest λ O j j parameter estimate for O j for the reference group (all subjects covariates on level = 0 in dummy coding ) λ O js js change of λ O j j model: SEX for group s estimates for object PA reference group SEX1 λ P A SEX2 λ P A +λ P A:SEX2 LLBT Interpretation of parameters ofinterest > cmsw <- coef(mbsex) > cmsw Coefficients of interest: LO PA MI SG BA ST NA u LO:SEX2 PA:SEX2 MI:SEX2 SG:SEX2 BA:SEX ST:SEX2 NA model: SEX estimates for object PA model in mbsex λ P A +λ P A:SEX2 SEX λ P female A = SEX λ P male A =
A Handbook of Statistical Analyses Using R. Brian S. Everitt and Torsten Hothorn
A Handbook of Statistical Analyses Using R Brian S. Everitt and Torsten Hothorn CHAPTER 6 Logistic Regression and Generalised Linear Models: Blood Screening, Women s Role in Society, and Colonic Polyps
Notating the Multilevel Longitudinal Model. Multilevel Modeling of Longitudinal Data. Notating (cont.) Notating (cont.)
Notating the Multilevel Modeling of Longitudinal Data Recall the typical -level model Y ij = γ 00 + (γ 0 + u j )X ij + u 0j + e ij Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education
Multivariate Logistic Regression
1 Multivariate Logistic Regression As in univariate logistic regression, let π(x) represent the probability of an event that depends on p covariates or independent variables. Then, using an inv.logit formulation
Chapter 5 Analysis of variance SPSS Analysis of variance
Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,
Basic Statistics and Data Analysis for Health Researchers from Foreign Countries
Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma [email protected] The Research Unit for General Practice in Copenhagen Dias 1 Content Quantifying association
ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2
University of California, Berkeley Prof. Ken Chay Department of Economics Fall Semester, 005 ECON 14 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE # Question 1: a. Below are the scatter plots of hourly wages
Overview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)
Overview Classes 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) 2-4 Loglinear models (8) 5-4 15-17 hrs; 5B02 Building and
Poisson Models for Count Data
Chapter 4 Poisson Models for Count Data In this chapter we study log-linear models for count data under the assumption of a Poisson error structure. These models have many applications, not only to the
I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN
Beckman HLM Reading Group: Questions, Answers and Examples Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Linear Algebra Slide 1 of
1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96
1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
11. Analysis of Case-control Studies Logistic Regression
Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:
SAS Software to Fit the Generalized Linear Model
SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling
13. Poisson Regression Analysis
136 Poisson Regression Analysis 13. Poisson Regression Analysis We have so far considered situations where the outcome variable is numeric and Normally distributed, or binary. In clinical work one often
Nominal and ordinal logistic regression
Nominal and ordinal logistic regression April 26 Nominal and ordinal logistic regression Our goal for today is to briefly go over ways to extend the logistic regression model to the case where the outcome
Probability Calculator
Chapter 95 Introduction Most statisticians have a set of probability tables that they refer to in doing their statistical wor. This procedure provides you with a set of electronic statistical tables that
Basic Statistical and Modeling Procedures Using SAS
Basic Statistical and Modeling Procedures Using SAS One-Sample Tests The statistical procedures illustrated in this handout use two datasets. The first, Pulse, has information collected in a classroom
SPSS Guide: Regression Analysis
SPSS Guide: Regression Analysis I put this together to give you a step-by-step guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar
Simple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
Ordinal Regression. Chapter
Ordinal Regression Chapter 4 Many variables of interest are ordinal. That is, you can rank the values, but the real distance between categories is unknown. Diseases are graded on scales from least severe
HLM software has been one of the leading statistical packages for hierarchical
Introductory Guide to HLM With HLM 7 Software 3 G. David Garson HLM software has been one of the leading statistical packages for hierarchical linear modeling due to the pioneering work of Stephen Raudenbush
GLMs: Gompertz s Law. GLMs in R. Gompertz s famous graduation formula is. or log µ x is linear in age, x,
Computing: an indispensable tool or an insurmountable hurdle? Iain Currie Heriot Watt University, Scotland ATRC, University College Dublin July 2006 Plan of talk General remarks The professional syllabus
DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9
DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 Analysis of covariance and multiple regression So far in this course,
MULTIPLE REGRESSION WITH CATEGORICAL DATA
DEPARTMENT OF POLITICAL SCIENCE AND INTERNATIONAL RELATIONS Posc/Uapp 86 MULTIPLE REGRESSION WITH CATEGORICAL DATA I. AGENDA: A. Multiple regression with categorical variables. Coding schemes. Interpreting
3. Regression & Exponential Smoothing
3. Regression & Exponential Smoothing 3.1 Forecasting a Single Time Series Two main approaches are traditionally used to model a single time series z 1, z 2,..., z n 1. Models the observation z t as a
It is important to bear in mind that one of the first three subscripts is redundant since k = i -j +3.
IDENTIFICATION AND ESTIMATION OF AGE, PERIOD AND COHORT EFFECTS IN THE ANALYSIS OF DISCRETE ARCHIVAL DATA Stephen E. Fienberg, University of Minnesota William M. Mason, University of Michigan 1. INTRODUCTION
Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm
Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm
R 2 -type Curves for Dynamic Predictions from Joint Longitudinal-Survival Models
Faculty of Health Sciences R 2 -type Curves for Dynamic Predictions from Joint Longitudinal-Survival Models Inference & application to prediction of kidney graft failure Paul Blanche joint work with M-C.
Statistical Models in R
Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Linear Models in R Regression Regression analysis is the appropriate
Lecture 6: Poisson regression
Lecture 6: Poisson regression Claudia Czado TU München c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 0 Overview Introduction EDA for Poisson regression Estimation and testing in Poisson regression
Linda K. Muthén Bengt Muthén. Copyright 2008 Muthén & Muthén www.statmodel.com. Table Of Contents
Mplus Short Courses Topic 2 Regression Analysis, Eploratory Factor Analysis, Confirmatory Factor Analysis, And Structural Equation Modeling For Categorical, Censored, And Count Outcomes Linda K. Muthén
Directions for using SPSS
Directions for using SPSS Table of Contents Connecting and Working with Files 1. Accessing SPSS... 2 2. Transferring Files to N:\drive or your computer... 3 3. Importing Data from Another File Format...
12.5: CHI-SQUARE GOODNESS OF FIT TESTS
125: Chi-Square Goodness of Fit Tests CD12-1 125: CHI-SQUARE GOODNESS OF FIT TESTS In this section, the χ 2 distribution is used for testing the goodness of fit of a set of data to a specific probability
Normality Testing in Excel
Normality Testing in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. [email protected]
VI. Introduction to Logistic Regression
VI. Introduction to Logistic Regression We turn our attention now to the topic of modeling a categorical outcome as a function of (possibly) several factors. The framework of generalized linear models
This can dilute the significance of a departure from the null hypothesis. We can focus the test on departures of a particular form.
One-Degree-of-Freedom Tests Test for group occasion interactions has (number of groups 1) number of occasions 1) degrees of freedom. This can dilute the significance of a departure from the null hypothesis.
Analyzing Ranking and Rating Data from Participatory On- Farm Trials
44 Analyzing Ranking and Rating Data from Participatory On- Farm Trials RICHARD COE Abstract Responses in participatory on-farm trials are often measured as ratings (scores on an ordered but arbitrary
Factor Analysis. Chapter 420. Introduction
Chapter 420 Introduction (FA) is an exploratory technique applied to a set of observed variables that seeks to find underlying factors (subsets of variables) from which the observed variables were generated.
KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management
KSTAT MINI-MANUAL Decision Sciences 434 Kellogg Graduate School of Management Kstat is a set of macros added to Excel and it will enable you to do the statistics required for this course very easily. To
Multiple Linear Regression
Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is
ANALYSING LIKERT SCALE/TYPE DATA, ORDINAL LOGISTIC REGRESSION EXAMPLE IN R.
ANALYSING LIKERT SCALE/TYPE DATA, ORDINAL LOGISTIC REGRESSION EXAMPLE IN R. 1. Motivation. Likert items are used to measure respondents attitudes to a particular question or statement. One must recall
Multinomial and Ordinal Logistic Regression
Multinomial and Ordinal Logistic Regression ME104: Linear Regression Analysis Kenneth Benoit August 22, 2012 Regression with categorical dependent variables When the dependent variable is categorical,
DATA INTERPRETATION AND STATISTICS
PholC60 September 001 DATA INTERPRETATION AND STATISTICS Books A easy and systematic introductory text is Essentials of Medical Statistics by Betty Kirkwood, published by Blackwell at about 14. DESCRIPTIVE
Standard errors of marginal effects in the heteroskedastic probit model
Standard errors of marginal effects in the heteroskedastic probit model Thomas Cornelißen Discussion Paper No. 320 August 2005 ISSN: 0949 9962 Abstract In non-linear regression models, such as the heteroskedastic
Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression
Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a
Introduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby
Categorical Data Analysis
Richard L. Scheaffer University of Florida The reference material and many examples for this section are based on Chapter 8, Analyzing Association Between Categorical Variables, from Statistical Methods
SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables.
SIMPLE LINEAR CORRELATION Simple linear correlation is a measure of the degree to which two variables vary together, or a measure of the intensity of the association between two variables. Correlation
Nominal and Real U.S. GDP 1960-2001
Problem Set #5-Key Sonoma State University Dr. Cuellar Economics 318- Managerial Economics Use the data set for gross domestic product (gdp.xls) to answer the following questions. (1) Show graphically
Simple Predictive Analytics Curtis Seare
Using Excel to Solve Business Problems: Simple Predictive Analytics Curtis Seare Copyright: Vault Analytics July 2010 Contents Section I: Background Information Why use Predictive Analytics? How to use
Linear Discrimination. Linear Discrimination. Linear Discrimination. Linearly Separable Systems Pairwise Separation. Steven J Zeil.
Steven J Zeil Old Dominion Univ. Fall 200 Discriminant-Based Classification Linearly Separable Systems Pairwise Separation 2 Posteriors 3 Logistic Discrimination 2 Discriminant-Based Classification Likelihood-based:
Parametric and non-parametric statistical methods for the life sciences - Session I
Why nonparametric methods What test to use? Rank Tests Parametric and non-parametric statistical methods for the life sciences - Session I Liesbeth Bruckers Geert Molenberghs Interuniversity Institute
Multiple Choice Models II
Multiple Choice Models II Laura Magazzini University of Verona [email protected] http://dse.univr.it/magazzini Laura Magazzini (@univr.it) Multiple Choice Models II 1 / 28 Categorical data Categorical
Examples of Using R for Modeling Ordinal Data
Examples of Using R for Modeling Ordinal Data Alan Agresti Department of Statistics, University of Florida Supplement for the book Analysis of Ordinal Categorical Data, 2nd ed., 2010 (Wiley), abbreviated
ASSIGNMENT 4 PREDICTIVE MODELING AND GAINS CHARTS
DATABASE MARKETING Fall 2015, max 24 credits Dead line 15.10. ASSIGNMENT 4 PREDICTIVE MODELING AND GAINS CHARTS PART A Gains chart with excel Prepare a gains chart from the data in \\work\courses\e\27\e20100\ass4b.xls.
Nonlinear Regression:
Zurich University of Applied Sciences School of Engineering IDP Institute of Data Analysis and Process Design Nonlinear Regression: A Powerful Tool With Considerable Complexity Half-Day : Improved Inference
The Dummy s Guide to Data Analysis Using SPSS
The Dummy s Guide to Data Analysis Using SPSS Mathematics 57 Scripps College Amy Gamble April, 2001 Amy Gamble 4/30/01 All Rights Rerserved TABLE OF CONTENTS PAGE Helpful Hints for All Tests...1 Tests
Analysing Questionnaires using Minitab (for SPSS queries contact -) [email protected]
Analysing Questionnaires using Minitab (for SPSS queries contact -) [email protected] Structure As a starting point it is useful to consider a basic questionnaire as containing three main sections:
Use of deviance statistics for comparing models
A likelihood-ratio test can be used under full ML. The use of such a test is a quite general principle for statistical testing. In hierarchical linear models, the deviance test is mostly used for multiparameter
MATH 10: Elementary Statistics and Probability Chapter 5: Continuous Random Variables
MATH 10: Elementary Statistics and Probability Chapter 5: Continuous Random Variables Tony Pourmohamad Department of Mathematics De Anza College Spring 2015 Objectives By the end of this set of slides,
Illustration (and the use of HLM)
Illustration (and the use of HLM) Chapter 4 1 Measurement Incorporated HLM Workshop The Illustration Data Now we cover the example. In doing so we does the use of the software HLM. In addition, we will
Statistiek II. John Nerbonne. October 1, 2010. Dept of Information Science [email protected]
Dept of Information Science [email protected] October 1, 2010 Course outline 1 One-way ANOVA. 2 Factorial ANOVA. 3 Repeated measures ANOVA. 4 Correlation and regression. 5 Multiple regression. 6 Logistic
Generalized Linear Models
Generalized Linear Models We have previously worked with regression models where the response variable is quantitative and normally distributed. Now we turn our attention to two types of models where the
ANALYSIS OF TREND CHAPTER 5
ANALYSIS OF TREND CHAPTER 5 ERSH 8310 Lecture 7 September 13, 2007 Today s Class Analysis of trends Using contrasts to do something a bit more practical. Linear trends. Quadratic trends. Trends in SPSS.
Row vs. Column Percents. tab PRAYER DEGREE, row col
Bivariate Analysis - Crosstabulation One of most basic research tools shows how x varies with respect to y Interpretation of table depends upon direction of percentaging example Row vs. Column Percents.
Using Excel for inferential statistics
FACT SHEET Using Excel for inferential statistics Introduction When you collect data, you expect a certain amount of variation, just caused by chance. A wide variety of statistical tests can be applied
The Latent Variable Growth Model In Practice. Individual Development Over Time
The Latent Variable Growth Model In Practice 37 Individual Development Over Time y i = 1 i = 2 i = 3 t = 1 t = 2 t = 3 t = 4 ε 1 ε 2 ε 3 ε 4 y 1 y 2 y 3 y 4 x η 0 η 1 (1) y ti = η 0i + η 1i x t + ε ti
INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)
INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) As with other parametric statistics, we begin the one-way ANOVA with a test of the underlying assumptions. Our first assumption is the assumption of
Modelling and added value
Modelling and added value Course: Statistical Evaluation of Diagnostic and Predictive Models Thomas Alexander Gerds (University of Copenhagen) Summer School, Barcelona, June 30, 2015 1 / 53 Multiple regression
Independent t- Test (Comparing Two Means)
Independent t- Test (Comparing Two Means) The objectives of this lesson are to learn: the definition/purpose of independent t-test when to use the independent t-test the use of SPSS to complete an independent
MATH4427 Notebook 2 Spring 2016. 2 MATH4427 Notebook 2 3. 2.1 Definitions and Examples... 3. 2.2 Performance Measures for Estimators...
MATH4427 Notebook 2 Spring 2016 prepared by Professor Jenny Baglivo c Copyright 2009-2016 by Jenny A. Baglivo. All Rights Reserved. Contents 2 MATH4427 Notebook 2 3 2.1 Definitions and Examples...................................
5. Linear Regression
5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4
Handling missing data in Stata a whirlwind tour
Handling missing data in Stata a whirlwind tour 2012 Italian Stata Users Group Meeting Jonathan Bartlett www.missingdata.org.uk 20th September 2012 1/55 Outline The problem of missing data and a principled
Examining a Fitted Logistic Model
STAT 536 Lecture 16 1 Examining a Fitted Logistic Model Deviance Test for Lack of Fit The data below describes the male birth fraction male births/total births over the years 1931 to 1990. A simple logistic
A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution
A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 4: September
STRUTS: Statistical Rules of Thumb. Seattle, WA
STRUTS: Statistical Rules of Thumb Gerald van Belle Departments of Environmental Health and Biostatistics University ofwashington Seattle, WA 98195-4691 Steven P. Millard Probability, Statistics and Information
NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
Module 4 - Multiple Logistic Regression
Module 4 - Multiple Logistic Regression Objectives Understand the principles and theory underlying logistic regression Understand proportions, probabilities, odds, odds ratios, logits and exponents Be
Automated Biosurveillance Data from England and Wales, 1991 2011
Article DOI: http://dx.doi.org/10.3201/eid1901.120493 Automated Biosurveillance Data from England and Wales, 1991 2011 Technical Appendix This online appendix provides technical details of statistical
Chapter 3 Quantitative Demand Analysis
Managerial Economics & Business Strategy Chapter 3 uantitative Demand Analysis McGraw-Hill/Irwin Copyright 2010 by the McGraw-Hill Companies, Inc. All rights reserved. Overview I. The Elasticity Concept
Lean Six Sigma Analyze Phase Introduction. TECH 50800 QUALITY and PRODUCTIVITY in INDUSTRY and TECHNOLOGY
TECH 50800 QUALITY and PRODUCTIVITY in INDUSTRY and TECHNOLOGY Before we begin: Turn on the sound on your computer. There is audio to accompany this presentation. Audio will accompany most of the online
Logit Models for Binary Data
Chapter 3 Logit Models for Binary Data We now turn our attention to regression models for dichotomous data, including logistic regression and probit analysis. These models are appropriate when the response
Lesson 14 14 Outline Outline
Lesson 14 Confidence Intervals of Odds Ratio and Relative Risk Lesson 14 Outline Lesson 14 covers Confidence Interval of an Odds Ratio Review of Odds Ratio Sampling distribution of OR on natural log scale
containing Kendall correlations; and the OUTH = option will create a data set containing Hoeffding statistics.
Getting Correlations Using PROC CORR Correlation analysis provides a method to measure the strength of a linear relationship between two numeric variables. PROC CORR can be used to compute Pearson product-moment
We extended the additive model in two variables to the interaction model by adding a third term to the equation.
Quadratic Models We extended the additive model in two variables to the interaction model by adding a third term to the equation. Similarly, we can extend the linear model in one variable to the quadratic
7 Generalized Estimating Equations
Chapter 7 The procedure extends the generalized linear model to allow for analysis of repeated measurements or other correlated observations, such as clustered data. Example. Public health of cials can
Simple linear regression
Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between
T-test & factor analysis
Parametric tests T-test & factor analysis Better than non parametric tests Stringent assumptions More strings attached Assumes population distribution of sample is normal Major problem Alternatives Continue
International College of Economics and Finance Syllabus Probability Theory and Introductory Statistics
International College of Economics and Finance Syllabus Probability Theory and Introductory Statistics Lecturer: Mikhail Zhitlukhin. 1. Course description Probability Theory and Introductory Statistics
A LOGNORMAL MODEL FOR INSURANCE CLAIMS DATA
REVSTAT Statistical Journal Volume 4, Number 2, June 2006, 131 142 A LOGNORMAL MODEL FOR INSURANCE CLAIMS DATA Authors: Daiane Aparecida Zuanetti Departamento de Estatística, Universidade Federal de São
Statistical Models in R
Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Structure of models in R Model Assessment (Part IA) Anova
GLM I An Introduction to Generalized Linear Models
GLM I An Introduction to Generalized Linear Models CAS Ratemaking and Product Management Seminar March 2009 Presented by: Tanya D. Havlicek, Actuarial Assistant 0 ANTITRUST Notice The Casualty Actuarial
COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES.
277 CHAPTER VI COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES. This chapter contains a full discussion of customer loyalty comparisons between private and public insurance companies
This chapter will demonstrate how to perform multiple linear regression with IBM SPSS
CHAPTER 7B Multiple Regression: Statistical Methods Using IBM SPSS This chapter will demonstrate how to perform multiple linear regression with IBM SPSS first using the standard method and then using the
Computer exercise 4 Poisson Regression
Chalmers-University of Gothenburg Department of Mathematical Sciences Probability, Statistics and Risk MVE300 Computer exercise 4 Poisson Regression When dealing with two or more variables, the functional
TRINITY COLLEGE. Faculty of Engineering, Mathematics and Science. School of Computer Science & Statistics
UNIVERSITY OF DUBLIN TRINITY COLLEGE Faculty of Engineering, Mathematics and Science School of Computer Science & Statistics BA (Mod) Enter Course Title Trinity Term 2013 Junior/Senior Sophister ST7002
Lecture 5 : The Poisson Distribution
Lecture 5 : The Poisson Distribution Jonathan Marchini November 10, 2008 1 Introduction Many experimental situations occur in which we observe the counts of events within a set unit of time, area, volume,
