Continuous Outcomes. Objectives. Review the linear regression model (LRM) Discuss the idea of identification Present the method of maximum likelihood
|
|
- Wilfrid Weaver
- 3 years ago
- Views:
Transcription
1 Continuous Outcomes Objectives Review the linear regression model (LRM) Discuss the idea of identification Present the method of maximum likelihood Continuous LHS \ 1
2 The Linear Regression Model y = x β + ε i i i ( x x x ) If x = 1, i i1 i2 i3 β0 β = + ε [ x x x ] 1 1 i 1 i 2 i 3 i β2 β3 = β + β x + β x + β x + ε 0 1 i1 2 i2 3 i3 i Continuous LHS \ 2
3 Graphically Continuous LHS \ 3
4 Assumptions Linearity Linear independence of the x's Errors: Zero conditional mean Homoscedastic Uncorrelated for any pair of x's [Normality] Continuous LHS \ 4
5 Linearity y is linearly related to the x's through the β's Continuous LHS \ 5
6 Collinearity the x k 's are not perfectly collinear Continuous LHS \ 6
7 Zero conditional mean E( ε x ) = 0 i i This identifying assumption implies: ( i xi) = E( xiβ + εi xi) = xβ + E( ε x ) Ey = i i i x β i Continuous LHS \ 7
8 Homoscedastic errors 2 ( ε ) = σ Var i x i for all i Continuous LHS \ 8
9 Uncorrelated errors For two observations i and j, the covariance between ε i and ε j is 0. What common situations violate this assumption? Continuous LHS \ 9
10 Estimation by OLS The OLS estimator of β is that value ˆβ that minimizes the sum of the squared residuals!! β = ( X X) 1 X y N i=1 ( y i x i β )2: When the assumptions of the model hold, β is the best linear unbiased estimator Continuous LHS \ 10
11 Estimation by Maximum Likelihood Instead of minimizing the sum of squared errors We maximize the likelihood The ML estimate is that value of the parameter that makes the observed (sample) data most likely Continuous LHS \ 11
12 A simple example o s be the # of men in the sample o N be the sample size o π be the population probability of being male We know s and N We want an estimate of π L ( π, s N) Continuous LHS \ 12
13 The Likelihood Function Binomial formula N! s Pr( s π, N) = (1 ) s!( N s)! π π N s Note that! k!= k (k 1) 2 1 Rewrite as likelihood function for s=3 & N=10 10! 3!7! 3 7 ( π = 3, = 10 ) = π (1 π) L s N Continuous LHS \ 13
14 Probability of s given fixed N and π (from Binomial formula) p(s p=.3, N=10) Continuous LHS \ 14
15 Probability of π given fixed N and s (from Likelihood formula) Continuous LHS \ 15
16 Maximize the likelihood The maximum occurs when the derivative (or gradient) is zero ( π 3, 10) L s= N= π = 0 The value that maximizes the likelihood function also maximizes the log of the likelihood (which is easier to calculate): ( π ) ln L s= 3, N= 10 π = 0 Continuous LHS \ 16
17 For our example, 10! 3 7 ln L( π s= 3, N= 10) ln 3!7! π (1 π) = π π 10! ln 7ln 1 3!7! 3lnπ = + + π π π 3lnπ 7ln(1 π) = π π 3 7 = = 0 π 1 π ( π )!π =.3 maximizes the likelihood Continuous LHS \ 17
18 Question {for you} Does anything about this approach seem strange to you? [HINT: What if our data represented coin flips, with s = HEADs?] Continuous LHS \ 18
19 ML estimation of the Sample Mean PDF for y 2 1 ( yi µ ) fy ( i µσ, = 1) = exp 2π 2 Rewrite in terms of µ 2 1 ( yi µ ) L( µ yi, σ = 1) = exp 2π 2 For three independent observations, the likelihood is L( µ y, σ = 1) = ln L( µ y, σ = 1) 3 i i Continuous LHS \ 19
20 Graphically Continuous LHS \ 20
21 ML estimation for the LRM The pdf ( x, ) f y i [ α βx ] 1 yi + i α + β i σ = ϕ σ σ Rewrite as likelihood equation L ( αβσ,, yx, ) [ α βx ] N 1 yi + i = ϕ i= 1 σ σ Continuous LHS \ 21
22 Graphically Continuous LHS \ 22
23 The Properties of ML Estimators Under very general conditions, the ML estimator is: Consistent Asymptotically efficient Asymptotically normally distributed These are asymptotic properties; they describe the ML estimator as the sample size approaches infinity. But, how big must N be to be approximately infinite? Continuous LHS \ 23
24 Guidelines It is risky to use ML for N < 100; N > 500 seems safe. These values should be raised depending on characteristics of the model and the data Some models seem to require more observations for example, the ordinal regression model Continuous LHS \ 24
25 Identification Occurs before estimation Continuous LHS \ 25
26 Demonstration of Identification In the LRM, the structural model is:! y = β + β x + + β x + ε where K K ( x ) E ε = 0 If we assume: ( ε x ) E = δ The structural equation can be modified to create an error with mean zero: y = 0+ β 0 + β 1 x β K x K + ε = ( δ δ )+ β 0 + β 1 x β K x K + ε = ( β 0 +δ )+ β 1 x β K x K + ε δ! = β 0 + β 1 x β K x K + ε ( ) Continuous LHS \ 26
27 This equation has all of the properties of the LRM, including ( x ) E ε = 0 But note: E β 0! = β +δ 0 No matter how large the sample, it is impossible to disentangle estimates of β and 0 δ β and δ are not identified individually, although their sum β0 0 identified + δ is Continuous LHS \ 27
28 Graphically ( ε x ) E = δ Continuous LHS \ 28
29 Basic Ideas about Identification A parameter is unidentified when it is impossible to estimate the parameter regardless of the data available Models become identified by adding assumptions, not by increasing the sample size. For example E( ε ) = 0 It is possible for some parameters of a model to be identified while others are not. For example, β but not α While individual parameters may not be identified, combinations of those parameters may be identified. For example, α+ δ but not δ and α Continuous LHS \ 29
30 Interpreting Regression Coefficients Slopes as marginal change (partial derivative) Slopes as discrete change (first difference) Relationship between discrete and marginal change Continuous LHS \ 30
31 Partial or Marginal Change The partial derivative of y with respect to x : k ( x) Ey x k xβ = = βk x k Continuous LHS \ 31
32 Discrete Change Notation Before: Ey (, x) x 2 is the expected value of y given x, explicitly noting a specific value of x 2 After: Ey (, x + 1) increases by 1 x 2 is the expected value of y given x when x 2 Continuous LHS \ 32
33 ΔE ( y x, x ) Δx 2 2 = After Before Ey ( x, x2 + 1) E( y x, x2 ) [ β0 + β1x+ β2( x2+ 1) + β3x3] [ β0 + β1x1+ β2x2+ β3x3 ] [ β + β x+ β x + β + β x ] [ β + β x + β x + β x ] = = = = β Continuous LHS \ 33
34 Equality of Discrete and Partial Change In the LRM, ( x) Δ ( x, ) E y E y x x k 2 = = Δx 2 β k Continuous LHS \ 34
35 Simple Interpretation For a unit increase in x the expected change in k y equals β, holding all k other variables constant Having characteristic x (as opposed to not having the characteristic) k results in an expected change of β in k y, holding all other variables constant Continuous LHS \ 35
36 Data Career data on biochemists that obtained their Ph.D.s in 1957, 1958, 1962, and 1963 (n=408) Continuous LHS \ 36
37 Descriptive Information Name Mean Std Dev Min Max Description JOB Prestige of job (from 1 to 5). FEM if female; 0 if male. PHD Prestige of Ph.D. department. MENT Citations received by mentor. FEL if held fellowship; else 0. ART Number of articles published. CIT Number of citations received. Continuous LHS \ 37
38 . use regjob3,clear (Long's data on academic jobs of biochemists \ ). codebook job fem phd ment fel art cit, compact Variable Obs Unique Mean Min Max Label job Prestige of 1st job on 1 to 5 scale fem Gender: 1=female 0=male phd PhD prestige on 1 to 5 scale ment Citations received by mentor fel Fellow: 1=yes 0=no art # of articles published cit # of citations received job Prestige of 1st job on 100 to 500 scale phd PhD prestige on 100 to 500 scale Continuous LHS \ 38
39 . summarize job fem phd ment fel art cit Variable Obs Mean Std. Dev. Min Max job fem phd ment fel art cit Continuous LHS \ 39
40 Stata: Estimating the LRM Our LRM is: JOB = + FEM + PHD + MENT + FEL + ART + CIT + In Stata: β β β β β β β ε <command> <y> <x x x x>, <options>. regress job fem phd ment fel art cit Source SS df MS Number of obs = F( 6, 401) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = job Coef. Std. Err. t P> t [95% Conf. Interval] fem phd ment fel art cit _cons Continuous LHS \ 40
41 Simple Interpretation job Coef. Std. Err. t P> t [95% Conf. Interval] fem phd ment fel art cit _cons For every additional citation, the prestige of the first job is expected to increase by.004 units, holding all other variables constant. The expected prestige of the first job is.14 points lower for females as compared to their male counterparts. Continuous LHS \ 41
42 Comparison of Linear & Nonlinear Models Continuous LHS \ 42
43 In nonlinear models, partial and discrete change are not equal: ( ) E( ) E Δ x Δx k k In nonlinear models, both discrete & partial change depend on: the value of x k, and the values of the other x's in the model Continuous LHS \ 43
44 Standardized and Semi- Standardized Coefficients y- standardized x- standardized fully standardized Continuous LHS \ 44
45 y- standardized coefficients Standardizing y to a unit variance: y β β β β ε = + x + x + x + σ σ σ σ σ σ y y y y y y Adding new notation: S S S S S S y = β0 + β1 x1+ β2 x2+ β3 x3+ ε y y y y y Continuous LHS \ 45
46 Interpretation For a continuous variable: o For a unit increase in x, k y is expected to change by deviations, holding all other variables constant For a dummy variable: S k y β standard o Having characteristic x (as opposed to not having the k Sy characteristic) results in an expected change in y of β standard k deviations, holding all other variables constant Continuous LHS \ 46
47 x- standardized coefficients Standardizing the x's to a unit variance: x x x y = β + σ β + σ β + σ β + ε Adding new notation: ( ) 1 ( ) 2 ( ) σ1 σ2 σ3 y= β + β x + β x + β x + ε S x S S x S S x S Continuous LHS \ 47
48 Interpretation For a continuous variable o For a standard deviation increase in x k, y is expected to change by β units, holding all other variables constant S x k For a dummy variable o The meaning of a standard deviation change is unclear Continuous LHS \ 48
49 Fully standardized coefficients Standardizing both y and x's: y β 0 σβ 1 1 x 1 σβ 2 2 x 2 σ3β 3 x3 ε = σy σ y σ y σ 1 σ y σ 2 σ y σ3 σy Adding new notation: y = β + β x + β x + β x + ε S S S S S S S S S y Continuous LHS \ 49
50 Interpretation For a continuous variable: o For a standard deviation increase in x, k y is expected to change by β standard deviations, holding all other variables constant S k For a dummy variable: o The meaning of a standard deviation change is unclear Continuous LHS \ 50
51 Stata: Using listcoef to compute standardized effects After running regress, the listcoef command is used:. listcoef, cons help regress (N=408): Unstandardized and Standardized Estimates Observed SD: SD of Error: job b t P> t bstdx bstdy bstdxy SDofX fem phd ment fel art cit _cons b = raw coefficient t = t-score for test of b=0 P> t = p-value for t-test bstdx = x-standardized coefficient bstdy = y-standardized coefficient bstdxy = fully standardized coefficient SDofX = standard deviation of X Continuous LHS \ 51
52 Interpreting standardized coefficients job b t P> t bstdx bstdy bstdxy SDofX fem phd ment fel art cit _cons Your turn Continuous LHS \ 52
53 End LRM Continuous LHS \ 53
Marginal Effects for Continuous Variables Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 21, 2015
Marginal Effects for Continuous Variables Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 21, 2015 References: Long 1997, Long and Freese 2003 & 2006 & 2014,
More informationLecture 15. Endogeneity & Instrumental Variable Estimation
Lecture 15. Endogeneity & Instrumental Variable Estimation Saw that measurement error (on right hand side) means that OLS will be biased (biased toward zero) Potential solution to endogeneity instrumental
More informationIAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results
IAPRI Quantitative Analysis Capacity Building Series Multiple regression analysis & interpreting results How important is R-squared? R-squared Published in Agricultural Economics 0.45 Best article of the
More informationMODEL I: DRINK REGRESSED ON GPA & MALE, WITHOUT CENTERING
Interpreting Interaction Effects; Interaction Effects and Centering Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 20, 2015 Models with interaction effects
More informationMULTIPLE REGRESSION EXAMPLE
MULTIPLE REGRESSION EXAMPLE For a sample of n = 166 college students, the following variables were measured: Y = height X 1 = mother s height ( momheight ) X 2 = father s height ( dadheight ) X 3 = 1 if
More informationMulticollinearity Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 13, 2015
Multicollinearity Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 13, 2015 Stata Example (See appendices for full example).. use http://www.nd.edu/~rwilliam/stats2/statafiles/multicoll.dta,
More informationECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2
University of California, Berkeley Prof. Ken Chay Department of Economics Fall Semester, 005 ECON 14 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE # Question 1: a. Below are the scatter plots of hourly wages
More informationFailure to take the sampling scheme into account can lead to inaccurate point estimates and/or flawed estimates of the standard errors.
Analyzing Complex Survey Data: Some key issues to be aware of Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 24, 2015 Rather than repeat material that is
More informationESTIMATING AVERAGE TREATMENT EFFECTS: IV AND CONTROL FUNCTIONS, II Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics
ESTIMATING AVERAGE TREATMENT EFFECTS: IV AND CONTROL FUNCTIONS, II Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 1. Quantile Treatment Effects 2. Control Functions
More informationPlease follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software
STATA Tutorial Professor Erdinç Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software 1.Wald Test Wald Test is used
More informationInteraction effects between continuous variables (Optional)
Interaction effects between continuous variables (Optional) Richard Williams, University of Notre Dame, http://www.nd.edu/~rwilliam/ Last revised February 0, 05 This is a very brief overview of this somewhat
More informationAugust 2012 EXAMINATIONS Solution Part I
August 01 EXAMINATIONS Solution Part I (1) In a random sample of 600 eligible voters, the probability that less than 38% will be in favour of this policy is closest to (B) () In a large random sample,
More informationCorrelation and Regression
Correlation and Regression Scatterplots Correlation Explanatory and response variables Simple linear regression General Principles of Data Analysis First plot the data, then add numerical summaries Look
More informationForecasting in STATA: Tools and Tricks
Forecasting in STATA: Tools and Tricks Introduction This manual is intended to be a reference guide for time series forecasting in STATA. It will be updated periodically during the semester, and will be
More informationHURDLE AND SELECTION MODELS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009
HURDLE AND SELECTION MODELS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 1. Introduction 2. A General Formulation 3. Truncated Normal Hurdle Model 4. Lognormal
More informationHandling missing data in Stata a whirlwind tour
Handling missing data in Stata a whirlwind tour 2012 Italian Stata Users Group Meeting Jonathan Bartlett www.missingdata.org.uk 20th September 2012 1/55 Outline The problem of missing data and a principled
More information1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96
1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
More informationStata Walkthrough 4: Regression, Prediction, and Forecasting
Stata Walkthrough 4: Regression, Prediction, and Forecasting Over drinks the other evening, my neighbor told me about his 25-year-old nephew, who is dating a 35-year-old woman. God, I can t see them getting
More informationOutline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares
Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation
More informationData Analysis Methodology 1
Data Analysis Methodology 1 Suppose you inherited the database in Table 1.1 and needed to find out what could be learned from it fast. Say your boss entered your office and said, Here s some software project
More informationDepartment of Economics Session 2012/2013. EC352 Econometric Methods. Solutions to Exercises from Week 10 + 0.0077 (0.052)
Department of Economics Session 2012/2013 University of Essex Spring Term Dr Gordon Kemp EC352 Econometric Methods Solutions to Exercises from Week 10 1 Problem 13.7 This exercise refers back to Equation
More informationStatistics 104 Final Project A Culture of Debt: A Study of Credit Card Spending in America TF: Kevin Rader Anonymous Students: LD, MH, IW, MY
Statistics 104 Final Project A Culture of Debt: A Study of Credit Card Spending in America TF: Kevin Rader Anonymous Students: LD, MH, IW, MY ABSTRACT: This project attempted to determine the relationship
More informationInteraction effects and group comparisons Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 20, 2015
Interaction effects and group comparisons Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 20, 2015 Note: This handout assumes you understand factor variables,
More information5. Linear Regression
5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4
More informationUsing Stata 9 & Higher for OLS Regression Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 8, 2015
Using Stata 9 & Higher for OLS Regression Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 8, 2015 Introduction. This handout shows you how Stata can be used
More informationNonlinear relationships Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 20, 2015
Nonlinear relationships Richard Williams, University of Notre Dame, http://www.nd.edu/~rwilliam/ Last revised February, 5 Sources: Berry & Feldman s Multiple Regression in Practice 985; Pindyck and Rubinfeld
More informationMODELING AUTO INSURANCE PREMIUMS
MODELING AUTO INSURANCE PREMIUMS Brittany Parahus, Siena College INTRODUCTION The findings in this paper will provide the reader with a basic knowledge and understanding of how Auto Insurance Companies
More informationMultivariate Logistic Regression
1 Multivariate Logistic Regression As in univariate logistic regression, let π(x) represent the probability of an event that depends on p covariates or independent variables. Then, using an inv.logit formulation
More information25 Working with categorical data and factor variables
25 Working with categorical data and factor variables Contents 25.1 Continuous, categorical, and indicator variables 25.1.1 Converting continuous variables to indicator variables 25.1.2 Converting continuous
More information**BEGINNING OF EXAMINATION** The annual number of claims for an insured has probability function: , 0 < q < 1.
**BEGINNING OF EXAMINATION** 1. You are given: (i) The annual number of claims for an insured has probability function: 3 p x q q x x ( ) = ( 1 ) 3 x, x = 0,1,, 3 (ii) The prior density is π ( q) = q,
More informationNonlinear Regression Functions. SW Ch 8 1/54/
Nonlinear Regression Functions SW Ch 8 1/54/ The TestScore STR relation looks linear (maybe) SW Ch 8 2/54/ But the TestScore Income relation looks nonlinear... SW Ch 8 3/54/ Nonlinear Regression General
More informationOverview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
More informationMultiple Linear Regression in Data Mining
Multiple Linear Regression in Data Mining Contents 2.1. A Review of Multiple Linear Regression 2.2. Illustration of the Regression Process 2.3. Subset Selection in Linear Regression 1 2 Chap. 2 Multiple
More informationGLM I An Introduction to Generalized Linear Models
GLM I An Introduction to Generalized Linear Models CAS Ratemaking and Product Management Seminar March 2009 Presented by: Tanya D. Havlicek, Actuarial Assistant 0 ANTITRUST Notice The Casualty Actuarial
More informationQuick Stata Guide by Liz Foster
by Liz Foster Table of Contents Part 1: 1 describe 1 generate 1 regress 3 scatter 4 sort 5 summarize 5 table 6 tabulate 8 test 10 ttest 11 Part 2: Prefixes and Notes 14 by var: 14 capture 14 use of the
More informationChapter 13 Introduction to Linear Regression and Correlation Analysis
Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing
More informationFrom the help desk: Bootstrapped standard errors
The Stata Journal (2003) 3, Number 1, pp. 71 80 From the help desk: Bootstrapped standard errors Weihua Guan Stata Corporation Abstract. Bootstrapping is a nonparametric approach for evaluating the distribution
More informationDiscussion Section 4 ECON 139/239 2010 Summer Term II
Discussion Section 4 ECON 139/239 2010 Summer Term II 1. Let s use the CollegeDistance.csv data again. (a) An education advocacy group argues that, on average, a person s educational attainment would increase
More informationDETERMINANTS OF CAPITAL ADEQUACY RATIO IN SELECTED BOSNIAN BANKS
DETERMINANTS OF CAPITAL ADEQUACY RATIO IN SELECTED BOSNIAN BANKS Nađa DRECA International University of Sarajevo nadja.dreca@students.ius.edu.ba Abstract The analysis of a data set of observation for 10
More informationRegression Analysis: A Complete Example
Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty
More informationFrom the help desk: hurdle models
The Stata Journal (2003) 3, Number 2, pp. 178 184 From the help desk: hurdle models Allen McDowell Stata Corporation Abstract. This article demonstrates that, although there is no command in Stata for
More informationGeneralized Linear Models
Generalized Linear Models We have previously worked with regression models where the response variable is quantitative and normally distributed. Now we turn our attention to two types of models where the
More informationIntroduction to Multilevel Modeling Using HLM 6. By ATS Statistical Consulting Group
Introduction to Multilevel Modeling Using HLM 6 By ATS Statistical Consulting Group Multilevel data structure Students nested within schools Children nested within families Respondents nested within interviewers
More informationFrom the help desk: Swamy s random-coefficients model
The Stata Journal (2003) 3, Number 3, pp. 302 308 From the help desk: Swamy s random-coefficients model Brian P. Poi Stata Corporation Abstract. This article discusses the Swamy (1970) random-coefficients
More informationWe extended the additive model in two variables to the interaction model by adding a third term to the equation.
Quadratic Models We extended the additive model in two variables to the interaction model by adding a third term to the equation. Similarly, we can extend the linear model in one variable to the quadratic
More informationI n d i a n a U n i v e r s i t y U n i v e r s i t y I n f o r m a t i o n T e c h n o l o g y S e r v i c e s
I n d i a n a U n i v e r s i t y U n i v e r s i t y I n f o r m a t i o n T e c h n o l o g y S e r v i c e s Linear Regression Models for Panel Data Using SAS, Stata, LIMDEP, and SPSS * Hun Myoung Park,
More informationIntroduction to Regression and Data Analysis
Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it
More informationUsing the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, and Discrete Changes
Using the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, Discrete Changes JunXuJ.ScottLong Indiana University August 22, 2005 The paper provides technical details on
More informationSimple linear regression
Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between
More informationI L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN
Beckman HLM Reading Group: Questions, Answers and Examples Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Linear Algebra Slide 1 of
More informationHow to set the main menu of STATA to default factory settings standards
University of Pretoria Data analysis for evaluation studies Examples in STATA version 11 List of data sets b1.dta (To be created by students in class) fp1.xls (To be provided to students) fp1.txt (To be
More informationModule 5: Multiple Regression Analysis
Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College
More informationRockefeller College University at Albany
Rockefeller College University at Albany PAD 705 Handout: Hypothesis Testing on Multiple Parameters In many cases we may wish to know whether two or more variables are jointly significant in a regression.
More informationBasic Statistical and Modeling Procedures Using SAS
Basic Statistical and Modeling Procedures Using SAS One-Sample Tests The statistical procedures illustrated in this handout use two datasets. The first, Pulse, has information collected in a classroom
More information1 Another method of estimation: least squares
1 Another method of estimation: least squares erm: -estim.tex, Dec8, 009: 6 p.m. (draft - typos/writos likely exist) Corrections, comments, suggestions welcome. 1.1 Least squares in general Assume Y i
More informationFrom this it is not clear what sort of variable that insure is so list the first 10 observations.
MNL in Stata We have data on the type of health insurance available to 616 psychologically depressed subjects in the United States (Tarlov et al. 1989, JAMA; Wells et al. 1989, JAMA). The insurance is
More informationData Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression
Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction
More informationChapter 7: Simple linear regression Learning Objectives
Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -
More information11. Analysis of Case-control Studies Logistic Regression
Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:
More informationStandard errors of marginal effects in the heteroskedastic probit model
Standard errors of marginal effects in the heteroskedastic probit model Thomas Cornelißen Discussion Paper No. 320 August 2005 ISSN: 0949 9962 Abstract In non-linear regression models, such as the heteroskedastic
More informationSome Essential Statistics The Lure of Statistics
Some Essential Statistics The Lure of Statistics Data Mining Techniques, by M.J.A. Berry and G.S Linoff, 2004 Statistics vs. Data Mining..lie, damn lie, and statistics mining data to support preconceived
More informationA Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution
A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 4: September
More informationMultinomial and Ordinal Logistic Regression
Multinomial and Ordinal Logistic Regression ME104: Linear Regression Analysis Kenneth Benoit August 22, 2012 Regression with categorical dependent variables When the dependent variable is categorical,
More informationModerator and Mediator Analysis
Moderator and Mediator Analysis Seminar General Statistics Marijtje van Duijn October 8, Overview What is moderation and mediation? What is their relation to statistical concepts? Example(s) October 8,
More informationMULTIPLE REGRESSION WITH CATEGORICAL DATA
DEPARTMENT OF POLITICAL SCIENCE AND INTERNATIONAL RELATIONS Posc/Uapp 86 MULTIPLE REGRESSION WITH CATEGORICAL DATA I. AGENDA: A. Multiple regression with categorical variables. Coding schemes. Interpreting
More informationLab 5 Linear Regression with Within-subject Correlation. Goals: Data: Use the pig data which is in wide format:
Lab 5 Linear Regression with Within-subject Correlation Goals: Data: Fit linear regression models that account for within-subject correlation using Stata. Compare weighted least square, GEE, and random
More informationDirections for using SPSS
Directions for using SPSS Table of Contents Connecting and Working with Files 1. Accessing SPSS... 2 2. Transferring Files to N:\drive or your computer... 3 3. Importing Data from Another File Format...
More informationAddressing Alternative. Multiple Regression. 17.871 Spring 2012
Addressing Alternative Explanations: Multiple Regression 17.871 Spring 2012 1 Did Clinton hurt Gore example Did Clinton hurt Gore in the 2000 election? Treatment is not liking Bill Clinton 2 Bivariate
More informationproblem arises when only a non-random sample is available differs from censored regression model in that x i is also unobserved
4 Data Issues 4.1 Truncated Regression population model y i = x i β + ε i, ε i N(0, σ 2 ) given a random sample, {y i, x i } N i=1, then OLS is consistent and efficient problem arises when only a non-random
More informationANNUITY LAPSE RATE MODELING: TOBIT OR NOT TOBIT? 1. INTRODUCTION
ANNUITY LAPSE RATE MODELING: TOBIT OR NOT TOBIT? SAMUEL H. COX AND YIJIA LIN ABSTRACT. We devise an approach, using tobit models for modeling annuity lapse rates. The approach is based on data provided
More informationLOGIT AND PROBIT ANALYSIS
LOGIT AND PROBIT ANALYSIS A.K. Vasisht I.A.S.R.I., Library Avenue, New Delhi 110 012 amitvasisht@iasri.res.in In dummy regression variable models, it is assumed implicitly that the dependent variable Y
More informationInternational Statistical Institute, 56th Session, 2007: Phil Everson
Teaching Regression using American Football Scores Everson, Phil Swarthmore College Department of Mathematics and Statistics 5 College Avenue Swarthmore, PA198, USA E-mail: peverso1@swarthmore.edu 1. Introduction
More informationSample Size Calculation for Longitudinal Studies
Sample Size Calculation for Longitudinal Studies Phil Schumm Department of Health Studies University of Chicago August 23, 2004 (Supported by National Institute on Aging grant P01 AG18911-01A1) Introduction
More informationAPPLICATION OF LINEAR REGRESSION MODEL FOR POISSON DISTRIBUTION IN FORECASTING
APPLICATION OF LINEAR REGRESSION MODEL FOR POISSON DISTRIBUTION IN FORECASTING Sulaimon Mutiu O. Department of Statistics & Mathematics Moshood Abiola Polytechnic, Abeokuta, Ogun State, Nigeria. Abstract
More informationDEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9
DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 Analysis of covariance and multiple regression So far in this course,
More informationis paramount in advancing any economy. For developed countries such as
Introduction The provision of appropriate incentives to attract workers to the health industry is paramount in advancing any economy. For developed countries such as Australia, the increasing demand for
More informationMultiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear.
Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear. In the main dialog box, input the dependent variable and several predictors.
More informationxtmixed & denominator degrees of freedom: myth or magic
xtmixed & denominator degrees of freedom: myth or magic 2011 Chicago Stata Conference Phil Ender UCLA Statistical Consulting Group July 2011 Phil Ender xtmixed & denominator degrees of freedom: myth or
More informationPredictor Coef StDev T P Constant 970667056 616256122 1.58 0.154 X 0.00293 0.06163 0.05 0.963. S = 0.5597 R-Sq = 0.0% R-Sq(adj) = 0.
Statistical analysis using Microsoft Excel Microsoft Excel spreadsheets have become somewhat of a standard for data storage, at least for smaller data sets. This, along with the program often being packaged
More informationData Mining: An Overview of Methods and Technologies for Increasing Profits in Direct Marketing. C. Olivia Rud, VP, Fleet Bank
Data Mining: An Overview of Methods and Technologies for Increasing Profits in Direct Marketing C. Olivia Rud, VP, Fleet Bank ABSTRACT Data Mining is a new term for the common practice of searching through
More informationPenalized regression: Introduction
Penalized regression: Introduction Patrick Breheny August 30 Patrick Breheny BST 764: Applied Statistical Modeling 1/19 Maximum likelihood Much of 20th-century statistics dealt with maximum likelihood
More informationEconomics of Strategy (ECON 4550) Maymester 2015 Applications of Regression Analysis
Economics of Strategy (ECON 4550) Maymester 015 Applications of Regression Analysis Reading: ACME Clinic (ECON 4550 Coursepak, Page 47) and Big Suzy s Snack Cakes (ECON 4550 Coursepak, Page 51) Definitions
More information5. Multiple regression
5. Multiple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/5 QBUS6840 Predictive Analytics 5. Multiple regression 2/39 Outline Introduction to multiple linear regression Some useful
More informationRegression III: Advanced Methods
Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models
More informationModeling Lifetime Value in the Insurance Industry
Modeling Lifetime Value in the Insurance Industry C. Olivia Parr Rud, Executive Vice President, Data Square, LLC ABSTRACT Acquisition modeling for direct mail insurance has the unique challenge of targeting
More informationFrom the help desk: Demand system estimation
The Stata Journal (2002) 2, Number 4, pp. 403 410 From the help desk: Demand system estimation Brian P. Poi Stata Corporation Abstract. This article provides an example illustrating how to use Stata to
More informationNCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
More informationHomework 11. Part 1. Name: Score: / null
Name: Score: / Homework 11 Part 1 null 1 For which of the following correlations would the data points be clustered most closely around a straight line? A. r = 0.50 B. r = -0.80 C. r = 0.10 D. There is
More informationSPSS Guide: Regression Analysis
SPSS Guide: Regression Analysis I put this together to give you a step-by-step guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar
More informationMultiple Optimization Using the JMP Statistical Software Kodak Research Conference May 9, 2005
Multiple Optimization Using the JMP Statistical Software Kodak Research Conference May 9, 2005 Philip J. Ramsey, Ph.D., Mia L. Stephens, MS, Marie Gaudard, Ph.D. North Haven Group, http://www.northhavengroup.com/
More information2. Linear regression with multiple regressors
2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions
More informationLecture 6: Poisson regression
Lecture 6: Poisson regression Claudia Czado TU München c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 0 Overview Introduction EDA for Poisson regression Estimation and testing in Poisson regression
More informationDescriptive Statistics
Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize
More informationChapter 7: Dummy variable regression
Chapter 7: Dummy variable regression Why include a qualitative independent variable?........................................ 2 Simplest model 3 Simplest case.............................................................
More informationMULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS
MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance
More informationExamining a Fitted Logistic Model
STAT 536 Lecture 16 1 Examining a Fitted Logistic Model Deviance Test for Lack of Fit The data below describes the male birth fraction male births/total births over the years 1931 to 1990. A simple logistic
More information10. Analysis of Longitudinal Studies Repeat-measures analysis
Research Methods II 99 10. Analysis of Longitudinal Studies Repeat-measures analysis This chapter builds on the concepts and methods described in Chapters 7 and 8 of Mother and Child Health: Research methods.
More information