Interpretation of the Fitted Logistic Regression Model
|
|
- Harold Gilbert
- 7 years ago
- Views:
Transcription
1 Interpretation of the Fitted Logistic Regression Model April 19, 2016
2 Table of contents 1 Introduction 2 Odds ratio Example 3 Cell Coding Deviation from means 4 5 Statistical Adjustment Statistical Intercation Examples 6 7
3 Assumptions Before interpretating the logistic regression Model we assume the following conditions are met: The model has been fit (Chapter 5) Variables in the model are significant Model fits according to a statistical measure of fit
4 Introduction What do the estimated coefficients in the model tell us about the research questions that motivated the study? In linear regression y = β 0 + β 1 x the estimated coefficient β 1 represents the slope, the rate of change per unit of change in the independent variable. (The intercept β 0 is of little interest in most models)
5 Introduction Thus, interpretation of the logistic regression model involves two issues: 1 Determining the functional relationship between the dependent and independent variable 2 Defining the unit of change for the independent variable
6 Link function Definition A link function is a function of the dependent variable which yields a linear function of the independent variable. Examples of link functions identity function id N, in linear regression in logisitc regression its the logit transformation with π(x) g(x) = ln( 1 π(x) ) = β 0 + β 1 x π(x) = eβ 0+β 1 x 1 + e β 0+β 1 x
7 logit function While the slope β 1 in linear regression is the change in the outcome variable corresponding to a one-unit change in the independent variable β 1 = y(x + 1) y(x), in logisitc regression β 1 is the analogous change in the logit β 1 = g(x + 1) g(x). Hence, for an interpretation its necessary to place meaning on the difference between two values of the logit function.
8 Odds ratio Example We assume that the independent variable x is nominal scaled and dichotomous. In most cases that means it s coded as either 0 or 1. Now calculating the slope β 1 is straight forward and is done in a few steps.
9 Odds ratio Example g(1) g(0) = (β 0 +β 1 1) (β 0 +β 1 0) = (β 0 +β 1 ) (β 0 ) = β 1. 1 Define two values of the covariate to be compared (x = 1, x = 0) 2 Substitute those two into the logit (g(1), g(0)) 3 Calculate the difference (g(1) g(0))
10 Odds ratio Odds ratio Example The practical problem is that change on the scale of the log-odds is hard to explain and it may not be especially meaningful to a subject-matter audience. In order to provide a more meaningful interpretation we need to introduce the odds ratio as a measure of association. Definition The odds ratio OR is the ratio of odds for x = 1 to the odds for x = 0 OR = π(1) 1 π(1) π(0) 1 π(0)
11 Odds ratio Odds ratio Example Substituting the logisitc regression model probabilities into the OR we obtain that the relationship between the odds ratio and the regression coefficient is OR = e β 0 +β 1 1+e β 0 +β e β 0 +β 1 e β 0 1+e β e β 0 = eβ 0+β 1 e β 0 = e β 1
12 Odds ratio Odds ratio Example Thus, as a fourth and final step we have 1 Define two values of the covariate to be compared (x = 1, x = 0) 2 Substitute those two into the logit (g(1), g(0)) 3 Calculate the difference (g(1) g(0)) 4 Exponentiate the logit difference to obtain an odds ratio
13 Odds ratio Odds ratio Example The odds ratio approximates how much more likely or unlikely (in terms of odds) it is for the outcome to be present among those subjects with x = 1 as compared to those subjects with x = 0. Example Assume that Y, is the presence or absence of heart disease and X denotes whether or not the person engages in regular strenuous physical exercise. If the odds ratio is OR = 0.5 (OR = 2), then the odds of heart disease among those subjects who exercise is one-half (twice) the odds of heart disease for those subjects who do not exercise in the study population.
14 Odds ratio Odds ratio Example ÔR tends to have a distribution that is highly skewed to the right, due to the fact that its range is between 0 and (only for extremely large sample sizes, the distribution would be normal). Hence, inferences are usually based on the sampling distribution of ln(ôr) = β 1, which tends to follow a normal distribution for much smaller sample sizes. A confidence 100 (1 α)% interval for ÔR is given by exp[ β 1 ± z 1 α/2 ŜE( β 1 )]
15 Odds ratio Odds ratio Example What if x is coded using values a and b? Then the four steps are 1 We choose a and b to be compared 2 Substitute those into the logit and obtain: ĝ(a) = ˆβ 0 + ˆβ 1 a and ĝ(b) = ˆβ 0 + ˆβ 1 b 3 Calculate the difference: ĝ(a) ĝ(b) = ˆβ 1 (a b) 4 Exponentiate the logit difference to obtain the odds ratio ÔR(a, b) = e ˆβ 1 (a b)
16 Example Odds ratio Example
17 Example Odds ratio Example Since β 1 = we get that the estimated odds ratio is (step 4) ÔR = e β 1 = e = 2.9. Another way of obtaining the odds ratio would be ÔR = = and to go full-circle we verify β 1 = ln(2.897) = The confidence 95% interval for ÔR is exp[1.064 ± ] = (1.87, 4.49) This suggests that the odds of a fracture during follow-up among women with a prior fracture could be as little as 1.9 times or much as 4.5 times the odds for women without a prior fracture, at the 95% level of confidence.
18 Cell Coding Deviation from means Now we ll assume that the independent variable has k > 2 distinct values. For example, x could denote the county of residence, the clinic used for health care, etc.
19 Cell Coding Cell Coding Deviation from means Since we can t model a nominal scale variable as an interval scale variable we have to use design variables (see Chapter 2) The method for specifying the design variables involves setting all of them equal to 0 for the reference group, and then setting a single design variable equal to 1 for each of the other groups. This is illustrated in Table 3.6. (cell coding)
20 Cell Coding Example Cell Coding Deviation from means
21 Cell Coding Example Cell Coding Deviation from means Comparing the estimated coefficients in Table 3.7 to the log-odds ratios in Table 3.5 we find that ln[ôr(same, less)] = ˆβ 1 = 0.546, ln[ôr(more, less)] = ˆβ 2 = To check: 1 Compare levels same to less 2 logit for same and less respectively are: ĝ(same) = ˆβ 0 + ˆβ 1 (1) + ˆβ 2 (0) and ĝ(less) = ˆβ 0 + ˆβ 1 (0) + ˆβ 2 (0) 3 The logit difference is ln[ôr(same, less)] = ĝ(same) ĝ(less) = ˆβ 1
22 Cell Coding Example Cell Coding Deviation from means Generaly the estimates of the standard errors found in the logistic regression output are identical to the estimates obtained using the cell frequencies from the contingency table: ŜE( ˆβ 1 1 ) = = (1 α)% confidence intervals for ˆβ j are calculated as ˆβ j ± z 1 α/2 ŜE( ˆβ j ) and analog for the odds ratio [ exp ˆβj ± z 1 α/2 ŜE( ˆβ ] j ).
23 Deviation from means Cell Coding Deviation from means This coding expresses an effect as the deviation of the group mean (logit for the group) from the overall mean (average logit over all groups). This method of coding is obtained by setting the value of all the design variables equal to 1 for one of the categories, and then using the 0, 1 coding for the remainder of the categories.
24 Deviation from means Example Cell Coding Deviation from means
25 Deviation from means Example Cell Coding Deviation from means The logit for each of the three categories of RATERISK are: ) ĝ 1 = ln ( = 1.602, ĝ 2 = 1.056, ĝ 3 = with a mean of ḡ = With that, the estimated coefficients in Table 3.9 are: ĝ 2 ḡ = and ĝ 3 ḡ = (ĝ j ḡ for the jth design variable)
26 Cell Coding Deviation from means Deviation from means Example - Interpretation Exponentiation of the estimated coefficients yields the ratio of the odds for the particular group to the geometric mean of the odds: exp( ˆβ 1 ) = exp(ĝ 2 ḡ) = 1.06 Which is not a true odds ratio because the quantities in the numerator and denominator do not represent the odds for two distinct categories. The exponentiation of the estimated coefficient expresses the odds relative to the geometric mean odds. The interpretation of this value depends on whether the geometric mean odds is at all meaningful in the context of the study.
27 Cell Coding Deviation from means Deviation from means Example - Interpretation Estimating the odds ratio for one category relative to a reference category with the estimated coefficients: 1 Compare levels same(raterisk=2) to less (RATERISK=1) 2 logit for same and less, using the deviation from means coding are: ĝ(same) = ˆβ 0 + ˆβ 1 (RAT ERISK2D = 1) + ˆβ 2 (RAT ERISK2D = 0) and ĝ(less) = ˆβ 0 + ˆβ 1 (RAT ERISK2D = 1) + ˆβ 2 (RAT ERISK2D = 1) 3 The logit difference is ln[ôr(same, less)] = ĝ(same) ĝ(less) = 2 ˆβ 1 + ˆβ 2 4 ÔR(same, less) = e 2 ˆβ 1 + ˆβ 2
28 With continuous independent variable, interpretation of the estimated coefficient depends on how it is entered into the model and the particular units of the variable. We assume that the logit is linear in the variable (see Chapter 4).
29 Again the four-step method is used to calculate the estimator of the odds ratio with the logit g(x) = ˆβ 0 + ˆβ 1 x 1 A one-unit increase: x + 1 compared to x 2 Obtain the corresponding logit functions g(x) and g(x + 1) 3 The logit difference is ln[ôr] = g(x + 1) g(x) = ˆβ 1 (x + 1 x) = ˆβ 1 4 ÔR = e ˆβ 1 Thus the estimator has exactly the same form as the one for a dichotomous one.
30 Substituting an arbitrary increment c we obtain analogous: 1 A c-unit increase: x + c compared to x 2 Obtain the corresponding logit functions g(x) and g(x + c) 3 The logit difference is ln[ôr] = g(x + c) g(x) = ˆβ 1 (x + c x) = c ˆβ 1 4 ÔR(c) = e c ˆβ 1
31 Again we can calculate the standard error ŜE(c ˆβ 1 ) = c ŜE( ˆβ 1 ) and a 100(1 α)% confidence interval as [ exp c ˆβ 1 ± z 1 α/2 c ŜE( ˆβ ] 1 ). Thus, the choice of c is crucial. As a rule of thumb multiples of 2, 5 or 10 are most meaningful and easily understood.
32 Example The estimated logit is ĝ(age) = AGE and the estimated odds ratio for an increase in 10 years is ÔR(10) = exp( ) = 3.03.
33 Example Thus, for every increase of 10 years in age, the odds of CHD being present is estimated to increase 3.03 times. The unavoidable dilemma is, that this means the odds increase of CHD for a 30 year old compared to a 20 year old is the same for a 60 year old compared to a 50 year old.
34 Statistical Adjustment Statistical Intercation Examples Univariable models rarely provide complete analysis of the data since the independent variables are usually associated with with one another. Thus multivariable analysis is used. Two questions are of importance: 1 What effect does each varriable have on the other independent variable? 2 To what extent the estimate of the log-odds of one independent variable changes, depending on the value of another independent variable.
35 Statistical Adjustment Statistical Adjustment Statistical Intercation Examples
36 Statistical Adjustment Statistical Adjustment Statistical Intercation Examples The difference in height between the two groups at mean weight is (w 2 w 1 ) = (β 0 + β 1 + β 2 ā 2 ) (β 0 + β 2 ā 1 ) = β 1 + β 2 (ā 2 ā 1 ) if this is done at a common mean ā we obtain (w 4 w 3 ) = β 1.
37 Statistical Adjustment Statistical Adjustment Statistical Intercation Examples Now instead of weight being the outcome variable, assume it is a dichotomous variable and that the vertical axis now denotes the logit or log-odds of the outcome. That is, the logit of the outcome is given by the equation g(x, a) = (β 0 + β 1 x + β 2 a).
38 Statistical Adjustment Statistical Adjustment Statistical Intercation Examples The unadjusted difference (w 2 w 1 ) and the adjusted difference (w 4 w 3 ) of the log-odds are (w 2 w 1 ) = β 1 + β 2 (ā 2 ā 1 ), (w 4 w 3 ) = β 1 What conditions are required for them to be the same? β 1 + β 2 (ā 2 ā 1 ) = β 1 (ā 2 ā 1 ) = 0 β 2 = 0 How to test for those two conditions?
39 Statistical Adjustment Statistical Adjustment Statistical Intercation Examples 1 Fit a model containging only d (excluding adjustment covariate a), with ˆθ 1 being d s coefficient 2 Fit a full model with ˆβ 1, ˆβ 2 as the estimated coefficients 3 Now ˆθ 1 ˆβ 1 + ˆβ 2 (ā 2 ā 1 ) thus (ˆθ 1 ˆβ 1 ) ˆβ 2 (ā 2 ā 1 ) 4 As the amount of adjustment is relative we scale it by dividing by ˆβ 1 and we obtain the delta-beta-hat-percent ˆβ% = 100 (ˆθ 1 ˆβ 1 ) ˆβ 1
40 Statistical Adjustment Statistical Adjustment Statistical Intercation Examples Amount of adjustment is expressed as a percentage of the adjusted log-odds ratio. (if ˆβ% > 20, 10, 25 then a covariate is needed to adjust the effect of another covariate (Chapter 4)). In the case of statistical adjustment we speak of x as a confounder (AGE).
41 Statistical Interaction Statistical Adjustment Statistical Intercation Examples
42 Statistical Interaction Statistical Adjustment Statistical Intercation Examples Outcome variable is the presence or absence of CHD, the risk factor is GENDER, and the covariate is AGE. l 1, l 2 being the logit for females and males respectively. Since the difference l 2 l 1 is the same for any age, hence there is no interaction. If instead l 3 is the one of the males, then the difference l 3 l 1 varries depending on age indicating that the relationship between AGE and CHD for males is different from that of females. When this occurs we say that there is an interaction between AGE and GENDER.
43 Effect Modifier Statistical Adjustment Statistical Intercation Examples We cannot estimate the odds ratio for GENDER without first specifying the AGE at which the comparison is being made. In other words, age modifies the effect of gender, so in this terminology age is called an effect modifier.
44 Examples Statistical Adjustment Statistical Intercation Examples Considering models with the pair of independent variables d (dichotomous) and x (continuous). The role of x with respect to the effect of d in the model can be one of three possibilities: 1 No statistical adjustment or interaction. 2 There is statistical adjustment, no interaction. Covariate x is a confounder. 3 There is statistical interaction. Covariate x is an effect modifier.
45 Statistical Adjustment Statistical Intercation Examples 1.No statistical adjustment or interaction
46 Statistical Adjustment Statistical Intercation Examples 1.No statistical adjustment or interaction There is little change in the estimate of the coefficient in Model 2 for PRIORFRAC as ˆβ% = 100 (ˆθ 1 ˆβ 1 ) ˆβ 1 = 5.1 which indicats that inclusion of HEIGHT does not statistically adjust the coefficient of PRIORFRAC HEIGHT it is not a confounder of PRIORFRAC.
47 Statistical Adjustment Statistical Intercation Examples 1.No statistical adjustment or interaction The statistical interaction of prior fracture (PRIORFRAC) and height (HEIGHT) is added to Model 2 to obtain Model 3. The Wald statistic for the added product term has p = 0.492, and thus is not significant. In these data HEIGHT is not an effect modifier of prior fracture.
48 Statistical Adjustment Statistical Intercation Examples 1.No statistical adjustment or interaction Hence, the choice is between Model 1 and Model 2. Even though the estimate of the effect of prior fracture is basically the same for the two models, we would choose Model 2 as height (HEIGHT) is not only statistically significant in Model 2, but is an important clinical covariate as well.
49 Statistical Adjustment Statistical Intercation Examples 2.Statistical adjustment, no interaction
50 Statistical Adjustment Statistical Intercation Examples 2.Statistical adjustment, no interaction In Model 1 the Wald test for the coefficient of GENDER is not significant with p = We know that gender can be an important covariate, but it is not significant in the univariable model. Thus, under some model building methods it might not be considered for a multivariable model. This is a common dilemma which is addressed in Chapter 4.
51 Statistical Adjustment Statistical Intercation Examples 2.Statistical adjustment, no interaction There is quite some change in the estimate of the coefficient for GENDER in Model 2, which is now significant, as ˆβ% = 37.6 which indicats that inclusion of SPHEQ does statistically adjust the coefficient of GENDER SPHEQ it is a confounder of GENDER.
52 Statistical Adjustment Statistical Intercation Examples 2.Statistical adjustment, no interaction The statistical interaction of SPHEQ and GENDER is added to Model 2 to obtain Model 3. The Wald statistic for the added product term has p = 0.185, and thus is not significant. In these data SPHEQ is not an effect modifier of GENDER.
53 Statistical Adjustment Statistical Intercation Examples 2.Statistical adjustment, no interaction In this case we use Model 2 as it adjusts for SPHEQ.
54 3.Statistical interaction Statistical Adjustment Statistical Intercation Examples
55 3.Statistical interaction Statistical Adjustment Statistical Intercation Examples In Model 1 the Wald test for the coefficient of PRIORFRAC is significant with p <
56 3.Statistical interaction Statistical Adjustment Statistical Intercation Examples There is quite some change in the estimate of the coefficient for AGE in Model 2 as ˆβ% = 26.8 which indicats that inclusion of AGE does statistically adjust the coefficient of PRIORFRAC AGE it is a confounder of PRIORFRAC.
57 3.Statistical interaction Statistical Adjustment Statistical Intercation Examples The statistical interaction of prior fracture (PRIORFRAC) and AGE is added to Model 2 to obtain Model 3. The Wald statistic for the added product term has p = 0.022, and thus is significant. In these data AGE is an effect modifier of PRIORFRAC.
58 3.Statistical interaction Statistical Adjustment Statistical Intercation Examples Hence, the choice is Model 3.
59 3.Statistical interaction Statistical Adjustment Statistical Intercation Examples If a model contains an interaction term, like in Model 3, one must follow the four-step method to calculate the odds ratio. 1 Comparing (PRIORFRAC=1, AGE=a) with (PRIORFRAC=0, AGE=a) 2 Substituting them into the estimated logits: ĝ(1, a) = ˆβ 0 + ˆβ ˆβ 2 a + ˆβ 3 1 a and ĝ(0, a) = ˆβ 0 + ˆβ ˆβ 2 a + ˆβ 3 0 a 3 The logit difference is ĝ(1, a) ĝ(0, a) = ˆβ 1 + ˆβ 3 a 4 ÔR((1, a), (0, a)) = e ˆβ 1 + ˆβ 3 a Now the results can be either graphed or tabulated.
60 3.Statistical interaction Statistical Adjustment Statistical Intercation Examples
61 3.Statistical interaction Statistical Adjustment Statistical Intercation Examples
62 3.Statistical interaction Statistical Adjustment Statistical Intercation Examples
63 3.Statistical interaction Statistical Adjustment Statistical Intercation Examples
64 Sofar we discussed using the logistic regression model coefficients to estimate odds ratios and construct confidence intervals. However, there are situations where the fitted values (i.e., the estimated probabilities) from the model are equally, if not more, important.
65
66 The values plotted for INH INJ = 1 (= 0 analog) in reference to Section 2.5 are obtained as follows 1 First calculating the fitted logit functions: ĝ 1 (a) = ĝ(t BSA = a, INH INJ = 1) = a = a 2 Using values from the covariance matrix to calculate V ar[ĝ 1 (a)] = a a ˆπ 1 (a) = eĝ1 (a) 1+eĝ1 (a) 4 The confidence intervals are then given as eĝ1 (a)±1.96ŝe(ĝ 1 (a)), with ŜE(ĝ) = V ar(g) 1+eĝ1 (a)±1.96ŝe(ĝ 1 (a))
67
68 We would like to plot the estimated probability of death as a function of burn area (as before) and inhalation injury but now controlling for the other four covariates in the model. This is done by either choosing typical values (median age and 0 for the dichotomous variables) or calculate a modified logit (subtracts the contribution of burn area and inhalation injury from the logit and uses its median value as a way to control for the additional model covariates).
69 Calculating the modified logit 1 The logit for the fitted model is: ĝ(x) = T BSA RACE 2 and the proposed modified logit ĝm(x) = ĝ(x) ( T BSA INH INJ) 3 with a median value over the 1000 subjects ĝm 50 = 5.349
70 With the two adjusted logits ĝ j (a) = ĝm T BSA j, ˆπ j (a) = eĝj(a) j = 0, eĝj(a)
71 With stratified(sampling) analysis of 2 2 tables to assess interaction and to control confounding the essential objective is to produce an adjusted odds ratio by first determining whether the odds ratios are constant (Mantel-Haenszel estimator or weighted logit-based estimator), or homogeneous.
72
73 Mantel-Haenszel Estimator The Mantel-Haenszel estimator is a weighted average of the stratum specific odds ratios ÔR i = a i d i b i c i, where a i, b i, c i, d i are the observed cell frequencies. ai d i N ÔR MH = i bi = 3.65, N c i i = a i + b i + c i + d i N i The logit-based summary estimator of the odds ratio is a weighted average where each weight is the inverse of the variance of the stratum specific log-odds ratio, [ ] wi ÔR L = exp ln(ôr i) = wi
74 In general, the estimators are similar when the data are not too sparse within the strata. One considerable advantage of the MantelHaenszel estimator is that it may be computed when some of the cell entries are 0.
75 These estimators provide a correct estimate of the effect of the risk factor only when the odds ratio is constant across the strata. Some test statistics: A weighted sum of the squared deviations of the stratum specific log-odds ratios from their weighted mean X 2 H = { w i [ln(ôr i) ln(ôr L)] 2 } Comparing the value of a i to an estimated expected value ê i, calculated under the assumption that the odds ratio is constant in all strata (Breslow 1996) XBD 2 = (a i ê i ) 2 ˆv i [ a i ê i ] 2 ˆvi, ˆv i... estimate of variance of a i
76 The ( odd ) end.
11. Analysis of Case-control Studies Logistic Regression
Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:
More informationMultivariate Logistic Regression
1 Multivariate Logistic Regression As in univariate logistic regression, let π(x) represent the probability of an event that depends on p covariates or independent variables. Then, using an inv.logit formulation
More informationII. DISTRIBUTIONS distribution normal distribution. standard scores
Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,
More information13. Poisson Regression Analysis
136 Poisson Regression Analysis 13. Poisson Regression Analysis We have so far considered situations where the outcome variable is numeric and Normally distributed, or binary. In clinical work one often
More informationOverview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)
Overview Classes 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) 2-4 Loglinear models (8) 5-4 15-17 hrs; 5B02 Building and
More informationChapter 5 Analysis of variance SPSS Analysis of variance
Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,
More informationLOGIT AND PROBIT ANALYSIS
LOGIT AND PROBIT ANALYSIS A.K. Vasisht I.A.S.R.I., Library Avenue, New Delhi 110 012 amitvasisht@iasri.res.in In dummy regression variable models, it is assumed implicitly that the dependent variable Y
More informationHYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION
HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate
More informationUNDERSTANDING THE TWO-WAY ANOVA
UNDERSTANDING THE e have seen how the one-way ANOVA can be used to compare two or more sample means in studies involving a single independent variable. This can be extended to two independent variables
More informationThe correlation coefficient
The correlation coefficient Clinical Biostatistics The correlation coefficient Martin Bland Correlation coefficients are used to measure the of the relationship or association between two quantitative
More informationSection 14 Simple Linear Regression: Introduction to Least Squares Regression
Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship
More informationStatistics 305: Introduction to Biostatistical Methods for Health Sciences
Statistics 305: Introduction to Biostatistical Methods for Health Sciences Modelling the Log Odds Logistic Regression (Chap 20) Instructor: Liangliang Wang Statistics and Actuarial Science, Simon Fraser
More informationFactors affecting online sales
Factors affecting online sales Table of contents Summary... 1 Research questions... 1 The dataset... 2 Descriptive statistics: The exploratory stage... 3 Confidence intervals... 4 Hypothesis tests... 4
More informationSAS Software to Fit the Generalized Linear Model
SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling
More informationSimple Regression Theory II 2010 Samuel L. Baker
SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the
More informationCURVE FITTING LEAST SQUARES APPROXIMATION
CURVE FITTING LEAST SQUARES APPROXIMATION Data analysis and curve fitting: Imagine that we are studying a physical system involving two quantities: x and y Also suppose that we expect a linear relationship
More information1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number
1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression
More informationVI. Introduction to Logistic Regression
VI. Introduction to Logistic Regression We turn our attention now to the topic of modeling a categorical outcome as a function of (possibly) several factors. The framework of generalized linear models
More informationIntroduction to Quantitative Methods
Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................
More informationLecture 19: Conditional Logistic Regression
Lecture 19: Conditional Logistic Regression Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina
More informationThis unit will lay the groundwork for later units where the students will extend this knowledge to quadratic and exponential functions.
Algebra I Overview View unit yearlong overview here Many of the concepts presented in Algebra I are progressions of concepts that were introduced in grades 6 through 8. The content presented in this course
More informationModule 3: Correlation and Covariance
Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis
More informationUnit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression
Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a
More informationA Basic Introduction to Missing Data
John Fox Sociology 740 Winter 2014 Outline Why Missing Data Arise Why Missing Data Arise Global or unit non-response. In a survey, certain respondents may be unreachable or may refuse to participate. Item
More informationInstitute of Actuaries of India Subject CT3 Probability and Mathematical Statistics
Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in
More informationChapter 9. Systems of Linear Equations
Chapter 9. Systems of Linear Equations 9.1. Solve Systems of Linear Equations by Graphing KYOTE Standards: CR 21; CA 13 In this section we discuss how to solve systems of two linear equations in two variables
More informationModule 5: Multiple Regression Analysis
Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College
More informationIntroduction to Matrix Algebra
Psychology 7291: Multivariate Statistics (Carey) 8/27/98 Matrix Algebra - 1 Introduction to Matrix Algebra Definitions: A matrix is a collection of numbers ordered by rows and columns. It is customary
More informationCorrelation key concepts:
CORRELATION Correlation key concepts: Types of correlation Methods of studying correlation a) Scatter diagram b) Karl pearson s coefficient of correlation c) Spearman s Rank correlation coefficient d)
More informationAdditional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm
Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm
More informationAdvanced Statistical Analysis of Mortality. Rhodes, Thomas E. and Freitas, Stephen A. MIB, Inc. 160 University Avenue. Westwood, MA 02090
Advanced Statistical Analysis of Mortality Rhodes, Thomas E. and Freitas, Stephen A. MIB, Inc 160 University Avenue Westwood, MA 02090 001-(781)-751-6356 fax 001-(781)-329-3379 trhodes@mib.com Abstract
More informationSAMPLE SIZE TABLES FOR LOGISTIC REGRESSION
STATISTICS IN MEDICINE, VOL. 8, 795-802 (1989) SAMPLE SIZE TABLES FOR LOGISTIC REGRESSION F. Y. HSIEH* Department of Epidemiology and Social Medicine, Albert Einstein College of Medicine, Bronx, N Y 10461,
More informationReview of Fundamental Mathematics
Review of Fundamental Mathematics As explained in the Preface and in Chapter 1 of your textbook, managerial economics applies microeconomic theory to business decision making. The decision-making tools
More informationSimple linear regression
Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between
More informationStudents' Opinion about Universities: The Faculty of Economics and Political Science (Case Study)
Cairo University Faculty of Economics and Political Science Statistics Department English Section Students' Opinion about Universities: The Faculty of Economics and Political Science (Case Study) Prepared
More informationAlgebra 2 Chapter 1 Vocabulary. identity - A statement that equates two equivalent expressions.
Chapter 1 Vocabulary identity - A statement that equates two equivalent expressions. verbal model- A word equation that represents a real-life problem. algebraic expression - An expression with variables.
More informationCHAPTER 2 Estimating Probabilities
CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2016. Tom M. Mitchell. All rights reserved. *DRAFT OF January 24, 2016* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is a
More informationMultinomial and Ordinal Logistic Regression
Multinomial and Ordinal Logistic Regression ME104: Linear Regression Analysis Kenneth Benoit August 22, 2012 Regression with categorical dependent variables When the dependent variable is categorical,
More informationRegression III: Advanced Methods
Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models
More informationNorthumberland Knowledge
Northumberland Knowledge Know Guide How to Analyse Data - November 2012 - This page has been left blank 2 About this guide The Know Guides are a suite of documents that provide useful information about
More informationLOGISTIC REGRESSION ANALYSIS
LOGISTIC REGRESSION ANALYSIS C. Mitchell Dayton Department of Measurement, Statistics & Evaluation Room 1230D Benjamin Building University of Maryland September 1992 1. Introduction and Model Logistic
More informationLesson 14 14 Outline Outline
Lesson 14 Confidence Intervals of Odds Ratio and Relative Risk Lesson 14 Outline Lesson 14 covers Confidence Interval of an Odds Ratio Review of Odds Ratio Sampling distribution of OR on natural log scale
More informationOrdinal Regression. Chapter
Ordinal Regression Chapter 4 Many variables of interest are ordinal. That is, you can rank the values, but the real distance between categories is unknown. Diseases are graded on scales from least severe
More informationLecture 1: Review and Exploratory Data Analysis (EDA)
Lecture 1: Review and Exploratory Data Analysis (EDA) Sandy Eckel seckel@jhsph.edu Department of Biostatistics, The Johns Hopkins University, Baltimore USA 21 April 2008 1 / 40 Course Information I Course
More informationHow to set the main menu of STATA to default factory settings standards
University of Pretoria Data analysis for evaluation studies Examples in STATA version 11 List of data sets b1.dta (To be created by students in class) fp1.xls (To be provided to students) fp1.txt (To be
More informationWhen to Use a Particular Statistical Test
When to Use a Particular Statistical Test Central Tendency Univariate Descriptive Mode the most commonly occurring value 6 people with ages 21, 22, 21, 23, 19, 21 - mode = 21 Median the center value the
More informationSimple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
More informationAlgebra II End of Course Exam Answer Key Segment I. Scientific Calculator Only
Algebra II End of Course Exam Answer Key Segment I Scientific Calculator Only Question 1 Reporting Category: Algebraic Concepts & Procedures Common Core Standard: A-APR.3: Identify zeros of polynomials
More informationUnit 12 Logistic Regression Supplementary Chapter 14 in IPS On CD (Chap 16, 5th ed.)
Unit 12 Logistic Regression Supplementary Chapter 14 in IPS On CD (Chap 16, 5th ed.) Logistic regression generalizes methods for 2-way tables Adds capability studying several predictors, but Limited to
More informationElementary Statistics
Elementary Statistics Chapter 1 Dr. Ghamsary Page 1 Elementary Statistics M. Ghamsary, Ph.D. Chap 01 1 Elementary Statistics Chapter 1 Dr. Ghamsary Page 2 Statistics: Statistics is the science of collecting,
More informationA Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution
A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 4: September
More informationSession 7 Bivariate Data and Analysis
Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table co-variation least squares
More informationRATIOS, PROPORTIONS, PERCENTAGES, AND RATES
RATIOS, PROPORTIOS, PERCETAGES, AD RATES 1. Ratios: ratios are one number expressed in relation to another by dividing the one number by the other. For example, the sex ratio of Delaware in 1990 was: 343,200
More informationIntroduction to Statistics and Quantitative Research Methods
Introduction to Statistics and Quantitative Research Methods Purpose of Presentation To aid in the understanding of basic statistics, including terminology, common terms, and common statistical methods.
More informationChapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS
Chapter Seven Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS Section : An introduction to multiple regression WHAT IS MULTIPLE REGRESSION? Multiple
More informationOverview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
More informationWEB APPENDIX. Calculating Beta Coefficients. b Beta Rise Run Y 7.1 1 8.92 X 10.0 0.0 16.0 10.0 1.6
WEB APPENDIX 8A Calculating Beta Coefficients The CAPM is an ex ante model, which means that all of the variables represent before-thefact, expected values. In particular, the beta coefficient used in
More informationCHAPTER FIVE. Solutions for Section 5.1. Skill Refresher. Exercises
CHAPTER FIVE 5.1 SOLUTIONS 265 Solutions for Section 5.1 Skill Refresher S1. Since 1,000,000 = 10 6, we have x = 6. S2. Since 0.01 = 10 2, we have t = 2. S3. Since e 3 = ( e 3) 1/2 = e 3/2, we have z =
More informationDESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.
DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,
More informationDeveloping Risk Adjustment Techniques Using the SAS@ System for Assessing Health Care Quality in the lmsystem@
Developing Risk Adjustment Techniques Using the SAS@ System for Assessing Health Care Quality in the lmsystem@ Yanchun Xu, Andrius Kubilius Joint Commission on Accreditation of Healthcare Organizations,
More informationLecture 2. Marginal Functions, Average Functions, Elasticity, the Marginal Principle, and Constrained Optimization
Lecture 2. Marginal Functions, Average Functions, Elasticity, the Marginal Principle, and Constrained Optimization 2.1. Introduction Suppose that an economic relationship can be described by a real-valued
More informationUnivariate Regression
Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is
More informationMULTIPLE REGRESSION WITH CATEGORICAL DATA
DEPARTMENT OF POLITICAL SCIENCE AND INTERNATIONAL RELATIONS Posc/Uapp 86 MULTIPLE REGRESSION WITH CATEGORICAL DATA I. AGENDA: A. Multiple regression with categorical variables. Coding schemes. Interpreting
More informationDelme John Pritchard
THE GENETICS OF ALZHEIMER S DISEASE, MODELLING DISABILITY AND ADVERSE SELECTION IN THE LONGTERM CARE INSURANCE MARKET By Delme John Pritchard Submitted for the Degree of Doctor of Philosophy at HeriotWatt
More informationT O P I C 1 2 Techniques and tools for data analysis Preview Introduction In chapter 3 of Statistics In A Day different combinations of numbers and types of variables are presented. We go through these
More informationWhat does the number m in y = mx + b measure? To find out, suppose (x 1, y 1 ) and (x 2, y 2 ) are two points on the graph of y = mx + b.
PRIMARY CONTENT MODULE Algebra - Linear Equations & Inequalities T-37/H-37 What does the number m in y = mx + b measure? To find out, suppose (x 1, y 1 ) and (x 2, y 2 ) are two points on the graph of
More informationX X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)
CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.
More informationI L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN
Beckman HLM Reading Group: Questions, Answers and Examples Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Linear Algebra Slide 1 of
More informationLet s explore the content and skills assessed by Heart of Algebra questions.
Chapter 9 Heart of Algebra Heart of Algebra focuses on the mastery of linear equations, systems of linear equations, and linear functions. The ability to analyze and create linear equations, inequalities,
More informationbusiness statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar
business statistics using Excel Glyn Davis & Branko Pecar OXFORD UNIVERSITY PRESS Detailed contents Introduction to Microsoft Excel 2003 Overview Learning Objectives 1.1 Introduction to Microsoft Excel
More informationLogit Models for Binary Data
Chapter 3 Logit Models for Binary Data We now turn our attention to regression models for dichotomous data, including logistic regression and probit analysis. These models are appropriate when the response
More informationPart 2: Analysis of Relationship Between Two Variables
Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable
More informationSTATISTICA Formula Guide: Logistic Regression. Table of Contents
: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary
More informationTwo Correlated Proportions (McNemar Test)
Chapter 50 Two Correlated Proportions (Mcemar Test) Introduction This procedure computes confidence intervals and hypothesis tests for the comparison of the marginal frequencies of two factors (each with
More informationCHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression
Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the
More informationMEASURES OF VARIATION
NORMAL DISTRIBTIONS MEASURES OF VARIATION In statistics, it is important to measure the spread of data. A simple way to measure spread is to find the range. But statisticians want to know if the data are
More informationAlgebra I Vocabulary Cards
Algebra I Vocabulary Cards Table of Contents Expressions and Operations Natural Numbers Whole Numbers Integers Rational Numbers Irrational Numbers Real Numbers Absolute Value Order of Operations Expression
More informationCALCULATIONS & STATISTICS
CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents
More informationGamma Distribution Fitting
Chapter 552 Gamma Distribution Fitting Introduction This module fits the gamma probability distributions to a complete or censored set of individual or grouped data values. It outputs various statistics
More informationExample: Boats and Manatees
Figure 9-6 Example: Boats and Manatees Slide 1 Given the sample data in Table 9-1, find the value of the linear correlation coefficient r, then refer to Table A-6 to determine whether there is a significant
More informationLean Six Sigma Analyze Phase Introduction. TECH 50800 QUALITY and PRODUCTIVITY in INDUSTRY and TECHNOLOGY
TECH 50800 QUALITY and PRODUCTIVITY in INDUSTRY and TECHNOLOGY Before we begin: Turn on the sound on your computer. There is audio to accompany this presentation. Audio will accompany most of the online
More informationMicroeconomic Theory: Basic Math Concepts
Microeconomic Theory: Basic Math Concepts Matt Van Essen University of Alabama Van Essen (U of A) Basic Math Concepts 1 / 66 Basic Math Concepts In this lecture we will review some basic mathematical concepts
More informationStatistics. Measurement. Scales of Measurement 7/18/2012
Statistics Measurement Measurement is defined as a set of rules for assigning numbers to represent objects, traits, attributes, or behaviors A variableis something that varies (eye color), a constant does
More informationa. all of the above b. none of the above c. B, C, D, and F d. C, D, F e. C only f. C and F
FINAL REVIEW WORKSHEET COLLEGE ALGEBRA Chapter 1. 1. Given the following equations, which are functions? (A) y 2 = 1 x 2 (B) y = 9 (C) y = x 3 5x (D) 5x + 2y = 10 (E) y = ± 1 2x (F) y = 3 x + 5 a. all
More informationCORRELATED TO THE SOUTH CAROLINA COLLEGE AND CAREER-READY FOUNDATIONS IN ALGEBRA
We Can Early Learning Curriculum PreK Grades 8 12 INSIDE ALGEBRA, GRADES 8 12 CORRELATED TO THE SOUTH CAROLINA COLLEGE AND CAREER-READY FOUNDATIONS IN ALGEBRA April 2016 www.voyagersopris.com Mathematical
More informationAlgebra Unpacked Content For the new Common Core standards that will be effective in all North Carolina schools in the 2012-13 school year.
This document is designed to help North Carolina educators teach the Common Core (Standard Course of Study). NCDPI staff are continually updating and improving these tools to better serve teachers. Algebra
More informationSTATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI
STATS8: Introduction to Biostatistics Data Exploration Babak Shahbaba Department of Statistics, UCI Introduction After clearly defining the scientific problem, selecting a set of representative members
More informationPREDICTION OF INDIVIDUAL CELL FREQUENCIES IN THE COMBINED 2 2 TABLE UNDER NO CONFOUNDING IN STRATIFIED CASE-CONTROL STUDIES
International Journal of Mathematical Sciences Vol. 10, No. 3-4, July-December 2011, pp. 411-417 Serials Publications PREDICTION OF INDIVIDUAL CELL FREQUENCIES IN THE COMBINED 2 2 TABLE UNDER NO CONFOUNDING
More informationSection A. Index. Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting techniques... 1. Page 1 of 11. EduPristine CMA - Part I
Index Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting techniques... 1 EduPristine CMA - Part I Page 1 of 11 Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting
More informationChapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 )
Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 ) and Neural Networks( 類 神 經 網 路 ) 許 湘 伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 10 1 / 35 13 Examples
More informationMeans, standard deviations and. and standard errors
CHAPTER 4 Means, standard deviations and standard errors 4.1 Introduction Change of units 4.2 Mean, median and mode Coefficient of variation 4.3 Measures of variation 4.4 Calculating the mean and standard
More informationWeek 4: Standard Error and Confidence Intervals
Health Sciences M.Sc. Programme Applied Biostatistics Week 4: Standard Error and Confidence Intervals Sampling Most research data come from subjects we think of as samples drawn from a larger population.
More informationHow To Model The Fate Of An Animal
Models Where the Fate of Every Individual is Known This class of models is important because they provide a theory for estimation of survival probability and other parameters from radio-tagged animals.
More informationData exploration with Microsoft Excel: analysing more than one variable
Data exploration with Microsoft Excel: analysing more than one variable Contents 1 Introduction... 1 2 Comparing different groups or different variables... 2 3 Exploring the association between categorical
More informationCHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS
Examples: Regression And Path Analysis CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS Regression analysis with univariate or multivariate dependent variables is a standard procedure for modeling relationships
More informationDirections for using SPSS
Directions for using SPSS Table of Contents Connecting and Working with Files 1. Accessing SPSS... 2 2. Transferring Files to N:\drive or your computer... 3 3. Importing Data from Another File Format...
More informationhp calculators HP 50g Trend Lines The STAT menu Trend Lines Practice predicting the future using trend lines
The STAT menu Trend Lines Practice predicting the future using trend lines The STAT menu The Statistics menu is accessed from the ORANGE shifted function of the 5 key by pressing Ù. When pressed, a CHOOSE
More information5 Systems of Equations
Systems of Equations Concepts: Solutions to Systems of Equations-Graphically and Algebraically Solving Systems - Substitution Method Solving Systems - Elimination Method Using -Dimensional Graphs to Approximate
More informationProbability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur
Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Module No. #01 Lecture No. #15 Special Distributions-VI Today, I am going to introduce
More information