Linear Mixed Models PGRM 15. Outline. Linear regression. Correlated measurements (eg repeated)

Size: px
Start display at page:

Download "Linear Mixed Models PGRM 15. Outline. Linear regression. Correlated measurements (eg repeated)"

Transcription

1 Lear ixed odels PGR 15 Outle Lear regression Correlated measurements (eg repeated) Random effects leadg to different components of variance & correlated measurements Different Correlation Structures Simple Analysis of Clustered Data Split Plot Analysis Repeated easures Analysis 1

2 Relationship between variables Yield and fertilizer level Temperature and bacterial growth Light and respiration rate Age and alcohol consumption Can the relationship be represented by a simple curve? Is there necessarily a causal relationship? Simple lear regression 8 6 Y X 2

3 Simple lear regression model Data: (x i, y i ) Probabilistic model: yi = 0 1 i β + β x + ε Random term with variance σ 2 Population Characteristic β 0 β 1 Sample Statistic $ β 0 $ β 1 To predict from the model: yˆ i =β ˆ ˆ 0+ β1x i σ 2 s 2 Simple lear regression Estimated equation of the le: ^ y = x 8 6 Y 4 2 Intercept = β 0 = 2 ^ Slope = β 1 = 0.75 ^ X 3

4 Simple lear regression Estimated equation of the le: ^ y = x 8 6 Y 4 2 Variance = s 2 = X Assumptions for a Simple Lear Regression model Note: If you are fittg a simple lear regression model to your own data, there are assumptions that must be satisfied. d details of how to test the assumptions for your fitted model any basic statistics text book. 4

5 Regression example Distance from fire ire damage station x (miles) y ($1000) Damage caused by fire ($1000) vs Distance from fire station (miles) 50 damage ($1000) Distance from fire station (miles) 5

6 Damage caused by fire ($1000) vs Distance from fire station (miles) damage ($1000) y ^ = x Distance from fire station (miles) Damage caused by fire ($1000) vs Distance from fire station (miles) damage ($1000) Distance from fire station (miles) 6

7 Correlated easurements Two values are correlated if knowledge of one provides formation about the other eg if knowg one value is high (relative to the appropriate mean) means the other is also likely to be high (+ve correlation), or low (-ve correlation) Correlation can be visualized by trends plots Independent values are uncorrelated Positive correlation often arises between values resultg from shared fluences When modellg, shared fluences must be taken to account (eg by cludg random effects, or specifyg a correlation structure) Pearson s correlation coefficient ρ (rho) easures the strength and direction (+ve or ve) of lear relationship between two variables measured on the same experimental unit Does not depend on the units of measurement Lies between 1 and +1 In fittg a regression le y = a + bx the estimate of ρ, r, gives: 100r 2 = % of variation y explaed by the lear regression Hence r = 0.5 dicates a negative relationship that explas 25% of the variation y. r = 0.8 explas approximately 2/3 of the variation. 7

8 Interpretg r A high value of r corresponds to pots closely scattered around a le Examples of correlated responses Clustered data easurements on piglets from several litters responses from piglets with a litter are related, sharg parental and environmental fluences Experiment at several sites responses with a site are related Split-plot designs (with ma and sub-plots see later) Responses from the same ma plot will be related, sharg the ma plot characteristics Repeated easures from experimental units responses from the same unit are related, sharg the fluential characteristics of the unit if repeated over time, responses from the same unit close time may be more closely related than responses further apart 8

9 Identifyg Correlation & Different Sources of Variation Components of variation & correlation So far background noise has been accounted for by a sgle source of variation the variation between experimental units sharg the same values of explanatory variables. With clustered data, some of the variation will be due to differences between clusters (eg litters, sites). Individuals with a litter are related genetically. Plots with a site are related as they share common environmental elements. Clusters may form a source of random variation (any particular litter could have been allocated to a different treatment group but all piglets the litter would then have the same treatment and would not be dependent. Variation between sites may contribute to the variation of the estimate of a treatment effect based on a number of sites. The treatment effect may also vary over sites. Observations with the same value of a random effect will be positively correlated 9

10 ore general correlation structures Introducg a random effect for clusters of measurements assumes a very restricted structure of correlation ore generally we identify clusters whose members are correlated specify the correlation structure Example of clustered data analysis 27 sites with 30 plots per site = 810 plots our species 2 grasses (G) & 2 legumes (L) Simplex design at two densities 10

11 Location of Sites 16 id European (E) 4 Northern European (NE) 3 oist editerranean () 2 Dry editerranean (D) 2 Other A typical E site Species - G 1 Lolium perenne - G 2 Dactylis glomerata - L 1 Trifolium pratense - L 2 Trifolium repens irst complete year s data was used. 11

12 Diversity effect is positive and significant at most sites Country Type Av mono (t ha -1 ) δˆ (t ha -1 ) Germany E Ireland E Lithuania E Lithuania E Lithuania E Netherlands E Norway E Norway E Norway E Poland E Spa E Sweden E Sweden E Switzerland E Wales E Wales E Average Site fixed vs site random Site ixed Proc glm data=yield; Class site monomix density; odel tot_yld=site density monomix ; Lsmeans monomix/stderr; Run; Diff 2.39 SE LSmeans ean SE ono ix Site Random Proc mixed data=yield; Class site monomix density; odel tot_yld= density monomix; Lsmeans monomix/stderr; Random site Run; Diff 2.39 SE LSmeans ean SE ono ix

13 onomix tested agast random site*monomix Estimate of diversity effect is the same (2.39) SE of diversity effect is now greater (0.254) Loss of precision is compensated for by a wider range of ference about the diversity effect Any new site predict a diversity effect of 2.39 but use the se = for settg bounds for the prediction. Unbalanced mixed model analysis The example data was balanced each treatment (combation of a level of V with a level of N) appeared the same number of times (once!) each block or unbalanced data every comparison has a different SE. It s important to use the ddfm = kenwardroger option on the model statement to drop teraction terms if non-significant See PGR

14 Cost data revisited meta analysis Consistency of treatment effects across multiple sites (meta analysis) If an experiment is repeated across a number of sites the question arises as to how well the average treatment effect estimated across all sites represents the difference between treatments. Is there variation the onomix effect across sites? If sites are regarded as random (drawn at random from some population of sites), how well does the average onomix effect over sites represent the typical monomix effect at a site drawn from the same population.. To use the average monomix effect this way its should conta variation due to how the effect varies from site to site (i.e. formation on the site x monomix teraction. This is achieved by cludg the SAS program the le random site site*monomix; Diversity effect is positive and significant at most sites eta analysis Country Type Av mono (t ha -1 ) δˆ (t ha -1 ) Germany E Ireland E Lithuania E Lithuania E Lithuania E Netherlands E Norway E Norway E Norway E Poland E Spa E Sweden E Sweden E Switzerland E Wales E Wales E Average

15 SAS programs Site random proc mixed data=yield; class site monomix density; model tot_yld = density monomix /ddfm=kenwardroger ; run; random site; lsmeans monomix; estimate 'diversity effect' monomix -1 1; SAS programs Site random proc mixed data=yield; class site monomix density; model tot_yld = density monomix /ddfm=kenwardroger ; run; random site; random site*monomix; lsmeans monomix; estimate 'diversity effect' monomix -1 1; 15

16 Site ixed Proc glm data=yield; Site fixed vs site random and Site*monomix cluded Class site monomix density; odel tot_yld=site density monomix site*monomix; Lsmeans monomix/stderr; Run; Diff 2.39 SE LSmeans ean SE ono ix Site Random Proc mixed data=yield; Class site monomix density; odel tot_yld= density monomix; Lsmeans monomix/stderr; Random site Random site*monomix Run; Diff 2.39 SE LSmeans ean SE ono ix Comparison of models Random effects SITE SITE SITE*ONOIX Variance Estimates Site Site * onomix 0.70 Residual log Likelihood ono SE ixture SE Diversity effect SE P value <0.001 <

17 Split Plot Experimental Design Split Plot: effect of fertiliser on 3 varieties of oats In each of 6 homogeneous blocks: Varieties allocated to ma plots at random In each sub plot varyg levels of fertiliser N (kg/ha) are allocated at random 17

18 Analysis of split plot design Interest is on effect of V (variety) and N (nitrogen level) on Y (yield) possible sources of background variation Initially regard N level as categorical odel terms would clude B (block), V, N and the teraction V*N In addition model must reflect the split plot structure: yields from the same ma plot will share similar (randomly allocated) characteristics, and consequently be correlated a plots are identified by specifyg B and V, ie B*V. SAS/IXED procedure caters for the random effects of B*V. SAS/IXED code proc mixed data = oats; class b v n; model y = b n v n*v / htype = 1; random b*v; estimate 'Variety 1v2' v 1-1 0; estimate 'N 1v2' n ; run; b*v each level of this corresponds to a ma plot which contributes its characteristics to the response 18

19 SAS/IXED OUTPUT Type 1 Tests of ixed Effects Effect Num D Den D Value Pr > b n <.0001 v v*n V*N teraction is far from significant (so the model could be re-run without this term) 2. N has a strong effect 3. Varieties do not appear to differ SAS/IXED OUTPUT Covariance Parameter Estimates Cov Parm Estimate b*v Residual There are 2 sources of background variation: 1. with ma plots (component of variance: ) 2. between ma plots (component of variance: ) Different comparisons will use appropriate combations of these 19

20 LSmeans & SEDs Code lsmeans v n v*n / diff; a effect means (ie for v (SED = 7.079), n (SED = 4.436) and SEDs are appropriate if v*n is non-significant Comparison of the 12 v*n means volves 2 different SEDs When comparg 2 levels of N with a particular variety (SED = 7.683) When makg comparisons where the variety changes (SED = 9.715) lsmeans / diff; OUTPUT fferences of Least Squares eans Effect Variety Nitrogen Variety Nitrogen Estimate Standard Error D t Value Pr > t v Golden Ra arvellous v Golden Ra Victory v arvellous Victory n <.0001 n <.0001 n <.0001 n n <.0001 n v*n Golden Ra 0 Golden Ra v*n Golden Ra 0 Golden Ra <.0001 v*n Golden Ra 0 Golden Ra <.0001 v*n v*n Golden Ra Golden Ra 0 0 arvellous arvellous

21 Is there a simple description of the N effect? Usg SAS/EANS procedure calculate the mean yield for each treatment (V, N combation) Plot ean Yield v Nitrogen Level for each variety it a le for each variety. Split plot design 21

22 SAS/IXED: Split Plot Analysis N seems to have equal effect on yield for the 3 varieties Plot suggests effect of N is lear Add variable nval = n to data set oats it & test for separate slopes and LO: proc mixed data = oats; class b v n; model y = b nval n v nval*v n*v / htype = 1; random b*v; run; n n*v terms measure LO nval*v checks equal slopes SAS OUTPUT Type 1 Tests of ixed Effects Num Den Effect D D Value Pr > b nval <.0001 n v nval*v v*n LO terms not significant (p =.27,.93) 2. Lear effect of N same for the 3 varieties (p =.63) 3. Lear effect of N highly significant (p <.0001) 4. Refit model without LO terms and teraction nval*v to get coefficient of nval: 0.59 (SE: ) 22

23 Practical exercise SAS/IXED for Split Plot Analysis Lab Session 6 Exercise 6.2 (a) (e) Simple Analyses of Correlated Data 23

24 A simple analysis combe correlated values Seedlgs are planted 20 each of 10 contaers and the dry mass of each plant measured after 60 days. Responses from plants the same contaer will be correlated. Use as response variable the total dry mass of all 20 plants the same contaer. This gives 10 dependent response values Responses from piglets the same litter could be combed, though adjustments would have to be made for varyg litter size. Averages from large litters could give more precise formation than from small, but may also be lower. A simple analysis combe correlated values easurg plant height every 14 days for 10 weeks gives 6 measurements for each plant (after 0, 2, 4, 6, 8, and 10 weeks) If daily growth is of terest a sgle value from each plant could be calculated by (fal height itial height)/140 the slope of the fitted le to the data from the plant 24

25 Repeated easures Analysis utiple measures on the same units at the same time Repeated measurements on the same units over time Example The weights of 27 male and female children were measured at ages 8, 10, 12 and 14 years. Child Gender To keep SAS happy replace 8, 10, etc with w_age8, w_age10, etc Child Gender

26 Issues Repeated easures Analysis 1. What questions are to be addressed by the analysis? Can these be answered by reducg the measurements to a sgle value? eg fal weight weight ga over a particular period (from 8 to 10 or 10 to 14 yr) the slope of a straight le Issues Repeated easures Analysis 2. Are the measurements taken at equally spaced time pots? When times are not equally spaced, and even more so when patterns of timg differ from unit to unit, the analysis is more difficult 26

27 Issues Repeated easures Analysis 3. What is a reasonable error/correlation structure for the data? Are measurements equally related, irrespective of whether they are close together time or otherwise? Does the correlation decle as time pots are farther apart? Can this decle be described simply? Child Weights Example simple analysis usg 2 possible sgle values data c; set childweights; wga1 = y4 - y1; run; proc glm data = c; class gender; model wga1 = gender; lsmeans gender; estimate 'diff -' gender -1 1; quit; 27

28 SAS OUTPUT Ga 8 10 yr Parameter Estimate Standard Error t Value Pr > t diff Ga yr Parameter Estimate Standard Error t Value Pr > t diff Differences between genders depend on age ie there is a gender year teraction The complete data set needs to be analysed Gettg the picture! Are the slopes equal? 28

29 Repeated easures Analysis of Weight Restructure data for SAS Obs Child Gender weight age Centre age usg cage = age - 11 This means that gender comparisons will be made at age 11 yr. Repeated easures Analysis of Weight Observations from the same child will be correlated A simple way to clude this is to troduce a random effect for the child However: this assumes that any two observations on the same child are equally correlated, dependent of whether far apart or not (COPOUND SYETRY) proc mixed data = cw; class gender cage child; model weight = gender cage cage*gender / ddfm = kenwardroger htype = 1; random child; run; Note: it s best to use model option: ddfm = kenwardroger 29

30 IXED selected OUTPUT Least Squares eans Effect Gender Estimate Standard Error D t Value Pr > t Gender <.0001 Gender <.0001 Effect Gender cage Type 1 Tests of ixed Effects Num D Den D Value Pr > <.0001 Covariance Parameter Estimates Cov Parm Child Residual Estimate cage*gender The common correlation between observations on the same subject (child) is /( ) = 0.63 IXED selected OUTPUT Gender Difference Differences of Least Squares eans Effect Gender Gender Estimate Standard Error D t Value Pr > t Gender LSmeans and difference refer to cage = 0 30

31 IXED selected OUTPUT Age Effect Solution for ixed Effects Effect Gender Estimate Standard Error D t Value Pr > t Intercept <.0001 Gender Gender cage <.0001 cage*gender cage*gender Slopes : = : These differ significantly (p = ) IXED selected OUTPUT rcorr option gives correlation between the 4 measurements on each subject (child) Estimated R Correlation atrix for Child 1 Row Col1 Col2 Col3 Col Note the CS structure and value of the correlation 31

32 Was compound symmetry a good choice for Child Weights example? What are the alternatives? Alternatives to Compound Symmetry Alternatively we can specify that measurements on the same subject are correlated with compound symmetry or another structure as follows: IXED syntax: replace random child; with Choice of correlation structure: cs ar(1) unstructured repeated / type = cs subject = child rcorr; With balanced data these are equivalent, but repeated permits more complex structures: AR(1) correlation gets less for measurement father apart beg ρ, ρ 2, ρ 3, for measurements 1, 2, 3, time units apart (eg.5,.25,.125, ) UNSTRUCTURED no particular structure assumed 32

33 Gettg the correlation structure correct Unstructured correlation Estimated R Correlation atrix for Child 1 Row Col1 Col2 Col3 Col Not very different from CS 2. Other results similar 3. Correlation does not decrease with separation don t use AR(1) correlation Gettg the correlation structure correct Tests for correlation structure Chi-squared tests can compare CS and AR(1) with unstructured (UN) (they are cluded UN) If both CS and AR(1) are not rejected when compared to UN there isn t a formal test; be guided by appropriateness and the prciple of maximisg likelihood or the formal test we need to note the number of estimated parameters the structures beg compared the difference is the df for the relevant chisquared test 33

34 Chi-squared test Need to re-run IXED with different structures. The default method used by IXED is REL which is best for determg random effects IXED OUTPUT (usg REL) cludes a table of it cludg -2 Res Log Likelihood The number of covariance parameters is OUTPUT the Dimensions Testg whether a simplified covariance structure will do the chi-squared df is the difference covariance parameters where p = 1 - cdf( chisq, d, df); d = change (-2 Res Log Likelihood) Child Weights Example Structure df -2 Res Log Likelihood Change from UN p UN CS (8 df) 0.32 AR(1) (8 df) 0.01 data _null_; p = 1-cdf( chisq, 9.3, 8); put p = ; run; Result - SAS LOG: p= CS ok AR(1) rejected 34

35 ore correlation structures See SAS Help! or example: suppose some subjects are more variable than others, but there is still a common correlation between different measurements on the same subject. Covariance Parameter Estimates Cov Parm Subject Estimate Var(1) Child Var(2) Child Var(3) Child Use Heterogeneous CS: Var(4) CSH Child Child type = csh Structure CSH CS df Res Log Likelihood Change from CSH 1.8 (3 df) p 0.61 CS not rejected when compared to CSH SAS/IXED code The Idea Summarised Values that are correlated must be identified - each set of correlated values beg a level of some factor Introduce a variable identifyg correlated values Include it the class statement Use a repeated statement & specify the correlation structure Alternatively: use a random statement if the correlation is simply due to units sharg a common fluences 35

36 Practical exercise SAS/IXED for Repeated easures Analysis Lab Session 6 Exercise 6.3 (a) (d) Random coefficients model Previous odel assumes a sgle regression le, with subjects (children) deviatg from this each year with correlated deviations. ore generally we could allow each subject to have its own regression le; this has the advantage beg possible when subjects are measured at different times Sce subjects are randomly allocated to treatments, the dividual tercepts and slopes are random variables We fit a fixed le (tercept and slope beg the estimated mean for all subjects) Each subject has its own le resultg from addg/subtractg from the mean tercept and slope. 36

37 Random coefficients model 6 (of 11) females 6 (of 16) males Plots show a le fit to each child s data (for the first 6 males and 6females) SAS/IXED: random coefficients model SAS/IXED code proc mixed data = cw; class gender child; model weight = gender cage gender*cage / ddfm = kenwardroger htype = 1 solution; random tercept cage / type = un subject = child gcorr; run; 37

38 IXED: selected OUTPUT Covariance Parameter Estimates Cov Parm Subject Estimate UN(1,1) Child UN(2,1) Child UN(2,2) Child Residual Estimated G Correlation atrix Row Effect Child Col1 Col2 1 Intercept age Type 1 Tests of ixed Effects Effect Num D Den D Value Pr > Gender cage <.0001 cage*gender Correlation between tercept and slope: Variances: tercept 5.79, slope 0.033, residual Conclusions: the pattern of growth differs between males and females (p = ), as with earlier analysis. Average slopes unchanged. The End but hopefully just the begng! 38

SAS Syntax and Output for Data Manipulation:

SAS Syntax and Output for Data Manipulation: Psyc 944 Example 5 page 1 Practice with Fixed and Random Effects of Time in Modeling Within-Person Change The models for this example come from Hoffman (in preparation) chapter 5. We will be examining

More information

Individual Growth Analysis Using PROC MIXED Maribeth Johnson, Medical College of Georgia, Augusta, GA

Individual Growth Analysis Using PROC MIXED Maribeth Johnson, Medical College of Georgia, Augusta, GA Paper P-702 Individual Growth Analysis Using PROC MIXED Maribeth Johnson, Medical College of Georgia, Augusta, GA ABSTRACT Individual growth models are designed for exploring longitudinal data on individuals

More information

This can dilute the significance of a departure from the null hypothesis. We can focus the test on departures of a particular form.

This can dilute the significance of a departure from the null hypothesis. We can focus the test on departures of a particular form. One-Degree-of-Freedom Tests Test for group occasion interactions has (number of groups 1) number of occasions 1) degrees of freedom. This can dilute the significance of a departure from the null hypothesis.

More information

Family economics data: total family income, expenditures, debt status for 50 families in two cohorts (A and B), annual records from 1990 1995.

Family economics data: total family income, expenditures, debt status for 50 families in two cohorts (A and B), annual records from 1990 1995. Lecture 18 1. Random intercepts and slopes 2. Notation for mixed effects models 3. Comparing nested models 4. Multilevel/Hierarchical models 5. SAS versions of R models in Gelman and Hill, chapter 12 1

More information

861 Example SPLH. 5 page 1. prefer to have. New data in. SPSS Syntax FILE HANDLE. VARSTOCASESS /MAKE rt. COMPUTE mean=2. COMPUTE sal=2. END IF.

861 Example SPLH. 5 page 1. prefer to have. New data in. SPSS Syntax FILE HANDLE. VARSTOCASESS /MAKE rt. COMPUTE mean=2. COMPUTE sal=2. END IF. SPLH 861 Example 5 page 1 Multivariate Models for Repeated Measures Response Times in Older and Younger Adults These data were collected as part of my masters thesis, and are unpublished in this form (to

More information

Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

More information

2013 MBA Jump Start Program. Statistics Module Part 3

2013 MBA Jump Start Program. Statistics Module Part 3 2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just

More information

Simple linear regression

Simple linear regression Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

More information

Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

More information

Random effects and nested models with SAS

Random effects and nested models with SAS Random effects and nested models with SAS /************* classical2.sas ********************* Three levels of factor A, four levels of B Both fixed Both random A fixed, B random B nested within A ***************************************************/

More information

2. Simple Linear Regression

2. Simple Linear Regression Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according

More information

Introduction to Longitudinal Data Analysis

Introduction to Longitudinal Data Analysis Introduction to Longitudinal Data Analysis Longitudinal Data Analysis Workshop Section 1 University of Georgia: Institute for Interdisciplinary Research in Education and Human Development Section 1: Introduction

More information

Chapter 5 Analysis of variance SPSS Analysis of variance

Chapter 5 Analysis of variance SPSS Analysis of variance Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,

More information

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the

More information

Correlation and Simple Linear Regression

Correlation and Simple Linear Regression Correlation and Simple Linear Regression We are often interested in studying the relationship among variables to determine whether they are associated with one another. When we think that changes in a

More information

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2 University of California, Berkeley Prof. Ken Chay Department of Economics Fall Semester, 005 ECON 14 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE # Question 1: a. Below are the scatter plots of hourly wages

More information

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

More information

Linear Models in STATA and ANOVA

Linear Models in STATA and ANOVA Session 4 Linear Models in STATA and ANOVA Page Strengths of Linear Relationships 4-2 A Note on Non-Linear Relationships 4-4 Multiple Linear Regression 4-5 Removal of Variables 4-8 Independent Samples

More information

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96 1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

More information

Regression Analysis: A Complete Example

Regression Analysis: A Complete Example Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information

Assignments Analysis of Longitudinal data: a multilevel approach

Assignments Analysis of Longitudinal data: a multilevel approach Assignments Analysis of Longitudinal data: a multilevel approach Frans E.S. Tan Department of Methodology and Statistics University of Maastricht The Netherlands Maastricht, Jan 2007 Correspondence: Frans

More information

Linear Regression. Chapter 5. Prediction via Regression Line Number of new birds and Percent returning. Least Squares

Linear Regression. Chapter 5. Prediction via Regression Line Number of new birds and Percent returning. Least Squares Linear Regression Chapter 5 Regression Objective: To quantify the linear relationship between an explanatory variable (x) and response variable (y). We can then predict the average response for all subjects

More information

Simple Predictive Analytics Curtis Seare

Simple Predictive Analytics Curtis Seare Using Excel to Solve Business Problems: Simple Predictive Analytics Curtis Seare Copyright: Vault Analytics July 2010 Contents Section I: Background Information Why use Predictive Analytics? How to use

More information

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

More information

Basic Statistical and Modeling Procedures Using SAS

Basic Statistical and Modeling Procedures Using SAS Basic Statistical and Modeling Procedures Using SAS One-Sample Tests The statistical procedures illustrated in this handout use two datasets. The first, Pulse, has information collected in a classroom

More information

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Beckman HLM Reading Group: Questions, Answers and Examples Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Linear Algebra Slide 1 of

More information

Regression III: Advanced Methods

Regression III: Advanced Methods Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models

More information

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1) CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.

More information

Binary Logistic Regression

Binary Logistic Regression Binary Logistic Regression Main Effects Model Logistic regression will accept quantitative, binary or categorical predictors and will code the latter two in various ways. Here s a simple model including

More information

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Chapter 13 Introduction to Linear Regression and Correlation Analysis Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing

More information

Example: Boats and Manatees

Example: Boats and Manatees Figure 9-6 Example: Boats and Manatees Slide 1 Given the sample data in Table 9-1, find the value of the linear correlation coefficient r, then refer to Table A-6 to determine whether there is a significant

More information

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 Analysis of covariance and multiple regression So far in this course,

More information

Univariate Regression

Univariate Regression Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is

More information

Factors affecting online sales

Factors affecting online sales Factors affecting online sales Table of contents Summary... 1 Research questions... 1 The dataset... 2 Descriptive statistics: The exploratory stage... 3 Confidence intervals... 4 Hypothesis tests... 4

More information

11. Analysis of Case-control Studies Logistic Regression

11. Analysis of Case-control Studies Logistic Regression Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:

More information

4. Multiple Regression in Practice

4. Multiple Regression in Practice 30 Multiple Regression in Practice 4. Multiple Regression in Practice The preceding chapters have helped define the broad principles on which regression analysis is based. What features one should look

More information

SPSS Guide: Regression Analysis

SPSS Guide: Regression Analysis SPSS Guide: Regression Analysis I put this together to give you a step-by-step guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar

More information

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation

More information

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis

More information

Correlation key concepts:

Correlation key concepts: CORRELATION Correlation key concepts: Types of correlation Methods of studying correlation a) Scatter diagram b) Karl pearson s coefficient of correlation c) Spearman s Rank correlation coefficient d)

More information

Geostatistics Exploratory Analysis

Geostatistics Exploratory Analysis Instituto Superior de Estatística e Gestão de Informação Universidade Nova de Lisboa Master of Science in Geospatial Technologies Geostatistics Exploratory Analysis Carlos Alberto Felgueiras cfelgueiras@isegi.unl.pt

More information

Chapter 15. Mixed Models. 15.1 Overview. A flexible approach to correlated data.

Chapter 15. Mixed Models. 15.1 Overview. A flexible approach to correlated data. Chapter 15 Mixed Models A flexible approach to correlated data. 15.1 Overview Correlated data arise frequently in statistical analyses. This may be due to grouping of subjects, e.g., students within classrooms,

More information

TRINITY COLLEGE. Faculty of Engineering, Mathematics and Science. School of Computer Science & Statistics

TRINITY COLLEGE. Faculty of Engineering, Mathematics and Science. School of Computer Science & Statistics UNIVERSITY OF DUBLIN TRINITY COLLEGE Faculty of Engineering, Mathematics and Science School of Computer Science & Statistics BA (Mod) Enter Course Title Trinity Term 2013 Junior/Senior Sophister ST7002

More information

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Section 14 Simple Linear Regression: Introduction to Least Squares Regression Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship

More information

Premaster Statistics Tutorial 4 Full solutions

Premaster Statistics Tutorial 4 Full solutions Premaster Statistics Tutorial 4 Full solutions Regression analysis Q1 (based on Doane & Seward, 4/E, 12.7) a. Interpret the slope of the fitted regression = 125,000 + 150. b. What is the prediction for

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

More information

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1) Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the

More information

Elementary Statistics Sample Exam #3

Elementary Statistics Sample Exam #3 Elementary Statistics Sample Exam #3 Instructions. No books or telephones. Only the supplied calculators are allowed. The exam is worth 100 points. 1. A chi square goodness of fit test is considered to

More information

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r), Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables

More information

Exercise 1.12 (Pg. 22-23)

Exercise 1.12 (Pg. 22-23) Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.

More information

An analysis method for a quantitative outcome and two categorical explanatory variables.

An analysis method for a quantitative outcome and two categorical explanatory variables. Chapter 11 Two-Way ANOVA An analysis method for a quantitative outcome and two categorical explanatory variables. If an experiment has a quantitative outcome and two categorical explanatory variables that

More information

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 4: September

More information

Model Fitting in PROC GENMOD Jean G. Orelien, Analytical Sciences, Inc.

Model Fitting in PROC GENMOD Jean G. Orelien, Analytical Sciences, Inc. Paper 264-26 Model Fitting in PROC GENMOD Jean G. Orelien, Analytical Sciences, Inc. Abstract: There are several procedures in the SAS System for statistical modeling. Most statisticians who use the SAS

More information

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm

More information

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade Statistics Quiz Correlation and Regression -- ANSWERS 1. Temperature and air pollution are known to be correlated. We collect data from two laboratories, in Boston and Montreal. Boston makes their measurements

More information

Course Objective This course is designed to give you a basic understanding of how to run regressions in SPSS.

Course Objective This course is designed to give you a basic understanding of how to run regressions in SPSS. SPSS Regressions Social Science Research Lab American University, Washington, D.C. Web. www.american.edu/provost/ctrl/pclabs.cfm Tel. x3862 Email. SSRL@American.edu Course Objective This course is designed

More information

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group MISSING DATA TECHNIQUES WITH SAS IDRE Statistical Consulting Group ROAD MAP FOR TODAY To discuss: 1. Commonly used techniques for handling missing data, focusing on multiple imputation 2. Issues that could

More information

Statistical Models in R

Statistical Models in R Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Structure of models in R Model Assessment (Part IA) Anova

More information

containing Kendall correlations; and the OUTH = option will create a data set containing Hoeffding statistics.

containing Kendall correlations; and the OUTH = option will create a data set containing Hoeffding statistics. Getting Correlations Using PROC CORR Correlation analysis provides a method to measure the strength of a linear relationship between two numeric variables. PROC CORR can be used to compute Pearson product-moment

More information

II. DISTRIBUTIONS distribution normal distribution. standard scores

II. DISTRIBUTIONS distribution normal distribution. standard scores Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

More information

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( ) Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

More information

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma siersma@sund.ku.dk The Research Unit for General Practice in Copenhagen Dias 1 Content Quantifying association

More information

Module 5: Multiple Regression Analysis

Module 5: Multiple Regression Analysis Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College

More information

Regression and Correlation

Regression and Correlation Regression and Correlation Topics Covered: Dependent and independent variables. Scatter diagram. Correlation coefficient. Linear Regression line. by Dr.I.Namestnikova 1 Introduction Regression analysis

More information

Using Excel for inferential statistics

Using Excel for inferential statistics FACT SHEET Using Excel for inferential statistics Introduction When you collect data, you expect a certain amount of variation, just caused by chance. A wide variety of statistical tests can be applied

More information

How To Run Statistical Tests in Excel

How To Run Statistical Tests in Excel How To Run Statistical Tests in Excel Microsoft Excel is your best tool for storing and manipulating data, calculating basic descriptive statistics such as means and standard deviations, and conducting

More information

SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables.

SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables. SIMPLE LINEAR CORRELATION Simple linear correlation is a measure of the degree to which two variables vary together, or a measure of the intensity of the association between two variables. Correlation

More information

Pearson's Correlation Tests

Pearson's Correlation Tests Chapter 800 Pearson's Correlation Tests Introduction The correlation coefficient, ρ (rho), is a popular statistic for describing the strength of the relationship between two variables. The correlation

More information

GLM I An Introduction to Generalized Linear Models

GLM I An Introduction to Generalized Linear Models GLM I An Introduction to Generalized Linear Models CAS Ratemaking and Product Management Seminar March 2009 Presented by: Tanya D. Havlicek, Actuarial Assistant 0 ANTITRUST Notice The Casualty Actuarial

More information

Technical report. in SPSS AN INTRODUCTION TO THE MIXED PROCEDURE

Technical report. in SPSS AN INTRODUCTION TO THE MIXED PROCEDURE Linear mixedeffects modeling in SPSS AN INTRODUCTION TO THE MIXED PROCEDURE Table of contents Introduction................................................................3 Data preparation for MIXED...................................................3

More information

Introduction to Regression and Data Analysis

Introduction to Regression and Data Analysis Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it

More information

Financial Risk Management Exam Sample Questions/Answers

Financial Risk Management Exam Sample Questions/Answers Financial Risk Management Exam Sample Questions/Answers Prepared by Daniel HERLEMONT 1 2 3 4 5 6 Chapter 3 Fundamentals of Statistics FRM-99, Question 4 Random walk assumes that returns from one time period

More information

Electronic Thesis and Dissertations UCLA

Electronic Thesis and Dissertations UCLA Electronic Thesis and Dissertations UCLA Peer Reviewed Title: A Multilevel Longitudinal Analysis of Teaching Effectiveness Across Five Years Author: Wang, Kairong Acceptance Date: 2013 Series: UCLA Electronic

More information

So, using the new notation, P X,Y (0,1) =.08 This is the value which the joint probability function for X and Y takes when X=0 and Y=1.

So, using the new notation, P X,Y (0,1) =.08 This is the value which the joint probability function for X and Y takes when X=0 and Y=1. Joint probabilit is the probabilit that the RVs & Y take values &. like the PDF of the two events, and. We will denote a joint probabilit function as P,Y (,) = P(= Y=) Marginal probabilit of is the probabilit

More information

ADVANCED FORECASTING MODELS USING SAS SOFTWARE

ADVANCED FORECASTING MODELS USING SAS SOFTWARE ADVANCED FORECASTING MODELS USING SAS SOFTWARE Girish Kumar Jha IARI, Pusa, New Delhi 110 012 gjha_eco@iari.res.in 1. Transfer Function Model Univariate ARIMA models are useful for analysis and forecasting

More information

Directions for using SPSS

Directions for using SPSS Directions for using SPSS Table of Contents Connecting and Working with Files 1. Accessing SPSS... 2 2. Transferring Files to N:\drive or your computer... 3 3. Importing Data from Another File Format...

More information

How To Model A Series With Sas

How To Model A Series With Sas Chapter 7 Chapter Table of Contents OVERVIEW...193 GETTING STARTED...194 TheThreeStagesofARIMAModeling...194 IdentificationStage...194 Estimation and Diagnostic Checking Stage...... 200 Forecasting Stage...205

More information

Name: Date: Use the following to answer questions 2-3:

Name: Date: Use the following to answer questions 2-3: Name: Date: 1. A study is conducted on students taking a statistics class. Several variables are recorded in the survey. Identify each variable as categorical or quantitative. A) Type of car the student

More information

CORRELATIONAL ANALYSIS: PEARSON S r Purpose of correlational analysis The purpose of performing a correlational analysis: To discover whether there

CORRELATIONAL ANALYSIS: PEARSON S r Purpose of correlational analysis The purpose of performing a correlational analysis: To discover whether there CORRELATIONAL ANALYSIS: PEARSON S r Purpose of correlational analysis The purpose of performing a correlational analysis: To discover whether there is a relationship between variables, To find out the

More information

Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables 2

Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables 2 Lesson 4 Part 1 Relationships between two numerical variables 1 Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables

More information

Chapter 23. Inferences for Regression

Chapter 23. Inferences for Regression Chapter 23. Inferences for Regression Topics covered in this chapter: Simple Linear Regression Simple Linear Regression Example 23.1: Crying and IQ The Problem: Infants who cry easily may be more easily

More information

Using SAS Proc Mixed for the Analysis of Clustered Longitudinal Data

Using SAS Proc Mixed for the Analysis of Clustered Longitudinal Data Using SAS Proc Mixed for the Analysis of Clustered Longitudinal Data Kathy Welch Center for Statistical Consultation and Research The University of Michigan 1 Background ProcMixed can be used to fit Linear

More information

Lecture 15. Endogeneity & Instrumental Variable Estimation

Lecture 15. Endogeneity & Instrumental Variable Estimation Lecture 15. Endogeneity & Instrumental Variable Estimation Saw that measurement error (on right hand side) means that OLS will be biased (biased toward zero) Potential solution to endogeneity instrumental

More information

Analysing Questionnaires using Minitab (for SPSS queries contact -) Graham.Currell@uwe.ac.uk

Analysing Questionnaires using Minitab (for SPSS queries contact -) Graham.Currell@uwe.ac.uk Analysing Questionnaires using Minitab (for SPSS queries contact -) Graham.Currell@uwe.ac.uk Structure As a starting point it is useful to consider a basic questionnaire as containing three main sections:

More information

The Big Picture. Correlation. Scatter Plots. Data

The Big Picture. Correlation. Scatter Plots. Data The Big Picture Correlation Bret Hanlon and Bret Larget Department of Statistics Universit of Wisconsin Madison December 6, We have just completed a length series of lectures on ANOVA where we considered

More information

Correlational Research. Correlational Research. Stephen E. Brock, Ph.D., NCSP EDS 250. Descriptive Research 1. Correlational Research: Scatter Plots

Correlational Research. Correlational Research. Stephen E. Brock, Ph.D., NCSP EDS 250. Descriptive Research 1. Correlational Research: Scatter Plots Correlational Research Stephen E. Brock, Ph.D., NCSP California State University, Sacramento 1 Correlational Research A quantitative methodology used to determine whether, and to what degree, a relationship

More information

PITFALLS IN TIME SERIES ANALYSIS. Cliff Hurvich Stern School, NYU

PITFALLS IN TIME SERIES ANALYSIS. Cliff Hurvich Stern School, NYU PITFALLS IN TIME SERIES ANALYSIS Cliff Hurvich Stern School, NYU The t -Test If x 1,..., x n are independent and identically distributed with mean 0, and n is not too small, then t = x 0 s n has a standard

More information

The Chi-Square Test. STAT E-50 Introduction to Statistics

The Chi-Square Test. STAT E-50 Introduction to Statistics STAT -50 Introduction to Statistics The Chi-Square Test The Chi-square test is a nonparametric test that is used to compare experimental results with theoretical models. That is, we will be comparing observed

More information

Biostatistics Short Course Introduction to Longitudinal Studies

Biostatistics Short Course Introduction to Longitudinal Studies Biostatistics Short Course Introduction to Longitudinal Studies Zhangsheng Yu Division of Biostatistics Department of Medicine Indiana University School of Medicine Zhangsheng Yu (Indiana University) Longitudinal

More information

Data Mining and Predictive Modeling in Institutional Advancement: How Ten Schools Found Success

Data Mining and Predictive Modeling in Institutional Advancement: How Ten Schools Found Success Technical report Data Mg and Predictive Modelg Institutional Advancement: How Ten Schools Found Success Dan Luperchio, Campaign Admistrator Zanvyl Krieger School of Arts and Sciences The Johns Hopks University

More information

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction

More information

ABSORBENCY OF PAPER TOWELS

ABSORBENCY OF PAPER TOWELS ABSORBENCY OF PAPER TOWELS 15. Brief Version of the Case Study 15.1 Problem Formulation 15.2 Selection of Factors 15.3 Obtaining Random Samples of Paper Towels 15.4 How will the Absorbency be measured?

More information

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling Jeff Wooldridge NBER Summer Institute, 2007 1. The Linear Model with Cluster Effects 2. Estimation with a Small Number of Groups and

More information

10. Analysis of Longitudinal Studies Repeat-measures analysis

10. Analysis of Longitudinal Studies Repeat-measures analysis Research Methods II 99 10. Analysis of Longitudinal Studies Repeat-measures analysis This chapter builds on the concepts and methods described in Chapters 7 and 8 of Mother and Child Health: Research methods.

More information

Statistical Models in R

Statistical Models in R Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Linear Models in R Regression Regression analysis is the appropriate

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is

More information

Multivariate Analysis of Variance (MANOVA)

Multivariate Analysis of Variance (MANOVA) Chapter 415 Multivariate Analysis of Variance (MANOVA) Introduction Multivariate analysis of variance (MANOVA) is an extension of common analysis of variance (ANOVA). In ANOVA, differences among various

More information

Examining a Fitted Logistic Model

Examining a Fitted Logistic Model STAT 536 Lecture 16 1 Examining a Fitted Logistic Model Deviance Test for Lack of Fit The data below describes the male birth fraction male births/total births over the years 1931 to 1990. A simple logistic

More information