ANOVA. February 12, 2015

Size: px
Start display at page:

Download "ANOVA. February 12, 2015"

Transcription

1 ANOVA February 12, ANOVA models Last time, we discussed the use of categorical variables in multivariate regression. Often, these are encoded as indicator columns in the design matrix. In [1]: %%R url = salary.table = read.table(url, header=t) salary.table$e = factor(salary.table$e) salary.table$m = factor(salary.table$m) salary.lm = lm(s ~ X + E + M, salary.table) head(model.matrix(salary.lm)) (Intercept) X E2 E3 M Often, especially in experimental settings, we record only categorical variables. Such models are often referred to ANOVA (Analysis of Variance) models. These are generalizations of our favorite example, the two sample t-test. 1.1 Example: recovery time Suppose we want to understand the relationship between recovery time after surgery based on an patient s prior fitness. We group patients into three fitness levels: below average, average, above average. If you are in better shape before surgery, does it take less time to recover? In [2]: %%R url = rehab.table = read.table(url, header=t, sep=, ) rehab.table$fitness <- factor(rehab.table$fitness) head(rehab.table) 1

2 Fitness Time In [3]: %%R -h 800 -w 800 attach(rehab.table) boxplot(time ~ Fitness, col=c( red, green, blue )) 2

3 1.2 One-way ANOVA First generalization of two sample t-test: more than two groups. Observations are broken up into r groups with n i, 1 i r observations per group. Model: Y ij = µ + α i + ε ij, ε ij N(0, σ 2 ). Constraint: r i=1 α i = 0. This constraint is needed for identifiability. This is equivalent to only adding r 1 columns to the design matrix for this qualitative variable. This is not the same parameterization we get when only adding r columns, but it gives the same model. The estimates of α can be obtained from the estimates of β using R s default parameters. For a more detailed exploration into R s creation of design matrices, try reading the following tutorial on design matrices. 1.3 Remember, it s still a model (i.e. a plane) 1.4 Fitting the model Model is easy to fit: Ŷ ij = 1 n i n i j=1 Y ij = Y i. If observation is in i-th group: predicted mean is just the sample mean of observations in i-th group. Simplest question: is there any group (main) effect? H 0 : α 1 = = α r = 0? Test is based on F -test with full model vs. reduced model. Reduced model just has an intercept. Other questions: is the effect the same in groups 1 and 2? In [4]: %%R rehab.lm <- lm(time ~ Fitness) summary(rehab.lm) H 0 : α 1 = α 2? Call: lm(formula = Time ~ Fitness) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) < 2e-16 *** Fitness ** Fitness e-06 *** 3

4 Signif. codes: 0 *** ** 0.01 * Residual standard error: on 21 degrees of freedom Multiple R-squared: ,Adjusted R-squared: F-statistic: on 2 and 21 DF, p-value: 4.129e-05 In [5]: %%R print(predict(rehab.lm, list(fitness=factor(c(1,2,3))))) c(mean(time[fitness == 1]), mean(time[fitness == 2]), mean(time[fitness == 3])) [1] Recall that the rows of the Coefficients table above do not correspond to the α parameter. For one thing, we would see three α s and their sum would have to be equal to 0. Also, the design matrix is the indicator coding we saw last time. In [6]: %%R head(model.matrix(rehab.lm)) (Intercept) Fitness2 Fitness There are ways to get different design matrices by using the contrasts argument. This is a bit above our pay grade at the moment. Upon inspection of the design matrix above, we see that the (Intercept) coefficient corresponds to the mean in Fitness==1, while Fitness==2 coefficient corresponds to the difference between the groups Fitness==2 and Fitness== ANOVA table Much of the information in an ANOVA model is contained in the ANOVA table. In [8]: make_table(anova_oneway) apply_theme( basic ) Out[8]: <ipy table.ipytable at 0x107d8c250> In [9]: %%R anova(rehab.lm) 4

5 Analysis of Variance Table Response: Time Df Sum Sq Mean Sq F value Pr(>F) Fitness e-05 *** Residuals Signif. codes: 0 *** ** 0.01 * Note that MST R measures variability of the cell means. If there is a group effect we expect this to be large relative to MSE. We see that under H 0 : α 1 = = α r = 0, the expected value of MST R and MSE is σ 2. This tells us how to test H 0 using ratio of mean squares, i.e. an F test. 1.6 Testing for any main effect Rows in the ANOVA table are, in general, independent. Therefore, under H 0 F = MST R MSE SST R = df T R SSE df E F dft R,df E the degrees of freedom come from the df column in previous table. Reject H 0 at level α if F > F 1 α,dft R,df E. In [10]: %%R F = / pval = 1 - pf(f, 2, 21) print(data.frame(f,pval)) F pval e Inference for linear combinations Suppose we want to infer something about r a i µ i where µ i = µ + α i is the mean in the i-th group. For example: H 0 : µ 1 µ 2 = 0 (same as H 0 : α 1 α 2 = 0)? i=1 For example: Is there a difference between below average and average groups in terms of rehab time? 5

6 We need to know ( r r Var a i Y i ) = σ 2 a 2 i. n i After this, the usual confidence intervals and t-tests apply. In [11]: %%R head(model.matrix(rehab.lm)) (Intercept) Fitness2 Fitness i=1 i=1 This means that the coefficient Fitness2 is the estimated difference between the two groups. In [12]: %%R detach(rehab.table) 1.8 Two-way ANOVA Often, we will have more than one variable we are changing Example After kidney failure, we suppose that the time of stay in hospital depends on weight gain between treatments and duration of treatment. We will model the log number of days as a function of the other two factors. In [14]: make_table(desc) apply_theme( basic ) Out[14]: <ipy table.ipytable at 0x107d8cd90> In [15]: %%R url = kidney.table = read.table(url, header=t) kidney.table$d = factor(kidney.table$duration) kidney.table$w = factor(kidney.table$weight) kidney.table$logdays = log(kidney.table$days + 1) attach(kidney.table) head(kidney.table) Days Duration Weight ID D W logdays

7 1.8.2 Two-way ANOVA model Second generalization of t-test: more than one grouping variable. Two-way ANOVA model: r groups in first factor m groups in second factor n ij in each combination of factor variables. Model: Y ijk = µ + α i + β j + (αβ) ij + ε ijk, ε ijk N(0, σ 2 ). In kidney example, r = 3 (weight gain), m = 2 (duration of treatment), n ij = 10 for all (i, j) Questions of interest Two-way ANOVA: main questions of interest Are there main effects for the grouping variables? Are there interaction effects: Interactions between factors H 0 : α 1 = = α r = 0, H 0 : β 1 = = β m = 0. H 0 : (αβ) ij = 0, 1 i r, 1 j m. We ve already seen these interactions in the IT salary example. An additive model says that the effects of the two factors occur additively such a model has no interactions. An interaction is present whenever the additive model does not hold Interaction plot In [16]: %%R -h 800 -w 800 interaction.plot(w, D, logdays, type= b, col=c( red, blue ), lwd=2, pch=c(23,24)) 7

8 When these broken lines are not parallel, there is evidence of an interaction. The one thing missing from this plot are errorbars. The above broken lines are clearly not parallel but there is measurement error. If the error bars were large then we might consider there to be no interaction, otherwise we might Parameterization Many constraints are needed, again for identifiability. Let s not worry too much about the details Constraints: r i=1 α i = 0 m j=1 β j = 0 m j=1 (αβ) ij = 0, 1 i r r i=1 (αβ) ij = 0, 1 j m. We should convince ourselves that we know have exactly r m free parameters. 8

9 1.8.7 Fitting the model Easy to fit when n ij = n (balanced) Ŷ ijk = Y ij = 1 n n Y ijk. k=1 Inference for combinations r m Var a ij Y ij = σ2 n i=1 j=1 r m i=1 j=1 a 2 ij. Usual t-tests, confidence intervals. In [17]: %%R kidney.lm = lm(logdays ~ D*W) summary(kidney.lm) Call: lm(formula = logdays ~ D * W) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) e-05 *** D W * W e-05 *** D2:W D2:W Signif. codes: 0 *** ** 0.01 * Residual standard error: on 54 degrees of freedom Multiple R-squared: ,Adjusted R-squared: F-statistic: on 5 and 54 DF, p-value: 2.301e Example Suppose we are interested in comparing the mean in (D = 1, W = 3) and (D = 2, W = 2) groups. The difference is E(Ȳ13 Ȳ22 ) By independence, its variance is Var(Ȳ13 ) + Var(Ȳ22 ) = 2σ2 n. 9

10 In [18]: %%R estimates = predict(kidney.lm, list(d=factor(c(1,2)), W=factor(c(3,2)))) print(estimates) sigma.hat = # from table above n = 10 # ten observations per group fit = estimates[1] - estimates[2] upper = fit + qt(0.975, 54) * sqrt(2 * sigma.hat^2 / n) lower = fit - qt(0.975,54) * sqrt(2 * sigma.hat^2 / n) data.frame(fit,lower,upper) fit lower upper In [19]: %%R head(model.matrix(kidney.lm)) (Intercept) D2 W2 W3 D2:W2 D2:W Finding predicted values The most direct way to compute predicted values is using the predict function In [20]: %%R predict(kidney.lm, list(d=factor(1),w=factor(1)), interval= confidence ) fit lwr upr ANOVA table In the balanced case, everything can again be summarized from the ANOVA table In [22]: make_table(anova_twoway) apply_theme( basic ) Out[22]: <ipy table.ipytable at 0x107d8c890> Tests using the ANOVA table Rows of the ANOVA table can be used to test various of the hypotheses we started out with. For instance, we see that under H 0 : (αβ) ij = 0, i, j the expected value of SSAB and SSE is σ 2 use these for an F -test testing for an interaction. 10

11 Under H 0 In [23]: %%R anova(kidney.lm) Analysis of Variance Table (m 1)(r 1) F = MSAB SSAB MSE = SSE (n 1)mr F (m 1)(r 1),(n 1)mr Response: logdays Df Sum Sq Mean Sq F value Pr(>F) D * W e-06 *** D:W Residuals Signif. codes: 0 *** ** 0.01 * We can also test for interactions using our usual approach In [24]: %%R anova(lm(logdays ~ D + W, kidney.table), kidney.lm) Analysis of Variance Table Model 1: logdays ~ D + W Model 2: logdays ~ D * W Res.Df RSS Df Sum of Sq F Pr(>F) Some caveats about R formulae While we see that it is straightforward to form the interactions test using our usual anova function approach, we generally cannot test for main effects by this approach. In [25]: %%R lm_no_main_weight = lm(logdays ~ D + W:D) anova(lm_no_main_weight, kidney.lm) Analysis of Variance Table Model 1: logdays ~ D + W:D Model 2: logdays ~ D * W Res.Df RSS Df Sum of Sq F Pr(>F) e-15 In fact, these models are identical in terms of their planes or their fitted values. What has happened is that R has formed a different design matrix using its rules for formula objects. 11

12 In [26]: %%R lm1 = lm(logdays ~ D + W:D) lm2 = lm(logdays ~ D + W:D + W) anova(lm1, lm2) Analysis of Variance Table Model 1: logdays ~ D + W:D Model 2: logdays ~ D + W:D + W Res.Df RSS Df Sum of Sq F Pr(>F) e ANOVA tables in general So far, we have used anova to compare two models. In this section, we produced tables for just 1 model. This also works for any regression model, though we have to be a little careful about interpretation. Let s revisit the job aptitude test data from last section. In [27]: %%R url = jobtest.table <- read.table(url, header=t) jobtest.table$ethn <- factor(jobtest.table$ethn) jobtest.lm = lm(jperf ~ TEST * ETHN, jobtest.table) summary(jobtest.lm) Call: lm(formula = JPERF ~ TEST * ETHN, data = jobtest.table) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) TEST ETHN TEST:ETHN Signif. codes: 0 *** ** 0.01 * Residual standard error: on 16 degrees of freedom Multiple R-squared: ,Adjusted R-squared: F-statistic: on 3 and 16 DF, p-value: Now, let s look at the anova output. We ll see the results don t match. In [28]: %%R anova(jobtest.lm) 12

13 Analysis of Variance Table Response: JPERF Df Sum Sq Mean Sq F value Pr(>F) TEST *** ETHN TEST:ETHN Residuals Signif. codes: 0 *** ** 0.01 * The difference is how the Sum Sq columns is created. In the anova output, terms in the response are added sequentially. We can see this by comparing these two models directly. The F statistic doesn t agree because the MSE above is computed in the fullest model, but the Sum of Sq is correct. In [29]: %%R anova(lm(jperf ~ TEST, jobtest.table), lm(jperf ~ TEST + ETHN, jobtest.table)) Analysis of Variance Table Model 1: JPERF ~ TEST Model 2: JPERF ~ TEST + ETHN Res.Df RSS Df Sum of Sq F Pr(>F) Similarly, the first Sum Sq in anova can be found by: In [30]: %%R anova(lm(jperf ~ 1, jobtest.table), lm(jperf ~ TEST, jobtest.table)) Analysis of Variance Table Model 1: JPERF ~ 1 Model 2: JPERF ~ TEST Res.Df RSS Df Sum of Sq F Pr(>F) *** Signif. codes: 0 *** ** 0.01 * There are ways to produce an ANOVA table whose p-values agree with summary. This is done by an ANOVA table that uses Type-III sum of squares. In [31]: %%R library(car) Anova(jobtest.lm, type=3) 13

14 Anova Table (Type III tests) Response: JPERF Sum Sq Df F value Pr(>F) (Intercept) TEST ETHN TEST:ETHN Residuals Signif. codes: 0 *** ** 0.01 * In [32]: %%R summary(jobtest.lm) Call: lm(formula = JPERF ~ TEST * ETHN, data = jobtest.table) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) TEST ETHN TEST:ETHN Signif. codes: 0 *** ** 0.01 * Residual standard error: on 16 degrees of freedom Multiple R-squared: ,Adjusted R-squared: F-statistic: on 3 and 16 DF, p-value: Fixed and random effects In kidney & rehab examples, the categorical variables are well-defined categories: below average fitness, long duration, etc. In some designs, the categorical variable is subject. Simplest example: repeated measures, where more than one (identical) measurement is taken on the same individual. In this case, the group effect α i is best thought of as random because we only sample a subset of the entire population. 14

15 2.0.1 When to use random effects? A group effect is random if we can think of the levels we observe in that group to be samples from a larger population. Example: if collecting data from different medical centers, center might be thought of as random. Example: if surveying students on different campuses, campus may be a random effect Example: sodium content in beer How much sodium is there in North American beer? How much does this vary by brand? Observations: for 6 brands of beer, we recorded the sodium content of 8 12 ounce bottles. Questions of interest: what is the grand mean sodium content? How much variability is there from brand to brand? Individuals in this case are brands, repeated measures are the 8 bottles. In [33]: %%R url = sodium.table = read.table(url, header=t) sodium.table$brand = factor(sodium.table$brand) sodium.lm = lm(sodium ~ brand, sodium.table) anova(sodium.lm) Analysis of Variance Table Response: sodium Df Sum Sq Mean Sq F value Pr(>F) brand < 2.2e-16 *** Residuals Signif. codes: 0 *** ** 0.01 * One-way random effects model Assuming that cell-sizes are the same, i.e. equal observations for each subject (brand of beer). Observations Y ij µ + α i + ε ij, 1 i r, 1 j n ε ij N(0, σ 2 ɛ ), 1 i r, 1 j n α i N(0, σ 2 α), 1 i r. Parameters: µ is the population mean; σ 2 ɛ is the measurement variance (i.e. how variable are the readings from the machine that reads the sodium content?); σ 2 α is the population variance (i.e. how variable is the sodium content of beer across brands). 15

16 2.0.4 Modelling the variance In random effects model, the observations are no longer independent (even if ε s are independent Cov(Y ij, Y i j ) = ( σ 2 α + σ 2 ɛ δ j,j ) δi,i. In more complicated models, this makes maximum likelihood estimation more complicated: least squares is no longer the best solution. It s no longer a plane! This model has a very simple model for the mean, it just has a slightly more complex model for the variance. Shortly we ll see other more complex models of the variance: Weighted Least Squares Correlated Errors Fitting the model The MLE (Maximum Likelihood Estimator) is found by minimizing 2 log l(µ, σ 2 ɛ, σ 2 α Y ) = r [ (Y i µ) T (σɛ 2 I ni n i + σα11 2 T ) 1 (Y i µ) i=1 + log ( det(σ 2 ɛ I ni n i + σ 2 α11 T ) )]. THe function l(µ, σ 2 ɛ, σ 2 α) is called the likelihood function Fitting the model in balanced design Only one parameter in the mean function µ. - When cell sizes are the same (balanced), Unbalanced models: use numerical optimizer. µ = Y = 1 Y ij. nr This also changes estimates of σ 2 ɛ see ANOVA table. We might guess that df = nr 1 and This is not correct. σ 2 = 1 nr 1 i,j (Y ij Y ) 2. In [34]: %%R library(nlme) sodium.lme = lme(fixed=sodium~1,random=~1 brand, data=sodium.table) summary(sodium.lme) Linear mixed-effects model fit by REML Data: sodium.table AIC BIC loglik Random effects: i,j 16

17 Formula: ~1 brand (Intercept) Residual StdDev: Fixed effects: sodium ~ 1 Value Std.Error DF t-value p-value (Intercept) Standardized Within-Group Residuals: Min Q1 Med Q3 Max Number of Observations: 48 Number of Groups: 6 For reasons I m not sure of, the degrees of freedom don t agree with our ANOVA, though we do find the correct SE for our estimate of µ: In [35]: %%R MSTR = anova(sodium.lm)$mean[1] sqrt(mstr/48) [1] The intervals formed by lme use the 42 degrees of freedom, but are otherwise the same: In [36]: %%R intervals(sodium.lme) Approximate 95% confidence intervals Fixed effects: lower est. upper (Intercept) attr(,"label") [1] "Fixed effects:" Random Effects: Level: brand lower est. upper sd((intercept)) Within-group standard error: lower est. upper In [37]: %%R center = mean(sodium.table$sodium) lwr = center - sqrt(mstr / 48) * qt(0.975,42) upr = center + sqrt(mstr / 48) * qt(0.975,42) data.frame(lwr, center, upr) 17

18 lwr center upr Using our degrees of freedom as 7 yields slightly wider intervals In [38]: %%R center = mean(sodium.table$sodium) lwr = center - sqrt(mstr / 48) * qt(0.975,7) upr = center + sqrt(mstr / 48) * qt(0.975,7) data.frame(lwr, center, upr) lwr center upr ANOVA table Again, the information needed can be summarized in an ANOVA table. In [40]: make_table(anova_oneway) apply_theme( basic ) Out[40]: <ipy table.ipytable at 0x107d8c990> ANOVA table is still useful to setup tests: the same F statistics for fixed or random will work here. Test for random effect: H 0 : σ 2 α = 0 based on Inference for µ F = MST R MSE F r 1,(n 1)r under H 0. Easy to check that E(Y ) = µ Var(Y ) = σ2 ɛ + nσα 2. rn To come up with a t statistic that we can use for test, CIs, we need to find an estimate of Var(Y ). ANOVA table says E(MST R) = nσ 2 α + σ 2 ɛ which suggests Degrees of freedom Why r 1 degrees of freedom? Y µ MST R rn t r 1. Imagine we could record an infinite number of observations for each individu al, so that Y i µ + α i. To learn anything about µ we still only have r observations (µ 1,..., µ r ). Sampling more within an individual cannot narrow the CI for µ. 18

19 Estimating σ 2 α We have seen estimates of µ and σ 2 ɛ. Only one parameter remains. Based on the ANOVA table, we see that σα 2 = 1 (E(MST R) E(MSE)). n This suggests the estimate ˆσ 2 µ = 1 (MST R MSE). n However, this estimate can be negative! Many such computational difficulties arise in random (and mixed) effects models. 2.1 Mixed effects model The one-way random effects ANOVA is a special case of a so-called mixed effects model: Y n 1 = X n p β p 1 + Z n q γ q 1 γ N(0, Σ). Various models also consider restrictions on Σ (e.g. diagonal, unrestricted, block diagonal, etc.) Our multiple linear regression model is a (very simple) mixed-effects model with q = n, Z = I n n Σ = σ 2 I n n. 19

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is

More information

E(y i ) = x T i β. yield of the refined product as a percentage of crude specific gravity vapour pressure ASTM 10% point ASTM end point in degrees F

E(y i ) = x T i β. yield of the refined product as a percentage of crude specific gravity vapour pressure ASTM 10% point ASTM end point in degrees F Random and Mixed Effects Models (Ch. 10) Random effects models are very useful when the observations are sampled in a highly structured way. The basic idea is that the error associated with any linear,

More information

Statistical Models in R

Statistical Models in R Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Linear Models in R Regression Regression analysis is the appropriate

More information

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 Analysis of covariance and multiple regression So far in this course,

More information

We extended the additive model in two variables to the interaction model by adding a third term to the equation.

We extended the additive model in two variables to the interaction model by adding a third term to the equation. Quadratic Models We extended the additive model in two variables to the interaction model by adding a third term to the equation. Similarly, we can extend the linear model in one variable to the quadratic

More information

Recall this chart that showed how most of our course would be organized:

Recall this chart that showed how most of our course would be organized: Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical

More information

Statistical Models in R

Statistical Models in R Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Structure of models in R Model Assessment (Part IA) Anova

More information

Comparing Nested Models

Comparing Nested Models Comparing Nested Models ST 430/514 Two models are nested if one model contains all the terms of the other, and at least one additional term. The larger model is the complete (or full) model, and the smaller

More information

One-Way Analysis of Variance (ANOVA) Example Problem

One-Way Analysis of Variance (ANOVA) Example Problem One-Way Analysis of Variance (ANOVA) Example Problem Introduction Analysis of Variance (ANOVA) is a hypothesis-testing technique used to test the equality of two or more population (or treatment) means

More information

N-Way Analysis of Variance

N-Way Analysis of Variance N-Way Analysis of Variance 1 Introduction A good example when to use a n-way ANOVA is for a factorial design. A factorial design is an efficient way to conduct an experiment. Each observation has data

More information

Stat 5303 (Oehlert): Tukey One Degree of Freedom 1

Stat 5303 (Oehlert): Tukey One Degree of Freedom 1 Stat 5303 (Oehlert): Tukey One Degree of Freedom 1 > catch

More information

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96 1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

More information

Week 5: Multiple Linear Regression

Week 5: Multiple Linear Regression BUS41100 Applied Regression Analysis Week 5: Multiple Linear Regression Parameter estimation and inference, forecasting, diagnostics, dummy variables Robert B. Gramacy The University of Chicago Booth School

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models We have previously worked with regression models where the response variable is quantitative and normally distributed. Now we turn our attention to two types of models where the

More information

Introducing the Multilevel Model for Change

Introducing the Multilevel Model for Change Department of Psychology and Human Development Vanderbilt University GCM, 2010 1 Multilevel Modeling - A Brief Introduction 2 3 4 5 Introduction In this lecture, we introduce the multilevel model for change.

More information

Correlation and Simple Linear Regression

Correlation and Simple Linear Regression Correlation and Simple Linear Regression We are often interested in studying the relationship among variables to determine whether they are associated with one another. When we think that changes in a

More information

Chapter 7. One-way ANOVA

Chapter 7. One-way ANOVA Chapter 7 One-way ANOVA One-way ANOVA examines equality of population means for a quantitative outcome and a single categorical explanatory variable with any number of levels. The t-test of Chapter 6 looks

More information

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation

More information

Using R for Linear Regression

Using R for Linear Regression Using R for Linear Regression In the following handout words and symbols in bold are R functions and words and symbols in italics are entries supplied by the user; underlined words and symbols are optional

More information

An analysis method for a quantitative outcome and two categorical explanatory variables.

An analysis method for a quantitative outcome and two categorical explanatory variables. Chapter 11 Two-Way ANOVA An analysis method for a quantitative outcome and two categorical explanatory variables. If an experiment has a quantitative outcome and two categorical explanatory variables that

More information

MIXED MODEL ANALYSIS USING R

MIXED MODEL ANALYSIS USING R Research Methods Group MIXED MODEL ANALYSIS USING R Using Case Study 4 from the BIOMETRICS & RESEARCH METHODS TEACHING RESOURCE BY Stephen Mbunzi & Sonal Nagda www.ilri.org/rmg www.worldagroforestrycentre.org/rmg

More information

EDUCATION AND VOCABULARY MULTIPLE REGRESSION IN ACTION

EDUCATION AND VOCABULARY MULTIPLE REGRESSION IN ACTION EDUCATION AND VOCABULARY MULTIPLE REGRESSION IN ACTION EDUCATION AND VOCABULARY 5-10 hours of input weekly is enough to pick up a new language (Schiff & Myers, 1988). Dutch children spend 5.5 hours/day

More information

Lecture 11: Confidence intervals and model comparison for linear regression; analysis of variance

Lecture 11: Confidence intervals and model comparison for linear regression; analysis of variance Lecture 11: Confidence intervals and model comparison for linear regression; analysis of variance 14 November 2007 1 Confidence intervals and hypothesis testing for linear regression Just as there was

More information

5. Linear Regression

5. Linear Regression 5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4

More information

Lets suppose we rolled a six-sided die 150 times and recorded the number of times each outcome (1-6) occured. The data is

Lets suppose we rolled a six-sided die 150 times and recorded the number of times each outcome (1-6) occured. The data is In this lab we will look at how R can eliminate most of the annoying calculations involved in (a) using Chi-Squared tests to check for homogeneity in two-way tables of catagorical data and (b) computing

More information

Psychology 205: Research Methods in Psychology

Psychology 205: Research Methods in Psychology Psychology 205: Research Methods in Psychology Using R to analyze the data for study 2 Department of Psychology Northwestern University Evanston, Illinois USA November, 2012 1 / 38 Outline 1 Getting ready

More information

Random effects and nested models with SAS

Random effects and nested models with SAS Random effects and nested models with SAS /************* classical2.sas ********************* Three levels of factor A, four levels of B Both fixed Both random A fixed, B random B nested within A ***************************************************/

More information

MULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL. by Michael L. Orlov Chemistry Department, Oregon State University (1996)

MULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL. by Michael L. Orlov Chemistry Department, Oregon State University (1996) MULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL by Michael L. Orlov Chemistry Department, Oregon State University (1996) INTRODUCTION In modern science, regression analysis is a necessary part

More information

Time-Series Regression and Generalized Least Squares in R

Time-Series Regression and Generalized Least Squares in R Time-Series Regression and Generalized Least Squares in R An Appendix to An R Companion to Applied Regression, Second Edition John Fox & Sanford Weisberg last revision: 11 November 2010 Abstract Generalized

More information

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( ) Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

More information

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma siersma@sund.ku.dk The Research Unit for General Practice in Copenhagen Dias 1 Content Quantifying association

More information

Testing for Lack of Fit

Testing for Lack of Fit Chapter 6 Testing for Lack of Fit How can we tell if a model fits the data? If the model is correct then ˆσ 2 should be an unbiased estimate of σ 2. If we have a model which is not complex enough to fit

More information

Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

More information

Data Analysis Tools. Tools for Summarizing Data

Data Analysis Tools. Tools for Summarizing Data Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information

T O P I C 1 2 Techniques and tools for data analysis Preview Introduction In chapter 3 of Statistics In A Day different combinations of numbers and types of variables are presented. We go through these

More information

STATISTICA Formula Guide: Logistic Regression. Table of Contents

STATISTICA Formula Guide: Logistic Regression. Table of Contents : Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary

More information

Final Exam Practice Problem Answers

Final Exam Practice Problem Answers Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal

More information

Regression step-by-step using Microsoft Excel

Regression step-by-step using Microsoft Excel Step 1: Regression step-by-step using Microsoft Excel Notes prepared by Pamela Peterson Drake, James Madison University Type the data into the spreadsheet The example used throughout this How to is a regression

More information

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

 Y. Notation and Equations for Regression Lecture 11/4. Notation: Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

More information

Chapter 5 Analysis of variance SPSS Analysis of variance

Chapter 5 Analysis of variance SPSS Analysis of variance Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,

More information

Experimental Designs (revisited)

Experimental Designs (revisited) Introduction to ANOVA Copyright 2000, 2011, J. Toby Mordkoff Probably, the best way to start thinking about ANOVA is in terms of factors with levels. (I say this because this is how they are described

More information

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management KSTAT MINI-MANUAL Decision Sciences 434 Kellogg Graduate School of Management Kstat is a set of macros added to Excel and it will enable you to do the statistics required for this course very easily. To

More information

Introduction to Regression and Data Analysis

Introduction to Regression and Data Analysis Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it

More information

The F distribution and the basic principle behind ANOVAs. Situating ANOVAs in the world of statistical tests

The F distribution and the basic principle behind ANOVAs. Situating ANOVAs in the world of statistical tests Tutorial The F distribution and the basic principle behind ANOVAs Bodo Winter 1 Updates: September 21, 2011; January 23, 2014; April 24, 2014; March 2, 2015 This tutorial focuses on understanding rather

More information

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups In analysis of variance, the main research question is whether the sample means are from different populations. The

More information

Statistics Review PSY379

Statistics Review PSY379 Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses

More information

CS 147: Computer Systems Performance Analysis

CS 147: Computer Systems Performance Analysis CS 147: Computer Systems Performance Analysis One-Factor Experiments CS 147: Computer Systems Performance Analysis One-Factor Experiments 1 / 42 Overview Introduction Overview Overview Introduction Finding

More information

data visualization and regression

data visualization and regression data visualization and regression Sepal.Length 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 I. setosa I. versicolor I. virginica I. setosa I. versicolor I. virginica Species Species

More information

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis

More information

GLM I An Introduction to Generalized Linear Models

GLM I An Introduction to Generalized Linear Models GLM I An Introduction to Generalized Linear Models CAS Ratemaking and Product Management Seminar March 2009 Presented by: Tanya D. Havlicek, Actuarial Assistant 0 ANTITRUST Notice The Casualty Actuarial

More information

Multiple Linear Regression in Data Mining

Multiple Linear Regression in Data Mining Multiple Linear Regression in Data Mining Contents 2.1. A Review of Multiple Linear Regression 2.2. Illustration of the Regression Process 2.3. Subset Selection in Linear Regression 1 2 Chap. 2 Multiple

More information

Analysis of Variance. MINITAB User s Guide 2 3-1

Analysis of Variance. MINITAB User s Guide 2 3-1 3 Analysis of Variance Analysis of Variance Overview, 3-2 One-Way Analysis of Variance, 3-5 Two-Way Analysis of Variance, 3-11 Analysis of Means, 3-13 Overview of Balanced ANOVA and GLM, 3-18 Balanced

More information

Part II. Multiple Linear Regression

Part II. Multiple Linear Regression Part II Multiple Linear Regression 86 Chapter 7 Multiple Regression A multiple linear regression model is a linear model that describes how a y-variable relates to two or more xvariables (or transformations

More information

n + n log(2π) + n log(rss/n)

n + n log(2π) + n log(rss/n) There is a discrepancy in R output from the functions step, AIC, and BIC over how to compute the AIC. The discrepancy is not very important, because it involves a difference of a constant factor that cancels

More information

ANALYSING LIKERT SCALE/TYPE DATA, ORDINAL LOGISTIC REGRESSION EXAMPLE IN R.

ANALYSING LIKERT SCALE/TYPE DATA, ORDINAL LOGISTIC REGRESSION EXAMPLE IN R. ANALYSING LIKERT SCALE/TYPE DATA, ORDINAL LOGISTIC REGRESSION EXAMPLE IN R. 1. Motivation. Likert items are used to measure respondents attitudes to a particular question or statement. One must recall

More information

Statistical Functions in Excel

Statistical Functions in Excel Statistical Functions in Excel There are many statistical functions in Excel. Moreover, there are other functions that are not specified as statistical functions that are helpful in some statistical analyses.

More information

MSwM examples. Jose A. Sanchez-Espigares, Alberto Lopez-Moreno Dept. of Statistics and Operations Research UPC-BarcelonaTech.

MSwM examples. Jose A. Sanchez-Espigares, Alberto Lopez-Moreno Dept. of Statistics and Operations Research UPC-BarcelonaTech. MSwM examples Jose A. Sanchez-Espigares, Alberto Lopez-Moreno Dept. of Statistics and Operations Research UPC-BarcelonaTech February 24, 2014 Abstract Two examples are described to illustrate the use of

More information

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance

More information

How To Run Statistical Tests in Excel

How To Run Statistical Tests in Excel How To Run Statistical Tests in Excel Microsoft Excel is your best tool for storing and manipulating data, calculating basic descriptive statistics such as means and standard deviations, and conducting

More information

SAS Software to Fit the Generalized Linear Model

SAS Software to Fit the Generalized Linear Model SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling

More information

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Chapter 13 Introduction to Linear Regression and Correlation Analysis Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing

More information

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1) Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the

More information

SAS Syntax and Output for Data Manipulation:

SAS Syntax and Output for Data Manipulation: Psyc 944 Example 5 page 1 Practice with Fixed and Random Effects of Time in Modeling Within-Person Change The models for this example come from Hoffman (in preparation) chapter 5. We will be examining

More information

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) As with other parametric statistics, we begin the one-way ANOVA with a test of the underlying assumptions. Our first assumption is the assumption of

More information

Section 13, Part 1 ANOVA. Analysis Of Variance

Section 13, Part 1 ANOVA. Analysis Of Variance Section 13, Part 1 ANOVA Analysis Of Variance Course Overview So far in this course we ve covered: Descriptive statistics Summary statistics Tables and Graphs Probability Probability Rules Probability

More information

Doing Multiple Regression with SPSS. In this case, we are interested in the Analyze options so we choose that menu. If gives us a number of choices:

Doing Multiple Regression with SPSS. In this case, we are interested in the Analyze options so we choose that menu. If gives us a number of choices: Doing Multiple Regression with SPSS Multiple Regression for Data Already in Data Editor Next we want to specify a multiple regression analysis for these data. The menu bar for SPSS offers several options:

More information

Multiple Optimization Using the JMP Statistical Software Kodak Research Conference May 9, 2005

Multiple Optimization Using the JMP Statistical Software Kodak Research Conference May 9, 2005 Multiple Optimization Using the JMP Statistical Software Kodak Research Conference May 9, 2005 Philip J. Ramsey, Ph.D., Mia L. Stephens, MS, Marie Gaudard, Ph.D. North Haven Group, http://www.northhavengroup.com/

More information

Didacticiel - Études de cas

Didacticiel - Études de cas 1 Topic Regression analysis with LazStats (OpenStat). LazStat 1 is a statistical software which is developed by Bill Miller, the father of OpenStat, a wellknow tool by statisticians since many years. These

More information

xtmixed & denominator degrees of freedom: myth or magic

xtmixed & denominator degrees of freedom: myth or magic xtmixed & denominator degrees of freedom: myth or magic 2011 Chicago Stata Conference Phil Ender UCLA Statistical Consulting Group July 2011 Phil Ender xtmixed & denominator degrees of freedom: myth or

More information

CHAPTER 13. Experimental Design and Analysis of Variance

CHAPTER 13. Experimental Design and Analysis of Variance CHAPTER 13 Experimental Design and Analysis of Variance CONTENTS STATISTICS IN PRACTICE: BURKE MARKETING SERVICES, INC. 13.1 AN INTRODUCTION TO EXPERIMENTAL DESIGN AND ANALYSIS OF VARIANCE Data Collection

More information

2. What is the general linear model to be used to model linear trend? (Write out the model) = + + + or

2. What is the general linear model to be used to model linear trend? (Write out the model) = + + + or Simple and Multiple Regression Analysis Example: Explore the relationships among Month, Adv.$ and Sales $: 1. Prepare a scatter plot of these data. The scatter plots for Adv.$ versus Sales, and Month versus

More information

SPSS Guide: Regression Analysis

SPSS Guide: Regression Analysis SPSS Guide: Regression Analysis I put this together to give you a step-by-step guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar

More information

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon t-tests in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com www.excelmasterseries.com

More information

Multivariate Logistic Regression

Multivariate Logistic Regression 1 Multivariate Logistic Regression As in univariate logistic regression, let π(x) represent the probability of an event that depends on p covariates or independent variables. Then, using an inv.logit formulation

More information

Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software

Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software STATA Tutorial Professor Erdinç Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software 1.Wald Test Wald Test is used

More information

Applied Statistics. J. Blanchet and J. Wadsworth. Institute of Mathematics, Analysis, and Applications EPF Lausanne

Applied Statistics. J. Blanchet and J. Wadsworth. Institute of Mathematics, Analysis, and Applications EPF Lausanne Applied Statistics J. Blanchet and J. Wadsworth Institute of Mathematics, Analysis, and Applications EPF Lausanne An MSc Course for Applied Mathematicians, Fall 2012 Outline 1 Model Comparison 2 Model

More information

Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures

Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures Jamie DeCoster Department of Psychology University of Alabama 348 Gordon Palmer Hall Box 870348 Tuscaloosa, AL 35487-0348 Phone:

More information

ABSORBENCY OF PAPER TOWELS

ABSORBENCY OF PAPER TOWELS ABSORBENCY OF PAPER TOWELS 15. Brief Version of the Case Study 15.1 Problem Formulation 15.2 Selection of Factors 15.3 Obtaining Random Samples of Paper Towels 15.4 How will the Absorbency be measured?

More information

Lucky vs. Unlucky Teams in Sports

Lucky vs. Unlucky Teams in Sports Lucky vs. Unlucky Teams in Sports Introduction Assuming gambling odds give true probabilities, one can classify a team as having been lucky or unlucky so far. Do results of matches between lucky and unlucky

More information

One-Way Analysis of Variance

One-Way Analysis of Variance One-Way Analysis of Variance Note: Much of the math here is tedious but straightforward. We ll skim over it in class but you should be sure to ask questions if you don t understand it. I. Overview A. We

More information

Chapter 6: Multivariate Cointegration Analysis

Chapter 6: Multivariate Cointegration Analysis Chapter 6: Multivariate Cointegration Analysis 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie VI. Multivariate Cointegration

More information

Chapter 7 Section 7.1: Inference for the Mean of a Population

Chapter 7 Section 7.1: Inference for the Mean of a Population Chapter 7 Section 7.1: Inference for the Mean of a Population Now let s look at a similar situation Take an SRS of size n Normal Population : N(, ). Both and are unknown parameters. Unlike what we used

More information

MODEL I: DRINK REGRESSED ON GPA & MALE, WITHOUT CENTERING

MODEL I: DRINK REGRESSED ON GPA & MALE, WITHOUT CENTERING Interpreting Interaction Effects; Interaction Effects and Centering Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 20, 2015 Models with interaction effects

More information

Multivariate Analysis of Variance (MANOVA): I. Theory

Multivariate Analysis of Variance (MANOVA): I. Theory Gregory Carey, 1998 MANOVA: I - 1 Multivariate Analysis of Variance (MANOVA): I. Theory Introduction The purpose of a t test is to assess the likelihood that the means for two groups are sampled from the

More information

Section 1: Simple Linear Regression

Section 1: Simple Linear Regression Section 1: Simple Linear Regression Carlos M. Carvalho The University of Texas McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction

More information

Univariate Regression

Univariate Regression Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is

More information

Simple Predictive Analytics Curtis Seare

Simple Predictive Analytics Curtis Seare Using Excel to Solve Business Problems: Simple Predictive Analytics Curtis Seare Copyright: Vault Analytics July 2010 Contents Section I: Background Information Why use Predictive Analytics? How to use

More information

Main Effects and Interactions

Main Effects and Interactions Main Effects & Interactions page 1 Main Effects and Interactions So far, we ve talked about studies in which there is just one independent variable, such as violence of television program. You might randomly

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

Introduction to Hierarchical Linear Modeling with R

Introduction to Hierarchical Linear Modeling with R Introduction to Hierarchical Linear Modeling with R 5 10 15 20 25 5 10 15 20 25 13 14 15 16 40 30 20 10 0 40 30 20 10 9 10 11 12-10 SCIENCE 0-10 5 6 7 8 40 30 20 10 0-10 40 1 2 3 4 30 20 10 0-10 5 10 15

More information

Two-sample hypothesis testing, II 9.07 3/16/2004

Two-sample hypothesis testing, II 9.07 3/16/2004 Two-sample hypothesis testing, II 9.07 3/16/004 Small sample tests for the difference between two independent means For two-sample tests of the difference in mean, things get a little confusing, here,

More information

Two-way ANOVA and ANCOVA

Two-way ANOVA and ANCOVA Two-way ANOVA and ANCOVA In this tutorial we discuss fitting two-way analysis of variance (ANOVA), as well as, analysis of covariance (ANCOVA) models in R. As we fit these models using regression methods

More information

New SAS Procedures for Analysis of Sample Survey Data

New SAS Procedures for Analysis of Sample Survey Data New SAS Procedures for Analysis of Sample Survey Data Anthony An and Donna Watts, SAS Institute Inc, Cary, NC Abstract Researchers use sample surveys to obtain information on a wide variety of issues Many

More information

Directions for using SPSS

Directions for using SPSS Directions for using SPSS Table of Contents Connecting and Working with Files 1. Accessing SPSS... 2 2. Transferring Files to N:\drive or your computer... 3 3. Importing Data from Another File Format...

More information

Statistiek II. John Nerbonne. October 1, 2010. Dept of Information Science j.nerbonne@rug.nl

Statistiek II. John Nerbonne. October 1, 2010. Dept of Information Science j.nerbonne@rug.nl Dept of Information Science j.nerbonne@rug.nl October 1, 2010 Course outline 1 One-way ANOVA. 2 Factorial ANOVA. 3 Repeated measures ANOVA. 4 Correlation and regression. 5 Multiple regression. 6 Logistic

More information

Exchange Rate Regime Analysis for the Chinese Yuan

Exchange Rate Regime Analysis for the Chinese Yuan Exchange Rate Regime Analysis for the Chinese Yuan Achim Zeileis Ajay Shah Ila Patnaik Abstract We investigate the Chinese exchange rate regime after China gave up on a fixed exchange rate to the US dollar

More information

Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition

Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition Online Learning Centre Technology Step-by-Step - Excel Microsoft Excel is a spreadsheet software application

More information

Linear Mixed-Effects Modeling in SPSS: An Introduction to the MIXED Procedure

Linear Mixed-Effects Modeling in SPSS: An Introduction to the MIXED Procedure Technical report Linear Mixed-Effects Modeling in SPSS: An Introduction to the MIXED Procedure Table of contents Introduction................................................................ 1 Data preparation

More information

Technical report. in SPSS AN INTRODUCTION TO THE MIXED PROCEDURE

Technical report. in SPSS AN INTRODUCTION TO THE MIXED PROCEDURE Linear mixedeffects modeling in SPSS AN INTRODUCTION TO THE MIXED PROCEDURE Table of contents Introduction................................................................3 Data preparation for MIXED...................................................3

More information