xtmixed & denominator degrees of freedom: myth or magic
|
|
|
- Blaze Bishop
- 9 years ago
- Views:
Transcription
1 xtmixed & denominator degrees of freedom: myth or magic 2011 Chicago Stata Conference Phil Ender UCLA Statistical Consulting Group July 2011 Phil Ender xtmixed & denominator degrees of freedom: myth or magic
2 Terminology Here are two abbreviations I will be using: ddf Denominator degrees of freedom. ddfm Denominator degrees of freedom method. 2 / 30
3 Consider this Simple Randomized Block Example Randomized block design with 16 subjects and 3 treatment levels.. anova y trt id Number of obs = 48 R-squared = Root MSE = Adj R-squared = Source Partial SS df MS F Prob > F Model trt id Residual Total / 30
4 Computing the F-ratio F = (SS n)/(ndf ) (SS d )/(ddf ) = ( )/(2) (313.5)/(30) = = 3.02 The denominator degrees of freedom is / 30
5 Same data using xtmixed. xtmixed y i.trt id:, var Mixed-effects REML regression Number of obs = 48 Group variable: id Number of groups = 16 Obs per group: min = 3 avg = 3.0 max = Wald chi2(2) = 6.04 Log restricted-likelihood = Prob > chi2 = y Coef. Std. Err. z P> z [95% CI] 2.trt trt _cons / 30
6 xtmixed con t Random-effects Parameters Estimate Std. Err. [95% CI] id: Identity var(_cons) var(residual) LR test vs. linear regression: chibar2(01) = Prob >= chibar2 = / 30
7 Testing main effect of trt Omnibus test for treatment.. test 2.trt 3.trt ( 1) [y]2.trt = 0 ( 2) [y]3.trt = 0 chi2( 2) = 6.04 Prob > chi2 = Scale chi-square as F-ratio.. display r(chi2)/r(df) F-ratio from xtmixed is the same as the F-ratio from anova. 7 / 30
8 Assumming... Assuming that the ddf for this simple balanced model is, ddf = obs df (trt) df (id) 1 = = 30 Then, the p-value equals, Ftail(2, 30, 3.022) = / 30
9 Comparing p-values The p-value for the chi-square is The p-value for the anova F-ratio is Chi-square is a large sample normal based statistic, so for small experimental designs we prefer the p-values obtained from the F-distribution. If xtmixed provided denominator degrees of freedom this would be a very simple matter. 9 / 30
10 What s your problem, just use anova. Stop Complaining. There are many situations that anova does not handle well. Here are three examples. Incomplete data within subject Unequally spaced time intervals Level 1 covariance structures other than compound symmetry UCLA has many researchers working within traditional anova frameworks with relatively small experimental designs. Reviewers and editors of journals in these fields are familiar with experimental designs and with F-ratios. However, it is common for data to be unbalanced within subject, as is the need for alternative level 1 covariance structures. Xtmixed would be ideal for these situations if it could produce probabilities adjusted for smaller samples. 10 / 30
11 MIssing Observations Example Consider a modification of our randomized block example with one missing observation for each of four subjects. Same xtmixed command.. xtmixed y i.trt id:, var 11 / 30
12 xtmixed with missing observations Mixed-effects REML regression Number of obs = 44 Group variable: id Number of groups = 16 Obs per group: min = 2 avg = 2.8 max = 3 Wald chi2(2) = 6.51 Log restricted-likelihood = Prob > chi2 = y Coef. Std. Err. z P> z [95% CI] trt trt _cons / 30
13 xtmixed with missing observations Continued Random-effects Parameters Estimate Std. Err. [95% CI] id: Identity var(_cons) var(residual) LR test vs. linear regression: chibar2(01) = Prob >= chibar2 = / 30
14 Testing trt Omnibus test for main effect for treatment.. test 2.trt 3.trt ( 1) [y]2.trt = 0 ( 2) [y]3.trt = 0 chi2( 2) = 6.51 Prob > chi2 = Scale chi-square as F-ratio.. display r(chi2)/r(df) / 30
15 Comparing p-values The p-value for the chi-square is The p-value for the F-ratio is Ftail(2,?, 3.257) =? Even thought the chi-square has been rescaled as an F-ratio, there is no p-value for the F-ratio because we don t know the denominator degrees of freedom. 15 / 30
16 So, why doesn t xtmixed provide the ddf? The simple answer: xtmixed does not know the denominator degrees of freedom. It does not have mean squares or numerators or denominators in the anova sense. And, it does not compute F-ratios at all. xtmixed performs statistical tests by dividing parameter estimates by their standard errors. 16 / 30
17 What can be done? Since there is no actual denominator degrees of freedom, we need an approximation of an F-distribution that has appropriate control over the Type I Error and has adequate power. This is not an easy task. There does not seem to be a single F-approximation that works for all possible mixed models. It may be difficult, but it doesn t mean that no one ever tried. 17 / 30
18 Suvery of Major Stat Packages Package Command ddf method Philosophy Stata xtmixed none Statistical R lmer none Purity R lme containment Empirical SPSS mixed Satterthwaite Pragmatism SAS proc mixed Satterthwaite Kenward-Roger* between-within residual containment * SAS current favorite. 18 / 30
19 Residual, Containment & Between-within ddf Residual df = N rank(x ) = 44 3 = 41 Containment df = N rank(x, Z) = = = 26 Betwithin df = Residual df rank(z) = = / 30
20 Satterthwaite Approximation The Satterthwaite approximation is intended as an accurate F-test approximation, and hence accurate p-values for the F-test. SAS does warn that the small-sample properties of the Satterthwaite approximation have not been thoroughly investigated for all models. 20 / 30
21 Kenward-Roger Approximation The Kenward-Roger method is an attempt to make a further adjustment to the F-statistic, to take into account the fact that the REML estimates of the covariance parameters are estimates and not known quantities. This method inflates the marginal variance-covariance matrix and then applies the Satterthwaite method on the resulting matrix. 21 / 30
22 Computational Issues Residual, containment and between-within methods are fairly simple to compute. However, Satterthwaite and Kenward-Rogers are both computationally and resource intensive. The computational overhead increases with the complexity of the design and with the complexity of the unbalancedness. 22 / 30
23 The RB-3 example with missing observations Various F-approximations with our RB-3 example with 4 missing observations using SAS. Statistic Value ddf p-value ddfm F Satterthwaite F Kenward-Roger F between-within F contain F residual chi from Stata 23 / 30
24 Exceptions in Stata xtmixed does not provide adjusted ddf s, however anova with the repeated option will adjust both the numerator and denominator degrees of freedom. We will return the the original randomized block data, the one without any missing observations and rerun anova using repeated(trt). 24 / 30
25 anova repeated option. anova y trt id, repeated(trt)... Between-subjects error term: id Levels: 16 (15 df) Lowest b.s.e. variable: id Repeated variable: trt Huynh-Feldt epsilon = *Huynh-Feldt epsilon reset to Greenhouse-Geisser epsilon = Box s conservative epsilon = Prob > F Source df F Regular H-F G-G Box trt Residual / 30
26 ddf with repeated option Ftail(2, 30, 3.022) = //Regular (1) Ftail(2 1, 30 1, 3.022) = //Huynh Feldt (2) Ftail(2, 30, 3.022) = Ftail(2.9505, , 3.022) = //Greenhouse Geisser (3) Ftail(1.901, , 3.022) = Ftail(2.5, 30.5, 3.022) = //Box s Conservative (4) Ftail(1, 15, ) = Use Three-Step Procedure to determine statistical significance. 26 / 30
27 And, of course, t-test with unequal variances both Satterthwaite and Welch degrees of freedom for t-tests with unequal variances produce latent ddf.. ttest y, by(grp) Two-sample t test with equal variances: t = df = 38 p-value = Two-sample t test with unequal variances using Satterthwaite s df: t = df = p-value = Two-sample t test with unequal variances using Welch s df: t = df = p-value = / 30
28 What can you do short of running SAS? Consider a split-plot design with a between subjects and b within subjects and with missing observations within subject:. xtmixed y a##b id: Use the ddf from the following anova models with the chi-squares rescaled as F-ratios from xtmixed: Between-within ddf: (two error terms). anova y a / id a b a#b / Containment ddf: (one error term). anova y a id a b a#b Residual ddf: (one error term). anova y a b a#b 28 / 30
29 Conclusion Myth or Magic? Mostly myth 29 / 30
30 References Giesbrecht, F.G. and Burns, J.C. (1985). Two-stage analysis based on a mixed model: Large-sample asymtotic theory and small-sample simulation results. Biometrics, 41, Gould, W. (2009). How are the chi-squared and F distributions related? Stata FAQ, Kenward, M.G. and Roger, J. H. (1997). Small sample Inference for fixed effects from restricted maximum likelihood. Biometrics, 53, (2009). SAS/STAT 9.2 User s Guide, Second Edition, SAS Institute Inc, Cary, NC. Satterthwaite, F. E. (1946). An approximate distribution of estimates of variance components. Biometrics Bulletin, 2, / 30
Section 13, Part 1 ANOVA. Analysis Of Variance
Section 13, Part 1 ANOVA Analysis Of Variance Course Overview So far in this course we ve covered: Descriptive statistics Summary statistics Tables and Graphs Probability Probability Rules Probability
Lab 5 Linear Regression with Within-subject Correlation. Goals: Data: Use the pig data which is in wide format:
Lab 5 Linear Regression with Within-subject Correlation Goals: Data: Fit linear regression models that account for within-subject correlation using Stata. Compare weighted least square, GEE, and random
ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2
University of California, Berkeley Prof. Ken Chay Department of Economics Fall Semester, 005 ECON 14 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE # Question 1: a. Below are the scatter plots of hourly wages
Department of Economics Session 2012/2013. EC352 Econometric Methods. Solutions to Exercises from Week 10 + 0.0077 (0.052)
Department of Economics Session 2012/2013 University of Essex Spring Term Dr Gordon Kemp EC352 Econometric Methods Solutions to Exercises from Week 10 1 Problem 13.7 This exercise refers back to Equation
Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares
Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation
MULTIPLE REGRESSION EXAMPLE
MULTIPLE REGRESSION EXAMPLE For a sample of n = 166 college students, the following variables were measured: Y = height X 1 = mother s height ( momheight ) X 2 = father s height ( dadheight ) X 3 = 1 if
Failure to take the sampling scheme into account can lead to inaccurate point estimates and/or flawed estimates of the standard errors.
Analyzing Complex Survey Data: Some key issues to be aware of Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 24, 2015 Rather than repeat material that is
Milk Data Analysis. 1. Objective Introduction to SAS PROC MIXED Analyzing protein milk data using STATA Refit protein milk data using PROC MIXED
1. Objective Introduction to SAS PROC MIXED Analyzing protein milk data using STATA Refit protein milk data using PROC MIXED 2. Introduction to SAS PROC MIXED The MIXED procedure provides you with flexibility
Handling missing data in Stata a whirlwind tour
Handling missing data in Stata a whirlwind tour 2012 Italian Stata Users Group Meeting Jonathan Bartlett www.missingdata.org.uk 20th September 2012 1/55 Outline The problem of missing data and a principled
Multicollinearity Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 13, 2015
Multicollinearity Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 13, 2015 Stata Example (See appendices for full example).. use http://www.nd.edu/~rwilliam/stats2/statafiles/multicoll.dta,
Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software
STATA Tutorial Professor Erdinç Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software 1.Wald Test Wald Test is used
Sample Size Calculation for Longitudinal Studies
Sample Size Calculation for Longitudinal Studies Phil Schumm Department of Health Studies University of Chicago August 23, 2004 (Supported by National Institute on Aging grant P01 AG18911-01A1) Introduction
Lecture 15. Endogeneity & Instrumental Variable Estimation
Lecture 15. Endogeneity & Instrumental Variable Estimation Saw that measurement error (on right hand side) means that OLS will be biased (biased toward zero) Potential solution to endogeneity instrumental
Random effects and nested models with SAS
Random effects and nested models with SAS /************* classical2.sas ********************* Three levels of factor A, four levels of B Both fixed Both random A fixed, B random B nested within A ***************************************************/
IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results
IAPRI Quantitative Analysis Capacity Building Series Multiple regression analysis & interpreting results How important is R-squared? R-squared Published in Agricultural Economics 0.45 Best article of the
DETERMINANTS OF CAPITAL ADEQUACY RATIO IN SELECTED BOSNIAN BANKS
DETERMINANTS OF CAPITAL ADEQUACY RATIO IN SELECTED BOSNIAN BANKS Nađa DRECA International University of Sarajevo [email protected] Abstract The analysis of a data set of observation for 10
Marginal Effects for Continuous Variables Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 21, 2015
Marginal Effects for Continuous Variables Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 21, 2015 References: Long 1997, Long and Freese 2003 & 2006 & 2014,
Rockefeller College University at Albany
Rockefeller College University at Albany PAD 705 Handout: Hypothesis Testing on Multiple Parameters In many cases we may wish to know whether two or more variables are jointly significant in a regression.
Multiple Linear Regression
Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is
Statistics 104 Final Project A Culture of Debt: A Study of Credit Card Spending in America TF: Kevin Rader Anonymous Students: LD, MH, IW, MY
Statistics 104 Final Project A Culture of Debt: A Study of Credit Card Spending in America TF: Kevin Rader Anonymous Students: LD, MH, IW, MY ABSTRACT: This project attempted to determine the relationship
ESTIMATING AVERAGE TREATMENT EFFECTS: IV AND CONTROL FUNCTIONS, II Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics
ESTIMATING AVERAGE TREATMENT EFFECTS: IV AND CONTROL FUNCTIONS, II Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 1. Quantile Treatment Effects 2. Control Functions
E(y i ) = x T i β. yield of the refined product as a percentage of crude specific gravity vapour pressure ASTM 10% point ASTM end point in degrees F
Random and Mixed Effects Models (Ch. 10) Random effects models are very useful when the observations are sampled in a highly structured way. The basic idea is that the error associated with any linear,
Multivariate Analysis of Variance (MANOVA)
Chapter 415 Multivariate Analysis of Variance (MANOVA) Introduction Multivariate analysis of variance (MANOVA) is an extension of common analysis of variance (ANOVA). In ANOVA, differences among various
Correlation and Regression
Correlation and Regression Scatterplots Correlation Explanatory and response variables Simple linear regression General Principles of Data Analysis First plot the data, then add numerical summaries Look
MODEL I: DRINK REGRESSED ON GPA & MALE, WITHOUT CENTERING
Interpreting Interaction Effects; Interaction Effects and Centering Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 20, 2015 Models with interaction effects
ANOVA. February 12, 2015
ANOVA February 12, 2015 1 ANOVA models Last time, we discussed the use of categorical variables in multivariate regression. Often, these are encoded as indicator columns in the design matrix. In [1]: %%R
Use of deviance statistics for comparing models
A likelihood-ratio test can be used under full ML. The use of such a test is a quite general principle for statistical testing. In hierarchical linear models, the deviance test is mostly used for multiparameter
MODELING AUTO INSURANCE PREMIUMS
MODELING AUTO INSURANCE PREMIUMS Brittany Parahus, Siena College INTRODUCTION The findings in this paper will provide the reader with a basic knowledge and understanding of how Auto Insurance Companies
Interaction effects and group comparisons Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 20, 2015
Interaction effects and group comparisons Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 20, 2015 Note: This handout assumes you understand factor variables,
August 2012 EXAMINATIONS Solution Part I
August 01 EXAMINATIONS Solution Part I (1) In a random sample of 600 eligible voters, the probability that less than 38% will be in favour of this policy is closest to (B) () In a large random sample,
Stat 5303 (Oehlert): Tukey One Degree of Freedom 1
Stat 5303 (Oehlert): Tukey One Degree of Freedom 1 > catch
Using Stata 9 & Higher for OLS Regression Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 8, 2015
Using Stata 9 & Higher for OLS Regression Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 8, 2015 Introduction. This handout shows you how Stata can be used
DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9
DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 Analysis of covariance and multiple regression So far in this course,
MEAN SEPARATION TESTS (LSD AND Tukey s Procedure) is rejected, we need a method to determine which means are significantly different from the others.
MEAN SEPARATION TESTS (LSD AND Tukey s Procedure) If Ho 1 2... n is rejected, we need a method to determine which means are significantly different from the others. We ll look at three separation tests
MULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL. by Michael L. Orlov Chemistry Department, Oregon State University (1996)
MULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL by Michael L. Orlov Chemistry Department, Oregon State University (1996) INTRODUCTION In modern science, regression analysis is a necessary part
ANOVAs and SPM. July 12, 2005
ANOVAs and SPM R. Henson (1) and W. Penny (2), (1) Institute of Cognitive Neuroscience, (2) Wellcome Department of Imaging Neuroscience, University College London. July 12, 2005 Abstract This note describes
Getting Correct Results from PROC REG
Getting Correct Results from PROC REG Nathaniel Derby, Statis Pro Data Analytics, Seattle, WA ABSTRACT PROC REG, SAS s implementation of linear regression, is often used to fit a line without checking
Interaction effects between continuous variables (Optional)
Interaction effects between continuous variables (Optional) Richard Williams, University of Notre Dame, http://www.nd.edu/~rwilliam/ Last revised February 0, 05 This is a very brief overview of this somewhat
1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96
1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
Discussion Section 4 ECON 139/239 2010 Summer Term II
Discussion Section 4 ECON 139/239 2010 Summer Term II 1. Let s use the CollegeDistance.csv data again. (a) An education advocacy group argues that, on average, a person s educational attainment would increase
Simple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN
Beckman HLM Reading Group: Questions, Answers and Examples Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Linear Algebra Slide 1 of
Overview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)
Overview Classes 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) 2-4 Loglinear models (8) 5-4 15-17 hrs; 5B02 Building and
From the help desk: Swamy s random-coefficients model
The Stata Journal (2003) 3, Number 3, pp. 302 308 From the help desk: Swamy s random-coefficients model Brian P. Poi Stata Corporation Abstract. This article discusses the Swamy (1970) random-coefficients
I n d i a n a U n i v e r s i t y U n i v e r s i t y I n f o r m a t i o n T e c h n o l o g y S e r v i c e s
I n d i a n a U n i v e r s i t y U n i v e r s i t y I n f o r m a t i o n T e c h n o l o g y S e r v i c e s Linear Regression Models for Panel Data Using SAS, Stata, LIMDEP, and SPSS * Hun Myoung Park,
Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear.
Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear. In the main dialog box, input the dependent variable and several predictors.
Statistical Models in R
Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Linear Models in R Regression Regression analysis is the appropriate
SAS Software to Fit the Generalized Linear Model
SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling
New SAS Procedures for Analysis of Sample Survey Data
New SAS Procedures for Analysis of Sample Survey Data Anthony An and Donna Watts, SAS Institute Inc, Cary, NC Abstract Researchers use sample surveys to obtain information on a wide variety of issues Many
Basic Statistical and Modeling Procedures Using SAS
Basic Statistical and Modeling Procedures Using SAS One-Sample Tests The statistical procedures illustrated in this handout use two datasets. The first, Pulse, has information collected in a classroom
Quick Stata Guide by Liz Foster
by Liz Foster Table of Contents Part 1: 1 describe 1 generate 1 regress 3 scatter 4 sort 5 summarize 5 table 6 tabulate 8 test 10 ttest 11 Part 2: Prefixes and Notes 14 by var: 14 capture 14 use of the
Multilevel Analysis and Complex Surveys. Alan Hubbard UC Berkeley - Division of Biostatistics
Multilevel Analysis and Complex Surveys Alan Hubbard UC Berkeley - Division of Biostatistics 1 Outline Multilevel data analysis Estimating specific parameters of the datagenerating distribution (GEE) Estimating
5. Linear Regression
5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4
Package lmertest. July 16, 2015
Type Package Title Tests in Linear Mixed Effects Models Version 2.0-29 Package lmertest July 16, 2015 Maintainer Alexandra Kuznetsova Depends R (>= 3.0.0), Matrix, stats, methods, lme4 (>=
HURDLE AND SELECTION MODELS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009
HURDLE AND SELECTION MODELS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 1. Introduction 2. A General Formulation 3. Truncated Normal Hurdle Model 4. Lognormal
International Statistical Institute, 56th Session, 2007: Phil Everson
Teaching Regression using American Football Scores Everson, Phil Swarthmore College Department of Mathematics and Statistics 5 College Avenue Swarthmore, PA198, USA E-mail: [email protected] 1. Introduction
Linear Mixed-Effects Modeling in SPSS: An Introduction to the MIXED Procedure
Technical report Linear Mixed-Effects Modeling in SPSS: An Introduction to the MIXED Procedure Table of contents Introduction................................................................ 1 Data preparation
Statistical Functions in Excel
Statistical Functions in Excel There are many statistical functions in Excel. Moreover, there are other functions that are not specified as statistical functions that are helpful in some statistical analyses.
Data Analysis Methodology 1
Data Analysis Methodology 1 Suppose you inherited the database in Table 1.1 and needed to find out what could be learned from it fast. Say your boss entered your office and said, Here s some software project
Chapter 4 and 5 solutions
Chapter 4 and 5 solutions 4.4. Three different washing solutions are being compared to study their effectiveness in retarding bacteria growth in five gallon milk containers. The analysis is done in a laboratory,
Nonlinear Regression Functions. SW Ch 8 1/54/
Nonlinear Regression Functions SW Ch 8 1/54/ The TestScore STR relation looks linear (maybe) SW Ch 8 2/54/ But the TestScore Income relation looks nonlinear... SW Ch 8 3/54/ Nonlinear Regression General
Introduction to Hierarchical Linear Modeling with R
Introduction to Hierarchical Linear Modeling with R 5 10 15 20 25 5 10 15 20 25 13 14 15 16 40 30 20 10 0 40 30 20 10 9 10 11 12-10 SCIENCE 0-10 5 6 7 8 40 30 20 10 0-10 40 1 2 3 4 30 20 10 0-10 5 10 15
THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7.
THERE ARE TWO WAYS TO DO HYPOTHESIS TESTING WITH STATCRUNCH: WITH SUMMARY DATA (AS IN EXAMPLE 7.17, PAGE 236, IN ROSNER); WITH THE ORIGINAL DATA (AS IN EXAMPLE 8.5, PAGE 301 IN ROSNER THAT USES DATA FROM
Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means
Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis
Standard errors of marginal effects in the heteroskedastic probit model
Standard errors of marginal effects in the heteroskedastic probit model Thomas Cornelißen Discussion Paper No. 320 August 2005 ISSN: 0949 9962 Abstract In non-linear regression models, such as the heteroskedastic
Multinomial and Ordinal Logistic Regression
Multinomial and Ordinal Logistic Regression ME104: Linear Regression Analysis Kenneth Benoit August 22, 2012 Regression with categorical dependent variables When the dependent variable is categorical,
Technical report. in SPSS AN INTRODUCTION TO THE MIXED PROCEDURE
Linear mixedeffects modeling in SPSS AN INTRODUCTION TO THE MIXED PROCEDURE Table of contents Introduction................................................................3 Data preparation for MIXED...................................................3
Linear Regression Models with Logarithmic Transformations
Linear Regression Models with Logarithmic Transformations Kenneth Benoit Methodology Institute London School of Economics [email protected] March 17, 2011 1 Logarithmic transformations of variables Considering
Correlated Random Effects Panel Data Models
INTRODUCTION AND LINEAR MODELS Correlated Random Effects Panel Data Models IZA Summer School in Labor Economics May 13-19, 2013 Jeffrey M. Wooldridge Michigan State University 1. Introduction 2. The Linear
An analysis method for a quantitative outcome and two categorical explanatory variables.
Chapter 11 Two-Way ANOVA An analysis method for a quantitative outcome and two categorical explanatory variables. If an experiment has a quantitative outcome and two categorical explanatory variables that
MIXED MODEL ANALYSIS USING R
Research Methods Group MIXED MODEL ANALYSIS USING R Using Case Study 4 from the BIOMETRICS & RESEARCH METHODS TEACHING RESOURCE BY Stephen Mbunzi & Sonal Nagda www.ilri.org/rmg www.worldagroforestrycentre.org/rmg
25 Working with categorical data and factor variables
25 Working with categorical data and factor variables Contents 25.1 Continuous, categorical, and indicator variables 25.1.1 Converting continuous variables to indicator variables 25.1.2 Converting continuous
Post-hoc comparisons & two-way analysis of variance. Two-way ANOVA, II. Post-hoc testing for main effects. Post-hoc testing 9.
Two-way ANOVA, II Post-hoc comparisons & two-way analysis of variance 9.7 4/9/4 Post-hoc testing As before, you can perform post-hoc tests whenever there s a significant F But don t bother if it s a main
ABSORBENCY OF PAPER TOWELS
ABSORBENCY OF PAPER TOWELS 15. Brief Version of the Case Study 15.1 Problem Formulation 15.2 Selection of Factors 15.3 Obtaining Random Samples of Paper Towels 15.4 How will the Absorbency be measured?
Comparing Group Means: The T-test and One-way ANOVA Using STATA, SAS, and SPSS
003-005, The Trustees of Indiana University Comparing Group Means: 1 Comparing Group Means: The T-test and One-way ANOVA Using STATA, SAS, and SPSS Hun Myoung Park This document summarizes the method of
MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS
MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance
Using Stata for Categorical Data Analysis
Using Stata for Categorical Data Analysis NOTE: These problems make extensive use of Nick Cox s tab_chi, which is actually a collection of routines, and Adrian Mander s ipf command. From within Stata,
Syntax Menu Description Options Remarks and examples Stored results Methods and formulas References Also see. level(#) , options2
Title stata.com ttest t tests (mean-comparison tests) Syntax Syntax Menu Description Options Remarks and examples Stored results Methods and formulas References Also see One-sample t test ttest varname
especially with continuous
Handling interactions in Stata, especially with continuous predictors Patrick Royston & Willi Sauerbrei German Stata Users meeting, Berlin, 1 June 2012 Interactions general concepts General idea of a (two-way)
Introduction to structural equation modeling using the sem command
Introduction to structural equation modeling using the sem command Gustavo Sanchez Senior Econometrician StataCorp LP Mexico City, Mexico Gustavo Sanchez (StataCorp) November 13, 2014 1 / 33 Outline Outline
The following postestimation commands for time series are available for regress:
Title stata.com regress postestimation time series Postestimation tools for regress with time series Description Syntax for estat archlm Options for estat archlm Syntax for estat bgodfrey Options for estat
Nonlinear relationships Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 20, 2015
Nonlinear relationships Richard Williams, University of Notre Dame, http://www.nd.edu/~rwilliam/ Last revised February, 5 Sources: Berry & Feldman s Multiple Regression in Practice 985; Pindyck and Rubinfeld
Title. Syntax. stata.com. fp Fractional polynomial regression. Estimation
Title stata.com fp Fractional polynomial regression Syntax Menu Description Options for fp Options for fp generate Remarks and examples Stored results Methods and formulas Acknowledgment References Also
Module 14: Missing Data Stata Practical
Module 14: Missing Data Stata Practical Jonathan Bartlett & James Carpenter London School of Hygiene & Tropical Medicine www.missingdata.org.uk Supported by ESRC grant RES 189-25-0103 and MRC grant G0900724
MATH 564 Project Report. Analysis of Desktop Virtualization Capacity with. Linear Regression Model
MATH 564 Project Report Analsis of Desktop Virtualization Capacit with Linear Regression Model Hongwei Jin CWID:A20288745 Dec. 1 st, 2012 1. Problem Describe a) Background Information At the beginning,
1.5 Oneway Analysis of Variance
Statistics: Rosie Cornish. 200. 1.5 Oneway Analysis of Variance 1 Introduction Oneway analysis of variance (ANOVA) is used to compare several means. This method is often used in scientific or medical experiments
Lecture 11: Confidence intervals and model comparison for linear regression; analysis of variance
Lecture 11: Confidence intervals and model comparison for linear regression; analysis of variance 14 November 2007 1 Confidence intervals and hypothesis testing for linear regression Just as there was
Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression
Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction
1 Simple Linear Regression I Least Squares Estimation
Simple Linear Regression I Least Squares Estimation Textbook Sections: 8. 8.3 Previously, we have worked with a random variable x that comes from a population that is normally distributed with mean µ and
From the help desk: hurdle models
The Stata Journal (2003) 3, Number 2, pp. 178 184 From the help desk: hurdle models Allen McDowell Stata Corporation Abstract. This article demonstrates that, although there is no command in Stata for
Univariate Regression
Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is
Statistics Review PSY379
Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses
Week 5: Multiple Linear Regression
BUS41100 Applied Regression Analysis Week 5: Multiple Linear Regression Parameter estimation and inference, forecasting, diagnostics, dummy variables Robert B. Gramacy The University of Chicago Booth School
Generalized Linear Models
Generalized Linear Models We have previously worked with regression models where the response variable is quantitative and normally distributed. Now we turn our attention to two types of models where the
An Introduction to Statistical Tests for the SAS Programmer Sara Beck, Fred Hutchinson Cancer Research Center, Seattle, WA
ABSTRACT An Introduction to Statistical Tests for the SAS Programmer Sara Beck, Fred Hutchinson Cancer Research Center, Seattle, WA Often SAS Programmers find themselves in situations where performing
The Analysis of Variance ANOVA
-3σ -σ -σ +σ +σ +3σ The Analysis of Variance ANOVA Lecture 0909.400.0 / 0909.400.0 Dr. P. s Clinic Consultant Module in Probability & Statistics in Engineering Today in P&S -3σ -σ -σ +σ +σ +3σ Analysis
Descriptive Statistics
Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize
