Types of Biostatistics. Lecture 18: Review Lecture. Types of Biostatistics. Approach to Modeling. 2) Inferential Statistics

Size: px
Start display at page:

Download "Types of Biostatistics. Lecture 18: Review Lecture. Types of Biostatistics. Approach to Modeling. 2) Inferential Statistics"

Transcription

1 Types of Biostatistics Lecture 18: Review Lecture Ani Manichaikul 15 May ) Inferential Statistics Confirmatory Data Analysis Methods Section of paper Goal: quantify relationships, test hypotheses Types of Biostatistics 1) Descriptive Statistics Exploratory Data Analysis often not in literature Summaries "Table 1" in a paper Goal: visualize relationships, generate hypotheses Approach to Modeling A general approach for most statistical modeling is to: Define the Population of Interest State the Scientific Questions & Underlying Theories Describe and Explore the Observed Data Define the Model Probability part (models the randomness / noise) Systematic part (models the expectation / signal)

2 Approach to Modeling Estimate the Parameters in the Model Fit the Model to the Observed Data Make Inferences about Covariates Check the Validity of the Model Verify the Model Assumptions Re-define, Re-fit, and Re-check the Model if necessary Interpret the results of the Analysis in terms of the Scientific Questions of Interest Grouping: Frequency Distribution Tables Shows the number of observations for each range of data Intervals can be chosen in ways similar to stem-and-leaf displays Age Interval Frequency Stem-and-Leaf Plots Age in years (10 observations) 25, 26, 29, 32, 35, 36, 38, 44, 49, 51 Histograms Pictures of the frequency or relative frequency distribution Age Interval Observations Frequency Histogram of Age Age Category

3 Box-and-Whisker Plots Box Plot of Age Age in Years IQR = = 15 Upper Fence = *1.5 = 66.5 Lower Fence = 29 15*1.5 = Continuous Variables Scatterplot Height in Centimeters Age by Height in cm Age in Years Scatterplots visually display the relationship between two continuous variables Why is the power of a test important? Power indicates the chance of finding a significant difference when there really is one Low power: like to obtain non-significant results even when significant differences exist High power is desirable! Low power is usually cause by small sample size

4 We re not always right Errors in Hypothesis Testing " Aim: To keep Type II error small and thus power high Errors in Hypothesis Testing! Aim: to keep Type I error small by specifying a small rejection region! is set before performing a test, usually at 0.05 ": Probability of Type II Error The value of " is usually unknown since it depends on a specified alternative value. " depends on sample size and!. Before data collection, scientists decide the test they will perform! the desired " They will use this information to choose the sample size

5 P-Values Definition: The p-value for a hypothesis test is the probability of obtaining by chance, alone, when H 0 is true, a value of the test statistic as extreme or more extreme (in the appropriate direction) than the one actually observed. Why use linear regression? Linear regression is very powerful. It can be used for many things: Binary X Continuous X Categorical X Adjustment for confounding Interaction Curved relationships between X and Y Steps of Hypothesis Testing Define the null hypothesis, H 0. Define the alternative hypothesis, H a, where H a is usually of the form not H 0. Define the type 1 error,!, usually Calculate the test statistic Calculate the P-value If the P-value is less than!, reject H 0. Otherwise fail to reject H 0. SLR: Y=! 0 +! 1 X 1!" Linear regression is used for continuous outcome variables! 0 : mean outcome when X=0 (Center!) Binary X = dummy variable for group! 1 : mean difference in outcome between groups Continuous X! 1 : mean difference in outcome corresponding to a 1-unit increase in X Center X to give meaning to! 0 Test! 1 =0 in the population 20

6 Assumptions of Linear Regression Regression Methods L Linear relationship I Independent observations N Normally distributed around line E Equal variance across X s In Simple Linear Regression Regression Methods In simple linear regression (SLR): One Predictor / Covariate / Explanatory Variable: X In multiple linear regression (MLR): Same Assumptions as SLR, (i.e. L.I.N.E.), but: More than one Covariate: X 1, X 2, X 3,, X p Model: Y ~ N(µ, # 2 ) µ = E(Y X) = " 0 + " 1 X 1 + " 2 X 2 + " 3 X " p Xp

7 Nested models One model is nested within another if the parent model contains one set of variables and the extended model contains all of the original variables plus one or more additional variables. The F test H 0 : all new! s=0 in population H A : at least one new! is not 0 in population F obs = ( RSSparent $ RSSnested ) ( # of new variablesadded ) RSS ( 69.6 $ 49.8) nested F 2 obs = = What is F cr? residual df nested Difference in assessing variables: nested models other predictor(s) assess with t test if single variable defines predictor assess with F test (today) if two or more variables are needed to define the predictor potential confounder(s) compare CI of primary predictor to see whether new parameter is significantly different The F test: notes The F test can be used to compare any two nested models If only one variable is added, it s easier to compare the models using the t test for that variable t 2 =F if one variable is added For any regression, the estimated variance of the residuals is RSS/(residual df)

8 Nested Models Comparing nested models 1 new variable: use t test for that variable 2+ new variables: use F test Categorical predictor set one group as reference create dummy variable for other groups include/exclude all dummy variables evaluate categorical predictor with F test Splines and Quadratic Terms Splines are used to allow the regression line to bend the breakpoint is arbitrary and decided graphically or by hypothesis the actual slope above and below the breakpoint is usually of more interest than the coefficient for the spline (ie the change in slope) Quadratic term allows for curvature in the model 31 Effect Modification In linear regression, effect modification is a way of allowing the association between the primary predictor and the outcome to change with the level of another predictor. If the 3 rd predictor is binary, that results in a graph in which the two lines (for the two groups) are no longer parallel. Logistic regression For binary outcomes Model log odds probability, which we also call the logit Baseline term interpreted as log odds Other coefficients are log odds ratios

9 Logistic regression model log [ odds(relief Tx) ] = log( ( % % ) P(no relief Tx) & * = " 0 + " 1 Tx P(relief Tx) ' And * odds(r D) ' ) odds(r P) & ( % Thus: log ( % = " 1 And: OR = exp(" 1 ) = e "1!! where: Tx = 0 if Placebo 1 if Drug So: exp(" 1 ) = odds ratio of relief for patients taking the Drug-vs-patients taking the Placebo. Then Logistic Regression log( odds(relief Drug) ) = " 0 + " 1 log( odds(relief Placebo) ) = " 0 log( odds(r D)) log( odds(r P)) = " 1 Logit estimates Number of obs = 70 LR chi2(1) = 2.83 Prob > chi2 = Log likelihood = Pseudo R2 = y Coef. Std. Err. z P> z [95% Conf. Interval] drug _cons Estimates: log( odds(relief) ) = ˆ "ˆ " + 0 Drug 1 = (Drug) Therefore: OR = exp(0.814) = 2.26!

10 Adding other variables What if Pr(relief) = function of Drug or Placebo AND Age Types of interpretation! 0 +! 1 = ln(odds) (for X=1)! 1 = difference in log odds We could easily include age in a model such as:! 0! e + 1 e! 1 = odds (for X=1) = odds ratio log( odds(relief) ) = " 0 + " 1 Drug + " 2 Age But we started with P(Y=1). Can we find that? Logistic Regression As in MLR, we can include many additional covariates. For a Logistic Regression model with p predictors: log ( odds(y=1)) = " 0 + " 1 X " p X p Pr( Y = 1) 1 $ Pr( Y = 1) where: odds(y=1) = = Pr( Y Pr( Y = 1) = 0) More useful math probability odds = 1$ probability odds probability = 1+ odds! +! e so probabilityfor + 1+ e 0 1 ( X = 1) =! 0! 1

11 Nested models Adding a single new variable to the model null model: full model: * p ' ln( % =! +! 1 ) 1$ p & ( Age 30) 0 $ * p ' ln( % =! 0 +! ) 1$ p & ( Age $ 30)! ( Multivita min) Conclusion from the Wald test The p-value for multivitamin is (<0.05) and the CI for coefficient multivitamin does not include 0 (CI for OR doesn t include 1) Reject H 0 Conclude that the larger model is better: after adjusting for age, multivitamin use is still an important predictor of physician visits in the population Comparing nested models that differ by one variable Compare models with p-value or CI What method is this? The Wald test, a test that applies the CLT, like Z test comparing proportions in 2x2 table analogous to the t test for linear regression H 0 : the new variable is not needed or H 0 :! new =0 in the population Interpretation - log odds! 0 : the log odds of not visiting a physician for a 30-year-old person who reports not regularly taking multivitamins! 1 : the log odds ratio of not visiting a physician for a one year increase in age controlling for multivitamin use! 2 : the log odds ratio of not visiting a physician for those who take multivitamins compared with those who do not, adjusting for age

12 Interpretation odds and odds ratio exp{! 0 }: the odds of not visiting a physician for a 30-year-old person who reports not regularly taking multivitamins Interpretation odds and odds ratio exp{! 2 }: the odds ratio of not visiting a physician for those who take multivitamins compared with those who do not is exp{! 2 }=0.46, adjusting for age taking multivitamins is associated with regular physician visits (p=0.007) Interpretation odds and odds ratio exp{! 1 }: after adjusting for multivitamin use, the odds ratio of not visiting a physician changes by a factor of exp{! 1 }=1.001 for each additional year of age additional age is associated with lower frequency of physician visits in these students, but the association is not statistically significant (p>0.05) Interpretation In General * odds(y = 1 X + 1,X ( Also: log 1 2 % ( % = " 1 ( ) odds(y = 1 X,X ) 1 2 And: OR = exp(" 1 )!! exp(" 1 ) is the Multiplicative change in odds for a 1 unit increase in X 1 provided X 2 is held constant. ) ' The result is similar for X 2 % &

13 CHD by smoking and coffee Y i = 1 if CHD case, 0 if control COF i = 1 if Coffee Drinker, 0 if not SMK i = 1 if Smoker, 0 if not p i = Pr (Y i = 1) Logistic Regression Model * ( ) p ' % & COF SMK i log ( = " 0 + " 1 i + " 2 i + " 3 1$ p % i COF SMK Which implies that Pr(Y i =1) is the logistic function! 0 +! 1X i1+ " 2 X i 2 + " 3 e p i =! 0 +! 1 X i 1+ " 2X i 2+ " 3 1+ e i X i i1 X i 2 X i 1X i 2 n i = Number observed at pattern i of Xs Logistic Regression Model Y i are from a Binomial (n i, p i ) distribution Yi are independent log odds (Y i =1) (or, logit( Y i =1) ) is a function of Coffee Smoking and coffee x smoking interaction Interpretations exp{# 1 }: odds ratio of being a CHD case for coffee drinkers -vs- non-drinkers among non-smokers exp{# 1!# 3 }: odds ratio of being a CHD case for coffee drinkers -vs- nondrinkers among smokers

14 Interpretations exp{# 2 }: odds ratio of being a CHD case for smokers -vs- non-smokers among non-coffee drinkers exp{# 2!# 3 }: odds ratio of being case for smokers -vs- non-smokers among coffee drinkers exp{# 3 } Interpretations exp{# 3 }: factor by which odds ratio of being a CHD case for coffee drinkers -vsnondrinkers is multiplied for smokers as compared to non-smokers or exp{# 3 }: factor by which odds ratio of being a CHD case for smokers -vs- non-smokers is multiplied for coffee drinkers as compared to non-coffee drinkers Interpretations e " e " 0 fraction of cases among nonsmoking non-coffee drinking individuals in the sample (determined by sampling plan) exp{# 3 }: ratio of odds ratios Some Special Cases Given * Pr( Y = 1) ' log( % = " 0 + " 1COF + " 2SMK + " 3COF * SMK ) Pr( Y = 0) & If # 1 = # 2 = # 3 = 0 Neither smoking nor coffee drinking is associated with increased risk of CHD

15 Some Special Cases Given * Pr( Y = 1) ' log( % = " 0 + " 1COF + " 2SMK + " 3COF * SMK ) Pr( Y = 0) & If # 1 = # 3 = 0 Smoking, but not coffee drinking, is associated with increased risk of CHD Confounding In epidemiological terms, Z is a confounder of the relationship of Y with X if Z is related to both X and Y and Z is not in the causal pathway between X and Y In statistical terms, Z is a confounder of the relationship of Y with X if the X coefficient changes when Z is added to a regression of Y on X Some Special Cases If # 3 = 0 Smoking and coffee drinking are both associated with risk of CHD but the odds ratio of CHD-smoking is the same at levels of coffee Smoking and coffee drinking are both associated with risk of CHD but the odds ratio of CHD-coffee is the same at levels of smoking. Confounding For example, consider the two models Y = # 0 + # 1 X + " 1 Y = $ 0 + $ 1 X + $ 2 Z + " 2 then Z is a confounder of the X, Y relationship if $ 1 " # 1

16 Look at Confidence Intervals Without Smoking OR = e 0.79 = % CI for log(or): 0.79 ± 1.96(0.33) = (0.13, 1.44) 95% CI for OR: (e 0.13, e 1.44 ) = (1.14, 4.24) Conclusion So, ignoring smoking, the CHD and coffee OR is 2.2 (95%CI: ) Adjusting for smoking, gives more modest evidence for a coffee effect In this case-control study, smoking is a weak-to-moderate confounder of the coffee-chd association Look at Confidence Intervals Interaction Model With Smoking (adjusting for smoking) OR = e 0.53 = 1.7 Variable Est Model 3 se z Intercept % CI for log(or): 0.53 ± 1.96(0.35) = (-0.17, 1.22) 95% CI for OR: (e -0.17, e 1.22 ) = (0.85, 3.39) Coffee Smoking Coffee* Smoking

17 Testing Interaction Term Likelihood Ratio Test Z= -0.59, p-value = % Confidence interval for # 1!# 3 (0.42, 3.99) Both of the above suggest that there is little evidence that smoking is an effect modifier! Deviance is a term used for the difference in -2*log likelihood relative to the best possible value from a perfectly predicting model. Change in deviance is the same as change in -2LL. Likelihood Ratio Test LRT Example The Likelihood Ratio Test will help decide whether or not additional term(s) significantly improve the model fit Likelihood Ratio Test (LRT) statistic for comparing nested models is -2 times the difference between the log likelihoods (LLs) for the Null -vs- Extended models the % obtained is identical to % from an analysis of variance test for linear regression models

18 Model comparisons using likelihood ratio test Summary: Adjusted ORs Controlling for the potential confounding of smoking, the coffee odds ratio was estimated to be 1.7 with 95% CI: (.85, 3.4). Hence, the evidence in these data are insufficient to conclude coffee has an independent effect on CHD beyond that of smoking. Summary: Unadjusted ORs The odds of CHD was estimated to be 3.4 times higher among smokers compared to non-smokers 95% CI: (1.7, 7.9) The odds of CHD was estimated to be 2.2 times higher among coffee drinkers compared to non-coffee drinkers 95% CI: (1.1, 4.3) Comparing the models Models C and F are both nested in Model A Models C and F cannot be directly compared to one another, but we can see which has a smaller p-value when compared to Model A C vs. A: X 2 = 26.5 with 2 df F vs. A: X 2 = 21.7 with 3 df

19 What next? Model C improves prediction beyond gender alone (Model A) more than Model F. Model C should be the next parent model, and we should test the new variables in Model F to see if they continue to improve prediction within the context of Model C. When a tentative final model is identified, the assumptions of logistic regression should be checked. Poisson regression model Log-linear model for mean rate where p is the number of predictors in the model Random component: Here: Flexibility in linear models Exponentiating Poisson regression models A spline allows the slope for a continuous predictor to change at a given point; the coefficient is for the difference in log odds ratio An interaction term allows the odds ratio for one variable to differ by the value of a second variable; the coefficient is for the difference in log odds ratio 74

20 Interpreting Poisson regression parameters Person-years In defining rates, it is crucial to state what denominator we have in mind For disease, we are usually interested in disease rate per person, per year If the HIV incidence rate is 5 per 1 million person years, that means we expect to see 5 new cases of HIV per 1 million persons per year Modelling rates Of key interest in Poisson regression models is to make inference about rates of events We are often interested in whether the rate of cancer, or some other disease, varies by population subgroups such as gender, race, or age Modelling Danish Cancer cases with an offset We observed Danish cancer cases in 6 age groups over a period of 4 years The model: predicts log rates per 10,000 person years

21 Interpretation of coefficients Poisson regression for cohort studies Log-linear regression can be used to estimate relative risks for cohort studies (but not case control) Relative risks is like relative rates, but we are comparing risks (probability of disease) instead of rates (expected cases per personyear) across groups Could also estimate relative risk by transforming results from logistic regression More about offsets The purpose of an offset is to specify the denominator of the predicted rates We should always try to use an offset if we suspect the underlying population sizes vary for the observed counts Typically, we ll use log(n) as the offset, where N is the sample size or number of person years generating each count Grand summary Exploratory analysis includes graphs and tables good to get a feel for the data Confirmatory analysis is useful for making definitive conclusions Linear models provide us with a framework in which to perform confirmatory analysis in many settings

22 Grand summary: linear models Linear regression: for continuous (normal) outcomes Logistic regression: for binary outcomes Poisson regression: for counts Grand summary: testing We can test significance of a single predictor using z-test (or t-test for linear regression) Test significance of several covariates using a pair of nested models by a likelihood ratio test Know how to interpret p-values and confidence intervals! Grand summary: modelling In all generalized linear models, we can use the following tools to make models more flexible: Adjust for confounders using additive covariates Effect modification allows by interaction terms Curved and bent lines through polynomials and splines

Multinomial and Ordinal Logistic Regression

Multinomial and Ordinal Logistic Regression Multinomial and Ordinal Logistic Regression ME104: Linear Regression Analysis Kenneth Benoit August 22, 2012 Regression with categorical dependent variables When the dependent variable is categorical,

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models We have previously worked with regression models where the response variable is quantitative and normally distributed. Now we turn our attention to two types of models where the

More information

11. Analysis of Case-control Studies Logistic Regression

11. Analysis of Case-control Studies Logistic Regression Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:

More information

Ordinal Regression. Chapter

Ordinal Regression. Chapter Ordinal Regression Chapter 4 Many variables of interest are ordinal. That is, you can rank the values, but the real distance between categories is unknown. Diseases are graded on scales from least severe

More information

Statistics 104 Final Project A Culture of Debt: A Study of Credit Card Spending in America TF: Kevin Rader Anonymous Students: LD, MH, IW, MY

Statistics 104 Final Project A Culture of Debt: A Study of Credit Card Spending in America TF: Kevin Rader Anonymous Students: LD, MH, IW, MY Statistics 104 Final Project A Culture of Debt: A Study of Credit Card Spending in America TF: Kevin Rader Anonymous Students: LD, MH, IW, MY ABSTRACT: This project attempted to determine the relationship

More information

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Beckman HLM Reading Group: Questions, Answers and Examples Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Linear Algebra Slide 1 of

More information

Lecture 1: Review and Exploratory Data Analysis (EDA)

Lecture 1: Review and Exploratory Data Analysis (EDA) Lecture 1: Review and Exploratory Data Analysis (EDA) Sandy Eckel seckel@jhsph.edu Department of Biostatistics, The Johns Hopkins University, Baltimore USA 21 April 2008 1 / 40 Course Information I Course

More information

Statistics 305: Introduction to Biostatistical Methods for Health Sciences

Statistics 305: Introduction to Biostatistical Methods for Health Sciences Statistics 305: Introduction to Biostatistical Methods for Health Sciences Modelling the Log Odds Logistic Regression (Chap 20) Instructor: Liangliang Wang Statistics and Actuarial Science, Simon Fraser

More information

Multivariate Logistic Regression

Multivariate Logistic Regression 1 Multivariate Logistic Regression As in univariate logistic regression, let π(x) represent the probability of an event that depends on p covariates or independent variables. Then, using an inv.logit formulation

More information

13. Poisson Regression Analysis

13. Poisson Regression Analysis 136 Poisson Regression Analysis 13. Poisson Regression Analysis We have so far considered situations where the outcome variable is numeric and Normally distributed, or binary. In clinical work one often

More information

III. INTRODUCTION TO LOGISTIC REGRESSION. a) Example: APACHE II Score and Mortality in Sepsis

III. INTRODUCTION TO LOGISTIC REGRESSION. a) Example: APACHE II Score and Mortality in Sepsis III. INTRODUCTION TO LOGISTIC REGRESSION 1. Simple Logistic Regression a) Example: APACHE II Score and Mortality in Sepsis The following figure shows 30 day mortality in a sample of septic patients as

More information

SAS Software to Fit the Generalized Linear Model

SAS Software to Fit the Generalized Linear Model SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling

More information

Biostatistics: Types of Data Analysis

Biostatistics: Types of Data Analysis Biostatistics: Types of Data Analysis Theresa A Scott, MS Vanderbilt University Department of Biostatistics theresa.scott@vanderbilt.edu http://biostat.mc.vanderbilt.edu/theresascott Theresa A Scott, MS

More information

Overview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)

Overview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) Overview Classes 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) 2-4 Loglinear models (8) 5-4 15-17 hrs; 5B02 Building and

More information

Handling missing data in Stata a whirlwind tour

Handling missing data in Stata a whirlwind tour Handling missing data in Stata a whirlwind tour 2012 Italian Stata Users Group Meeting Jonathan Bartlett www.missingdata.org.uk 20th September 2012 1/55 Outline The problem of missing data and a principled

More information

Fairfield Public Schools

Fairfield Public Schools Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity

More information

Nonlinear Regression Functions. SW Ch 8 1/54/

Nonlinear Regression Functions. SW Ch 8 1/54/ Nonlinear Regression Functions SW Ch 8 1/54/ The TestScore STR relation looks linear (maybe) SW Ch 8 2/54/ But the TestScore Income relation looks nonlinear... SW Ch 8 3/54/ Nonlinear Regression General

More information

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in

More information

Logit Models for Binary Data

Logit Models for Binary Data Chapter 3 Logit Models for Binary Data We now turn our attention to regression models for dichotomous data, including logistic regression and probit analysis. These models are appropriate when the response

More information

Basic Statistical and Modeling Procedures Using SAS

Basic Statistical and Modeling Procedures Using SAS Basic Statistical and Modeling Procedures Using SAS One-Sample Tests The statistical procedures illustrated in this handout use two datasets. The first, Pulse, has information collected in a classroom

More information

Simple linear regression

Simple linear regression Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

More information

Elements of statistics (MATH0487-1)

Elements of statistics (MATH0487-1) Elements of statistics (MATH0487-1) Prof. Dr. Dr. K. Van Steen University of Liège, Belgium December 10, 2012 Introduction to Statistics Basic Probability Revisited Sampling Exploratory Data Analysis -

More information

VI. Introduction to Logistic Regression

VI. Introduction to Logistic Regression VI. Introduction to Logistic Regression We turn our attention now to the topic of modeling a categorical outcome as a function of (possibly) several factors. The framework of generalized linear models

More information

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96 1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information

Module 5: Multiple Regression Analysis

Module 5: Multiple Regression Analysis Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College

More information

II. DISTRIBUTIONS distribution normal distribution. standard scores

II. DISTRIBUTIONS distribution normal distribution. standard scores Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

Logit and Probit. Brad Jones 1. April 21, 2009. University of California, Davis. Bradford S. Jones, UC-Davis, Dept. of Political Science

Logit and Probit. Brad Jones 1. April 21, 2009. University of California, Davis. Bradford S. Jones, UC-Davis, Dept. of Political Science Logit and Probit Brad 1 1 Department of Political Science University of California, Davis April 21, 2009 Logit, redux Logit resolves the functional form problem (in terms of the response function in the

More information

Some Essential Statistics The Lure of Statistics

Some Essential Statistics The Lure of Statistics Some Essential Statistics The Lure of Statistics Data Mining Techniques, by M.J.A. Berry and G.S Linoff, 2004 Statistics vs. Data Mining..lie, damn lie, and statistics mining data to support preconceived

More information

Regression III: Advanced Methods

Regression III: Advanced Methods Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models

More information

MULTIPLE REGRESSION EXAMPLE

MULTIPLE REGRESSION EXAMPLE MULTIPLE REGRESSION EXAMPLE For a sample of n = 166 college students, the following variables were measured: Y = height X 1 = mother s height ( momheight ) X 2 = father s height ( dadheight ) X 3 = 1 if

More information

Introduction to Quantitative Methods

Introduction to Quantitative Methods Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................

More information

Categorical Data Analysis

Categorical Data Analysis Richard L. Scheaffer University of Florida The reference material and many examples for this section are based on Chapter 8, Analyzing Association Between Categorical Variables, from Statistical Methods

More information

Biostatistics Short Course Introduction to Longitudinal Studies

Biostatistics Short Course Introduction to Longitudinal Studies Biostatistics Short Course Introduction to Longitudinal Studies Zhangsheng Yu Division of Biostatistics Department of Medicine Indiana University School of Medicine Zhangsheng Yu (Indiana University) Longitudinal

More information

Regression Analysis: A Complete Example

Regression Analysis: A Complete Example Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty

More information

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

 Y. Notation and Equations for Regression Lecture 11/4. Notation: Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

More information

Linda K. Muthén Bengt Muthén. Copyright 2008 Muthén & Muthén www.statmodel.com. Table Of Contents

Linda K. Muthén Bengt Muthén. Copyright 2008 Muthén & Muthén www.statmodel.com. Table Of Contents Mplus Short Courses Topic 2 Regression Analysis, Eploratory Factor Analysis, Confirmatory Factor Analysis, And Structural Equation Modeling For Categorical, Censored, And Count Outcomes Linda K. Muthén

More information

Correlation and Regression

Correlation and Regression Correlation and Regression Scatterplots Correlation Explanatory and response variables Simple linear regression General Principles of Data Analysis First plot the data, then add numerical summaries Look

More information

Marginal Effects for Continuous Variables Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 21, 2015

Marginal Effects for Continuous Variables Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 21, 2015 Marginal Effects for Continuous Variables Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 21, 2015 References: Long 1997, Long and Freese 2003 & 2006 & 2014,

More information

Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013

Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013 Statistics I for QBIC Text Book: Biostatistics, 10 th edition, by Daniel & Cross Contents and Objectives Chapters 1 7 Revised: August 2013 Chapter 1: Nature of Statistics (sections 1.1-1.6) Objectives

More information

Lecture 14: GLM Estimation and Logistic Regression

Lecture 14: GLM Estimation and Logistic Regression Lecture 14: GLM Estimation and Logistic Regression Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South

More information

Outline. Dispersion Bush lupine survival Quasi-Binomial family

Outline. Dispersion Bush lupine survival Quasi-Binomial family Outline 1 Three-way interactions 2 Overdispersion in logistic regression Dispersion Bush lupine survival Quasi-Binomial family 3 Simulation for inference Why simulations Testing model fit: simulating the

More information

STATISTICA Formula Guide: Logistic Regression. Table of Contents

STATISTICA Formula Guide: Logistic Regression. Table of Contents : Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary

More information

2. Simple Linear Regression

2. Simple Linear Regression Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according

More information

HURDLE AND SELECTION MODELS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009

HURDLE AND SELECTION MODELS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 HURDLE AND SELECTION MODELS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 1. Introduction 2. A General Formulation 3. Truncated Normal Hurdle Model 4. Lognormal

More information

Poisson Models for Count Data

Poisson Models for Count Data Chapter 4 Poisson Models for Count Data In this chapter we study log-linear models for count data under the assumption of a Poisson error structure. These models have many applications, not only to the

More information

DATA INTERPRETATION AND STATISTICS

DATA INTERPRETATION AND STATISTICS PholC60 September 001 DATA INTERPRETATION AND STATISTICS Books A easy and systematic introductory text is Essentials of Medical Statistics by Betty Kirkwood, published by Blackwell at about 14. DESCRIPTIVE

More information

Statistics courses often teach the two-sample t-test, linear regression, and analysis of variance

Statistics courses often teach the two-sample t-test, linear regression, and analysis of variance 2 Making Connections: The Two-Sample t-test, Regression, and ANOVA In theory, there s no difference between theory and practice. In practice, there is. Yogi Berra 1 Statistics courses often teach the two-sample

More information

Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

More information

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2 University of California, Berkeley Prof. Ken Chay Department of Economics Fall Semester, 005 ECON 14 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE # Question 1: a. Below are the scatter plots of hourly wages

More information

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not. Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

More information

Multiple logistic regression analysis of cigarette use among high school students

Multiple logistic regression analysis of cigarette use among high school students Multiple logistic regression analysis of cigarette use among high school students ABSTRACT Joseph Adwere-Boamah Alliant International University A binary logistic regression analysis was performed to predict

More information

Introduction. Survival Analysis. Censoring. Plan of Talk

Introduction. Survival Analysis. Censoring. Plan of Talk Survival Analysis Mark Lunt Arthritis Research UK Centre for Excellence in Epidemiology University of Manchester 01/12/2015 Survival Analysis is concerned with the length of time before an event occurs.

More information

Organizing Your Approach to a Data Analysis

Organizing Your Approach to a Data Analysis Biost/Stat 578 B: Data Analysis Emerson, September 29, 2003 Handout #1 Organizing Your Approach to a Data Analysis The general theme should be to maximize thinking about the data analysis and to minimize

More information

SUMAN DUVVURU STAT 567 PROJECT REPORT

SUMAN DUVVURU STAT 567 PROJECT REPORT SUMAN DUVVURU STAT 567 PROJECT REPORT SURVIVAL ANALYSIS OF HEROIN ADDICTS Background and introduction: Current illicit drug use among teens is continuing to increase in many countries around the world.

More information

MTH 140 Statistics Videos

MTH 140 Statistics Videos MTH 140 Statistics Videos Chapter 1 Picturing Distributions with Graphs Individuals and Variables Categorical Variables: Pie Charts and Bar Graphs Categorical Variables: Pie Charts and Bar Graphs Quantitative

More information

Tips for surviving the analysis of survival data. Philip Twumasi-Ankrah, PhD

Tips for surviving the analysis of survival data. Philip Twumasi-Ankrah, PhD Tips for surviving the analysis of survival data Philip Twumasi-Ankrah, PhD Big picture In medical research and many other areas of research, we often confront continuous, ordinal or dichotomous outcomes

More information

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing Introduction Hypothesis Testing Mark Lunt Arthritis Research UK Centre for Ecellence in Epidemiology University of Manchester 13/10/2015 We saw last week that we can never know the population parameters

More information

List of Examples. Examples 319

List of Examples. Examples 319 Examples 319 List of Examples DiMaggio and Mantle. 6 Weed seeds. 6, 23, 37, 38 Vole reproduction. 7, 24, 37 Wooly bear caterpillar cocoons. 7 Homophone confusion and Alzheimer s disease. 8 Gear tooth strength.

More information

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Chapter 13 Introduction to Linear Regression and Correlation Analysis Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing

More information

GLM I An Introduction to Generalized Linear Models

GLM I An Introduction to Generalized Linear Models GLM I An Introduction to Generalized Linear Models CAS Ratemaking and Product Management Seminar March 2009 Presented by: Tanya D. Havlicek, Actuarial Assistant 0 ANTITRUST Notice The Casualty Actuarial

More information

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 4: September

More information

Discussion Section 4 ECON 139/239 2010 Summer Term II

Discussion Section 4 ECON 139/239 2010 Summer Term II Discussion Section 4 ECON 139/239 2010 Summer Term II 1. Let s use the CollegeDistance.csv data again. (a) An education advocacy group argues that, on average, a person s educational attainment would increase

More information

A full analysis example Multiple correlations Partial correlations

A full analysis example Multiple correlations Partial correlations A full analysis example Multiple correlations Partial correlations New Dataset: Confidence This is a dataset taken of the confidence scales of 41 employees some years ago using 4 facets of confidence (Physical,

More information

Principles of Hypothesis Testing for Public Health

Principles of Hypothesis Testing for Public Health Principles of Hypothesis Testing for Public Health Laura Lee Johnson, Ph.D. Statistician National Center for Complementary and Alternative Medicine johnslau@mail.nih.gov Fall 2011 Answers to Questions

More information

Advanced Statistical Analysis of Mortality. Rhodes, Thomas E. and Freitas, Stephen A. MIB, Inc. 160 University Avenue. Westwood, MA 02090

Advanced Statistical Analysis of Mortality. Rhodes, Thomas E. and Freitas, Stephen A. MIB, Inc. 160 University Avenue. Westwood, MA 02090 Advanced Statistical Analysis of Mortality Rhodes, Thomas E. and Freitas, Stephen A. MIB, Inc 160 University Avenue Westwood, MA 02090 001-(781)-751-6356 fax 001-(781)-329-3379 trhodes@mib.com Abstract

More information

Nominal and ordinal logistic regression

Nominal and ordinal logistic regression Nominal and ordinal logistic regression April 26 Nominal and ordinal logistic regression Our goal for today is to briefly go over ways to extend the logistic regression model to the case where the outcome

More information

LOGISTIC REGRESSION ANALYSIS

LOGISTIC REGRESSION ANALYSIS LOGISTIC REGRESSION ANALYSIS C. Mitchell Dayton Department of Measurement, Statistics & Evaluation Room 1230D Benjamin Building University of Maryland September 1992 1. Introduction and Model Logistic

More information

Unit 12 Logistic Regression Supplementary Chapter 14 in IPS On CD (Chap 16, 5th ed.)

Unit 12 Logistic Regression Supplementary Chapter 14 in IPS On CD (Chap 16, 5th ed.) Unit 12 Logistic Regression Supplementary Chapter 14 in IPS On CD (Chap 16, 5th ed.) Logistic regression generalizes methods for 2-way tables Adds capability studying several predictors, but Limited to

More information

CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS

CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS Examples: Regression And Path Analysis CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS Regression analysis with univariate or multivariate dependent variables is a standard procedure for modeling relationships

More information

Failure to take the sampling scheme into account can lead to inaccurate point estimates and/or flawed estimates of the standard errors.

Failure to take the sampling scheme into account can lead to inaccurate point estimates and/or flawed estimates of the standard errors. Analyzing Complex Survey Data: Some key issues to be aware of Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 24, 2015 Rather than repeat material that is

More information

How to set the main menu of STATA to default factory settings standards

How to set the main menu of STATA to default factory settings standards University of Pretoria Data analysis for evaluation studies Examples in STATA version 11 List of data sets b1.dta (To be created by students in class) fp1.xls (To be provided to students) fp1.txt (To be

More information

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma siersma@sund.ku.dk The Research Unit for General Practice in Copenhagen Dias 1 Content Quantifying association

More information

Likelihood: Frequentist vs Bayesian Reasoning

Likelihood: Frequentist vs Bayesian Reasoning "PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200B University of California, Berkeley Spring 2009 N Hallinan Likelihood: Frequentist vs Bayesian Reasoning Stochastic odels and

More information

Chapter 18. Effect modification and interactions. 18.1 Modeling effect modification

Chapter 18. Effect modification and interactions. 18.1 Modeling effect modification Chapter 18 Effect modification and interactions 18.1 Modeling effect modification weight 40 50 60 70 80 90 100 male female 40 50 60 70 80 90 100 male female 30 40 50 70 dose 30 40 50 70 dose Figure 18.1:

More information

LOGIT AND PROBIT ANALYSIS

LOGIT AND PROBIT ANALYSIS LOGIT AND PROBIT ANALYSIS A.K. Vasisht I.A.S.R.I., Library Avenue, New Delhi 110 012 amitvasisht@iasri.res.in In dummy regression variable models, it is assumed implicitly that the dependent variable Y

More information

10 Dichotomous or binary responses

10 Dichotomous or binary responses 10 Dichotomous or binary responses 10.1 Introduction Dichotomous or binary responses are widespread. Examples include being dead or alive, agreeing or disagreeing with a statement, and succeeding or failing

More information

Lecture 19: Conditional Logistic Regression

Lecture 19: Conditional Logistic Regression Lecture 19: Conditional Logistic Regression Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina

More information

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1) CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.

More information

Examining a Fitted Logistic Model

Examining a Fitted Logistic Model STAT 536 Lecture 16 1 Examining a Fitted Logistic Model Deviance Test for Lack of Fit The data below describes the male birth fraction male births/total births over the years 1931 to 1990. A simple logistic

More information

Lecture Notes Module 1

Lecture Notes Module 1 Lecture Notes Module 1 Study Populations A study population is a clearly defined collection of people, animals, plants, or objects. In psychological research, a study population usually consists of a specific

More information

Logistic Regression (a type of Generalized Linear Model)

Logistic Regression (a type of Generalized Linear Model) Logistic Regression (a type of Generalized Linear Model) 1/36 Today Review of GLMs Logistic Regression 2/36 How do we find patterns in data? We begin with a model of how the world works We use our knowledge

More information

Module 3: Correlation and Covariance

Module 3: Correlation and Covariance Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis

More information

Latent Class Regression Part II

Latent Class Regression Part II This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results

IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results IAPRI Quantitative Analysis Capacity Building Series Multiple regression analysis & interpreting results How important is R-squared? R-squared Published in Agricultural Economics 0.45 Best article of the

More information

Section 13, Part 1 ANOVA. Analysis Of Variance

Section 13, Part 1 ANOVA. Analysis Of Variance Section 13, Part 1 ANOVA Analysis Of Variance Course Overview So far in this course we ve covered: Descriptive statistics Summary statistics Tables and Graphs Probability Probability Rules Probability

More information

Longitudinal Data Analysis

Longitudinal Data Analysis Longitudinal Data Analysis Acknowledge: Professor Garrett Fitzmaurice INSTRUCTOR: Rino Bellocco Department of Statistics & Quantitative Methods University of Milano-Bicocca Department of Medical Epidemiology

More information

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1) Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the

More information

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,

More information

Logistic Regression (1/24/13)

Logistic Regression (1/24/13) STA63/CBB540: Statistical methods in computational biology Logistic Regression (/24/3) Lecturer: Barbara Engelhardt Scribe: Dinesh Manandhar Introduction Logistic regression is model for regression used

More information

CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

More information

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction

More information

2013 MBA Jump Start Program. Statistics Module Part 3

2013 MBA Jump Start Program. Statistics Module Part 3 2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just

More information

Sample Size and Power in Clinical Trials

Sample Size and Power in Clinical Trials Sample Size and Power in Clinical Trials Version 1.0 May 011 1. Power of a Test. Factors affecting Power 3. Required Sample Size RELATED ISSUES 1. Effect Size. Test Statistics 3. Variation 4. Significance

More information

Statistical Models in R

Statistical Models in R Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Structure of models in R Model Assessment (Part IA) Anova

More information

Two Correlated Proportions (McNemar Test)

Two Correlated Proportions (McNemar Test) Chapter 50 Two Correlated Proportions (Mcemar Test) Introduction This procedure computes confidence intervals and hypothesis tests for the comparison of the marginal frequencies of two factors (each with

More information

Introduction to Fixed Effects Methods

Introduction to Fixed Effects Methods Introduction to Fixed Effects Methods 1 1.1 The Promise of Fixed Effects for Nonexperimental Research... 1 1.2 The Paired-Comparisons t-test as a Fixed Effects Method... 2 1.3 Costs and Benefits of Fixed

More information

Chapter 5 Analysis of variance SPSS Analysis of variance

Chapter 5 Analysis of variance SPSS Analysis of variance Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,

More information

Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct 16 2015

Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct 16 2015 Stat 411/511 THE RANDOMIZATION TEST Oct 16 2015 Charlotte Wickham stat511.cwick.co.nz Today Review randomization model Conduct randomization test What about CIs? Using a t-distribution as an approximation

More information