Interaction between quantitative predictors

Save this PDF as:

Size: px
Start display at page:

Transcription

1 Interaction between quantitative predictors In a first-order model like the ones we have discussed, the association between E(y) and a predictor x j does not depend on the value of the other predictors in the model. See Fig. 4.1: relation between E(y) and x 1 is the same regardless of the value of x 2 : all the prediction lines are parallel. If, however, the association between response and one of the predictors depends on the value of other predictors, then a first-order model is no longer appropriate. We say that there is an interaction among predictors. Stat Fall

2 Interaction (cont d) Example: a company wishes to estimate the association between sales of a beauty product (y) and two potential predictors of sales in each of n markets: \$ spent on daytime TV ads in ith market (x 1 ) and average number of years of education of females in ith market. Intuitively, this is what we would expect: Advertisement expenses will tend to increase sales (up to a point). In cities where women are highly educated (on the average), less of them will be watching TV during the day. The effect of \$ in ads on sales may then also depend on education of potential consumers. Stat Fall

3 Interaction (cont d) A figure to represent the association between ads and sales for different levels of education will be drawn in class. How do we include an interaction term in the model? With k = 2 predictors: y i = β 0 + β 1 x 1i + β 2 x 2i + β 3 x 1i x 2i + ɛ i, where the assumptions about the model are the same as before. An interaction between two predictors is a second-order term in the model. Stat Fall

4 Interaction (cont d) In sales example, we would expect that β 3 < 0: as education increases (and more women are out working), the strength of the association between daytime TV ads on sales decreases. In other words, daytime ads are expected to be more effective in markets where more women are at home watching TV during the day than in markets where most women are not watching TV. In general, with k predictors, we can include pairwise interactions between any two, as appropriate. Higher order interactions (e.g. x j x l x t denoting the three-way interaction between the jth, lth and tth predictors) can also be included in the model, but are much harder to interpret from a subject matter point of view. Stat Fall

5 Interaction (cont d) When predictors interact, the interpretation of all the β s changes. If the model is y i = β 0 + β 1 x 1i + β 2 x 2i + β 3 x 1i x 2i + ɛ i, β 0 is still interpreted as before. (β 1 + β 3 x 2 ) is change in E(y) when x 1 increases by one unit and x 2 is held fixed. (β 2 + β 3 x 1 ) is change in E(y) when x 2 increases by one unit and x 1 is held fixed. Association between E(y) and x 1 depends on level of x 2, unless β 3 = 0, in which case interaction does not exist. Stat Fall

6 Interaction (cont d) In sales example, suppose we find that: b 0 = 5, b 1 = 3, b 2 = 0.5, b 3 = 0.2. Interpretation? Number of units sold can be expected to change by 3 0.2x 2 when ad expenses increase by \$1 given education. Number of units sold can be expected to change by x 1 when education of potential customers increases by one year, given ad expenditures. In a market with 12 years of average education, we expect that sales will increase by 3-0.2(12) = 0.6 units if ad expenditures increase by \$1. In a market with average education equal to 8 years, an additional \$1 spent on daytime ads would be associated to an increase of about 1.4 units in expected sales. Stat Fall

7 Interaction (cont d) How do we draw inferences in models with interaction terms? Steps would be the same as in any multiple regression model: 1. Do a global F test of the utility of the model. The null hypothesis in this case is H 0 : β 1 = β 2 =... = β k = 0, tested against the alternative that says that at least one of the β s is different from If F test leads to rejection of H 0, then do a t test on each of the β s associated to interaction terms. 3. If interaction between x j and x k is significant, do not test hypothesis for β j and β k ; if the interaction is important, the individual x s must be important too (some statisticians would argue different here). Stat Fall

8 Second order model with quadratic predictors Sometimes, the association between E(y) and x j quadratic. is not linear but A second order model with one predictor is: y i = β 0 + β 1 x 1i + β 2 x 2 1i + ɛ i. If β 2 > 0: association is concave upwards (bowl shape). If β 2 < 0: concave downwards (mound shape). β 2 is known as a rate of curvature parameter. Stat Fall

9 Quadratic predictors - Example Example 4.6, page 198. Data: y is immunoglobin in blood (indicator of immunity, in mgrs) and x is maximum oxygen uptake (indicator of fitness, in ml/kg) measured on 30 individuals. Range: x (32, 70). See scatter plot of data. Model: with usual assumptions. y i = β 0 + β 1 x i + β 2 x 2 i + ɛ i, Stat Fall

10 Quadratic predictors - Example Results: b 0 = 1, 464, b 1 = 88.3 and b 2 = 0.54, so that the prediction equation is ŷ = 1, x 0.54x 2. R 2 a = 0.93 so about 93% of the variability observed in immunoglobin can be associated to fitness. Interpretation of coefficients: The intercept is meaningless. Cannot have negative immunoglobin. b 1 no longer has a simple interpretation. It is NOT the expected change in y when x increases by one. The quadratic term b 2 is negative: response curves downwards as x increases. Stat Fall

11 Quadratic predictors - Example Be cautious with extrapolations! See Fig Concavity of response implies that for large enough x the E(y) will begin to decrease. This makes no sense from a physiology point of view. Nonsensical predictions may occur if the model is used outside of the range of the data! Stat Fall

12 Quadratic predictors - Example First test of hypotheses is F -test for entire model. We test: H 0 : β 1 = β 2 = 0, against H a : at least one of the two 0. In this example, F = which we know will be larger than the critical value even without looking at the table. We reject H 0 : maximal oxygen uptake contributes information about immunoglobin levels in the blood. Next step is to decide whether curvature is important or not. Stat Fall

13 Quadratic predictors - Example We now test for significance of the quadratic effect: H 0 : β 2 = 0 against H a : β 2 0 (or we can do a one-tailed test too). t statistic is t = b 2 /ˆσ b2 = 0.536/0.158 = 3.39 which we compare to a table value with α/2 = and n 3 degrees of freedom. We reject H 0. Interpretation: There is strong evidence that immunoglobin levels increase more slowly per unit increase in maximal oxygen uptake in individuals with high aerobic fitness than in those with low aerobic fitness. If we had failed to reject H 0 : β 2 = 0, we would conclude that the association between y and x is linear. Stat Fall

14 Estimation and prediction Same concepts as before. With the model we might wish to: 1. Estimate the expected mean value of the response at a certain value of the predictor(s). 2. Predict a single response for some value of the predictor. In both cases, the point estimator (predictor) is ŷ = b 0 + b 1 x p + b 2 x 2 p for x = x p. The standard error of ŷ depends on whether we predict a mean or a single value. As before, ˆσ (y ŷ) > ˆσŷ. Calculations are complex, so we use the computer to get these standard errors and CIs. Stat Fall

15 Estimation and prediction In example, suppose we wish to obtain 1. The expected mean immunoglobin levels for people with oxygen uptake of x p = 40 ml/kg. 2. The expected immunoglobin level for a person with x p = 40 ml/kg. In both cases, point estimator is ŷ = 1, (40) 0.536(40) 2 = 1, JMP and SAS will give the (1 α)% CI for the mean or for a single prediction. Stat Fall

16 Estimation and prediction From CI we can derive ˆσŷ or ˆσ (y ŷ) recalling that (1 α)% Lower bound of CI = ŷ t α/2,n k 1 std error. Then Std error = ŷ Lower bound t α/2,n k 1. We can also derive the std errors using the upper bound of the CI as follows: Upper bound ŷ Std error =. t α/2,n k 1 Stat Fall

17 Estimation and prediction In example, the 95% CI for the mean immunoglobin at x = 40 ml/kg is (1, 156.2, 1, 263.6). Then: ˆσŷ = 1, , = Also, since the 95% CI for a single response is (985, 1, 434.8): ˆσ (y ŷ) = 1, = Stat Fall

18 More complex models: interaction + curvature Consider the following complete second-order model with two predictors: See Fig y = β 0 + β 1 x 1 + β 2 x 2 + β 3 x 1 x 2 + β 4 x β 5 x ɛ. A complete second order model with three predictors includes 3 firstorder terms, 3 squared terms, 3 two-way interactions, and 1 three-way interaction. The number of terms in complete models gets out of hand fast. Samples often not large enough to fit all possible terms. Use subject-matter knowledge to decide which terms to include. Stat Fall

19 More complex models: Example Example 4.7, page 213: Study to determine whether weight of package (x 1 ) and distance delivered (x 2 ) are associated to shipping costs (y) in a small regional express delivery service. See scatter plots. Complete second-order model fitted with JMP. Data Express on class web site. Results: See output. Stat Fall

20 More complex models: Example Interpretation of results: Since RMSE = 0.44, about 95% of shipping costs will fall within \$0.89 of their predicted values. R 2 a = 0.99: almost all of the variability in shipping costs can be explained by the model. F statistic = on 5 and 14 df. Highly significant, model is useful. Weight is associated to cost both linearly and quadratically. Distance only linearly. Interaction between weight and cost is positive: effect of weight on cost is not independent of distance. Stat Fall

We extended the additive model in two variables to the interaction model by adding a third term to the equation.

Quadratic Models We extended the additive model in two variables to the interaction model by adding a third term to the equation. Similarly, we can extend the linear model in one variable to the quadratic

Estimation of σ 2, the variance of ɛ

Estimation of σ 2, the variance of ɛ The variance of the errors σ 2 indicates how much observations deviate from the fitted surface. If σ 2 is small, parameters β 0, β 1,..., β k will be reliably estimated

Hypothesis testing - Steps

Hypothesis testing - Steps Steps to do a two-tailed test of the hypothesis that β 1 0: 1. Set up the hypotheses: H 0 : β 1 = 0 H a : β 1 0. 2. Compute the test statistic: t = b 1 0 Std. error of b 1 =

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

e = random error, assumed to be normally distributed with mean 0 and standard deviation σ

1 Linear Regression 1.1 Simple Linear Regression Model The linear regression model is applied if we want to model a numeric response variable and its dependency on at least one numeric factor variable.

Regression Analysis: A Complete Example

Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the

Multiple Linear Regression

Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is

HYPOTHESIS TESTING AND TYPE I AND TYPE II ERROR

HYPOTHESIS TESTING AND TYPE I AND TYPE II ERROR Hypothesis is a conjecture (an inferring) about one or more population parameters. Null Hypothesis (H 0 ) is a statement of no difference or no relationship

Coefficient of Determination

Coefficient of Determination The coefficient of determination R 2 (or sometimes r 2 ) is another measure of how well the least squares equation ŷ = b 0 + b 1 x performs as a predictor of y. R 2 is computed

Premaster Statistics Tutorial 4 Full solutions

Premaster Statistics Tutorial 4 Full solutions Regression analysis Q1 (based on Doane & Seward, 4/E, 12.7) a. Interpret the slope of the fitted regression = 125,000 + 150. b. What is the prediction for

Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing

Simple Linear Regression Inference

Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

2. Simple Linear Regression

Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according

Regression. Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question.

Class: Date: Regression Multiple Choice Identify the choice that best completes the statement or answers the question. 1. Given the least squares regression line y8 = 5 2x: a. the relationship between

Statistics Review PSY379

Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses

Introduction to Hypothesis Testing. Point estimation and confidence intervals are useful statistical inference procedures.

Introduction to Hypothesis Testing Point estimation and confidence intervals are useful statistical inference procedures. Another type of inference is used frequently used concerns tests of hypotheses.

Chapter 9, Part A Hypothesis Tests. Learning objectives

Chapter 9, Part A Hypothesis Tests Slide 1 Learning objectives 1. Understand how to develop Null and Alternative Hypotheses 2. Understand Type I and Type II Errors 3. Able to do hypothesis test about population

Chapter 23. Inferences for Regression

Chapter 23. Inferences for Regression Topics covered in this chapter: Simple Linear Regression Simple Linear Regression Example 23.1: Crying and IQ The Problem: Infants who cry easily may be more easily

5. Multiple regression

5. Multiple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/5 QBUS6840 Predictive Analytics 5. Multiple regression 2/39 Outline Introduction to multiple linear regression Some useful

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) As with other parametric statistics, we begin the one-way ANOVA with a test of the underlying assumptions. Our first assumption is the assumption of

1 Simple Linear Regression I Least Squares Estimation

Simple Linear Regression I Least Squares Estimation Textbook Sections: 8. 8.3 Previously, we have worked with a random variable x that comes from a population that is normally distributed with mean µ and

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

STA 3024 Practice Problems Exam 2 NOTE: These are just Practice Problems. This is NOT meant to look just like the test, and it is NOT the only thing that you should study. Make sure you know all the material

Using R for Linear Regression

Using R for Linear Regression In the following handout words and symbols in bold are R functions and words and symbols in italics are entries supplied by the user; underlined words and symbols are optional

SPSS Guide: Regression Analysis

SPSS Guide: Regression Analysis I put this together to give you a step-by-step guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar

Pearson's Correlation Tests

Chapter 800 Pearson's Correlation Tests Introduction The correlation coefficient, ρ (rho), is a popular statistic for describing the strength of the relationship between two variables. The correlation

A Short Tour of the Predictive Modeling Process

Chapter 2 A Short Tour of the Predictive Modeling Process Before diving in to the formal components of model building, we present a simple example that illustrates the broad concepts of model building.

11. Analysis of Case-control Studies Logistic Regression

Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

9-3.4 Likelihood ratio test. Neyman-Pearson lemma

9-3.4 Likelihood ratio test Neyman-Pearson lemma 9-1 Hypothesis Testing 9-1.1 Statistical Hypotheses Statistical hypothesis testing and confidence interval estimation of parameters are the fundamental

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 Analysis of covariance and multiple regression So far in this course,

Logs Transformation in a Regression Equation

Fall, 2001 1 Logs as the Predictor Logs Transformation in a Regression Equation The interpretation of the slope and intercept in a regression change when the predictor (X) is put on a log scale. In this

EPS 625 ANALYSIS OF COVARIANCE (ANCOVA) EXAMPLE USING THE GENERAL LINEAR MODEL PROGRAM

EPS 6 ANALYSIS OF COVARIANCE (ANCOVA) EXAMPLE USING THE GENERAL LINEAR MODEL PROGRAM ANCOVA One Continuous Dependent Variable (DVD Rating) Interest Rating in DVD One Categorical/Discrete Independent Variable

Regression step-by-step using Microsoft Excel

Step 1: Regression step-by-step using Microsoft Excel Notes prepared by Pamela Peterson Drake, James Madison University Type the data into the spreadsheet The example used throughout this How to is a regression

International Statistical Institute, 56th Session, 2007: Phil Everson

Teaching Regression using American Football Scores Everson, Phil Swarthmore College Department of Mathematics and Statistics 5 College Avenue Swarthmore, PA198, USA E-mail: peverso1@swarthmore.edu 1. Introduction

Sampling and Hypothesis Testing

Population and sample Sampling and Hypothesis Testing Allin Cottrell Population : an entire set of objects or units of observation of one sort or another. Sample : subset of a population. Parameter versus

Week 5: Multiple Linear Regression

BUS41100 Applied Regression Analysis Week 5: Multiple Linear Regression Parameter estimation and inference, forecasting, diagnostics, dummy variables Robert B. Gramacy The University of Chicago Booth School

2013 MBA Jump Start Program. Statistics Module Part 3

2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just

COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES.

277 CHAPTER VI COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES. This chapter contains a full discussion of customer loyalty comparisons between private and public insurance companies

Simple linear regression

Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

6. Statistical Inference: Significance Tests

6. Statistical Inference: Significance Tests Goal: Use statistical methods to check hypotheses such as Women's participation rates in elections in France is higher than in Germany. (an effect) Ethnic divisions

Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal

Week TSX Index 1 8480 2 8470 3 8475 4 8510 5 8500 6 8480

1) The S & P/TSX Composite Index is based on common stock prices of a group of Canadian stocks. The weekly close level of the TSX for 6 weeks are shown: Week TSX Index 1 8480 2 8470 3 8475 4 8510 5 8500

17. SIMPLE LINEAR REGRESSION II

17. SIMPLE LINEAR REGRESSION II The Model In linear regression analysis, we assume that the relationship between X and Y is linear. This does not mean, however, that Y can be perfectly predicted from X.

Chapter 5 Estimating Demand Functions

Chapter 5 Estimating Demand Functions 1 Why do you need statistics and regression analysis? Ability to read market research papers Analyze your own data in a simple way Assist you in pricing and marketing

Chapter 7 Section 7.1: Inference for the Mean of a Population

Chapter 7 Section 7.1: Inference for the Mean of a Population Now let s look at a similar situation Take an SRS of size n Normal Population : N(, ). Both and are unknown parameters. Unlike what we used

Stat 412/512 CASE INFLUENCE STATISTICS. Charlotte Wickham. stat512.cwick.co.nz. Feb 2 2015

Stat 412/512 CASE INFLUENCE STATISTICS Feb 2 2015 Charlotte Wickham stat512.cwick.co.nz Regression in your field See website. You may complete this assignment in pairs. Find a journal article in your field

Testing for Lack of Fit

Chapter 6 Testing for Lack of Fit How can we tell if a model fits the data? If the model is correct then ˆσ 2 should be an unbiased estimate of σ 2. If we have a model which is not complex enough to fit

Chapter 8. Hypothesis Testing

Chapter 8 Hypothesis Testing Hypothesis In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test (or test of significance) is a standard procedure for testing

Factors affecting online sales

Factors affecting online sales Table of contents Summary... 1 Research questions... 1 The dataset... 2 Descriptive statistics: The exploratory stage... 3 Confidence intervals... 4 Hypothesis tests... 4

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation

Lecture Notes Module 1

Lecture Notes Module 1 Study Populations A study population is a clearly defined collection of people, animals, plants, or objects. In psychological research, a study population usually consists of a specific

Simple Regression Theory II 2010 Samuel L. Baker

SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

ANOVA - Analysis of Variance

ANOVA - Analysis of Variance ANOVA - Analysis of Variance Extends independent-samples t test Compares the means of groups of independent observations Don t be fooled by the name. ANOVA does not compare

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression

Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction

Multinomial and Ordinal Logistic Regression

Multinomial and Ordinal Logistic Regression ME104: Linear Regression Analysis Kenneth Benoit August 22, 2012 Regression with categorical dependent variables When the dependent variable is categorical,

Comparing Nested Models

Comparing Nested Models ST 430/514 Two models are nested if one model contains all the terms of the other, and at least one additional term. The larger model is the complete (or full) model, and the smaller

STAT 350 Practice Final Exam Solution (Spring 2015)

PART 1: Multiple Choice Questions: 1) A study was conducted to compare five different training programs for improving endurance. Forty subjects were randomly divided into five groups of eight subjects

Correlational Research

Correlational Research Chapter Fifteen Correlational Research Chapter Fifteen Bring folder of readings The Nature of Correlational Research Correlational Research is also known as Associational Research.

Point Biserial Correlation Tests

Chapter 807 Point Biserial Correlation Tests Introduction The point biserial correlation coefficient (ρ in this chapter) is the product-moment correlation calculated between a continuous random variable

Mind on Statistics. Chapter 13

Mind on Statistics Chapter 13 Sections 13.1-13.2 1. Which statement is not true about hypothesis tests? A. Hypothesis tests are only valid when the sample is representative of the population for the question

POLYNOMIAL AND MULTIPLE REGRESSION. Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model.

Polynomial Regression POLYNOMIAL AND MULTIPLE REGRESSION Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model. It is a form of linear regression

12: Analysis of Variance. Introduction

1: Analysis of Variance Introduction EDA Hypothesis Test Introduction In Chapter 8 and again in Chapter 11 we compared means from two independent groups. In this chapter we extend the procedure to consider

Unit 26 Estimation with Confidence Intervals

Unit 26 Estimation with Confidence Intervals Objectives: To see how confidence intervals are used to estimate a population proportion, a population mean, a difference in population proportions, or a difference

MGT 267 PROJECT. Forecasting the United States Retail Sales of the Pharmacies and Drug Stores. Done by: Shunwei Wang & Mohammad Zainal

MGT 267 PROJECT Forecasting the United States Retail Sales of the Pharmacies and Drug Stores Done by: Shunwei Wang & Mohammad Zainal Dec. 2002 The retail sale (Million) ABSTRACT The present study aims

ANALYSIS OF TREND CHAPTER 5

ANALYSIS OF TREND CHAPTER 5 ERSH 8310 Lecture 7 September 13, 2007 Today s Class Analysis of trends Using contrasts to do something a bit more practical. Linear trends. Quadratic trends. Trends in SPSS.

Null Hypothesis H 0. The null hypothesis (denoted by H 0

Hypothesis test In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test (or test of significance) is a standard procedure for testing a claim about a property

Econometrics Simple Linear Regression

Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight

DATA INTERPRETATION AND STATISTICS

PholC60 September 001 DATA INTERPRETATION AND STATISTICS Books A easy and systematic introductory text is Essentials of Medical Statistics by Betty Kirkwood, published by Blackwell at about 14. DESCRIPTIVE

Pearson s correlation

Pearson s correlation Introduction Often several quantitative variables are measured on each member of a sample. If we consider a pair of such variables, it is frequently of interest to establish if there

Chapter 7 Part 2. Hypothesis testing Power

Chapter 7 Part 2 Hypothesis testing Power November 6, 2008 All of the normal curves in this handout are sampling distributions Goal: To understand the process of hypothesis testing and the relationship

MINITAB ASSISTANT WHITE PAPER

MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. One-Way

Recall this chart that showed how most of our course would be organized:

Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical

Chapter 5 Analysis of variance SPSS Analysis of variance

Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,

HYPOTHESIS TESTING: POWER OF THE TEST

HYPOTHESIS TESTING: POWER OF THE TEST The first 6 steps of the 9-step test of hypothesis are called "the test". These steps are not dependent on the observed data values. When planning a research project,

Nonlinear Regression Functions. SW Ch 8 1/54/

Nonlinear Regression Functions SW Ch 8 1/54/ The TestScore STR relation looks linear (maybe) SW Ch 8 2/54/ But the TestScore Income relation looks nonlinear... SW Ch 8 3/54/ Nonlinear Regression General

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Final Exam Review MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 1) A researcher for an airline interviews all of the passengers on five randomly

2. What is the general linear model to be used to model linear trend? (Write out the model) = + + + or

Simple and Multiple Regression Analysis Example: Explore the relationships among Month, Adv.\$ and Sales \$: 1. Prepare a scatter plot of these data. The scatter plots for Adv.\$ versus Sales, and Month versus

12.5: CHI-SQUARE GOODNESS OF FIT TESTS

125: Chi-Square Goodness of Fit Tests CD12-1 125: CHI-SQUARE GOODNESS OF FIT TESTS In this section, the χ 2 distribution is used for testing the goodness of fit of a set of data to a specific probability

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma siersma@sund.ku.dk The Research Unit for General Practice in Copenhagen Dias 1 Content Quantifying association

An analysis method for a quantitative outcome and two categorical explanatory variables.

Chapter 11 Two-Way ANOVA An analysis method for a quantitative outcome and two categorical explanatory variables. If an experiment has a quantitative outcome and two categorical explanatory variables that

CHAPTER 11 SECTION 2: INTRODUCTION TO HYPOTHESIS TESTING

CHAPTER 11 SECTION 2: INTRODUCTION TO HYPOTHESIS TESTING MULTIPLE CHOICE 56. In testing the hypotheses H 0 : µ = 50 vs. H 1 : µ 50, the following information is known: n = 64, = 53.5, and σ = 10. The standardized

1 SAMPLE SIGN TEST. Non-Parametric Univariate Tests: 1 Sample Sign Test 1. A non-parametric equivalent of the 1 SAMPLE T-TEST.

Non-Parametric Univariate Tests: 1 Sample Sign Test 1 1 SAMPLE SIGN TEST A non-parametric equivalent of the 1 SAMPLE T-TEST. ASSUMPTIONS: Data is non-normally distributed, even after log transforming.

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.

Difference of Means and ANOVA Problems

Difference of Means and Problems Dr. Tom Ilvento FREC 408 Accounting Firm Study An accounting firm specializes in auditing the financial records of large firm It is interested in evaluating its fee structure,particularly

Generalized Linear Models

Generalized Linear Models We have previously worked with regression models where the response variable is quantitative and normally distributed. Now we turn our attention to two types of models where the

Categorical Data Analysis

Richard L. Scheaffer University of Florida The reference material and many examples for this section are based on Chapter 8, Analyzing Association Between Categorical Variables, from Statistical Methods

Section A. Index. Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting techniques... 1. Page 1 of 11. EduPristine CMA - Part I

Index Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting techniques... 1 EduPristine CMA - Part I Page 1 of 11 Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting

Yiming Peng, Department of Statistics. February 12, 2013

Regression Analysis Using JMP Yiming Peng, Department of Statistics February 12, 2013 2 Presentation and Data http://www.lisa.stat.vt.edu Short Courses Regression Analysis Using JMP Download Data to Desktop

An Introduction to Statistical Tests for the SAS Programmer Sara Beck, Fred Hutchinson Cancer Research Center, Seattle, WA

ABSTRACT An Introduction to Statistical Tests for the SAS Programmer Sara Beck, Fred Hutchinson Cancer Research Center, Seattle, WA Often SAS Programmers find themselves in situations where performing

A Primer on Forecasting Business Performance

A Primer on Forecasting Business Performance There are two common approaches to forecasting: qualitative and quantitative. Qualitative forecasting methods are important when historical data is not available.

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

The Multiple Regression Model: Hypothesis Tests and the Use of Nonsample Information

Chapter 8 The Multiple Regression Model: Hypothesis Tests and the Use of Nonsample Information An important new development that we encounter in this chapter is using the F- distribution to simultaneously

Principles of Hypothesis Testing for Public Health

Principles of Hypothesis Testing for Public Health Laura Lee Johnson, Ph.D. Statistician National Center for Complementary and Alternative Medicine johnslau@mail.nih.gov Fall 2011 Answers to Questions