Bivariate Analysis. Correlation. Correlation. Pearson's Correlation Coefficient. Variable 1. Variable 2



Similar documents
SPSS Guide: Regression Analysis

Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear.

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Research Methods & Experimental Design

Chapter 7: Simple linear regression Learning Objectives

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

Chapter 23. Inferences for Regression

Module 5: Multiple Regression Analysis

Simple linear regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Introduction to Statistics and Quantitative Research Methods

Multiple Linear Regression

Stat 412/512 CASE INFLUENCE STATISTICS. Charlotte Wickham. stat512.cwick.co.nz. Feb

Multiple Regression. Page 24

Data Analysis for Marketing Research - Using SPSS

Univariate Regression

Example: Boats and Manatees

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

Week TSX Index

II. DISTRIBUTIONS distribution normal distribution. standard scores

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Multicollinearity Richard Williams, University of Notre Dame, Last revised January 13, 2015

Correlational Research

Using R for Linear Regression

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Additional sources Compilation of sources:

TRINITY COLLEGE. Faculty of Engineering, Mathematics and Science. School of Computer Science & Statistics

Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS

1/27/2013. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

Introduction to Regression and Data Analysis

1 Simple Linear Regression I Least Squares Estimation

Pearson's Correlation Tests

WHAT IS A JOURNAL CLUB?

Course Objective This course is designed to give you a basic understanding of how to run regressions in SPSS.

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Simple Linear Regression, Scatterplots, and Bivariate Correlation

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries

Copyright 2007 by Laura Schultz. All rights reserved. Page 1 of 5

Pearson s Correlation

Generalized Linear Models

1.1. Simple Regression in Excel (Excel 2010).

Chapter 5 Analysis of variance SPSS Analysis of variance

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

Categorical Data Analysis

MULTIPLE REGRESSION ANALYSIS OF MAIN ECONOMIC INDICATORS IN TOURISM. R, analysis of variance, Student test, multivariate analysis

Module 3: Correlation and Covariance

A STUDY ON ONBOARDING PROCESS IN SIFY TECHNOLOGIES, CHENNAI

Correlation and Simple Linear Regression

Independent t- Test (Comparing Two Means)

Correlation. What Is Correlation? Perfect Correlation. Perfect Correlation. Greg C Elvers

A full analysis example Multiple correlations Partial correlations

Statistical Control using Partial and Semi-partial (Part) Correlations

False. Model 2 is not a special case of Model 1, because Model 2 includes X5, which is not part of Model 1. What she ought to do is estimate

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

5. Linear Regression

IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results

Projects Involving Statistics (& SPSS)

Section 3 Part 1. Relationships between two numerical variables

Statistics 2014 Scoring Guidelines

Regression Analysis: A Complete Example

Introduction to Quantitative Methods

11. Analysis of Case-control Studies Logistic Regression

Premaster Statistics Tutorial 4 Full solutions

Descriptive Statistics

Final Exam Practice Problem Answers

An analysis appropriate for a quantitative outcome and a single quantitative explanatory. 9.1 The model behind linear regression

13. Poisson Regression Analysis

POLYNOMIAL AND MULTIPLE REGRESSION. Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model.

ABSORBENCY OF PAPER TOWELS

Testing for Lack of Fit

Econometrics Simple Linear Regression

Comparing Nested Models

This chapter will demonstrate how to perform multiple linear regression with IBM SPSS

Ordinal Regression. Chapter

2013 MBA Jump Start Program. Statistics Module Part 3

Part 2: Analysis of Relationship Between Two Variables

THE RELATIONSHIP BETWEEN WORKING CAPITAL MANAGEMENT AND DIVIDEND PAYOUT RATIO OF FIRMS LISTED IN NAIROBI SECURITIES EXCHANGE

Moderator and Mediator Analysis

Overview of Non-Parametric Statistics PRESENTER: ELAINE EISENBEISZ OWNER AND PRINCIPAL, OMEGA STATISTICS

COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES.

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

CORRELATIONAL ANALYSIS: PEARSON S r Purpose of correlational analysis The purpose of performing a correlational analysis: To discover whether there

Doing Multiple Regression with SPSS. In this case, we are interested in the Analyze options so we choose that menu. If gives us a number of choices:

Financial Risk Management Exam Sample Questions/Answers

Lecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation

Biostatistics: Types of Data Analysis

Regression step-by-step using Microsoft Excel

Simple Linear Regression Inference

Predictability Study of ISIP Reading and STAAR Reading: Prediction Bands. March 2014

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing

Study Guide for the Final Exam

Formula for linear models. Prediction, extrapolation, significance test against zero slope.

2. Simple Linear Regression

Factors affecting online sales

The correlation coefficient

Section Format Day Begin End Building Rm# Instructor. 001 Lecture Tue 6:45 PM 8:40 PM Silver 401 Ballerini

IMPACT AND SIGNIFICANCE OF TRANSPORTATION AND SOCIO ECONOMIC FACTORS ON STUDENTS CLASS ATTENDANCE IN NIGERIA POLYTECHNICS: A

Elementary Statistics Sample Exam #3

Transcription:

Bivariate Analysis Variable 2 LEVELS >2 LEVELS COTIUOUS Correlation Used when you measure two continuous variables. Variable 2 2 LEVELS X 2 >2 LEVELS X 2 COTIUOUS t-test X 2 X 2 AOVA (F-test) t-test AOVA (F-test) -Correlation -Simple linear Regression Examples: Association between &. Association between & blood pressure Correlation Weight (Kg) Height (cm) 55 70 93 80 90 68 60 56 2 78 45 6 85 8 04 92 68 76 87 86 Heig ht 200 90 80 70 60 50 40 0 0 20 30 40 50 60 70 80 90 00 0 20 Weight Correlation is measured by Pearson's Correlation Coefficient. A measure of the linear association between two variables that have been measured on a continuous scale. Pearson's correlation coefficient is denoted by r. A correlation coefficient is a number ranges between - and +.

If r = perfect positive linear relationship between the two variables. If r = - perfect negative linear relationship between the two variables. If r = 0 o linear relationship between the two variables. r= + r= - r= 0 http://noppa5.pc.helsinki.fi/koe/corr/cor7.html -0.9 0.8 0.2-0.5

Moderate Moderate Example : Research question: Is there a linear relationship between the and of students? - -0.8-0.6-0.4-0.2 0 0.2 0.4 0.6 0.8 H o : there is no linear relationship between & of students in the population (p = 0) Strong Weak Strong H a : there is a linear relationship between & of students in the population (p 0) Statistical test: Pearson correlation coefficient (R) Example : SPSS Output Correlation is significant at the 0.0 level.65** 975 954.65** 954 97 r coefficient P-Value Example : SPSS Output.65** 975 954.65** 954 97 Correlation is significant at the 0.0 level Value of statistical test: 0.65 P-value: 0

Example : SPSS Output.65** 975 954.65** 954 97 Correlation is significant at the 0.0 level Conclusion: At significance level of 0.05, we reject null hypothesis and conclude that in the population there is significant linear relationship between the and of students. Example 2: SPSS Output Correlation is significant at the 0.0 level.55** 975 84.55** 84 846 Research question: Is there a linear relationship between the and of students? Example 2: SPSS Output Example 2: SPSS Output H o : H a : Correlation is significant at the 0.0 level.55** 975 84.55** 84 846 p = 0 ; o linear relationship between & in the population p 0 ; There is linear relationship between & in the population Value of statistical test: P-value: Correlation is significant at the 0.0 level 0.55** 975 84.55** 84 846 0.55

Example 2: SPSS Output Correlation is significant at the 0.0 level.55** 975 84.55** 84 846 Conclusion: At significance level of 0.05, we reject null hypothesis and conclude that in the population there is a significant linear relationship between the and of students. Example 3: SPSS Output Correlation is significant at the 0.0 level.084** 846 82.084** 82 97 Research question: Is there a linear relationship between the and of students? Example 3: SPSS Output Example 3: SPSS Output H o : H a : Correlation is significant at the 0.0 level.084** 846 82.084** 82 97 p = 0 ; o linear relationship between & in the population p 0 ; There is linear relationship between & in the population Value of statistical test: P-value: Correlation is significant at the 0.0 level 0.084** 846 82.084** 82 97 0.084

Example 3: SPSS Output Correlation is significant at the 0.0 level.084** 846 82.084** 82 97 Conclusion: At significance level of 0.05, we reject null hypothesis and conclude that in the population there is a significant linear relationship between the and of students. SPSS command for r Example Analyze Correlate Bivariate select and and put it in the variables box. T (True) or F (False): T (True) or F (False): In studying whether there is an association between gender and, the investigator found out that r= 0.90 and p-value<0.00 and concludes that there is a strong significant correlation between gender and. The correlation between obesity and number of cigarettes smoked was r=0.02 and the p-value= 0.856. Based on these results we conclude that there isn t any association between obesity and number of cigarette smoked.

Used to explain observed variation in the data For example, we measure blood pressure in a sample of patients and observe: I=Pt# 2 3 4 5 6 7 Y= BP 85 05 90 85 0 70 5 In order to explain why BP of individual patients are different, we try to associate the differences in PB with differences in other relevant patient characteristics (variables). Example: Can variation in blood pressure be explained by? Questions: ) What is the most appropriate mathematical to use? A straight line, parabola, etc 2) Given a specific model, how do we determine the best fitting model? Mathematical properties of a straight line Y= B 0 + B X Y = dependent variable X = independent variable B 0 = Y intercept B = Slope The intercept B 0 is the value of Y when X=0. The slope B is the amount of change in Y for each -unit change in X.

Estimation of a simple Linear Regression Optimal Regression line = B 0 + B X Y = B 0 + B X Example : Research Question: Does help to predict using a straight line model? Is there a linear relationship between and? Does explain a significant portion of the variation in the values of observed? Weight = B 0 + B Height SPSS output: Example Variables Entered/Removed b Variables Variables Entered Removed Method a. Enter a. All requested variables entered. b. Dependent Variable: Summary R R Square R Square the Estimate.65 a.424.423 0.878 a. Predictors: (Constant), SPSS output (Continued): Example Regression Residual Total a. Predictors: (Constant), b. Dependent Variable: (Constant) a. Dependent Variable: Unstandardized Coefficients AOVA b Sum of Squares df Mean Square F Sig. 69820.3 69820.297 435.30 a 230982.0 952 8.33 400802.3 953 Standardized Coefficients B Std. Error Beta t Sig. -95.246 4.226-22.539.940.025.65 37.883

SPSS output (Continued): Example Summary R R Square R Square the Estimate.65 a.424.423 0.878 a. Predictors: (Constant), SPSS output (Continued): Example Unstandardized Standardized Coefficients Coefficients B Std. Error Beta t Sig. (Constant) -95.246 4.226-22.539.940.025.65 37.883 a. Dependent Variable: 0.424 Height explains 42.4% of the variation seen in Weight = B 0 + B Height -95.246 0.940 Weight = -95.246 + 0.94 Height Increasing by unit ( cm) increases by 0.94 Kg (Constant) a. Dependent Variable: H 0 : B =0 H a : B 0 Unstandardized Coefficients Standardized Coefficients B Std. Error Beta t Sig. -95.246 4.226-22.539.940.025.65 37.883 Because the p-value of the B is < 0.05; then reject H 0 and conclude that provides significant information for predicting. Question : In a simple linear regression model the predicted straight line was as follows: Weight (Kg) = 3.5.32 (weekly hours of PA) R 2 = 0.22; p-value for the slope= 0.04 What is the dependent/ independent variable? Dependent variable: Weight Independent Variable: Weekly hours of PA

Question : Question : In a simple linear regression model the predicted straight line was as follows: Weight (Kg) = 3.5.32 (weekly hours of PA) R 2 = 0.22; p-value for the slope= 0.04 Interpret the value of R 2 umber of weekly hours of PA explain 22% of the variation observed in In a simple linear regression model the predicted straight line was as follows: Weight (Kg) = 3.5.32 (weekly hours of PA) R 2 = 0.22; p-value for the slope= 0.04 What is the null hypothesis? Alternative? H 0 : B weekly hours of PA =0 H a : B weekly hours of PA 0 Question : Question : In a simple linear regression model the predicted straight line was as follows: Weight (Kg) = 3.5.32 (weekly hours of PA) R 2 = 0.22; p-value for the slope= 0.04 In a simple linear regression model the predicted straight line was as follows: Weight (Kg) = 3.5.32 (weekly hours of PA) R 2 = 0.22; p-value for the slope= 0.04 Is the association between & weekly hours of PA positive or negative? egative What is the magnitude of this association?.32 => One hour increase of PA in a week decreases by.32 Kg.

Question : In a simple linear regression model the predicted straight line was as follows: Weight (Kg) = 3.5.32 (weekly hours of PA) R 2 = 0.22; p-value for the slope= 0.04 Is the association significant at a level of 0.05? Because the p-value of the B is < 0.05; then reject H 0 and conclude that weekly hours of PA provide significant information for predicting. Question 2: Summary R R Square R Square the Estimate.407 a.66.64 0.396 a. Predictors: (Constant), ISS - injury severity measure (Constant) ISS - injury severity measure Unstandardized Coefficients a. Dependent Variable: Length of hospital stay Standardized Coefficients B Std. Error Beta t Sig..443.747.593.554.66.066.407 9.945 Question 2: Summary R R Square R Squarethe Estimate.407 a.66.64 0.396 a. Predictors: (Constant), ISS - injury severity UnstandardizedStandardized Coefficients Coefficients Mode B Std. Error Beta t Sig. (Constant) ISS - injury.443.747.593.554.66 severity meas.066.407 9.945 a. Dependent Variable: Length of hospital stay Question 2: Summary R R Square R Squarethe Estimate.407 a.66.64 0.396 a. Predictors: (Constant), ISS - injury severity UnstandardizedStandardized Coefficients Coefficients Mode B Std. Error Beta t Sig. (Constant) ISS - injury.443.747.593.554.66 severity meas.066.407 9.945 a. Dependent Variable: Length of hospital stay What is the dependent/ independent variable? Dependent variable: Length of hospital stay Independent Variable: ISS- Injury severity score Interpret the value of R 2 ISS explains 40.7% of the variation observed in length of hospital stay.

Question 2: Summary R R Square R Squarethe Estimate.407 a.66.64 0.396 a. Predictors: (Constant), ISS - injury severity UnstandardizedStandardized Coefficients Coefficients Mode B Std. Error Beta t Sig. (Constant) ISS - injury.443.747.593.554.66 severity meas.066.407 9.945 a. Dependent Variable: Length of hospital stay Question 2: Summary R R Square R Squarethe Estimate.407 a.66.64 0.396 a. Predictors: (Constant), ISS - injury severity UnstandardizedStandardized Coefficients Coefficients Mode B Std. Error Beta t Sig. (Constant) ISS - injury.443.747.593.554.66 severity meas.066.407 9.945 a. Dependent Variable: Length of hospital stay What is the null hypothesis? Alternative? H 0 : B ISS =0 H a : B ISS 0 Is there a significant association between the dependent & the independent? Because the p-value of the B ISS is < 0.05; then reject H 0 and conclude that ISS provide significant information for predicting length of hospital stay. Question 2: Summary R R Square R Squarethe Estimate.407 a.66.64 0.396 a. Predictors: (Constant), ISS - injury severity What is the magnitude of this association? UnstandardizedStandardized Coefficients Coefficients Mode B Std. Error Beta t Sig. (Constant) ISS - injury.443.747.593.554.66 severity meas.066.407 9.945 a. Dependent Variable: Length of hospital stay 0.66 => Increasing ISS by unit increases length of hospital stay by 0.66 days. Biases Selection bias Information bias Confounding bias Bias is an error in an epidemiologic study that results in an incorrect estimation of the association between exposure and outcome.

Biases Confounding Bias: Definition Selection bias Information bias Confounding bias Is present when the association between an exposure and an outcome is distorted by an extraneous third variable (referred to a confounding variable). Confounding Bias: Example Example : Study the association between coffee drinking and lung cancer LC Yes o Yes 80 5 o 20 85 Coffee Confounding Bias: Minimize bias Research Design: Use of randomized clinical trial Restriction Data Analysis: Multivariate statistical techniques OR= (80x 85)/ (5 x 20)= 22 What would you conclude????

Variable 2 Bivariate Analysis 2 LEVELS X 2 Variable 2 LEVELS >2 LEVELS COTIUOUS >2 LEVELS X 2 COTIUOUS T-test X 2 X 2 AOVA (F-test) t-test AOVA (F-test) -Correlation -Simple linear Regression Multivariate analyses Logistic Regression (If outcome is 2 levels) Multiple Linear Regression (If outcome is continuous) Multivariate Analysis is used for adjusting for confounding variables. Multivariate Analysis Multivariate analyses WHY? To investigate the effect of more than one independent variable. Logistic Regression (If outcome is 2 levels) Multiple Linear Regression (If outcome is continuous) Predict the outcome using various independent variables. Adjust for confounding variables