# The correlation coefficient

Save this PDF as:

Size: px
Start display at page:

## Transcription

2 Table 1. Height, quadriceps muscle, and age in 41 male alcoholics (data of Hickish et al., 1989) Height (cm) Quadriceps muscle (N) Age (years) Height (cm) Quadriceps muscle (N) Age (years) Muscle (newtons) Height (cm) Figure 1. Scatter diagram showing muscle and height for 41 male alcoholics

3 Quadriceps (newtons) Mean Mean height Height (cm) Figure 2. Scatter diagram showing muscle and height for 41 male alcoholics, with lines through the mean height and mean Quadriceps (newtons) Age (years) Figure 3. Scatter diagram showing muscle and age for 41 male alcoholics Quadriceps (newtons) 600 Mean Mean age Age (years) Figure 4. Scatter diagram showing muscle and age for 41 male alcoholics, with lines through the mean.

4 Figure 3 shows the relationship between and age in Table 1. Strength tends to be less for older men than for younger men. Figure 4 shows lines through the means, as in Figure 2. Now there are more observations in the top left and bottom right quadrants, where products are negative, than in the top left and bottom right quadrants, where products are positive. The sum of products will be negative. When large values of one variable are associated with small values of the other, we say we have negative correlation. The sum of products will depend on the number of observations and the units in which they are measured. We can show that the maximum possible value it can have is the square root of the sum of squares for height multiplied by the square root of the sum of squares for. Hence we divide the sum of products by the square roots of the two sums of squares. This gives the correlation coefficient, usually denoted by r. Using the abbreviation r looks very odd. Why r and not c for correlation? This is for historical reasons and it is so ingrained in statistical practice that we are stuck with it. If you see an unexplained r = in a paper, it means the correlation coefficient. Originally, r stood for regression. Because of the way r is calculated, its maximum value = 1.00 and its minimum value = We shall look at what these mean later. The correlation coefficient is also known as Pearson s correlation coefficient and the product moment correlation coefficient. There are other correlation coefficients as well, such as Spearman s and Kendall s, but if it is described simply as the correlation coefficient or just the correlation, the one based on the sum of products about the mean is the one intended. For the example of muscle and height in 41 alcoholic men, r = This a positive correlation of fairly low. For and age, r = This is a negative correlation of fairly low. Figure 5 shows the correlations between several simulated variables. Each pair of variables was generated to have the correlation shown above it. The first panel in Figure 5 shows a perfect correlation. The points lie exactly on a straight line and we could calculate Y exactly from X. In fact, Y = X; they could not be more closely related. r = when large values of one variable are associated with large values of the other and the points lie exactly on a straight line. The second panel shows a strong but not perfect positive relationship. The third panel also shows a positive relationship, but less strong. The size of the correlation coefficient clearly reflects the degree of closeness on the scatter diagram. The correlation coefficient is positive when large values of one variable are associated with large values of the other.

5 r = 1.0 r = 0.9 r = 0.5 r = 0.0 r = 0.0 r = 0.92 r = -0.3 r = r = -1.0 Figure 5. Simulated data from populations with different relationships between the two variables, and the population correlation coefficient

6 The fourth panel In Figure 5 shows what happens when there is no relationship at all, r = This is the not only way r can be equal to zero, however. The fifth panel shows data where there is a relationship, because large values of Y are associated with small values of X and with large values of X, whereas small values of Y are associated with values of X in the middle of the range. The products about the mean will be positive in the upper left and upper right quadrants and negative in the lower left and lower right quadrants, giving a sum which is zero. It is possible for r to be equal to 0.00 when there is a relationship which is not linear. A correlation r = 0.00 means that there is no linear relationship, i.e. that there is no relationship where large values of one variable are consistently associated either with large or with small values of the other, but not both. The sixth panel shows another perfect relationship, but not a straight line. The correlation coefficient is less than r will not equal 1.00 or when there is a perfect relationship unless the points lie on a straight line. Correlation measures closeness to a linear relationship, not to any perfect relationship. The correlation coefficient is negative when large values of one variable are associated with small values of the other. The seventh panel in Figure 5 shows a rather weak negative relationship, the eighth a strong one, and the ninth panel a perfect negative relationship. r = 1.00 when large values of one variable are associated with small values of the other and the points lie on a straight line. Test of significance and confidence interval for r We can test the null hypothesis that the correlation coefficient in the population is zero. This is done by a simple t test. The distribution of r if the null hypothesis is true, i.e. in the absence of any relationship in the population, depends only on the number of observations. This is often described in the terms of the degrees of freedom for the t test, which is the number of observations minus 2. Because of this, it is possible to tabulate the critical value for the test for different sample sizes. Bland (0) gives a table. For the test of significance to be valid, we must assume that: at least one of the variables is from a Normal distribution, the observations are independent. Large deviations from the assumptions make the P value for this test very unreliable. For the muscle and height data of Figure 1, r = 0.42, P = Computer programmes almost always print this when they calculate a correlation coefficient. As a result you will rarely see a correlation coefficient reported without it, even when the null hypothesis that the correlation in the population is equal to zero is absurd. We can find a confidence interval for the correlation coefficient in the population, too. The distribution of the sample correlation coefficient when the null hypothesis is not true, i.e. when there is a relationship, is very awkward. It does not become approximately Normal until the sample size is in the thousands. We use a very clever but rather intimidating mathematical function called Fisher s z transformation. This produces a very close approximation to a Normal distribution with a fairly simple expression for its mean and variance (see Bland 0 if you really want to know). This can be used to calculate a 95% confidence interval on the transformed scale, which can then be transformed back to the correlation coefficient scale. For the and height data, r = 0.42, and the approximate 95% confidence interval is

7 0.13 to As is usual for confidence intervals which are back transformed, this is not symmetrical about the point estimate, r. For Fisher s z transformation to be valid, we must make a much stronger assumption about the distributions than for the test of significance. We must assume that both of the variables are from Normal distributions. Large deviations from this assumption can make the confidence interval very unreliable. The use of Fisher s z is tricky without a computer, approximate, and requires a strong assumption. Computer programmes rarely print this confidence interval and so you rarely see it, which is a pity. References Bland M. (0) An Introduction to Medical Statistics. Oxford University Press. Hickish T, Colston K, Bland JM, Maxwell JD. (1989) Vitamin D deficiency and muscle in male alcoholics. Clinical Science 77,

### DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,

BIOSTATISTICS QUIZ ANSWERS 1. When you read scientific literature, do you know whether the statistical tests that were used were appropriate and why they were used? a. Always b. Mostly c. Rarely d. Never

### Week 4: Standard Error and Confidence Intervals

Health Sciences M.Sc. Programme Applied Biostatistics Week 4: Standard Error and Confidence Intervals Sampling Most research data come from subjects we think of as samples drawn from a larger population.

### CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the

### Pearson s correlation

Pearson s correlation Introduction Often several quantitative variables are measured on each member of a sample. If we consider a pair of such variables, it is frequently of interest to establish if there

### Lecture 18 Linear Regression

Lecture 18 Statistics Unit Andrew Nunekpeku / Charles Jackson Fall 2011 Outline 1 1 Situation - used to model quantitative dependent variable using linear function of quantitative predictor(s). Situation

### Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

### 1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression

### Inferential Statistics

Inferential Statistics Sampling and the normal distribution Z-scores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are

### Simple Linear Regression Chapter 11

Simple Linear Regression Chapter 11 Rationale Frequently decision-making situations require modeling of relationships among business variables. For instance, the amount of sale of a product may be related

### 4. Describing Bivariate Data

4. Describing Bivariate Data A. Introduction to Bivariate Data B. Values of the Pearson Correlation C. Properties of Pearson's r D. Computing Pearson's r E. Variance Sum Law II F. Exercises A dataset with

### Simple Regression Theory II 2010 Samuel L. Baker

SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

### Regression. In this class we will:

AMS 5 REGRESSION Regression The idea behind the calculation of the coefficient of correlation is that the scatter plot of the data corresponds to a cloud that follows a straight line. This idea can be

### AMS7: WEEK 8. CLASS 1. Correlation Monday May 18th, 2015

AMS7: WEEK 8. CLASS 1 Correlation Monday May 18th, 2015 Type of Data and objectives of the analysis Paired sample data (Bivariate data) Determine whether there is an association between two variables This

### Module 3: Correlation and Covariance

Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis

### Spearman s correlation

Spearman s correlation Introduction Before learning about Spearman s correllation it is important to understand Pearson s correlation which is a statistical measure of the strength of a linear relationship

### OLS is not only unbiased it is also the most precise (efficient) unbiased estimation technique - ie the estimator has the smallest variance

Lecture 5: Hypothesis Testing What we know now: OLS is not only unbiased it is also the most precise (efficient) unbiased estimation technique - ie the estimator has the smallest variance (if the Gauss-Markov

### MEASURES OF VARIATION

NORMAL DISTRIBTIONS MEASURES OF VARIATION In statistics, it is important to measure the spread of data. A simple way to measure spread is to find the range. But statisticians want to know if the data are

### Association Between Variables

Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi

### Regression Analysis: A Complete Example

Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty

### Homework 11. Part 1. Name: Score: / null

Name: Score: / Homework 11 Part 1 null 1 For which of the following correlations would the data points be clustered most closely around a straight line? A. r = 0.50 B. r = -0.80 C. r = 0.10 D. There is

### Correlation key concepts:

CORRELATION Correlation key concepts: Types of correlation Methods of studying correlation a) Scatter diagram b) Karl pearson s coefficient of correlation c) Spearman s Rank correlation coefficient d)

### Statistics for Management II-STAT 362-Final Review

Statistics for Management II-STAT 362-Final Review Multiple Choice Identify the letter of the choice that best completes the statement or answers the question. 1. The ability of an interval estimate to

### Using Excel for inferential statistics

FACT SHEET Using Excel for inferential statistics Introduction When you collect data, you expect a certain amount of variation, just caused by chance. A wide variety of statistical tests can be applied

### X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.

### Correlational Research. Correlational Research. Stephen E. Brock, Ph.D., NCSP EDS 250. Descriptive Research 1. Correlational Research: Scatter Plots

Correlational Research Stephen E. Brock, Ph.D., NCSP California State University, Sacramento 1 Correlational Research A quantitative methodology used to determine whether, and to what degree, a relationship

### A correlation exists between two variables when one of them is related to the other in some way.

Lecture #10 Chapter 10 Correlation and Regression The main focus of this chapter is to form inferences based on sample data that come in pairs. Given such paired sample data, we want to determine whether

### The aspect of the data that we want to describe/measure is the degree of linear relationship between and The statistic r describes/measures the degree

PS 511: Advanced Statistics for Psychological and Behavioral Research 1 Both examine linear (straight line) relationships Correlation works with a pair of scores One score on each of two variables ( and

### CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

### Lesson Lesson Outline Outline

Lesson 15 Linear Regression Lesson 15 Outline Review correlation analysis Dependent and Independent variables Least Squares Regression line Calculating l the slope Calculating the Intercept Residuals and

### not to be republished NCERT Measures of Central Tendency

You have learnt in previous chapter that organising and presenting data makes them comprehensible. It facilitates data processing. A number of statistical techniques are used to analyse the data. In this

### Hypothesis Testing - Relationships

- Relationships Session 3 AHX43 (28) 1 Lecture Outline Correlational Research. The Correlation Coefficient. An example. Considerations. One and Two-tailed Tests. Errors. Power. for Relationships AHX43

### Research Variables. Measurement. Scales of Measurement. Chapter 4: Data & the Nature of Measurement

Chapter 4: Data & the Nature of Graziano, Raulin. Research Methods, a Process of Inquiry Presented by Dustin Adams Research Variables Variable Any characteristic that can take more than one form or value.

### Square roots by subtraction

Square roots by subtraction Frazer Jarvis When I was at school, my mathematics teacher showed me the following very strange method to work out square roots, using only subtraction, which is apparently

### Session 7 Bivariate Data and Analysis

Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table co-variation least squares

### Introduction to Quantitative Methods

Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................

### 17.0 Linear Regression

17.0 Linear Regression 1 Answer Questions Lines Correlation Regression 17.1 Lines The algebraic equation for a line is Y = β 0 + β 1 X 2 The use of coordinate axes to show functional relationships was

### Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables

### CORRELATIONAL ANALYSIS: PEARSON S r Purpose of correlational analysis The purpose of performing a correlational analysis: To discover whether there

CORRELATIONAL ANALYSIS: PEARSON S r Purpose of correlational analysis The purpose of performing a correlational analysis: To discover whether there is a relationship between variables, To find out the

### where b is the slope of the line and a is the intercept i.e. where the line cuts the y axis.

Least Squares Introduction We have mentioned that one should not always conclude that because two variables are correlated that one variable is causing the other to behave a certain way. However, sometimes

### Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

### Exercise 1.12 (Pg. 22-23)

Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.

### E205 Final: Version B

Name: Class: Date: E205 Final: Version B Multiple Choice Identify the choice that best completes the statement or answers the question. 1. The owner of a local nightclub has recently surveyed a random

### Projects Involving Statistics (& SPSS)

Projects Involving Statistics (& SPSS) Academic Skills Advice Starting a project which involves using statistics can feel confusing as there seems to be many different things you can do (charts, graphs,

### Lecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation

Lecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation Display and Summarize Correlation for Direction and Strength Properties of Correlation Regression Line Cengage

### Simple Regression and Correlation

Simple Regression and Correlation Today, we are going to discuss a powerful statistical technique for examining whether or not two variables are related. Specifically, we are going to talk about the ideas

### Module 2 Project Maths Development Team Draft (Version 2)

5 Week Modular Course in Statistics & Probability Strand 1 Module 2 Analysing Data Numerically Measures of Central Tendency Mean Median Mode Measures of Spread Range Standard Deviation Inter-Quartile Range

### 7. Tests of association and Linear Regression

7. Tests of association and Linear Regression In this chapter we consider 1. Tests of Association for 2 qualitative variables. 2. Measures of the strength of linear association between 2 quantitative variables.

### Statistical tests for SPSS

Statistical tests for SPSS Paolo Coletti A.Y. 2010/11 Free University of Bolzano Bozen Premise This book is a very quick, rough and fast description of statistical tests and their usage. It is explicitly

### 11. Analysis of Case-control Studies Logistic Regression

Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:

### Outline. Correlation & Regression, III. Review. Relationship between r and regression

Outline Correlation & Regression, III 9.07 4/6/004 Relationship between correlation and regression, along with notes on the correlation coefficient Effect size, and the meaning of r Other kinds of correlation

### Simple linear regression

Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

### Content DESCRIPTIVE STATISTICS. Data & Statistic. Statistics. Example: DATA VS. STATISTIC VS. STATISTICS

Content DESCRIPTIVE STATISTICS Dr Najib Majdi bin Yaacob MD, MPH, DrPH (Epidemiology) USM Unit of Biostatistics & Research Methodology School of Medical Sciences Universiti Sains Malaysia. Introduction

### STRAND B: Number Theory. UNIT B2 Number Classification and Bases: Text * * * * * Contents. Section. B2.1 Number Classification. B2.

STRAND B: Number Theory B2 Number Classification and Bases Text Contents * * * * * Section B2. Number Classification B2.2 Binary Numbers B2.3 Adding and Subtracting Binary Numbers B2.4 Multiplying Binary

### Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade

Statistics Quiz Correlation and Regression -- ANSWERS 1. Temperature and air pollution are known to be correlated. We collect data from two laboratories, in Boston and Montreal. Boston makes their measurements

### The scatterplot indicates a positive linear relationship between waist size and body fat percentage:

STAT E-150 Statistical Methods Multiple Regression Three percent of a man's body is essential fat, which is necessary for a healthy body. However, too much body fat can be dangerous. For men between the

### Means, standard deviations and. and standard errors

CHAPTER 4 Means, standard deviations and standard errors 4.1 Introduction Change of units 4.2 Mean, median and mode Coefficient of variation 4.3 Measures of variation 4.4 Calculating the mean and standard

### Introduction to Statistics for Computer Science Projects

Introduction Introduction to Statistics for Computer Science Projects Peter Coxhead Whole modules are devoted to statistics and related topics in many degree programmes, so in this short session all I

### the number of organisms in the squares of a haemocytometer? the number of goals scored by a football team in a match?

Poisson Random Variables (Rees: 6.8 6.14) Examples: What is the distribution of: the number of organisms in the squares of a haemocytometer? the number of hits on a web site in one hour? the number of

### SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables.

SIMPLE LINEAR CORRELATION Simple linear correlation is a measure of the degree to which two variables vary together, or a measure of the intensity of the association between two variables. Correlation

### Factors affecting online sales

Factors affecting online sales Table of contents Summary... 1 Research questions... 1 The dataset... 2 Descriptive statistics: The exploratory stage... 3 Confidence intervals... 4 Hypothesis tests... 4

### Statistics GCSE Higher Revision Sheet

Statistics GCSE Higher Revision Sheet This document attempts to sum up the contents of the Higher Tier Statistics GCSE. There is one exam, two hours long. A calculator is allowed. It is worth 75% of the

### COMP6053 lecture: Relationship between two variables: correlation, covariance and r-squared. jn2@ecs.soton.ac.uk

COMP6053 lecture: Relationship between two variables: correlation, covariance and r-squared jn2@ecs.soton.ac.uk Relationships between variables So far we have looked at ways of characterizing the distribution

### Section 3 Part 1. Relationships between two numerical variables

Section 3 Part 1 Relationships between two numerical variables 1 Relationship between two variables The summary statistics covered in the previous lessons are appropriate for describing a single variable.

### Regression Analysis: Basic Concepts

The simple linear model Regression Analysis: Basic Concepts Allin Cottrell Represents the dependent variable, y i, as a linear function of one independent variable, x i, subject to a random disturbance

### Sydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1. 1. Introduction p. 2. 2. Statistical Methods Used p. 5. 3. 10 and under Males p.

Sydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1 Table of Contents 1. Introduction p. 2 2. Statistical Methods Used p. 5 3. 10 and under Males p. 8 4. 11 and up Males p. 10 5. 10 and under

### Lesson 4 Part 1. Relationships between. two numerical variables. Correlation Coefficient. Relationship between two

Lesson Part Relationships between two numerical variables Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear between two numerical variables Relationship

### LEARNING OBJECTIVES SCALES OF MEASUREMENT: A REVIEW SCALES OF MEASUREMENT: A REVIEW DESCRIBING RESULTS DESCRIBING RESULTS 8/14/2016

UNDERSTANDING RESEARCH RESULTS: DESCRIPTION AND CORRELATION LEARNING OBJECTIVES Contrast three ways of describing results: Comparing group percentages Correlating scores Comparing group means Describe

### Hence, multiplying by 12, the 95% interval for the hourly rate is (965, 1435)

Confidence Intervals for Poisson data For an observation from a Poisson distribution, we have σ 2 = λ. If we observe r events, then our estimate ˆλ = r : N(λ, λ) If r is bigger than 20, we can use this

### REGRESSION LINES IN STATA

REGRESSION LINES IN STATA THOMAS ELLIOTT 1. Introduction to Regression Regression analysis is about eploring linear relationships between a dependent variable and one or more independent variables. Regression

### T adult = 96 T child = 114.

Homework Solutions Do all tests at the 5% level and quote p-values when possible. When answering each question uses sentences and include the relevant JMP output and plots (do not include the data in your

### Correlation. What Is Correlation? Perfect Correlation. Perfect Correlation. Greg C Elvers

Correlation Greg C Elvers What Is Correlation? Correlation is a descriptive statistic that tells you if two variables are related to each other E.g. Is your related to how much you study? When two variables

### Study Resources For Algebra I. Unit 1C Analyzing Data Sets for Two Quantitative Variables

Study Resources For Algebra I Unit 1C Analyzing Data Sets for Two Quantitative Variables This unit explores linear functions as they apply to data analysis of scatter plots. Information compiled and written

### STATISTICS 151 SECTION 1 FINAL EXAM MAY

STATISTICS 151 SECTION 1 FINAL EXAM MAY 2 2009 This is an open book exam. Course text, personal notes and calculator are permitted. You have 3 hours to complete the test. Personal computers and cellphones

### How do I analyse observer variation studies?

How do I analyse observer variation studies? This is based on work done for a long term project by Doug Altman and Martin Bland. I recommend that you first read our three Statistics Notes on measurement

### Study Guide for the Final Exam

Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make

### Descriptive Statistics

Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

### ID X Y

Dale Berger SPSS Step-by-Step Regression Introduction: MRC01 This step-by-step example shows how to enter data into SPSS and conduct a simple regression analysis to develop an equation to predict from.

### 5. Linear regression and correlation

Statistics for Engineers 5-1 5. Linear regression and correlation If we measure a response variable at various values of a controlled variable, linear regression is the process of fitting a straight line

### " Y. Notation and Equations for Regression Lecture 11/4. Notation:

Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

### Univariate Regression

Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is

### Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables 2

Lesson 4 Part 1 Relationships between two numerical variables 1 Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables

### Elementary Statistics. Scatter Plot, Regression Line, Linear Correlation Coefficient, and Coefficient of Determination

Scatter Plot, Regression Line, Linear Correlation Coefficient, and Coefficient of Determination What is a Scatter Plot? A Scatter Plot is a plot of ordered pairs (x, y) where the horizontal axis is used

### Risk Engines in Diabetes

Risk Engines in Diabetes Martin Bland Prof. of Health Statistics University of York http://www-users.york.ac.uk/~mb55/msc/ What is a risk engine? A fancy name for a prediction of risk from the characteristics

### Outline of Topics. Statistical Methods I. Types of Data. Descriptive Statistics

Statistical Methods I Tamekia L. Jones, Ph.D. (tjones@cog.ufl.edu) Research Assistant Professor Children s Oncology Group Statistics & Data Center Department of Biostatistics Colleges of Medicine and Public

### NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

### The Big Picture. Correlation. Scatter Plots. Data

The Big Picture Correlation Bret Hanlon and Bret Larget Department of Statistics Universit of Wisconsin Madison December 6, We have just completed a length series of lectures on ANOVA where we considered

### AP Statistics 1998 Scoring Guidelines

AP Statistics 1998 Scoring Guidelines These materials are intended for non-commercial use by AP teachers for course and exam preparation; permission for any other use must be sought from the Advanced Placement

### In Chapter 27 we tried to predict the percent body fat of male subjects from

29 C H A P T E R Multiple Regression WHO WHAT UNITS WHEN WHERE WHY 25 Male subjects Body fat and waist size %Body fat and inches 199s United States Scientific research In Chapter 27 we tried to predict

### Unit 21 Student s t Distribution in Hypotheses Testing

Unit 21 Student s t Distribution in Hypotheses Testing Objectives: To understand the difference between the standard normal distribution and the Student's t distributions To understand the difference between

### . 58 58 60 62 64 66 68 70 72 74 76 78 Father s height (inches)

PEARSON S FATHER-SON DATA The following scatter diagram shows the heights of 1,0 fathers and their full-grown sons, in England, circa 1900 There is one dot for each father-son pair Heights of fathers and

### EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY (formerly the Examinations of the Institute of Statisticians) HIGHER CERTIFICATE IN STATISTICS, 2007

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY (formerly the Examinations of the Institute of Statisticians) HIGHER CERTIFICATE IN STATISTICS, 2007 Paper II : Statistical Methods Time Allowed: Three Hours

### 11/20/2014. Correlational research is used to describe the relationship between two or more naturally occurring variables.

Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are highly extraverted people less afraid of rejection

### Chapter 13 Introduction to Linear Regression and Correlation Analysis

Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing

### Introduction to Stata

Introduction to Stata September 23, 2014 Stata is one of a few statistical analysis programs that social scientists use. Stata is in the mid-range of how easy it is to use. Other options include SPSS,

### 17. SIMPLE LINEAR REGRESSION II

17. SIMPLE LINEAR REGRESSION II The Model In linear regression analysis, we assume that the relationship between X and Y is linear. This does not mean, however, that Y can be perfectly predicted from X.

### GLOSSARY FOR DESIGN OF EXPERIMENTS

GLOSSARY FOR DESIGN OF EXPERIMENTS Adaptation provenant du site Internet http://www.mathoptions.com ANOVA -- An acronym for "ANalysis Of VAriance," a statistical technique that separates the variation

### Chapter 3: Data Description Numerical Methods

Chapter 3: Data Description Numerical Methods Learning Objectives Upon successful completion of Chapter 3, you will be able to: Summarize data using measures of central tendency, such as the mean, median,