Regression. In this class we will:
|
|
|
- Gyles Merritt
- 9 years ago
- Views:
Transcription
1 AMS 5 REGRESSION
2 Regression The idea behind the calculation of the coefficient of correlation is that the scatter plot of the data corresponds to a cloud that follows a straight line. This idea can be formalized by regression methods. In this class we will: Consider the definition of simple linear regression Find a method to predict an individual value Use the normal curve to estimate the percentile ranks Describe the regression effect Compute the regression errors and its RMS Study the behavior of regression errors
3 Regression The regression method describes how one variable depends on another. The Northern California temperature data have average altitude of 3,524 feet and a SD of 1,839 feet; average temperature of 70.3 degrees and SD 6.5 degrees. The correlation between temperature and altitude is
4 Regression The idea behind the calculation of the coefficient of correlation is that the scatter plot of the data corresponds to a cloud that follows a straight line. This idea can be formalized by regression methods. In this class we will: Consider the definition of simple linear regression Find a method to predict an individual value Use the normal curve to estimate the percentile ranks Describe the regression effect Compute the regression errors and its RMS Study the behavior of regression errors
5 Regression The cloud of points shows a mild negative association between the two variables, as does the value of r. Can we use the values of altitude to estimate the average values of temperature?
6 Regression How does the regression line work? Associated with an increase of one SD in x there is an increase of r SDs in y on average. Clearly, if the correlation coefficient is negative, then the average value of y decreases as x increases. In the temperature and altitude example, an increase of height of 1,839 feet produces a increase of = degrees in the average temperature.
7 Regression How do we use the method to predict an individual value? If we consider two variables x and y and we want to predict the value of y for a specific value of x, we use the average value of y that corresponds to the value of x according to the regression method. Example: The first year GPAs and the Math SAT for the students of a university produce the following data average SAT score = 550, SD = 80 average 1st-year GPA = 2.6, SD = 0.6 r = 0.40 We want to predict the 1st-year GPA of a student with a SAT score of 650.
8 Regression The student's SAT score in standard units is = so the score is 1.25 SDs above average. An increase of one SD above the average SAT score produces an increase of 0,4 0,6 GPA points. This implies that our student will have an increase of = 0.3 points of GPA above average. Since the average GPA is 2.6, the predicted GPA is = 2.9 This is the average GPA that we expect for students with STA scores around 650.
9 Regression WARNING: You can use the regression method on new subjects provided that they are similar to the ones that were used to produce the averages, SDs and r used in the regression method. In the previous example the method will not be valid for students of a different institution.
10 Regression We can use the regression method and the normal curve to produce estimates of the percentile ranks. Example: In the previous example suppose a student has a percentile rank of 90% for the SAT scores. That is, only 10% of the scores are higher than his. What is the predicted percentile rank for the 1st-year GPA of this student? Using the normal curve we have that a 90% probability corresponds to z score of 1.3. This means that the student's SAT score is 1.3 SDs above average. This corresponds to being SDs above the average GPA and this corresponds to an accumulated probability, under the normal curve, of approximately 69%.
11 Regression So the percentile rank on 1st-year GPA of a student with a percentile rank on SAT score of 90% is predicted to be 69%. Notice that the student with a SAT percentile rank of 90% was `pulled down' to only 69% by the regression method. Why is that? Suppose the correlation was perfect, r = 1, then 90% will convert to 90%. The other extreme is that there is no correlation, so, in the absence of any information, the best guess is the median or 50% percentile. The regression method produces a rank that is somewhere between these two extremes.
12 Example The shoe size and the heights of 14 men are recorded. The shoe size average is with a SD of The average height is inches with a SD of 2.45 inches. The correlation is What is the average height of a man that uses shoes of size 11.5? We convert 11.5 to standard units = so the shoe size is units above average. This means that the height will be = 1.95 inches above average. So the average height of a man with shoe size 11.5 will be = inches.
13 Regression effect Galton, a British statistician, studied the relationship between the height of the fathers and the sons in 1,078 families. He noticed that tall fathers tended to have shorter sons and short fathers tended to have taller sons. He termed this fact regression to mediocrity. This is where the term regression comes from. Example: Children are tested for IQ before and after taking a preschool program. In both cases the scores average 100 and the SD is 15. So, on average, there seems to be no effect. Nevertheless children below average in the first test had an average gain of 5 IQ and those above average had an average loss of 5 IQ. This is regression effect.
14 Regression effect A model for the test-retest situation is observed test score = true score + chance error Suppose that the chance error can be either positive or negative. Suppose that the true scores in the population follow the normal curve with an average of 100 and a SD of 15. Consider the children who scored 140 on the first test. There are two possibilities: true score below 140, with a positive chance error true score above 140, with a negative chance error Which one is more likely? According to the normal curve, the first possibility is more likely, since the mean is 100 and so the interval above 140 has less probability than the one below 140. Under this scenario, the second test is more likely to produce a value below 140.
15 Regression effect A symmetric situation is valid for those scoring, say, 80 IQ. It is likely that the true test is above 80 with a negative chance error, and so the second score is likely to be above the first. In other words, if a students scores above average in the first test, it is likely that the true score is lower than the observed one. If the student takes the test again, chances are that the second score will be lower than the first. A symmetric situation is true for a person scoring below average in the first test. This explains the regression effect.
16 Regression errors The regression method can be used to predict y from x. But actual values differ from predictions. These are the regression errors. error = actual value of y - predicted value of y Some of the errors defined in this way are positive and some are negative. Reflecting the fact that some observations are above and some are below the regression line. How do we measure the error in a regression? The overall size of the error is measured using the root-meansquare (RMS), as we did to obtain the SD. This is equal to where N is the number of points in the scatter diagram.
17 Regression errors What if we ignore the values of x? Then our prediction for y is the average of y. In this case the RMS error coincides with the SD of y.
18 Computing the RMS error We saw that the error that corresponds to a prediction where the values of x are ignored corresponds to the SD of y. The overall size of the error for a regression using x has to be smaller than the SD. How much smaller? 2 RMS error= 1 r SD of y We observe the following features The units of the RMS error are the same as the units of the variable being predicted. Perfect correlation corresponds to zero RMS error. Zero correlation corresponds to maximum RMS error (equal to SD of y).
19 Computing the RMS error Example 1: In the California temperature example we had that the SD of y is 6.5 degrees and the correlation is -0.76, then degrees 4.22 degrees So, in this case, knowing the altitude reduces the SD from 6.5 to 4.22 degrees. Example 2: In the shoe sizes examples we had that the SD of y is 2.45 inches and the correlation is 0.93, then inches 0.90 inches So we observe that, knowing the shoe size produces a dramatic reduction of the SD from 2.45 to 0.90.
20 Plotting the residuals Prediction errors are usually called residuals. It is important to explore the graphical properties of residuals to find out about the goodness of the fit by the regression line. In a residual plot the x coordinates are the same as for the original data. The y coordinates correspond to the values of the residuals. So there is one point for each point in the original scatter diagram.
21 Thus, if everything is OK with the regression line, we expect to see a cloud of points around the zero line in the y axis. Plotting the residuals We expect to see no trends or clusters in the residuals There should be about the same number of positive as negative residuals A histogram of the residuals should look symmetric around zero
22 Problem The following results are taken from a study of about 1,000 families: average height of husband 68 inches, SD 2.7 inches average height of wife 63 inches, SD 2.5, r 0.25 Predict the height of a wife when the height of her husband is inches The husband is 4 inches above average height. This is 4/2.7 = 1.5 SD above the average. So the wife is predicted to have r 1.5 = this corresponds to = 1 inch inches This the husband is right on the average, so the wife will be right on the average as well.
23 Prediction for data in a vertical strip Example: A law school finds the following relationship between LSAT scores and first-year scores average LSAT score = 162, SD = 6 average first-year score = 68, SD = 10, r=0.60 Q: About what percentage of the students had first-year scores over 75? A: We use the normal curve approximation. Converting to standard units = 0.7 this corresponds to a right hand tail of 14% under the normal curve.
24 Prediction for data in a vertical strip Q: Of the students who scored 165 on the LSAT, about what percentage had first-year scores over 75? A: We first convert to standard units for the x variable: = 0.5 then convert to standard units for the y variable r 0.5= = 0.3 which corresponds to = 3 points above average or 68+3 = 71. Since the data corresponding to a strip are a smaller and more homogeneous sample, the corresponding SD will be smaller. How much smaller?
25 Prediction for data in a vertical strip Example: A law school finds the following relationship between LSAT scores and first-year scores average LSAT score = 162, SD = 6 average first-year score = 68, SD = 10, r=0.60 Q: About what percentage of the students had first-year scores over 75? A: We use the normal curve approximation. Converting to standard units = 0.7 this corresponds to a right hand tail of 14% under the normal curve.
26 Prediction for data in a vertical strip We expect the dispersion in the y variable to be about the same for each vertical strip. This is given by the RMS error, thus the new SD is r SD of y= = 8 points This new SD can be used to convert to standard units = and, using the normal curve, we obtain an area of 31% above 0.5. This is the percentage of students scoring more than 75 in the first year among those who scored 165 in the LSAT. Notice that this percentage is higher than the 14% we obtained before. This is because we have focus on a smaller portion of the sample, obtaining a smaller SD.
27 Prediction for data in a vertical strip In summary, when considering data for a vertical strip: Convert to standard units in the x variable. Obtain the predicted value of the y variable. Calculate the SD for the y variable in the strip using RMS error. Convert to standard units in the y variable and use the normal curve.
28 Slope and intercept All lines can be determined by a slope and an intercept. The intercept is the height of the line when x = 0. The slope is the rate at which y increases, per unit increase in x. If the slope is negative then y decreases as x increases.
29 Slope and intercept How do you get the slope of a regression line? Example: A sample of 555 California men age in 1993 was surveyed to find out about education and income. The data are summarized by average education 12.5 years; SD 4 years average income $21,500; SD $16,000; r 0.35 This means that, for every increase of one SD in education, there is an increase of r SD in income. Thus, 4 extra years of education are worth an extra 0.35 $16,000 = $5,600 of income. So, each extra year is worth 0.35 $16, this, is the slope of the regression line. = $1, 400
30 Slope and intercept The intercept of the regression line is given by the value of y when x = 0. This is 12.5 years below average in education. Since each year costs $1,400, a man with no education should have an income which is below average by 12.5 years $1,400 per year = $17,500 since the average income is $21,500, the income of a man with no education is $21,500 -$17,500 = $4,000. This is the intercept of the regression line. This corresponds to the change in y associated with one unit increase in x.
31 Slope and intercept This is given by average of y - slope average of x The equation for the regression line is called the regression equation and can be written as y= slope x+ intercept So, for our example, we have that predicted income = $1,400 per year education + $4,000
32 Slope and intercept Q: What is the predicted income of a man with an education of 15 years? A: Using the regression equation we have y = $1, $4,000 = $25,000 we can plug in any value of education and obtain the expected income for that level of education. Warning: It is usually a bad idea to use the regression line for extrapolations.
33 Example Back to our shoe size example. The shoe size and the heights of 14 men are recorded. The shoe size average is with a SD of The average height is inches with a SD of 2.45 inches. The correlation is r SD of height The slope of the regression line is = = 1.88 SD of shoe size 1.21 To obtain the intercept we consider a show size of zero. This is units below average and so will correspond to a height that is = inches below average. So it corresponds to a height of = inches. The regression line is height = 1.88 shoe size inches Q: What is the predicted height of a man with a show size of 9? A: Using the regression equation we have inches = inches
34 Least Square Consider a cloud of points produced by obtaining the scatter diagram of observations corresponding to two variables x and y. There are many lines that we can draw through the cloud. Which is the straight line that fits the points best? The regression line is a possible solution to this problem. This is the reason why the regression line is called the least squares line.
35 Least Square Example: Let b be the length of a spring with no load. If a load x is attached to the spring the stretch is proportional to x. Thus the length of the string is y = mx + b. where m and b are constants that depend on the string. An experiment is run to determine the constants for a given spring, the data are shown in the table. The correlation coefficient is r = 0.999, so the points are very close to straight line. But they are not exactly on a straight line. This is probably due to measurement error. The regression line for these data produces estimates of b and m, given, respectively, by the intercept and the slope of the line. The values are m 0.5c per kg, and b cm. These are the least squares estimates of m and b.
36 Problem Find the regression equation for predicting final score from midterm score, based on the following information: average midterm score = 70, SD = 10 average final score = 55, SD = 20, r = 0.60 The slope of the line can be obtained as r SD of final = = 1.2 SD of midterm 10 A score of 0 in the midterm will correspond to a final score that is = 84 units below average. So the intercept is = -29 units of the final score. Thus, the regression equation is final score = 1.2 midterm score - 29
. 58 58 60 62 64 66 68 70 72 74 76 78 Father s height (inches)
PEARSON S FATHER-SON DATA The following scatter diagram shows the heights of 1,0 fathers and their full-grown sons, in England, circa 1900 There is one dot for each father-son pair Heights of fathers and
Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade
Statistics Quiz Correlation and Regression -- ANSWERS 1. Temperature and air pollution are known to be correlated. We collect data from two laboratories, in Boston and Montreal. Boston makes their measurements
Descriptive statistics; Correlation and regression
Descriptive statistics; and regression Patrick Breheny September 16 Patrick Breheny STA 580: Biostatistics I 1/59 Tables and figures Descriptive statistics Histograms Numerical summaries Percentiles Human
Exercise 1.12 (Pg. 22-23)
Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.
Lecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation
Lecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation Display and Summarize Correlation for Direction and Strength Properties of Correlation Regression Line Cengage
Chapter 11: r.m.s. error for regression
Chapter 11: r.m.s. error for regression Context................................................................... 2 Prediction error 3 r.m.s. error for the regression line...............................................
1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number
1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression
c. Construct a boxplot for the data. Write a one sentence interpretation of your graph.
MBA/MIB 5315 Sample Test Problems Page 1 of 1 1. An English survey of 3000 medical records showed that smokers are more inclined to get depressed than non-smokers. Does this imply that smoking causes depression?
The Correlation Coefficient
The Correlation Coefficient Lelys Bravo de Guenni April 22nd, 2015 Outline The Correlation coefficient Positive Correlation Negative Correlation Properties of the Correlation Coefficient Non-linear association
The correlation coefficient
The correlation coefficient Clinical Biostatistics The correlation coefficient Martin Bland Correlation coefficients are used to measure the of the relationship or association between two quantitative
Correlation key concepts:
CORRELATION Correlation key concepts: Types of correlation Methods of studying correlation a) Scatter diagram b) Karl pearson s coefficient of correlation c) Spearman s Rank correlation coefficient d)
Session 7 Bivariate Data and Analysis
Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table co-variation least squares
DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.
DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,
Pie Charts. proportion of ice-cream flavors sold annually by a given brand. AMS-5: Statistics. Cherry. Cherry. Blueberry. Blueberry. Apple.
Graphical Representations of Data, Mean, Median and Standard Deviation In this class we will consider graphical representations of the distribution of a set of data. The goal is to identify the range of
Simple linear regression
Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between
Applied Data Analysis. Fall 2015
Applied Data Analysis Fall 2015 Course information: Labs Anna Walsdorff [email protected] Tues. 9-11 AM Mary Clare Roche [email protected] Mon. 2-4 PM Lecture outline 1. Practice
MEASURES OF VARIATION
NORMAL DISTRIBTIONS MEASURES OF VARIATION In statistics, it is important to measure the spread of data. A simple way to measure spread is to find the range. But statisticians want to know if the data are
AP Statistics Solutions to Packet 2
AP Statistics Solutions to Packet 2 The Normal Distributions Density Curves and the Normal Distribution Standard Normal Calculations HW #9 1, 2, 4, 6-8 2.1 DENSITY CURVES (a) Sketch a density curve that
CURVE FITTING LEAST SQUARES APPROXIMATION
CURVE FITTING LEAST SQUARES APPROXIMATION Data analysis and curve fitting: Imagine that we are studying a physical system involving two quantities: x and y Also suppose that we expect a linear relationship
Correlational Research. Correlational Research. Stephen E. Brock, Ph.D., NCSP EDS 250. Descriptive Research 1. Correlational Research: Scatter Plots
Correlational Research Stephen E. Brock, Ph.D., NCSP California State University, Sacramento 1 Correlational Research A quantitative methodology used to determine whether, and to what degree, a relationship
Unit 7: Normal Curves
Unit 7: Normal Curves Summary of Video Histograms of completely unrelated data often exhibit similar shapes. To focus on the overall shape of a distribution and to avoid being distracted by the irregularities
Name: Date: Use the following to answer questions 2-3:
Name: Date: 1. A study is conducted on students taking a statistics class. Several variables are recorded in the survey. Identify each variable as categorical or quantitative. A) Type of car the student
Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),
Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables
DATA INTERPRETATION AND STATISTICS
PholC60 September 001 DATA INTERPRETATION AND STATISTICS Books A easy and systematic introductory text is Essentials of Medical Statistics by Betty Kirkwood, published by Blackwell at about 14. DESCRIPTIVE
Lecture 13/Chapter 10 Relationships between Measurement (Quantitative) Variables
Lecture 13/Chapter 10 Relationships between Measurement (Quantitative) Variables Scatterplot; Roles of Variables 3 Features of Relationship Correlation Regression Definition Scatterplot displays relationship
Diagrams and Graphs of Statistical Data
Diagrams and Graphs of Statistical Data One of the most effective and interesting alternative way in which a statistical data may be presented is through diagrams and graphs. There are several ways in
Homework 8 Solutions
Math 17, Section 2 Spring 2011 Homework 8 Solutions Assignment Chapter 7: 7.36, 7.40 Chapter 8: 8.14, 8.16, 8.28, 8.36 (a-d), 8.38, 8.62 Chapter 9: 9.4, 9.14 Chapter 7 7.36] a) A scatterplot is given below.
Density Curve. A density curve is the graph of a continuous probability distribution. It must satisfy the following properties:
Density Curve A density curve is the graph of a continuous probability distribution. It must satisfy the following properties: 1. The total area under the curve must equal 1. 2. Every point on the curve
2. Here is a small part of a data set that describes the fuel economy (in miles per gallon) of 2006 model motor vehicles.
Math 1530-017 Exam 1 February 19, 2009 Name Student Number E There are five possible responses to each of the following multiple choice questions. There is only on BEST answer. Be sure to read all possible
Chapter 7: Simple linear regression Learning Objectives
Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -
Module 3: Correlation and Covariance
Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis
Statistics E100 Fall 2013 Practice Midterm I - A Solutions
STATISTICS E100 FALL 2013 PRACTICE MIDTERM I - A SOLUTIONS PAGE 1 OF 5 Statistics E100 Fall 2013 Practice Midterm I - A Solutions 1. (16 points total) Below is the histogram for the number of medals won
CORRELATIONAL ANALYSIS: PEARSON S r Purpose of correlational analysis The purpose of performing a correlational analysis: To discover whether there
CORRELATIONAL ANALYSIS: PEARSON S r Purpose of correlational analysis The purpose of performing a correlational analysis: To discover whether there is a relationship between variables, To find out the
Linear Regression. Chapter 5. Prediction via Regression Line Number of new birds and Percent returning. Least Squares
Linear Regression Chapter 5 Regression Objective: To quantify the linear relationship between an explanatory variable (x) and response variable (y). We can then predict the average response for all subjects
CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression
Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the
Statistics 151 Practice Midterm 1 Mike Kowalski
Statistics 151 Practice Midterm 1 Mike Kowalski Statistics 151 Practice Midterm 1 Multiple Choice (50 minutes) Instructions: 1. This is a closed book exam. 2. You may use the STAT 151 formula sheets and
Descriptive Statistics and Measurement Scales
Descriptive Statistics 1 Descriptive Statistics and Measurement Scales Descriptive statistics are used to describe the basic features of the data in a study. They provide simple summaries about the sample
1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96
1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
Correlation. What Is Correlation? Perfect Correlation. Perfect Correlation. Greg C Elvers
Correlation Greg C Elvers What Is Correlation? Correlation is a descriptive statistic that tells you if two variables are related to each other E.g. Is your related to how much you study? When two variables
Worksheet A5: Slope Intercept Form
Name Date Worksheet A5: Slope Intercept Form Find the Slope of each line below 1 3 Y - - - - - - - - - - Graph the lines containing the point below, then find their slopes from counting on the graph!.
2. Simple Linear Regression
Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according
Chapter 13 Introduction to Linear Regression and Correlation Analysis
Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing
Section 14 Simple Linear Regression: Introduction to Least Squares Regression
Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship
Logo Symmetry Learning Task. Unit 5
Logo Symmetry Learning Task Unit 5 Course Mathematics I: Algebra, Geometry, Statistics Overview The Logo Symmetry Learning Task explores graph symmetry and odd and even functions. Students are asked to
Describing Relationships between Two Variables
Describing Relationships between Two Variables Up until now, we have dealt, for the most part, with just one variable at a time. This variable, when measured on many different subjects or objects, took
II. DISTRIBUTIONS distribution normal distribution. standard scores
Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,
The Normal Distribution
Chapter 6 The Normal Distribution 6.1 The Normal Distribution 1 6.1.1 Student Learning Objectives By the end of this chapter, the student should be able to: Recognize the normal probability distribution
Scatter Plots with Error Bars
Chapter 165 Scatter Plots with Error Bars Introduction The procedure extends the capability of the basic scatter plot by allowing you to plot the variability in Y and X corresponding to each point. Each
Lesson 4 Measures of Central Tendency
Outline Measures of a distribution s shape -modality and skewness -the normal distribution Measures of central tendency -mean, median, and mode Skewness and Central Tendency Lesson 4 Measures of Central
Section 3 Part 1. Relationships between two numerical variables
Section 3 Part 1 Relationships between two numerical variables 1 Relationship between two variables The summary statistics covered in the previous lessons are appropriate for describing a single variable.
X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)
CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.
with functions, expressions and equations which follow in units 3 and 4.
Grade 8 Overview View unit yearlong overview here The unit design was created in line with the areas of focus for grade 8 Mathematics as identified by the Common Core State Standards and the PARCC Model
Chapter 4: Average and standard deviation
Chapter 4: Average and standard deviation Context................................................................... 2 Average vs. median 3 Average.................................................................
Course Objective This course is designed to give you a basic understanding of how to run regressions in SPSS.
SPSS Regressions Social Science Research Lab American University, Washington, D.C. Web. www.american.edu/provost/ctrl/pclabs.cfm Tel. x3862 Email. [email protected] Course Objective This course is designed
DESCRIPTIVE STATISTICS AND EXPLORATORY DATA ANALYSIS
DESCRIPTIVE STATISTICS AND EXPLORATORY DATA ANALYSIS SEEMA JAGGI Indian Agricultural Statistics Research Institute Library Avenue, New Delhi - 110 012 [email protected] 1. Descriptive Statistics Statistics
Statistics. Measurement. Scales of Measurement 7/18/2012
Statistics Measurement Measurement is defined as a set of rules for assigning numbers to represent objects, traits, attributes, or behaviors A variableis something that varies (eye color), a constant does
6 3 The Standard Normal Distribution
290 Chapter 6 The Normal Distribution Figure 6 5 Areas Under a Normal Distribution Curve 34.13% 34.13% 2.28% 13.59% 13.59% 2.28% 3 2 1 + 1 + 2 + 3 About 68% About 95% About 99.7% 6 3 The Distribution Since
Regression Analysis: A Complete Example
Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty
WEB APPENDIX. Calculating Beta Coefficients. b Beta Rise Run Y 7.1 1 8.92 X 10.0 0.0 16.0 10.0 1.6
WEB APPENDIX 8A Calculating Beta Coefficients The CAPM is an ex ante model, which means that all of the variables represent before-thefact, expected values. In particular, the beta coefficient used in
CALCULATIONS & STATISTICS
CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents
Univariate Regression
Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is
Pennsylvania System of School Assessment
Pennsylvania System of School Assessment The Assessment Anchors, as defined by the Eligible Content, are organized into cohesive blueprints, each structured with a common labeling system that can be read
Section 1.3 Exercises (Solutions)
Section 1.3 Exercises (s) 1.109, 1.110, 1.111, 1.114*, 1.115, 1.119*, 1.122, 1.125, 1.127*, 1.128*, 1.131*, 1.133*, 1.135*, 1.137*, 1.139*, 1.145*, 1.146-148. 1.109 Sketch some normal curves. (a) Sketch
Elementary Statistics
Elementary Statistics Chapter 1 Dr. Ghamsary Page 1 Elementary Statistics M. Ghamsary, Ph.D. Chap 01 1 Elementary Statistics Chapter 1 Dr. Ghamsary Page 2 Statistics: Statistics is the science of collecting,
Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables 2
Lesson 4 Part 1 Relationships between two numerical variables 1 Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables
T O P I C 1 2 Techniques and tools for data analysis Preview Introduction In chapter 3 of Statistics In A Day different combinations of numbers and types of variables are presented. We go through these
A Determination of g, the Acceleration Due to Gravity, from Newton's Laws of Motion
A Determination of g, the Acceleration Due to Gravity, from Newton's Laws of Motion Objective In the experiment you will determine the cart acceleration, a, and the friction force, f, experimentally for
Test Bias. As we have seen, psychological tests can be well-conceived and well-constructed, but
Test Bias As we have seen, psychological tests can be well-conceived and well-constructed, but none are perfect. The reliability of test scores can be compromised by random measurement error (unsystematic
Descriptive Statistics
Y520 Robert S Michael Goal: Learn to calculate indicators and construct graphs that summarize and describe a large quantity of values. Using the textbook readings and other resources listed on the web
1. Suppose that a score on a final exam depends upon attendance and unobserved factors that affect exam performance (such as student ability).
Examples of Questions on Regression Analysis: 1. Suppose that a score on a final exam depends upon attendance and unobserved factors that affect exam performance (such as student ability). Then,. When
Solving Quadratic Equations
9.3 Solving Quadratic Equations by Using the Quadratic Formula 9.3 OBJECTIVES 1. Solve a quadratic equation by using the quadratic formula 2. Determine the nature of the solutions of a quadratic equation
What Does the Normal Distribution Sound Like?
What Does the Normal Distribution Sound Like? Ananda Jayawardhana Pittsburg State University [email protected] Published: June 2013 Overview of Lesson In this activity, students conduct an investigation
Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression
Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a
Descriptive Statistics
Descriptive Statistics Descriptive statistics consist of methods for organizing and summarizing data. It includes the construction of graphs, charts and tables, as well various descriptive measures such
" Y. Notation and Equations for Regression Lecture 11/4. Notation:
Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through
Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs
Types of Variables Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs Quantitative (numerical)variables: take numerical values for which arithmetic operations make sense (addition/averaging)
Mathematics (Project Maths Phase 1)
2012. M128 S Coimisiún na Scrúduithe Stáit State Examinations Commission Leaving Certificate Examination, 2012 Sample Paper Mathematics (Project Maths Phase 1) Paper 2 Ordinary Level Time: 2 hours, 30
First Midterm Exam (MATH1070 Spring 2012)
First Midterm Exam (MATH1070 Spring 2012) Instructions: This is a one hour exam. You can use a notecard. Calculators are allowed, but other electronics are prohibited. 1. [40pts] Multiple Choice Problems
Scatter Plot, Correlation, and Regression on the TI-83/84
Scatter Plot, Correlation, and Regression on the TI-83/84 Summary: When you have a set of (x,y) data points and want to find the best equation to describe them, you are performing a regression. This page
ACTIVITY 6: Falling Objects
UNIT FM Developing Ideas ACTIVITY 6: Falling Objects Purpose and Key Question You developed your ideas about how the motion of an object is related to the forces acting on it using objects that move horizontally.
Bill Burton Albert Einstein College of Medicine [email protected] April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1
Bill Burton Albert Einstein College of Medicine [email protected] April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1 Calculate counts, means, and standard deviations Produce
DesCartes (Combined) Subject: Mathematics Goal: Statistics and Probability
DesCartes (Combined) Subject: Mathematics Goal: Statistics and Probability RIT Score Range: Below 171 Below 171 Data Analysis and Statistics Solves simple problems based on data from tables* Compares
Chapter 9 Descriptive Statistics for Bivariate Data
9.1 Introduction 215 Chapter 9 Descriptive Statistics for Bivariate Data 9.1 Introduction We discussed univariate data description (methods used to eplore the distribution of the values of a single variable)
LAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE
LAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE MAT 119 STATISTICS AND ELEMENTARY ALGEBRA 5 Lecture Hours, 2 Lab Hours, 3 Credits Pre-
Linear Models in STATA and ANOVA
Session 4 Linear Models in STATA and ANOVA Page Strengths of Linear Relationships 4-2 A Note on Non-Linear Relationships 4-4 Multiple Linear Regression 4-5 Removal of Variables 4-8 Independent Samples
Elasticity. I. What is Elasticity?
Elasticity I. What is Elasticity? The purpose of this section is to develop some general rules about elasticity, which may them be applied to the four different specific types of elasticity discussed in
The importance of graphing the data: Anscombe s regression examples
The importance of graphing the data: Anscombe s regression examples Bruce Weaver Northern Health Research Conference Nipissing University, North Bay May 30-31, 2008 B. Weaver, NHRC 2008 1 The Objective
Formula for linear models. Prediction, extrapolation, significance test against zero slope.
Formula for linear models. Prediction, extrapolation, significance test against zero slope. Last time, we looked the linear regression formula. It s the line that fits the data best. The Pearson correlation
Chapter 27: Taxation. 27.1: Introduction. 27.2: The Two Prices with a Tax. 27.2: The Pre-Tax Position
Chapter 27: Taxation 27.1: Introduction We consider the effect of taxation on some good on the market for that good. We ask the questions: who pays the tax? what effect does it have on the equilibrium
This unit will lay the groundwork for later units where the students will extend this knowledge to quadratic and exponential functions.
Algebra I Overview View unit yearlong overview here Many of the concepts presented in Algebra I are progressions of concepts that were introduced in grades 6 through 8. The content presented in this course
Premaster Statistics Tutorial 4 Full solutions
Premaster Statistics Tutorial 4 Full solutions Regression analysis Q1 (based on Doane & Seward, 4/E, 12.7) a. Interpret the slope of the fitted regression = 125,000 + 150. b. What is the prediction for
Introduction to Linear Regression
14. Regression A. Introduction to Simple Linear Regression B. Partitioning Sums of Squares C. Standard Error of the Estimate D. Inferential Statistics for b and r E. Influential Observations F. Regression
Association Between Variables
Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi
Def: The standard normal distribution is a normal probability distribution that has a mean of 0 and a standard deviation of 1.
Lecture 6: Chapter 6: Normal Probability Distributions A normal distribution is a continuous probability distribution for a random variable x. The graph of a normal distribution is called the normal curve.
The Effect of Dropping a Ball from Different Heights on the Number of Times the Ball Bounces
The Effect of Dropping a Ball from Different Heights on the Number of Times the Ball Bounces Or: How I Learned to Stop Worrying and Love the Ball Comment [DP1]: Titles, headings, and figure/table captions
Describing, Exploring, and Comparing Data
24 Chapter 2. Describing, Exploring, and Comparing Data Chapter 2. Describing, Exploring, and Comparing Data There are many tools used in Statistics to visualize, summarize, and describe data. This chapter
COMMON CORE STATE STANDARDS FOR
COMMON CORE STATE STANDARDS FOR Mathematics (CCSSM) High School Statistics and Probability Mathematics High School Statistics and Probability Decisions or predictions are often based on data numbers in
Slope-Intercept Equation. Example
1.4 Equations of Lines and Modeling Find the slope and the y intercept of a line given the equation y = mx + b, or f(x) = mx + b. Graph a linear equation using the slope and the y-intercept. Determine
PLOTTING DATA AND INTERPRETING GRAPHS
PLOTTING DATA AND INTERPRETING GRAPHS Fundamentals of Graphing One of the most important sets of skills in science and mathematics is the ability to construct graphs and to interpret the information they
