Regression. In this class we will:


 Gyles Merritt
 10 months ago
 Views:
Transcription
1 AMS 5 REGRESSION
2 Regression The idea behind the calculation of the coefficient of correlation is that the scatter plot of the data corresponds to a cloud that follows a straight line. This idea can be formalized by regression methods. In this class we will: Consider the definition of simple linear regression Find a method to predict an individual value Use the normal curve to estimate the percentile ranks Describe the regression effect Compute the regression errors and its RMS Study the behavior of regression errors
3 Regression The regression method describes how one variable depends on another. The Northern California temperature data have average altitude of 3,524 feet and a SD of 1,839 feet; average temperature of 70.3 degrees and SD 6.5 degrees. The correlation between temperature and altitude is
4 Regression The idea behind the calculation of the coefficient of correlation is that the scatter plot of the data corresponds to a cloud that follows a straight line. This idea can be formalized by regression methods. In this class we will: Consider the definition of simple linear regression Find a method to predict an individual value Use the normal curve to estimate the percentile ranks Describe the regression effect Compute the regression errors and its RMS Study the behavior of regression errors
5 Regression The cloud of points shows a mild negative association between the two variables, as does the value of r. Can we use the values of altitude to estimate the average values of temperature?
6 Regression How does the regression line work? Associated with an increase of one SD in x there is an increase of r SDs in y on average. Clearly, if the correlation coefficient is negative, then the average value of y decreases as x increases. In the temperature and altitude example, an increase of height of 1,839 feet produces a increase of = degrees in the average temperature.
7 Regression How do we use the method to predict an individual value? If we consider two variables x and y and we want to predict the value of y for a specific value of x, we use the average value of y that corresponds to the value of x according to the regression method. Example: The first year GPAs and the Math SAT for the students of a university produce the following data average SAT score = 550, SD = 80 average 1styear GPA = 2.6, SD = 0.6 r = 0.40 We want to predict the 1styear GPA of a student with a SAT score of 650.
8 Regression The student's SAT score in standard units is = so the score is 1.25 SDs above average. An increase of one SD above the average SAT score produces an increase of 0,4 0,6 GPA points. This implies that our student will have an increase of = 0.3 points of GPA above average. Since the average GPA is 2.6, the predicted GPA is = 2.9 This is the average GPA that we expect for students with STA scores around 650.
9 Regression WARNING: You can use the regression method on new subjects provided that they are similar to the ones that were used to produce the averages, SDs and r used in the regression method. In the previous example the method will not be valid for students of a different institution.
10 Regression We can use the regression method and the normal curve to produce estimates of the percentile ranks. Example: In the previous example suppose a student has a percentile rank of 90% for the SAT scores. That is, only 10% of the scores are higher than his. What is the predicted percentile rank for the 1styear GPA of this student? Using the normal curve we have that a 90% probability corresponds to z score of 1.3. This means that the student's SAT score is 1.3 SDs above average. This corresponds to being SDs above the average GPA and this corresponds to an accumulated probability, under the normal curve, of approximately 69%.
11 Regression So the percentile rank on 1styear GPA of a student with a percentile rank on SAT score of 90% is predicted to be 69%. Notice that the student with a SAT percentile rank of 90% was `pulled down' to only 69% by the regression method. Why is that? Suppose the correlation was perfect, r = 1, then 90% will convert to 90%. The other extreme is that there is no correlation, so, in the absence of any information, the best guess is the median or 50% percentile. The regression method produces a rank that is somewhere between these two extremes.
12 Example The shoe size and the heights of 14 men are recorded. The shoe size average is with a SD of The average height is inches with a SD of 2.45 inches. The correlation is What is the average height of a man that uses shoes of size 11.5? We convert 11.5 to standard units = so the shoe size is units above average. This means that the height will be = 1.95 inches above average. So the average height of a man with shoe size 11.5 will be = inches.
13 Regression effect Galton, a British statistician, studied the relationship between the height of the fathers and the sons in 1,078 families. He noticed that tall fathers tended to have shorter sons and short fathers tended to have taller sons. He termed this fact regression to mediocrity. This is where the term regression comes from. Example: Children are tested for IQ before and after taking a preschool program. In both cases the scores average 100 and the SD is 15. So, on average, there seems to be no effect. Nevertheless children below average in the first test had an average gain of 5 IQ and those above average had an average loss of 5 IQ. This is regression effect.
14 Regression effect A model for the testretest situation is observed test score = true score + chance error Suppose that the chance error can be either positive or negative. Suppose that the true scores in the population follow the normal curve with an average of 100 and a SD of 15. Consider the children who scored 140 on the first test. There are two possibilities: true score below 140, with a positive chance error true score above 140, with a negative chance error Which one is more likely? According to the normal curve, the first possibility is more likely, since the mean is 100 and so the interval above 140 has less probability than the one below 140. Under this scenario, the second test is more likely to produce a value below 140.
15 Regression effect A symmetric situation is valid for those scoring, say, 80 IQ. It is likely that the true test is above 80 with a negative chance error, and so the second score is likely to be above the first. In other words, if a students scores above average in the first test, it is likely that the true score is lower than the observed one. If the student takes the test again, chances are that the second score will be lower than the first. A symmetric situation is true for a person scoring below average in the first test. This explains the regression effect.
16 Regression errors The regression method can be used to predict y from x. But actual values differ from predictions. These are the regression errors. error = actual value of y  predicted value of y Some of the errors defined in this way are positive and some are negative. Reflecting the fact that some observations are above and some are below the regression line. How do we measure the error in a regression? The overall size of the error is measured using the rootmeansquare (RMS), as we did to obtain the SD. This is equal to where N is the number of points in the scatter diagram.
17 Regression errors What if we ignore the values of x? Then our prediction for y is the average of y. In this case the RMS error coincides with the SD of y.
18 Computing the RMS error We saw that the error that corresponds to a prediction where the values of x are ignored corresponds to the SD of y. The overall size of the error for a regression using x has to be smaller than the SD. How much smaller? 2 RMS error= 1 r SD of y We observe the following features The units of the RMS error are the same as the units of the variable being predicted. Perfect correlation corresponds to zero RMS error. Zero correlation corresponds to maximum RMS error (equal to SD of y).
19 Computing the RMS error Example 1: In the California temperature example we had that the SD of y is 6.5 degrees and the correlation is 0.76, then degrees 4.22 degrees So, in this case, knowing the altitude reduces the SD from 6.5 to 4.22 degrees. Example 2: In the shoe sizes examples we had that the SD of y is 2.45 inches and the correlation is 0.93, then inches 0.90 inches So we observe that, knowing the shoe size produces a dramatic reduction of the SD from 2.45 to 0.90.
20 Plotting the residuals Prediction errors are usually called residuals. It is important to explore the graphical properties of residuals to find out about the goodness of the fit by the regression line. In a residual plot the x coordinates are the same as for the original data. The y coordinates correspond to the values of the residuals. So there is one point for each point in the original scatter diagram.
21 Thus, if everything is OK with the regression line, we expect to see a cloud of points around the zero line in the y axis. Plotting the residuals We expect to see no trends or clusters in the residuals There should be about the same number of positive as negative residuals A histogram of the residuals should look symmetric around zero
22 Problem The following results are taken from a study of about 1,000 families: average height of husband 68 inches, SD 2.7 inches average height of wife 63 inches, SD 2.5, r 0.25 Predict the height of a wife when the height of her husband is inches The husband is 4 inches above average height. This is 4/2.7 = 1.5 SD above the average. So the wife is predicted to have r 1.5 = this corresponds to = 1 inch inches This the husband is right on the average, so the wife will be right on the average as well.
23 Prediction for data in a vertical strip Example: A law school finds the following relationship between LSAT scores and firstyear scores average LSAT score = 162, SD = 6 average firstyear score = 68, SD = 10, r=0.60 Q: About what percentage of the students had firstyear scores over 75? A: We use the normal curve approximation. Converting to standard units = 0.7 this corresponds to a right hand tail of 14% under the normal curve.
24 Prediction for data in a vertical strip Q: Of the students who scored 165 on the LSAT, about what percentage had firstyear scores over 75? A: We first convert to standard units for the x variable: = 0.5 then convert to standard units for the y variable r 0.5= = 0.3 which corresponds to = 3 points above average or 68+3 = 71. Since the data corresponding to a strip are a smaller and more homogeneous sample, the corresponding SD will be smaller. How much smaller?
25 Prediction for data in a vertical strip Example: A law school finds the following relationship between LSAT scores and firstyear scores average LSAT score = 162, SD = 6 average firstyear score = 68, SD = 10, r=0.60 Q: About what percentage of the students had firstyear scores over 75? A: We use the normal curve approximation. Converting to standard units = 0.7 this corresponds to a right hand tail of 14% under the normal curve.
26 Prediction for data in a vertical strip We expect the dispersion in the y variable to be about the same for each vertical strip. This is given by the RMS error, thus the new SD is r SD of y= = 8 points This new SD can be used to convert to standard units = and, using the normal curve, we obtain an area of 31% above 0.5. This is the percentage of students scoring more than 75 in the first year among those who scored 165 in the LSAT. Notice that this percentage is higher than the 14% we obtained before. This is because we have focus on a smaller portion of the sample, obtaining a smaller SD.
27 Prediction for data in a vertical strip In summary, when considering data for a vertical strip: Convert to standard units in the x variable. Obtain the predicted value of the y variable. Calculate the SD for the y variable in the strip using RMS error. Convert to standard units in the y variable and use the normal curve.
28 Slope and intercept All lines can be determined by a slope and an intercept. The intercept is the height of the line when x = 0. The slope is the rate at which y increases, per unit increase in x. If the slope is negative then y decreases as x increases.
29 Slope and intercept How do you get the slope of a regression line? Example: A sample of 555 California men age in 1993 was surveyed to find out about education and income. The data are summarized by average education 12.5 years; SD 4 years average income $21,500; SD $16,000; r 0.35 This means that, for every increase of one SD in education, there is an increase of r SD in income. Thus, 4 extra years of education are worth an extra 0.35 $16,000 = $5,600 of income. So, each extra year is worth 0.35 $16, this, is the slope of the regression line. = $1, 400
30 Slope and intercept The intercept of the regression line is given by the value of y when x = 0. This is 12.5 years below average in education. Since each year costs $1,400, a man with no education should have an income which is below average by 12.5 years $1,400 per year = $17,500 since the average income is $21,500, the income of a man with no education is $21,500 $17,500 = $4,000. This is the intercept of the regression line. This corresponds to the change in y associated with one unit increase in x.
31 Slope and intercept This is given by average of y  slope average of x The equation for the regression line is called the regression equation and can be written as y= slope x+ intercept So, for our example, we have that predicted income = $1,400 per year education + $4,000
32 Slope and intercept Q: What is the predicted income of a man with an education of 15 years? A: Using the regression equation we have y = $1, $4,000 = $25,000 we can plug in any value of education and obtain the expected income for that level of education. Warning: It is usually a bad idea to use the regression line for extrapolations.
33 Example Back to our shoe size example. The shoe size and the heights of 14 men are recorded. The shoe size average is with a SD of The average height is inches with a SD of 2.45 inches. The correlation is r SD of height The slope of the regression line is = = 1.88 SD of shoe size 1.21 To obtain the intercept we consider a show size of zero. This is units below average and so will correspond to a height that is = inches below average. So it corresponds to a height of = inches. The regression line is height = 1.88 shoe size inches Q: What is the predicted height of a man with a show size of 9? A: Using the regression equation we have inches = inches
34 Least Square Consider a cloud of points produced by obtaining the scatter diagram of observations corresponding to two variables x and y. There are many lines that we can draw through the cloud. Which is the straight line that fits the points best? The regression line is a possible solution to this problem. This is the reason why the regression line is called the least squares line.
35 Least Square Example: Let b be the length of a spring with no load. If a load x is attached to the spring the stretch is proportional to x. Thus the length of the string is y = mx + b. where m and b are constants that depend on the string. An experiment is run to determine the constants for a given spring, the data are shown in the table. The correlation coefficient is r = 0.999, so the points are very close to straight line. But they are not exactly on a straight line. This is probably due to measurement error. The regression line for these data produces estimates of b and m, given, respectively, by the intercept and the slope of the line. The values are m 0.5c per kg, and b cm. These are the least squares estimates of m and b.
36 Problem Find the regression equation for predicting final score from midterm score, based on the following information: average midterm score = 70, SD = 10 average final score = 55, SD = 20, r = 0.60 The slope of the line can be obtained as r SD of final = = 1.2 SD of midterm 10 A score of 0 in the midterm will correspond to a final score that is = 84 units below average. So the intercept is = 29 units of the final score. Thus, the regression equation is final score = 1.2 midterm score  29
. 58 58 60 62 64 66 68 70 72 74 76 78 Father s height (inches)
PEARSON S FATHERSON DATA The following scatter diagram shows the heights of 1,0 fathers and their fullgrown sons, in England, circa 1900 There is one dot for each fatherson pair Heights of fathers and
More informationAnswer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade
Statistics Quiz Correlation and Regression  ANSWERS 1. Temperature and air pollution are known to be correlated. We collect data from two laboratories, in Boston and Montreal. Boston makes their measurements
More informationDescriptive statistics; Correlation and regression
Descriptive statistics; and regression Patrick Breheny September 16 Patrick Breheny STA 580: Biostatistics I 1/59 Tables and figures Descriptive statistics Histograms Numerical summaries Percentiles Human
More informationCorrelation & Regression, II. Residual Plots. What we like to see: no pattern. Steps in regression analysis (so far)
Steps in regression analysis (so far) Correlation & Regression, II 9.07 4/6/2004 Plot a scatter plot Find the parameters of the best fit regression line, y =a+bx Plot the regression line on the scatter
More informationExercise 1.12 (Pg. 2223)
Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.
More information17.0 Linear Regression
17.0 Linear Regression 1 Answer Questions Lines Correlation Regression 17.1 Lines The algebraic equation for a line is Y = β 0 + β 1 X 2 The use of coordinate axes to show functional relationships was
More informationLecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation
Lecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation Display and Summarize Correlation for Direction and Strength Properties of Correlation Regression Line Cengage
More informationChapter 11: r.m.s. error for regression
Chapter 11: r.m.s. error for regression Context................................................................... 2 Prediction error 3 r.m.s. error for the regression line...............................................
More informationChapter 5: The normal approximation for data
Chapter 5: The normal approximation for data Context................................................................... 2 Normal curve 3 Normal curve.............................................................
More informationc. Construct a boxplot for the data. Write a one sentence interpretation of your graph.
MBA/MIB 5315 Sample Test Problems Page 1 of 1 1. An English survey of 3000 medical records showed that smokers are more inclined to get depressed than nonsmokers. Does this imply that smoking causes depression?
More informationThe Correlation Coefficient
The Correlation Coefficient Lelys Bravo de Guenni April 22nd, 2015 Outline The Correlation coefficient Positive Correlation Negative Correlation Properties of the Correlation Coefficient Nonlinear association
More information1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number
1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x  x) B. x 3 x C. 3x  x D. x  3x 2) Write the following as an algebraic expression
More informationChapter 10  Practice Problems 1
Chapter 10  Practice Problems 1 1. A researcher is interested in determining if one could predict the score on a statistics exam from the amount of time spent studying for the exam. In this study, the
More informationAP Statistics Semester Exam Review Chapters 13
AP Statistics Semester Exam Review Chapters 13 1. Here are the IQ test scores of 10 randomly chosen fifthgrade students: 145 139 126 122 125 130 96 110 118 118 To make a stemplot of these scores, you
More informationCorrelation key concepts:
CORRELATION Correlation key concepts: Types of correlation Methods of studying correlation a) Scatter diagram b) Karl pearson s coefficient of correlation c) Spearman s Rank correlation coefficient d)
More informationThe correlation coefficient
The correlation coefficient Clinical Biostatistics The correlation coefficient Martin Bland Correlation coefficients are used to measure the of the relationship or association between two quantitative
More informationSession 7 Bivariate Data and Analysis
Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table covariation least squares
More informationSimple linear regression
Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between
More informationApplied Data Analysis. Fall 2015
Applied Data Analysis Fall 2015 Course information: Labs Anna Walsdorff anna.walsdorff@rochester.edu Tues. 911 AM Mary Clare Roche maryclare.roche@rochester.edu Mon. 24 PM Lecture outline 1. Practice
More informationElementary Statistics. Scatter Plot, Regression Line, Linear Correlation Coefficient, and Coefficient of Determination
Scatter Plot, Regression Line, Linear Correlation Coefficient, and Coefficient of Determination What is a Scatter Plot? A Scatter Plot is a plot of ordered pairs (x, y) where the horizontal axis is used
More informationDESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.
DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,
More information4. Describing Bivariate Data
4. Describing Bivariate Data A. Introduction to Bivariate Data B. Values of the Pearson Correlation C. Properties of Pearson's r D. Computing Pearson's r E. Variance Sum Law II F. Exercises A dataset with
More informationStatistiek II. John Nerbonne. March 24, 2010. Information Science, Groningen Slides improved a lot by Harmut Fitz, Groningen!
Information Science, Groningen j.nerbonne@rug.nl Slides improved a lot by Harmut Fitz, Groningen! March 24, 2010 Correlation and regression We often wish to compare two different variables Examples: compare
More informationPie Charts. proportion of icecream flavors sold annually by a given brand. AMS5: Statistics. Cherry. Cherry. Blueberry. Blueberry. Apple.
Graphical Representations of Data, Mean, Median and Standard Deviation In this class we will consider graphical representations of the distribution of a set of data. The goal is to identify the range of
More informationName: Date: Use the following to answer questions 23:
Name: Date: 1. A study is conducted on students taking a statistics class. Several variables are recorded in the survey. Identify each variable as categorical or quantitative. A) Type of car the student
More informationAP Statistics Solutions to Packet 2
AP Statistics Solutions to Packet 2 The Normal Distributions Density Curves and the Normal Distribution Standard Normal Calculations HW #9 1, 2, 4, 68 2.1 DENSITY CURVES (a) Sketch a density curve that
More informationMEASURES OF VARIATION
NORMAL DISTRIBTIONS MEASURES OF VARIATION In statistics, it is important to measure the spread of data. A simple way to measure spread is to find the range. But statisticians want to know if the data are
More informationLecture 13/Chapter 10 Relationships between Measurement (Quantitative) Variables
Lecture 13/Chapter 10 Relationships between Measurement (Quantitative) Variables Scatterplot; Roles of Variables 3 Features of Relationship Correlation Regression Definition Scatterplot displays relationship
More informationDiagrams and Graphs of Statistical Data
Diagrams and Graphs of Statistical Data One of the most effective and interesting alternative way in which a statistical data may be presented is through diagrams and graphs. There are several ways in
More informationDensity Curve. A density curve is the graph of a continuous probability distribution. It must satisfy the following properties:
Density Curve A density curve is the graph of a continuous probability distribution. It must satisfy the following properties: 1. The total area under the curve must equal 1. 2. Every point on the curve
More informationHomework 8 Solutions
Math 17, Section 2 Spring 2011 Homework 8 Solutions Assignment Chapter 7: 7.36, 7.40 Chapter 8: 8.14, 8.16, 8.28, 8.36 (ad), 8.38, 8.62 Chapter 9: 9.4, 9.14 Chapter 7 7.36] a) A scatterplot is given below.
More information2. Here is a small part of a data set that describes the fuel economy (in miles per gallon) of 2006 model motor vehicles.
Math 1530017 Exam 1 February 19, 2009 Name Student Number E There are five possible responses to each of the following multiple choice questions. There is only on BEST answer. Be sure to read all possible
More informationInferential Statistics
Inferential Statistics Sampling and the normal distribution Zscores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are
More informationUnit 7: Normal Curves
Unit 7: Normal Curves Summary of Video Histograms of completely unrelated data often exhibit similar shapes. To focus on the overall shape of a distribution and to avoid being distracted by the irregularities
More informationCURVE FITTING LEAST SQUARES APPROXIMATION
CURVE FITTING LEAST SQUARES APPROXIMATION Data analysis and curve fitting: Imagine that we are studying a physical system involving two quantities: x and y Also suppose that we expect a linear relationship
More informationMind on Statistics. Chapter 3
Mind on Statistics Chapter 3 Section 3.1 1. Which one of the following is not appropriate for studying the relationship between two quantitative variables? A. Scatterplot B. Bar chart C. Correlation D.
More informationChapter 3: Data Description Numerical Methods
Chapter 3: Data Description Numerical Methods Learning Objectives Upon successful completion of Chapter 3, you will be able to: Summarize data using measures of central tendency, such as the mean, median,
More informationCorrelational Research. Correlational Research. Stephen E. Brock, Ph.D., NCSP EDS 250. Descriptive Research 1. Correlational Research: Scatter Plots
Correlational Research Stephen E. Brock, Ph.D., NCSP California State University, Sacramento 1 Correlational Research A quantitative methodology used to determine whether, and to what degree, a relationship
More information, has mean A) 0.3. B) the smaller of 0.8 and 0.5. C) 0.15. D) which cannot be determined without knowing the sample results.
BA 275 Review Problems  Week 9 (11/20/0611/24/06) CD Lessons: 69, 70, 1620 Textbook: pp. 520528, 111124, 133141 An SRS of size 100 is taken from a population having proportion 0.8 of successes. An
More informationLesson Lesson Outline Outline
Lesson 15 Linear Regression Lesson 15 Outline Review correlation analysis Dependent and Independent variables Least Squares Regression line Calculating l the slope Calculating the Intercept Residuals and
More informationChapter 7: Simple linear regression Learning Objectives
Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) 
More informationChapter 10. Key Ideas Correlation, Correlation Coefficient (r),
Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables
More informationDATA INTERPRETATION AND STATISTICS
PholC60 September 001 DATA INTERPRETATION AND STATISTICS Books A easy and systematic introductory text is Essentials of Medical Statistics by Betty Kirkwood, published by Blackwell at about 14. DESCRIPTIVE
More informationSect The SlopeIntercept Form
Concepts # and # Sect.  The SlopeIntercept Form SlopeIntercept Form of a line Recall the following definition from the beginning of the chapter: Let a, b, and c be real numbers where a and b are not
More informationCORRELATIONAL ANALYSIS: PEARSON S r Purpose of correlational analysis The purpose of performing a correlational analysis: To discover whether there
CORRELATIONAL ANALYSIS: PEARSON S r Purpose of correlational analysis The purpose of performing a correlational analysis: To discover whether there is a relationship between variables, To find out the
More informationLinear Regression. Chapter 5. Prediction via Regression Line Number of new birds and Percent returning. Least Squares
Linear Regression Chapter 5 Regression Objective: To quantify the linear relationship between an explanatory variable (x) and response variable (y). We can then predict the average response for all subjects
More informationCHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression
Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the
More informationStatistics 151 Practice Midterm 1 Mike Kowalski
Statistics 151 Practice Midterm 1 Mike Kowalski Statistics 151 Practice Midterm 1 Multiple Choice (50 minutes) Instructions: 1. This is a closed book exam. 2. You may use the STAT 151 formula sheets and
More informationModule 3: Correlation and Covariance
Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis
More informationCorrelation. What Is Correlation? Perfect Correlation. Perfect Correlation. Greg C Elvers
Correlation Greg C Elvers What Is Correlation? Correlation is a descriptive statistic that tells you if two variables are related to each other E.g. Is your related to how much you study? When two variables
More informationStatistics E100 Fall 2013 Practice Midterm I  A Solutions
STATISTICS E100 FALL 2013 PRACTICE MIDTERM I  A SOLUTIONS PAGE 1 OF 5 Statistics E100 Fall 2013 Practice Midterm I  A Solutions 1. (16 points total) Below is the histogram for the number of medals won
More informationSTAT 155 Introductory Statistics. Lecture 5: Density Curves and Normal Distributions (I)
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL STAT 155 Introductory Statistics Lecture 5: Density Curves and Normal Distributions (I) 9/12/06 Lecture 5 1 A problem about Standard Deviation A variable
More information1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96
1 Final Review 2 Review 2.1 CI 1propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
More informationWorksheet A5: Slope Intercept Form
Name Date Worksheet A5: Slope Intercept Form Find the Slope of each line below 1 3 Y           Graph the lines containing the point below, then find their slopes from counting on the graph!.
More informationDescriptive Statistics and Measurement Scales
Descriptive Statistics 1 Descriptive Statistics and Measurement Scales Descriptive statistics are used to describe the basic features of the data in a study. They provide simple summaries about the sample
More informationUnit 8: Normal Calculations
Unit 8: Normal Calculations Summary of Video In this video, we continue the discussion of normal curves that was begun in Unit 7. Recall that a normal curve is bellshaped and completely characterized
More information2. Simple Linear Regression
Research methods  II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according
More informationChapter 13 Introduction to Linear Regression and Correlation Analysis
Chapter 3 Student Lecture Notes 3 Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing
More informationLinear Approximations ACADEMIC RESOURCE CENTER
Linear Approximations ACADEMIC RESOURCE CENTER Table of Contents Linear Function Linear Function or Not Real World Uses for Linear Equations Why Do We Use Linear Equations? Estimation with Linear Approximations
More informationRegents Exam Questions A2.S.8: Correlation Coefficient
A2.S.8: Correlation Coefficient: Interpret within the linear regression model the value of the correlation coefficient as a measure of the strength of the relationship 1 Which statement regarding correlation
More informationRescaling and shifting
Rescaling and shifting A fancy way of changing one variable to another Main concepts involve: Adding or subtracting a number (shifting) Multiplying or dividing by a number (rescaling) Where have you seen
More informationSection 14 Simple Linear Regression: Introduction to Least Squares Regression
Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship
More informationAppendix E: Graphing Data
You will often make scatter diagrams and line graphs to illustrate the data that you collect. Scatter diagrams are often used to show the relationship between two variables. For example, in an absorbance
More informationUnit 11: Fitting Lines to Data
Unit 11: Fitting Lines to Data Summary of Video Scatterplots are a great way to visualize the relationship between two quantitative variables. For example, the scatterplot of temperatures and coral reef
More informationChapter 8 Graphs and Functions:
Chapter 8 Graphs and Functions: Cartesian axes, coordinates and points 8.1 Pictorially we plot points and graphs in a plane (flat space) using a set of Cartesian axes traditionally called the x and y axes
More informationLogo Symmetry Learning Task. Unit 5
Logo Symmetry Learning Task Unit 5 Course Mathematics I: Algebra, Geometry, Statistics Overview The Logo Symmetry Learning Task explores graph symmetry and odd and even functions. Students are asked to
More informatione = random error, assumed to be normally distributed with mean 0 and standard deviation σ
1 Linear Regression 1.1 Simple Linear Regression Model The linear regression model is applied if we want to model a numeric response variable and its dependency on at least one numeric factor variable.
More informationThe Normal Distribution
Chapter 6 The Normal Distribution 6.1 The Normal Distribution 1 6.1.1 Student Learning Objectives By the end of this chapter, the student should be able to: Recognize the normal probability distribution
More informationThe aspect of the data that we want to describe/measure is the degree of linear relationship between and The statistic r describes/measures the degree
PS 511: Advanced Statistics for Psychological and Behavioral Research 1 Both examine linear (straight line) relationships Correlation works with a pair of scores One score on each of two variables ( and
More informationSimple Regression Theory I 2010 Samuel L. Baker
SIMPLE REGRESSION THEORY I 1 Simple Regression Theory I 2010 Samuel L. Baker Regression analysis lets you use data to explain and predict. A simple regression line drawn through data points In Assignment
More informationInfinite Algebra 1 supports the teaching of the Common Core State Standards listed below.
Infinite Algebra 1 Kuta Software LLC Common Core Alignment Software version 2.05 Last revised July 2015 Infinite Algebra 1 supports the teaching of the Common Core State Standards listed below. High School
More informationThe Cartesian Plane The Cartesian Plane. Performance Criteria 3. PreTest 5. Coordinates 7. Graphs of linear functions 9. The gradient of a line 13
6 The Cartesian Plane The Cartesian Plane Performance Criteria 3 PreTest 5 Coordinates 7 Graphs of linear functions 9 The gradient of a line 13 Linear equations 19 Empirical Data 24 Lines of best fit
More informationSection 3 Part 1. Relationships between two numerical variables
Section 3 Part 1 Relationships between two numerical variables 1 Relationship between two variables The summary statistics covered in the previous lessons are appropriate for describing a single variable.
More information! x sum of the entries
3.1 Measures of Central Tendency (Page 1 of 16) 3.1 Measures of Central Tendency Mean, Median and Mode! x sum of the entries a. mean, x = = n number of entries Example 1 Find the mean of 26, 18, 12, 31,
More informationCourse Objective This course is designed to give you a basic understanding of how to run regressions in SPSS.
SPSS Regressions Social Science Research Lab American University, Washington, D.C. Web. www.american.edu/provost/ctrl/pclabs.cfm Tel. x3862 Email. SSRL@American.edu Course Objective This course is designed
More informationDescribing Relationships between Two Variables
Describing Relationships between Two Variables Up until now, we have dealt, for the most part, with just one variable at a time. This variable, when measured on many different subjects or objects, took
More informationDESCRIPTIVE STATISTICS AND EXPLORATORY DATA ANALYSIS
DESCRIPTIVE STATISTICS AND EXPLORATORY DATA ANALYSIS SEEMA JAGGI Indian Agricultural Statistics Research Institute Library Avenue, New Delhi  110 012 seema@iasri.res.in 1. Descriptive Statistics Statistics
More information6 3 The Standard Normal Distribution
290 Chapter 6 The Normal Distribution Figure 6 5 Areas Under a Normal Distribution Curve 34.13% 34.13% 2.28% 13.59% 13.59% 2.28% 3 2 1 + 1 + 2 + 3 About 68% About 95% About 99.7% 6 3 The Distribution Since
More informationII. DISTRIBUTIONS distribution normal distribution. standard scores
Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,
More informationOutline. Correlation & Regression, III. Review. Relationship between r and regression
Outline Correlation & Regression, III 9.07 4/6/004 Relationship between correlation and regression, along with notes on the correlation coefficient Effect size, and the meaning of r Other kinds of correlation
More informationChapter 4: Average and standard deviation
Chapter 4: Average and standard deviation Context................................................................... 2 Average vs. median 3 Average.................................................................
More informationUnivariate Regression
Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is
More informationScatter Plots with Error Bars
Chapter 165 Scatter Plots with Error Bars Introduction The procedure extends the capability of the basic scatter plot by allowing you to plot the variability in Y and X corresponding to each point. Each
More informationWEB APPENDIX. Calculating Beta Coefficients. b Beta Rise Run Y 7.1 1 8.92 X 10.0 0.0 16.0 10.0 1.6
WEB APPENDIX 8A Calculating Beta Coefficients The CAPM is an ex ante model, which means that all of the variables represent beforethefact, expected values. In particular, the beta coefficient used in
More informationRegression Analysis: A Complete Example
Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty
More informationHints for Success on the AP Statistics Exam. (Compiled by Zack Bigner)
Hints for Success on the AP Statistics Exam. (Compiled by Zack Bigner) The Exam The AP Stat exam has 2 sections that take 90 minutes each. The first section is 40 multiple choice questions, and the second
More informationLesson 4 Measures of Central Tendency
Outline Measures of a distribution s shape modality and skewness the normal distribution Measures of central tendency mean, median, and mode Skewness and Central Tendency Lesson 4 Measures of Central
More informationAP Statistics 2001 Solutions and Scoring Guidelines
AP Statistics 2001 Solutions and Scoring Guidelines The materials included in these files are intended for noncommercial use by AP teachers for course and exam preparation; permission for any other use
More informationStatistics. Measurement. Scales of Measurement 7/18/2012
Statistics Measurement Measurement is defined as a set of rules for assigning numbers to represent objects, traits, attributes, or behaviors A variableis something that varies (eye color), a constant does
More informationX X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)
CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.
More informationElementary Statistics
Elementary Statistics Chapter 1 Dr. Ghamsary Page 1 Elementary Statistics M. Ghamsary, Ph.D. Chap 01 1 Elementary Statistics Chapter 1 Dr. Ghamsary Page 2 Statistics: Statistics is the science of collecting,
More informationNumerical Summarization of Data OPRE 6301
Numerical Summarization of Data OPRE 6301 Motivation... In the previous session, we used graphical techniques to describe data. For example: While this histogram provides useful insight, other interesting
More informationDescriptive Statistics
Descriptive Statistics Descriptive statistics consist of methods for organizing and summarizing data. It includes the construction of graphs, charts and tables, as well various descriptive measures such
More informationTeaching & Learning Plans. The Correlation Coefficient. Leaving Certificate Syllabus
Teaching & Learning Plans The Correlation Coefficient Leaving Certificate Syllabus The Teaching & Learning Plans are structured as follows: Aims outline what the lesson, or series of lessons, hopes to
More informationT O P I C 1 2 Techniques and tools for data analysis Preview Introduction In chapter 3 of Statistics In A Day different combinations of numbers and types of variables are presented. We go through these
More informationChapter 15 Multiple Choice Questions (The answers are provided after the last question.)
Chapter 15 Multiple Choice Questions (The answers are provided after the last question.) 1. What is the median of the following set of scores? 18, 6, 12, 10, 14? a. 10 b. 14 c. 18 d. 12 2. Approximately
More informationwith functions, expressions and equations which follow in units 3 and 4.
Grade 8 Overview View unit yearlong overview here The unit design was created in line with the areas of focus for grade 8 Mathematics as identified by the Common Core State Standards and the PARCC Model
More informationCorrelation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables 2
Lesson 4 Part 1 Relationships between two numerical variables 1 Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables
More information" Y. Notation and Equations for Regression Lecture 11/4. Notation:
Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through
More information1. Suppose that a score on a final exam depends upon attendance and unobserved factors that affect exam performance (such as student ability).
Examples of Questions on Regression Analysis: 1. Suppose that a score on a final exam depends upon attendance and unobserved factors that affect exam performance (such as student ability). Then,. When
More information