Simple Regression and Correlation

Size: px
Start display at page:

Download "Simple Regression and Correlation"

Transcription

1 14 Simple Regression and Simple Regression and We are going to study the relation between two variables. Let us label the first as X and the second as Y. We will observe values for these two variables in the form: X Y We are interested in finding relation between these two variables. To determine the relation between them we have to know the meaning of each of them. The following are some examples of X and Y, and the theoretical relation between them: X Y Theoretical Relation Weight of a person Height of a person Positive relation Higher Temperature Electricity consumption Positive relation KD Exchange rate Temperature in against the Dollar Japan No relation Price of a commodity Quantity of items purchased Negative relation QMIS 0 1

2 Simple Regression and For these variables we could be interested in answers to the following questions: a) Is there any relation between the two variables? b) Does the relation (if any) take a linear form? c) What is the direction of the relation (positive, negative)? d) What is the magnitude of this relation (weak, medium or trong)? The Answer to these questions could be done in two ways 1) Partially by plotting the scatter plot of X and Y. As in the following general forms of scatter plots one would obtain : 3 Simple Regression and Scatter Plot scatter plot of X and Y scatter plot of X and Y Y X Y X Complete positive linear relation r = 1 Complete negative linear relation r = 1 4 QMIS 0

3 Simple Regression and Scatter Plot scatter plot of X and Y scatter plot of X and Y Y X Y X Positive incomplete linear relation 0 < r < 1 Negative incomplete linear relation 1 < r < 0 5 Simple Regression and Scatter Plot scatter plot of X and Y Y X No relation linear relation between X and Y r = 0 6 QMIS 0 3

4 Simple Regression and Scatter Plot in MINITAB Assuming that the variable X is in column C1 and Y is in column C of the MINITAB worksheet. We can get a high resolution form of the scatter plot using the commands: Mtb> plot C*C1 The general form of the plot command is: Col. of Y Mtb> plot * Col. of X 7 Linear Coefficient () The second and more precise method to answer the previous four questions is by using a numerical measure. Carl Pearson has developed a measure of the strength of a linear relation between two variables. This measure is still wildly used and is known as Pearson's linear correlation coefficient or simply the linear correlation coefficient, and is denoted by "r". The Coefficient "r" is a scale that runs from 1 to +1. The value of "r" indicates the strength of the linear relation between the two variables X and Y. And its sign indicates the direction (negative or positive) of the linear relation. Negative relation Positive relation strong medium Weak weak medium strong Perfect negative relation No linear relation Perfect positive relation 8 QMIS 0 4

5 Linear Coefficient As an example the following values of correlation coefficients "r" is interpreted as follows: If r = 0.81 Strong negative linear relation If r = 0.34 Weak positive linear relation If r = 0.69 Medium positive linear relation If r = 1.00 Perfect positive linear relation If r = 1.00 Perfect negative linear relation If r = 0.71 Medium negative linear relation If r = 0 No linear relation 9 Linear Coefficient The formula for computing the Pearson's Coefficients is as follows: r (X X)*(Y Y) ( Definition form) (X X) (Y Y) (X) (Y) XYn ( X) ( Y) X Y n n SSxy SS xx * SSyy ( Computation form) So, to compute "r" we need the value of the following sums: n, X, Y, X, Y and XY 10 QMIS 0 5

6 Linear Coefficient Example: For the two variables X and Y we have the following 5 observations. Determine the strength and direction of the linear relation (if any) between them. X Y Sum Linear Coefficient Example To compute the Pearson's Coefficient "r" we need to find out the values of the required sums. So, we have: X Y X Y XY Sum n 5 X05 Y 176 X 1045 Y 7490 XY5655 SS xx 00 SS yy SS xy QMIS 0 6

7 n5 Linear Coefficient Example X05 Y 176 X 1045 Y 7490 XY 5655 SS xx 00 SS yy SS xy 1561 r (X) (Y) SS XYxy n SS xx * SS yy ( X) ( Y) X Y n n (05)(176) (05) (176) 00 * Which means that there is a strong negative linear relation between the two variable X and Y. 13 Linear Coefficient Example Computing the Coefficient in MINITAB: Assuming that the data of X is in column C1 and the data of Y is in column C we can use the following MINITAB command to calculate "r": Mtb> corr C1 C 14 QMIS 0 7

8 Linear Coefficient MINITAB Example MTB > print c1 c Row X Y MTB > let c3=c1*c1 MTB > let c4=c*c MTB > let c5=c1*c MTB > print c1c5 Row X Y x^ Y^ XY MTB > count c1 Total number of observations in X = 5 MTB > sum c1 Sum of X = MTB > sum c Sum of Y = MTB > sum c3 Sum of x^ = 1045 MTB > sum c4 Sum of Y^ = MTB > sum c5 Sum of XY = Linear Coefficient MINITAB Example MTB > gstd MTB > plot c c1 Y * * 45+ * 30+ * 15+ * X MTB > corr c1 c s: X, Y Pearson correlation of X and Y = PValue = QMIS 0 8

9 Linear Coefficient Hypothesis Testing 17 Linear Coefficient Hypothesis Testing t = 6.39 t(3) 5 So for =.05 and t(with 3 df) The critical values is 3.18 Decision: reject H0 with 95% confidence 18 QMIS 0 9

10 Regression We have learned from the previous section how to examine the strength and direction of a linear relation that could link two variables: X and Y. In linear regression, we are interested in forming or estimating the best linear function that ties the two variables: X and Y. The general form of linear equation is: Y = a + b X or Y = b 0 + b 1 X An example of such a linear equation is: Y = X Here a = 4 and b = 3. What is the interpretation of "a" and "b" in the general form of linear equation? 19 Regression To understand the interpretation of linear equation let us take the previous equation as an example: Y = X And find out the values of Y for different values of X: Changes in X: Changes in Y: Y X X Y 1 7 X=1=1 10 Y=107 = 3 X=3= Y=1310= 3 X=43= Y=1613= Y = 4 Value of "b" Value of "a" From the above example we see that whenever X changes by 1 unit Y changes by 3 (the value of b). And when X equals 0 the value of Y equals 4 (the value of a in the general form of the linear equation). 0 QMIS 0 10

11 Regression Y = a + b X a = is the starting or initial value of Y (i.e. the value of Y when X = 0) b = (1) the rate of change in Y when X changes by 1 unit. Or () the rate of change in Y divided by the rate of change in X (b= Y/X). (3) b is also interpreted as the slope of the linear equation Y = a + b X or the tangent (tan) of the angle between that line and the horizontal line. For the above general form of linear equation X is known as the independent variable, where as Y is the dependents variable as its value is determined by the values of X. 1 Regression QMIS 0 11

12 Regression Y = a + b X How to estimate the best linear equation for Y on X? We usually start with values for the two variables X and Y. Here we should specify which of them we are going to considered as the independent Variable and which is the dependent (or the one we need to explain by the other) variable. We start in a similar data layout as we had with the correlation coefficient case, that is: X Y Estimating the linear equation of Y as dependent variable and X as the independent variable, from a set of data, is known as the Estimating the linear regression line of Y on X. 3 Regression One way to help use estimate the linear regression of Y on X is to scatter plot the values of X and Y. And compute the correlation coefficient as discussed before. If the scatter plot of Y and X takes exactly a linear form (with positive or negative slop). Then estimating the best line that fits the data is easy and straight forward. But if the scatter plot does not perfectly follow a linear pattern. Then there are a number of ways to define and estimate a linear equation that would represent the data. The Least Squared Method is one of these methods which is widely use. The idea of the least squared method is to fit a line for the linear relation Y= a + b X that passes in the middle of the data set. To achieve this idea, the method would search for the line that has the least squared error. That is the estimated line that has the least (the lowest possible value) of (Y Ŷ) where: Y is the observed values (value obtained) for the dependent variable. Ŷ is the estimated values of Y using the estimated regression line. The quantity (Y Ŷ) is known as the error term or residual or deviation. 4 QMIS 0 1

13 Regression scatter plot of X and Y scatter plot of X and Y Y X Y (Y Ŷ) Ŷ Y X 5 Regression Since the data does not exactly follow a linear form. We can say the linear form would fit the data with some error. That is: Y = a + b X + e Where "a" and "b" are unknown constants and "e" is an unobservable error term (deviation from the line). We then can write the error term as: (Y Ŷ) or (Y ( a + b X)), the square of that is the squared deviation. (Y ( a + b X)), Y scatter plot of X and Y (Y Ŷ) Ŷ Y X 6 QMIS 0 13

14 Y (14) Simple Regression and Regression And the sum of all squared deviations for all values of Y is the sum of squared deviations: Y a b X to obtain the values of "a" and "b" that will minimize the sum of the squared errors we have to partially differentiate the sum of squared deviations, once with respect to "a" and another with respect to "b". Then equate both resulted functions with the zero, to obtain what is known as the normal equations: scatter plot of X and Y 45 a n b X Y a X b X XY (Y Ŷ) Ŷ Y X 7 Regression Solving these two equations for "a" and "b" we have: ( X)( Y) XY ˆb n ( X) X n SSxy SSxx and â Y bˆ X so the estimated equation is then: Yˆ aˆ bˆ X Y scatter plot of X and Y (Y Ŷ) Ŷ Y X 8 QMIS 0 14

15 Regression Example From the previous example n5 X Y X Y XY Sum X05 Y 176 X 1045 Y 7490 XY 5655 SS xx 00 SS yy SS xy * ˆ b = = = (05) and â Y bx ˆ ( 0.773) * Regression Example 05* ˆ b = = = (05) â YbX ˆ ( 0.773) * The final least squared estimated linear regression of Y on X is: Ŷ X Note: "b" and "r" have the same sign (+ or ) which is determined by the value of the numerator. 30 QMIS 0 15

16 Regression Example Ŷ X Usage of the estimated Linear equation: 1) To further explain the relation between X and Y ) To predict values of Y (dependent variable) for certain values of X (independent variable). Explaining the relation in the previous example: For the previous example the estimated slope (b) is: ˆ b = ΔY =0.773 ΔX Which indicates that when X increases by 1 unit the values of Y will decrease by The initial value of Y (the value of Y when X=0) is: ˆ a= Regression Example Ŷ X Prediction: What is the estimated (predicted) value of Y when X=30? i.e. What is Ŷ when X Y x * 30 X Y error deviation Y Ŷ Similarly, the predicted value of Y when X=80 is: * Yx * QMIS 0 16

17 Regression MINITAB Example Assuming that the data of X is in column C1 and the data of Y is in column C we can use the following MINITAB command to estimate the value of "a" and "b" : mtb> regr C 1 C1 The general form of the Regression command is: Y X standard error Ŷ Mtb> regr 1 Optional columns Example: Mtb> regr C 1 C1 Mtb> regr C 1 C1 C5 C6 33 Regression MINITAB Example MTB > print c1c Row X Y MTB > gstd MTB > plot c c1 Y * * 45+ * 30+ * 15+ * X MTB > corr c1 c s: X, Y Pearson correlation of X and Y = QMIS 0 17

18 Regression MINITAB Example MTB > regr c 1 c1 Regression Analysis: Y versus X The regression equation is Y = X Predictor Coef SE Coef T P Constant X S = RSq = 93.% RSq(adj) = 90.9% Analysis of Variance Source DF SS MS F P Regression Residual Error Total Regression MINITAB Example MTB > regr c 1 c1 c5 c6 MTB > print c1 c c5 c6 Data Display Row X Y st.error Yhat QMIS 0 18

19 Regression Assumptions of the regression Model Consider the population regression model: Y = a + bx + We have four assumptions for estimating this model using a sample: Assumption 1 : The random error term has a mean equal to zero for each x. Assumption : The error associated with different observations are independent. Assumption 3 : For any given x, the distribution of error is Normal. Assumption 4 : The distribution of population errors for each x has the same constant standard deviations, which is denoted by. This assumption indicates that the spread of points around the regression line is similar for all x values. 37 Regression If we estimate a regression of Y on X using the least squared method. Then how do we know if this estimate is good? And how do we know whether it is reliable estimate? We need to test the validity of the estimated model for: Where: Y = a + b X + X is the independent variable. is a random variable that is distributed normally with mean 0 and standard deviation. i.e. [ ~ N( 0, ) ] Y is the dependent random variable. Y ~ N( a+bx, ) 38 QMIS 0 19

20 Regression Steps for estimated model validation: (1) Coefficient of Determination (r ): r is a numerical measure that takes a value between 0 and 1 (i.e. 0 r 1 ) which represents the percentage of the total variations in the dependent (Y) that is explained by the estimated linear model: Ŷ â bˆ X. r is computed by the formula: Explaned variation r = = = Total variation in Y Regression sum of squares = RSS Total sum of squares TSS The more the value of r, the stronger is the estimated model. So, r is one indication or measure for the strength of the estimated model. 39 Regression Example: If we compute r and we find out that it equals to Then this is interpreted as: The estimated model (using the least squared method) has succeeded in explaining 83% of the total variation in the dependent variable Y. The remaining 17% is not explained by the estimated model. It could be a pure error or some lack or deficiency in the estimated model. 40 QMIS 0 0

21 Regression Notes: 1 SS SS (Total Sum of Squares) (Regression Sum of Squares) SS SS 3 SS SS (Error Sum of Squares) (4) i.e. 41 Regression 4 QMIS 0 1

22 Regression 43 Regression () Test for the overall validity of the estimated model: Given the assumptions of the normally of the error term and the dependent variable Y. we can test the adequacy of the estimated model in presenting the linear relation between Y and X. The procedure for this testing follows the same steps we have been doing through all previous procedures of hypothesis testing. We start by formulating the two hypotheses. Namely, H O and H 1. In our case : H O : The estimated model does not fit the data. ( OR the model is not adequate) H 1 : The estimated model fits the data. (OR the model is adequate). 44 QMIS 0

23 Regression To test the above hypotheses we compute the value of the Test Statistics through the analysis of variance table (ANOVA). ANOVA: SOURCE DF SS MS F (test stat.) Regression k ( Ŷ Y) (Ŷ Y) / k Reg. MS Error nk1 ( Y Ŷ) (Y Ŷ) /(n k 1) Error MS Total n1 ( Y Y) The test statistics: Regression MS ~ F( k, nk1) Error MS Where: n is the number of observations used in estimating the model k is the number of independent variables (X) used in the model Ŷ is the predicted values of Y for each value of X. Y is the mean of the dependent variable Y. 45 Regression (3) Test for the model parameters: The estimated model: Ŷ â bˆx Has two parameters: â and bˆ. We will test the hypothesis that (b) (the true value of the estimated bˆ ) equals 0. The general form of the hypothesis in this case is: HO: b = b0 versus H1: b b0 Similarly, we will test the hypothesis that (a) (the true value of the estimated â ) equals 0. The general form of the hypothesis in this case is: H O: a = a 0 versus H 1: a a 0. If the value of b0 equals zero then the true value of (b) could be zero, and X will not contribute to the equation, and it is not needed in the model. In that case, we can drop X from the model. Similarly, if a0 = 0, and the result of the test shows that this could be accepted, then (a) is dropped out from the estimated model. 46 QMIS 0 3

24 Regression The Test Statistics are: same DF as the error in ANOVA same DF as the error in ANOVA The two statistics have a tdistribution with (n) df. The standard error for â and bˆ is not easy to compute and we are going to relay on the output of the MINITAB to estimated it. We can estimate the Confidence Intervals for a and b from the formulas: 47 Regression SE S SS where; S Error MS n SS b SS n 48 QMIS 0 4

25 Regression 49 Regression Example: If we have the following values of the variables X and Y: n x = 498 y=6 xy = 1178 x = 690 y =5370 SS xx = SS yy = 6.4 SS xy = 57. X Y If we input the values of these variables in columns 1 and in the MINITAB worksheet and use the commands print and Descriptive we have: 50 QMIS 0 5

26 n x = 498 y=6 xy = 1178 x = 690 y =5370 SS xx = SS yy = 6.4 SS xy = 57. Regression MTB > print c1 c Row X Y MTB > desc c1 c Descriptive Statistics: X, Y Variable N Mean Median TrMean StDev SE Mean X Y Variable Minimum Maximum Q1 Q3 X Y Regression To see the relation between the two variables: X and Y we plot the scatter plot between them using the MINITAB Plot command: MTB > GStd. MTB > Plot 'Y' 'X'; SUBC> Symbol '*'. * * Y * 5.0+ * * * 0.0+ * * * * X The scatter plot indicates that there is nonperfect but positive correlation between the two variables. This could be more confirmed by computing the Pearson s linear correlation coefficient (r) using the MINITAB corr command: 5 QMIS 0 6

27 n x = 498 y=6 xy = 1178 x = 690 y =5370 SS xx = SS yy = 6.4 SS xy = 57. Regression MTB > corr c1 c s: X, Y Pearson correlation of X and Y = PValue = 0.00 The value of the correlation coefficient confirms that there is a strong positive linear correlation between X and Y. To estimate the linear regression line of Y on X (in which Y is the dependent variable and X is the independent one) we will use the MINITAB command Regress c 1 c1. 53 n x = 498 y=6 xy = 1178 x = 690 y =5370 SS xx = SS yy = 6.4 SS xy = 57. Regression To further store the values of the standardized errors in column C3 and the predicted values of Y in column C4. We use the command: Regress c 1 c1 c3 c4. The following is the output of that command: MTB > regr c 1 c1 c3 c4 Regression Analysis: Y versus X The regression equation is Y = X Predictor Coef SE Coef T P Constant X S = RSq = 71.1% RSq(adj) = 67.5% Analysis of Variance Source DF SS MS F P Regression Residual Error Total QMIS 0 7

28 Regression MTB > print c1 c c3 c4 X Y SE yhat From the above results we have: The best linear estimated regression line using the least squared method is: Ŷ X 55 Regression To test the validation of the model we will use the previous output from MINITAB. (1) Coefficient of Determination (r ): The value of r is presented in the above result. r = or 71.1%. To show how this value is computed we will use the values of Y, Yhat and Y. Through the MINITAB we will compute r using the formula: ŶY RSS r YY TSS 6.4 b ˆ SS xy * 57. r = = = SSyy 6.4 ( SS ) ( 57.) xx xy yy S = RSq = 71.1% RSq(adj) = 67.5% Analysis of Variance Source DF SS MS F P Regression Residual Error Total SSxx = SSyy = 6.4 SSxy = 57. OR OR ( ) ( ) r = = = = r = SS * SS * QMIS 0 8

29 Regression MTB > let c5=c4mean(c) MTB > name c5 'YhYB' MTB > let c6=c5** MTB > name c6 '(YhYB)' MTB > let c7 = (cmean(c))** MTB > name c7 '(YYB)' MTB > print c1 c c4 c5c7 Row X Y Yhat YhYB (YhYB) (YYB) MTB > let k1=sum(c6) MTB > let k=sum(c7) MTB > let k3=k1/k MTB > print k1k3 K K K r 57 S = RSq = 71.1% RSq(adj) = 67.5% Analysis of Variance Source DF SS MS F P Regression Residual Error Total Regression r = or 71.1% means that the estimated model of the regression for Y on X (i.e Ŷ X ) managed to explain 71.1% of the total variation in the dependent variable Y. The remaining 9.9% is not explained by the model either because the used model is not adequate, or because that part is a pure error term that can not be explained. We can also compute the same value directly from the results of the ANOVA table presented above, by: dividing the regression sum of squares by the total sum of squares: Analysis of Variance Source DF SS MS F P Regression Residual Error Total r = QMIS 0 9

30 S = RSq = 71.1% RSq(adj) = 67.5% Analysis of Variance Source DF SS MS F P Regression Residual Error Total Regression The same value is presented in the above output: S = RSq = 71.1% RSq(adj) = 67.5% It is worth mentioning here that for the simple regression model case the coefficient of Determination (r ) equals the square of the Pearson s Coefficient (r) (r) = (0.843) = Conclusion: 71.1% is a good and acceptable ratio. So the model is considered adequate enough. 59 Regression n x = 498 y=6 xy = 1178 x = 690 y =5370 SSxx = SSyy = 6.4 SSxy = 57. () Testing the validity of the overall estimated model: We will test the following two hypotheses using the results from the ANOVA presented above: H0: The estimated model does not fit the data (the estimated model is not good or accepted) H1: The estimated model fits the data. (the estimated model is good and accepted) The number of independent variables in the model is k = 1 The ANOVA results presented in the above MINITAB results are: Analysis of Variance Source DF SS MS F P Regression Residual Error Total QMIS 0 30

31 S = RSq = 71.1% RSq(adj) = 67.5% Analysis of Variance Source DF SS MS F P Regression Residual Error Total Regression So, the test statistics for testing the above hypotheses equals to: Fcalculated = and follows the F Distribution with 1 and 8 DF. The Pvalue = 0.00 The tabulated F value for = 0.05 and 1 and 8 DF equals to Conclusion: Reject Ho: The estimated model does not fit the data (the estimated model is not good or accepted) with 95% confidence. This mean that we have statistical evidences that the estimated model is accepted and good to fit and present the linear relation between the two variables. 61 Regression (3) Testing each component of the estimated model: If the validity of the model (as a whole) is accepted in the second step above. We can go ahead and test the importance, and the necessity of keeping each of the elements that form the estimated model. If the true model we are estimating is: Y= a+ b X + e And the estimated model is: Yˆ = ˆa+ ˆb X Then we will test whether a = 0 (i.e. we can eliminate a from the estimated model). And the estimated model will be: Yˆ = ˆb X 6 QMIS 0 31

32 Regression Similarly, we will test whether b = 0 (i.e. we can eliminate b and X contribution. And estimate the model with the contribution of a alone. a here will be the mean of the dependent variable Y). And the estimated model is: Yˆ = ˆa Or simply Ŷ Y For that, we will test the hypotheses: H o : b = 0 and H o : a = 0 H 1 : b 0 H 1 : a 0 63 S = RSq = 71.1% RSq(adj) = 67.5% Analysis of Variance Source DF SS MS F P Regression Residual Error Total Regression SSxx = SSyy = 6.4 SSxy = QMIS 0 3

33 The regression equation is Y = X Predictor Coef SE Coef T P Constant X Regression 65 Regression Based on the above results, we can conclude the following: We will reject Ho that b = 0. This means that the contribution of X is not negligible and it should be included in the model. The effect of a is not important and can be eliminated from the model. Even though if we leave it in the model it will not harm the model as its roll and contribution is not of a significant importance. 66 QMIS 0 33

34 Regression Confidence Intervals For model parameters: As we have shown before, one can estimate the Confidence Interval for B using the equation: P b ˆ t( ˆ ˆ ˆ /, nk1 )*S.E.(b) b b t( /, nk1)*s.e.(b) 1 And the Confidence Interval for A: P a ˆ t( /, nk1 ) *S.E.(a) ˆ a a ˆ t( /, nk1) *S.E.(a) ˆ 1 To estimate these two confidence intervals, we will use the MINITAB output. Mainly the estimated values for the two parameters and their Standard Errors shown in the output below: The regression equation is Y = X Predictor Coef SE Coef T P Constant X S = RSq = 71.1% RSq(adj) = 67.5% Analysis of Variance Source DF SS MS F P Regression Residual Error Total Regression The regression equation is Y = X Predictor Coef SE Coef T P Constant X S = RSq = 71.1% RSq(adj) = 67.5% Analysis of Variance Source DF SS MS F P Regression Residual Error Total For = 0.05 the tabulated value of t ( 0.05/, 8) =.306, and The estimated Confidence Interval for b is: P * b * P b And the estimated Confidence Interval for a is: P * a * P a QMIS 0 34

Regression Analysis: A Complete Example

Regression Analysis: A Complete Example Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty

More information

2. Simple Linear Regression

2. Simple Linear Regression Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according

More information

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the

More information

17. SIMPLE LINEAR REGRESSION II

17. SIMPLE LINEAR REGRESSION II 17. SIMPLE LINEAR REGRESSION II The Model In linear regression analysis, we assume that the relationship between X and Y is linear. This does not mean, however, that Y can be perfectly predicted from X.

More information

2013 MBA Jump Start Program. Statistics Module Part 3

2013 MBA Jump Start Program. Statistics Module Part 3 2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just

More information

Correlation key concepts:

Correlation key concepts: CORRELATION Correlation key concepts: Types of correlation Methods of studying correlation a) Scatter diagram b) Karl pearson s coefficient of correlation c) Spearman s Rank correlation coefficient d)

More information

Predictor Coef StDev T P Constant 970667056 616256122 1.58 0.154 X 0.00293 0.06163 0.05 0.963. S = 0.5597 R-Sq = 0.0% R-Sq(adj) = 0.

Predictor Coef StDev T P Constant 970667056 616256122 1.58 0.154 X 0.00293 0.06163 0.05 0.963. S = 0.5597 R-Sq = 0.0% R-Sq(adj) = 0. Statistical analysis using Microsoft Excel Microsoft Excel spreadsheets have become somewhat of a standard for data storage, at least for smaller data sets. This, along with the program often being packaged

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information

Univariate Regression

Univariate Regression Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is

More information

Hypothesis testing - Steps

Hypothesis testing - Steps Hypothesis testing - Steps Steps to do a two-tailed test of the hypothesis that β 1 0: 1. Set up the hypotheses: H 0 : β 1 = 0 H a : β 1 0. 2. Compute the test statistic: t = b 1 0 Std. error of b 1 =

More information

1.5 Oneway Analysis of Variance

1.5 Oneway Analysis of Variance Statistics: Rosie Cornish. 200. 1.5 Oneway Analysis of Variance 1 Introduction Oneway analysis of variance (ANOVA) is used to compare several means. This method is often used in scientific or medical experiments

More information

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r), Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables

More information

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Section 14 Simple Linear Regression: Introduction to Least Squares Regression Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship

More information

4. Multiple Regression in Practice

4. Multiple Regression in Practice 30 Multiple Regression in Practice 4. Multiple Regression in Practice The preceding chapters have helped define the broad principles on which regression analysis is based. What features one should look

More information

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ STA 3024 Practice Problems Exam 2 NOTE: These are just Practice Problems. This is NOT meant to look just like the test, and it is NOT the only thing that you should study. Make sure you know all the material

More information

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

More information

1.1. Simple Regression in Excel (Excel 2010).

1.1. Simple Regression in Excel (Excel 2010). .. Simple Regression in Excel (Excel 200). To get the Data Analysis tool, first click on File > Options > Add-Ins > Go > Select Data Analysis Toolpack & Toolpack VBA. Data Analysis is now available under

More information

Analysing Questionnaires using Minitab (for SPSS queries contact -) Graham.Currell@uwe.ac.uk

Analysing Questionnaires using Minitab (for SPSS queries contact -) Graham.Currell@uwe.ac.uk Analysing Questionnaires using Minitab (for SPSS queries contact -) Graham.Currell@uwe.ac.uk Structure As a starting point it is useful to consider a basic questionnaire as containing three main sections:

More information

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( ) Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

More information

Simple linear regression

Simple linear regression Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

More information

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Chapter 13 Introduction to Linear Regression and Correlation Analysis Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing

More information

SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables.

SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables. SIMPLE LINEAR CORRELATION Simple linear correlation is a measure of the degree to which two variables vary together, or a measure of the intensity of the association between two variables. Correlation

More information

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96 1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

More information

Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

More information

Final Exam Practice Problem Answers

Final Exam Practice Problem Answers Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal

More information

Premaster Statistics Tutorial 4 Full solutions

Premaster Statistics Tutorial 4 Full solutions Premaster Statistics Tutorial 4 Full solutions Regression analysis Q1 (based on Doane & Seward, 4/E, 12.7) a. Interpret the slope of the fitted regression = 125,000 + 150. b. What is the prediction for

More information

Factors affecting online sales

Factors affecting online sales Factors affecting online sales Table of contents Summary... 1 Research questions... 1 The dataset... 2 Descriptive statistics: The exploratory stage... 3 Confidence intervals... 4 Hypothesis tests... 4

More information

Coefficient of Determination

Coefficient of Determination Coefficient of Determination The coefficient of determination R 2 (or sometimes r 2 ) is another measure of how well the least squares equation ŷ = b 0 + b 1 x performs as a predictor of y. R 2 is computed

More information

ABSORBENCY OF PAPER TOWELS

ABSORBENCY OF PAPER TOWELS ABSORBENCY OF PAPER TOWELS 15. Brief Version of the Case Study 15.1 Problem Formulation 15.2 Selection of Factors 15.3 Obtaining Random Samples of Paper Towels 15.4 How will the Absorbency be measured?

More information

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups In analysis of variance, the main research question is whether the sample means are from different populations. The

More information

CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

More information

Module 3: Correlation and Covariance

Module 3: Correlation and Covariance Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis

More information

Regression step-by-step using Microsoft Excel

Regression step-by-step using Microsoft Excel Step 1: Regression step-by-step using Microsoft Excel Notes prepared by Pamela Peterson Drake, James Madison University Type the data into the spreadsheet The example used throughout this How to is a regression

More information

The correlation coefficient

The correlation coefficient The correlation coefficient Clinical Biostatistics The correlation coefficient Martin Bland Correlation coefficients are used to measure the of the relationship or association between two quantitative

More information

How To Run Statistical Tests in Excel

How To Run Statistical Tests in Excel How To Run Statistical Tests in Excel Microsoft Excel is your best tool for storing and manipulating data, calculating basic descriptive statistics such as means and standard deviations, and conducting

More information

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade Statistics Quiz Correlation and Regression -- ANSWERS 1. Temperature and air pollution are known to be correlated. We collect data from two laboratories, in Boston and Montreal. Boston makes their measurements

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is

More information

Regression III: Advanced Methods

Regression III: Advanced Methods Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models

More information

5. Linear Regression

5. Linear Regression 5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4

More information

Notes on Applied Linear Regression

Notes on Applied Linear Regression Notes on Applied Linear Regression Jamie DeCoster Department of Social Psychology Free University Amsterdam Van der Boechorststraat 1 1081 BT Amsterdam The Netherlands phone: +31 (0)20 444-8935 email:

More information

INTRODUCTION TO MULTIPLE CORRELATION

INTRODUCTION TO MULTIPLE CORRELATION CHAPTER 13 INTRODUCTION TO MULTIPLE CORRELATION Chapter 12 introduced you to the concept of partialling and how partialling could assist you in better interpreting the relationship between two primary

More information

General Regression Formulae ) (N-2) (1 - r 2 YX

General Regression Formulae ) (N-2) (1 - r 2 YX General Regression Formulae Single Predictor Standardized Parameter Model: Z Yi = β Z Xi + ε i Single Predictor Standardized Statistical Model: Z Yi = β Z Xi Estimate of Beta (Beta-hat: β = r YX (1 Standard

More information

Module 5: Multiple Regression Analysis

Module 5: Multiple Regression Analysis Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College

More information

Solution Let us regress percentage of games versus total payroll.

Solution Let us regress percentage of games versus total payroll. Assignment 3, MATH 2560, Due November 16th Question 1: all graphs and calculations have to be done using the computer The following table gives the 1999 payroll (rounded to the nearest million dolars)

More information

The importance of graphing the data: Anscombe s regression examples

The importance of graphing the data: Anscombe s regression examples The importance of graphing the data: Anscombe s regression examples Bruce Weaver Northern Health Research Conference Nipissing University, North Bay May 30-31, 2008 B. Weaver, NHRC 2008 1 The Objective

More information

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

 Y. Notation and Equations for Regression Lecture 11/4. Notation: Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

More information

Title: Modeling for Prediction Linear Regression with Excel, Minitab, Fathom and the TI-83

Title: Modeling for Prediction Linear Regression with Excel, Minitab, Fathom and the TI-83 Title: Modeling for Prediction Linear Regression with Excel, Minitab, Fathom and the TI-83 Brief Overview: In this lesson section, the class is going to be exploring data through linear regression while

More information

Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

More information

MULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL. by Michael L. Orlov Chemistry Department, Oregon State University (1996)

MULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL. by Michael L. Orlov Chemistry Department, Oregon State University (1996) MULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL by Michael L. Orlov Chemistry Department, Oregon State University (1996) INTRODUCTION In modern science, regression analysis is a necessary part

More information

Elementary Statistics Sample Exam #3

Elementary Statistics Sample Exam #3 Elementary Statistics Sample Exam #3 Instructions. No books or telephones. Only the supplied calculators are allowed. The exam is worth 100 points. 1. A chi square goodness of fit test is considered to

More information

1 Simple Linear Regression I Least Squares Estimation

1 Simple Linear Regression I Least Squares Estimation Simple Linear Regression I Least Squares Estimation Textbook Sections: 8. 8.3 Previously, we have worked with a random variable x that comes from a population that is normally distributed with mean µ and

More information

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.

More information

TRINITY COLLEGE. Faculty of Engineering, Mathematics and Science. School of Computer Science & Statistics

TRINITY COLLEGE. Faculty of Engineering, Mathematics and Science. School of Computer Science & Statistics UNIVERSITY OF DUBLIN TRINITY COLLEGE Faculty of Engineering, Mathematics and Science School of Computer Science & Statistics BA (Mod) Enter Course Title Trinity Term 2013 Junior/Senior Sophister ST7002

More information

Section 3 Part 1. Relationships between two numerical variables

Section 3 Part 1. Relationships between two numerical variables Section 3 Part 1 Relationships between two numerical variables 1 Relationship between two variables The summary statistics covered in the previous lessons are appropriate for describing a single variable.

More information

Week TSX Index 1 8480 2 8470 3 8475 4 8510 5 8500 6 8480

Week TSX Index 1 8480 2 8470 3 8475 4 8510 5 8500 6 8480 1) The S & P/TSX Composite Index is based on common stock prices of a group of Canadian stocks. The weekly close level of the TSX for 6 weeks are shown: Week TSX Index 1 8480 2 8470 3 8475 4 8510 5 8500

More information

SPSS Guide: Regression Analysis

SPSS Guide: Regression Analysis SPSS Guide: Regression Analysis I put this together to give you a step-by-step guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar

More information

Using Excel for Statistical Analysis

Using Excel for Statistical Analysis Using Excel for Statistical Analysis You don t have to have a fancy pants statistics package to do many statistical functions. Excel can perform several statistical tests and analyses. First, make sure

More information

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation

More information

II. DISTRIBUTIONS distribution normal distribution. standard scores

II. DISTRIBUTIONS distribution normal distribution. standard scores Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

More information

c 2015, Jeffrey S. Simonoff 1

c 2015, Jeffrey S. Simonoff 1 Modeling Lowe s sales Forecasting sales is obviously of crucial importance to businesses. Revenue streams are random, of course, but in some industries general economic factors would be expected to have

More information

Statistical Models in R

Statistical Models in R Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Linear Models in R Regression Regression analysis is the appropriate

More information

Scatter Plot, Correlation, and Regression on the TI-83/84

Scatter Plot, Correlation, and Regression on the TI-83/84 Scatter Plot, Correlation, and Regression on the TI-83/84 Summary: When you have a set of (x,y) data points and want to find the best equation to describe them, you are performing a regression. This page

More information

WEB APPENDIX. Calculating Beta Coefficients. b Beta Rise Run Y 7.1 1 8.92 X 10.0 0.0 16.0 10.0 1.6

WEB APPENDIX. Calculating Beta Coefficients. b Beta Rise Run Y 7.1 1 8.92 X 10.0 0.0 16.0 10.0 1.6 WEB APPENDIX 8A Calculating Beta Coefficients The CAPM is an ex ante model, which means that all of the variables represent before-thefact, expected values. In particular, the beta coefficient used in

More information

Example: Boats and Manatees

Example: Boats and Manatees Figure 9-6 Example: Boats and Manatees Slide 1 Given the sample data in Table 9-1, find the value of the linear correlation coefficient r, then refer to Table A-6 to determine whether there is a significant

More information

August 2012 EXAMINATIONS Solution Part I

August 2012 EXAMINATIONS Solution Part I August 01 EXAMINATIONS Solution Part I (1) In a random sample of 600 eligible voters, the probability that less than 38% will be in favour of this policy is closest to (B) () In a large random sample,

More information

12: Analysis of Variance. Introduction

12: Analysis of Variance. Introduction 1: Analysis of Variance Introduction EDA Hypothesis Test Introduction In Chapter 8 and again in Chapter 11 we compared means from two independent groups. In this chapter we extend the procedure to consider

More information

AP Physics 1 and 2 Lab Investigations

AP Physics 1 and 2 Lab Investigations AP Physics 1 and 2 Lab Investigations Student Guide to Data Analysis New York, NY. College Board, Advanced Placement, Advanced Placement Program, AP, AP Central, and the acorn logo are registered trademarks

More information

Introduction to Quantitative Methods

Introduction to Quantitative Methods Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................

More information

Session 7 Bivariate Data and Analysis

Session 7 Bivariate Data and Analysis Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table co-variation least squares

More information

Using Excel for inferential statistics

Using Excel for inferential statistics FACT SHEET Using Excel for inferential statistics Introduction When you collect data, you expect a certain amount of variation, just caused by chance. A wide variety of statistical tests can be applied

More information

The Volatility Index Stefan Iacono University System of Maryland Foundation

The Volatility Index Stefan Iacono University System of Maryland Foundation 1 The Volatility Index Stefan Iacono University System of Maryland Foundation 28 May, 2014 Mr. Joe Rinaldi 2 The Volatility Index Introduction The CBOE s VIX, often called the market fear gauge, measures

More information

Chapter 7. One-way ANOVA

Chapter 7. One-way ANOVA Chapter 7 One-way ANOVA One-way ANOVA examines equality of population means for a quantitative outcome and a single categorical explanatory variable with any number of levels. The t-test of Chapter 6 looks

More information

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar business statistics using Excel Glyn Davis & Branko Pecar OXFORD UNIVERSITY PRESS Detailed contents Introduction to Microsoft Excel 2003 Overview Learning Objectives 1.1 Introduction to Microsoft Excel

More information

Using R for Linear Regression

Using R for Linear Regression Using R for Linear Regression In the following handout words and symbols in bold are R functions and words and symbols in italics are entries supplied by the user; underlined words and symbols are optional

More information

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

Chapter 23. Inferences for Regression

Chapter 23. Inferences for Regression Chapter 23. Inferences for Regression Topics covered in this chapter: Simple Linear Regression Simple Linear Regression Example 23.1: Crying and IQ The Problem: Infants who cry easily may be more easily

More information

Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables 2

Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables 2 Lesson 4 Part 1 Relationships between two numerical variables 1 Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables

More information

MGT 267 PROJECT. Forecasting the United States Retail Sales of the Pharmacies and Drug Stores. Done by: Shunwei Wang & Mohammad Zainal

MGT 267 PROJECT. Forecasting the United States Retail Sales of the Pharmacies and Drug Stores. Done by: Shunwei Wang & Mohammad Zainal MGT 267 PROJECT Forecasting the United States Retail Sales of the Pharmacies and Drug Stores Done by: Shunwei Wang & Mohammad Zainal Dec. 2002 The retail sale (Million) ABSTRACT The present study aims

More information

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 4: September

More information

Data Analysis Tools. Tools for Summarizing Data

Data Analysis Tools. Tools for Summarizing Data Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool

More information

2. Linear regression with multiple regressors

2. Linear regression with multiple regressors 2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions

More information

2 Sample t-test (unequal sample sizes and unequal variances)

2 Sample t-test (unequal sample sizes and unequal variances) Variations of the t-test: Sample tail Sample t-test (unequal sample sizes and unequal variances) Like the last example, below we have ceramic sherd thickness measurements (in cm) of two samples representing

More information

Lin s Concordance Correlation Coefficient

Lin s Concordance Correlation Coefficient NSS Statistical Software NSS.com hapter 30 Lin s oncordance orrelation oefficient Introduction This procedure calculates Lin s concordance correlation coefficient ( ) from a set of bivariate data. The

More information

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number 1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression

More information

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,

More information

MULTIPLE REGRESSION EXAMPLE

MULTIPLE REGRESSION EXAMPLE MULTIPLE REGRESSION EXAMPLE For a sample of n = 166 college students, the following variables were measured: Y = height X 1 = mother s height ( momheight ) X 2 = father s height ( dadheight ) X 3 = 1 if

More information

Notes on logarithms Ron Michener Revised January 2003

Notes on logarithms Ron Michener Revised January 2003 Notes on logarithms Ron Michener Revised January 2003 In applied work in economics, it is often the case that statistical work is done using the logarithms of variables, rather than the raw variables themselves.

More information

Simple Linear Regression, Scatterplots, and Bivariate Correlation

Simple Linear Regression, Scatterplots, and Bivariate Correlation 1 Simple Linear Regression, Scatterplots, and Bivariate Correlation This section covers procedures for testing the association between two continuous variables using the SPSS Regression and Correlate analyses.

More information

Doing Multiple Regression with SPSS. In this case, we are interested in the Analyze options so we choose that menu. If gives us a number of choices:

Doing Multiple Regression with SPSS. In this case, we are interested in the Analyze options so we choose that menu. If gives us a number of choices: Doing Multiple Regression with SPSS Multiple Regression for Data Already in Data Editor Next we want to specify a multiple regression analysis for these data. The menu bar for SPSS offers several options:

More information

Homework 11. Part 1. Name: Score: / null

Homework 11. Part 1. Name: Score: / null Name: Score: / Homework 11 Part 1 null 1 For which of the following correlations would the data points be clustered most closely around a straight line? A. r = 0.50 B. r = -0.80 C. r = 0.10 D. There is

More information

Recall this chart that showed how most of our course would be organized:

Recall this chart that showed how most of our course would be organized: Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical

More information

(More Practice With Trend Forecasts)

(More Practice With Trend Forecasts) Stats for Strategy HOMEWORK 11 (Topic 11 Part 2) (revised Jan. 2016) DIRECTIONS/SUGGESTIONS You may conveniently write answers to Problems A and B within these directions. Some exercises include special

More information

An analysis appropriate for a quantitative outcome and a single quantitative explanatory. 9.1 The model behind linear regression

An analysis appropriate for a quantitative outcome and a single quantitative explanatory. 9.1 The model behind linear regression Chapter 9 Simple Linear Regression An analysis appropriate for a quantitative outcome and a single quantitative explanatory variable. 9.1 The model behind linear regression When we are examining the relationship

More information

MULTIPLE REGRESSION WITH CATEGORICAL DATA

MULTIPLE REGRESSION WITH CATEGORICAL DATA DEPARTMENT OF POLITICAL SCIENCE AND INTERNATIONAL RELATIONS Posc/Uapp 86 MULTIPLE REGRESSION WITH CATEGORICAL DATA I. AGENDA: A. Multiple regression with categorical variables. Coding schemes. Interpreting

More information

Module 5: Statistical Analysis

Module 5: Statistical Analysis Module 5: Statistical Analysis To answer more complex questions using your data, or in statistical terms, to test your hypothesis, you need to use more advanced statistical tests. This module reviews the

More information

Correlation and Simple Linear Regression

Correlation and Simple Linear Regression Correlation and Simple Linear Regression We are often interested in studying the relationship among variables to determine whether they are associated with one another. When we think that changes in a

More information

GLM I An Introduction to Generalized Linear Models

GLM I An Introduction to Generalized Linear Models GLM I An Introduction to Generalized Linear Models CAS Ratemaking and Product Management Seminar March 2009 Presented by: Tanya D. Havlicek, Actuarial Assistant 0 ANTITRUST Notice The Casualty Actuarial

More information

Example G Cost of construction of nuclear power plants

Example G Cost of construction of nuclear power plants 1 Example G Cost of construction of nuclear power plants Description of data Table G.1 gives data, reproduced by permission of the Rand Corporation, from a report (Mooz, 1978) on 32 light water reactor

More information