General Regression Formulae Single Predictor Standardized Parameter Model: Z Yi = β Z Xi + ε i Single Predictor Standardized Statistical Model: Z Yi = β Z Xi Estimate of Beta (Beta-hat: β = r YX (1 Standard error of estimate: s Zy.Zx = ( Standard error of Beta: Se β = (1 - r YX (N- (3 There are two identical null hypotheses: Ho: β = 0 and Ho: ρ = 0 Both are tested with a t-statistic with (N - degrees of freedom (df which can be computed two ways. t (N - = r (N - (4 and t (N - = β - 0 Se β (5 Single Predictor Raw Score Parameter Model: Y i = α + βx i + ε i Single Predictor Raw Score Statistical Model: Y = a + b 1 X 1 Estimate of Beta (b: b=β s Y sx (6 Since Beta-hat and r are identical in the single predictor model r can be substituted. Estimate of the Y-intercept or Regression constant (a: a = Y - bx (7 Standard error of estimate: s Y.X = s Y (8 Standard error of b: Se b = s Y.X (N-1s X (9 There are two identical null hypotheses: Ho: β = 0 and Ho: ρ = 0 Both are tested with a t-statistic with (N - degrees of freedom (df which can be computed two ways. Again with formula (4 and with t (N - = b - 0 Se b (10
Two Predictor Standardized Parameter Model: Z Yi = β 1 Z X1i + β Z Xi + ε i Two Predictor Standardized Statistical Model: Z Yi = β 1 Z X1i + β Z Xi To calculate Beta-hat the correlation between the predictor variables must be taken into consideration β 1 = r Y1 - r Y r 1 1 - r 1 (11 and β = r Y - r Y1 r 1 1 - r 1 (1 Similar to formula (, the Standard error of estimate is: s Zy.Zy = 1 - R YX (13 In the two predictor case the Standard error of Beta-hat is the same for both variables: Se β = (1 - R Y.1 (N-3(1- r 1 (14 However, there is more than one null hypothesis that can be tested. First of all, one can test whether the overall model significantly improves prediction over the mean. Ho: β 1 = β = 0 This is tested with an F-statistic with two (number of predictors and (N-3 dfs: (R F (,N-3 = - 0/ (1 - R /(N - 3 (15 Multiple R has a general formula: K R Y = βj r Yj j =1 = β 1 r Y1 + β r Y +... + β k r Yk (16 One may also test whether each predictor makes a significant improvement in prediction over the other predictor(s. This is tested with a t-test with (N - k -1 degrees of freedom, where k equals the number of predictors (in this case k =. For any variable j: t (N - k -1 = β j - 0 Se βj (17 where Se βj = (1 - R Y.1... k (N-k-1(1- R j.1... k (18 This can also be tested with a more flexible F-statistic: F (kf - k R,N-k F -1 = (R F - RR/(kF - k R (1 - R F/(N - kf -1 (19
Two Predictor Raw Score Parameter Model: Y i = α + β 1 X 1i + β X i + ε i Two Predictor Raw Score Statistical Model: Y = a + b 1 X 1 + b X s For any variable j, the Estimate of Beta (bj: b j =β Y jsxj (0 Estimate of the Y-intercept or Regression constant (a: a = Y - b 1 X 1 - b X (1 Similar to formula (8, the Standard error of estimate: s Y.Y = s Y 1 - R ( Because of possible differences in variance across variables, each predictor variable has a different Standard error of b: s Y.Y (3 For any of the two variables denoted as j: Sebj = s j (N-1(1- r 1 Again, one can test whether the overall model significantly improves prediction over the mean. Ho: β 1 = β = 0, which is tested with the F-statistic in formula (15. Also similar to the standardized model, one may also test whether each predictor makes a significant improvement in prediction over the other predictor(s. This is tested with a t-test with (N - k -1 degrees of freedom, where k equals the number of predictors (in this case k =. For any variable j: t (N - k -1 = b j - 0 Se bj (4 Partial Correlations are used to statistically "control" the effects of all other predictors. Partial correlations remove the effect of control variables from variables of interest including the dependent variable. Some researchers use them instead of Beta-hat to interpret variable "importance." With one dependent variable (Y and two predictors, the general formula is: r Y1. = r Y1 - r Y r 1 1 - r 1 1 - r Y (5 Semi-Partial (sometimes referred to as Part correlation are an index of the "unique" correlation between variables. Semi-Partial correlations remove the effect of a variable from all other predictors but not the dependent variable. With one dependent variable (Y and predictors, the general formula is: r Y(1. = r Y1 - r Y r 1 1 - r 1 (6 Squaring Semi-partial correlations are useful because they give the "unique" contribution a variable makes to the R of a multiple regression model. For example with two predictors R can be decomposed as follows:
R Y.1 = r Y + r Y(1. and conversely, R Y.1 = r Y1 + r Y(.1
Source Table for Multiple Regression Although this process would be laborious, this is the conceptual derivation for the F-ratio in Multiple Regression Source Sum of Squares df Mean Squares F Regression (Y i - Y k SS R /k MS R /MS e (Explained Variance Residual (Y i - Y i N - k - 1 SS e /df e Total Variance (Y i - Y N - 1 s = SS T /N-1 where, N = total number of cases, k = number of predictors, Y = the mean of Y. Y i = each individual score on Y, and Y i = each individual predicted Y. Given, R = SS R /SS T Source Sum of Squares df MS F Regression R SS T k SS R /k (R /k (Explained Variance (1- R /(N - k - 1 Residual (1- R SS T N - k - 1 SS e /N - k - 1 Total Variance (Y i - Y N - 1 s = SS T /N-1 One-Way ANOVA Source Table When we extend Least Squares Regression Methodology to a continuous dependent variable Y and categorical independent variables, it is often referred to as the ANalysis Of Varaince (ANOVA. In the ANOVA, the predicted score, Y i, for each individual in the jth group is equal to their (jth group mean, Y i = Y j. Knowing this, the previous Source Tables simplify greatly. For the One-way (one categorical independent variable ANOVA, the Source Table is as follows: Source Sum of Squares df Mean Squares F Between Groups n j (Y j - Y * J - 1 SS B /J - 1 MS B /MS W (Explained Variance Within Groups Total Variance (Y i - Y j N - J SS W /df W (Y i - Y * N - 1 s = SS T /N-1 where, N = total number of cases, J = number of groups, Y * = the grand mean of Y across all groups. Y i = each individual score on Y, and Y j = the mean for group j. n j = the number of cases in group j. R = η = SS B /SS T.