13 Standardized Multiple Regression
(we are skipping standardized multiple regression)
Historically, standardized multiple regression model is generally used for two reasons:
- To control round-off errors in coefficient estimate calculations (usually most problematic with multicollinearity).
- To directly compare the estimated regression coefficients in common units.
- More recently, standardizing coefficients has become important for methods like Ridge Regression and LASSO (see Section 14) so that the constrained optimization techniques do not differentially affect the coefficient estimates.
Consider the following example: \[ \hat{Y} = 200 + 20,000 X_1 + 0.2 X_2\] At first glance, it seems like \(X_1\) is the much more important factor. But it’s all about units! Suppose the units are: \[\begin{eqnarray*} Y & &\mbox{in dollars}\\ X_1 && \mbox{in thousand dollars}\\ X_2 & &\mbox{in cents} \end{eqnarray*}\] The effect of the mean response of a $1,000 increase in \(X_1\) (that is, a one-unit increase) when \(X_2\) is held constant is an increase of $20,000. This is exactly the same as the effect of a $1,000 increase in \(X_2\) (i.e., a 100,000-unit-increase) when \(X_1\) is held constant.
Consider the following transformations: \[\begin{eqnarray*} Y_i^* &=& \bigg( \frac{Y_i - \overline{Y}}{s_Y} \bigg)\\ X_{ik}^* &=& \bigg( \frac{X_{ik} - \overline{X}_k}{s_k} \bigg) \ \ \ \ \ k=1, \ldots, p-1\\ \mbox{and }&& \\ s_Y &=& \sqrt{\frac{\sum_i (Y_i - \overline{Y})^2}{n-1}}\\ s_k &=& \sqrt{\frac{\sum_i (X_{ik} - \overline{X}_k)^2}{n-1}} \ \ \ \ \ k=1, \ldots, p-1\\ \end{eqnarray*}\]
Using the standardized variables, we get the standardized regression model: \[\begin{eqnarray*} Y_i^* &=& \beta_i^*X_{i1} ^*+ \cdots + \beta_{p-1}^*X_{i,p-1}^* + \epsilon_i^*\\ \end{eqnarray*}\] There is a direct algebraic connection between the standardized parameters \(\beta_1^*, \ldots, \beta_{p-1}^*\) and the original parameters form the ordinary multiple regression model \(\beta_1, \ldots, \beta_{p-1}\)
\[\begin{eqnarray*} \beta_k &=& \bigg(\frac{s_Y}{s_k} \bigg) \beta_k^* \ \ \ \ \ k=1, \ldots, p-1\\ \beta_0 &=& \overline{Y} - \beta_1\overline{X}_1 - \cdots - \beta_{p-1}\overline{X}_{p-1} \end{eqnarray*}\]
Proof of relationship between standardized and unstandardized coefficients
\[\begin{eqnarray*} Y_i &=& b_0 + b_1 X_{i1} + b_2 X_{i2} + e_i \\ Y_i - \overline{Y} &=& b_0 + b_1 X_{i1} + b_2 X_{i2} + e_i- \overline{Y} \\ &=& \overline{Y} - b_1 \overline{X}_1 - b_2 \overline{X}_2 + b_1 X_{i1} + b_2 X_{i2} + e_i- \overline{Y} \\ &=& b_1 (X_{i1} - \overline{X}) + b_2 (X_{i2} - \overline{X}_2) + e_i\\ &=& b_1 s_1 (X_{i1} - \overline{X}) / s_1+ b_2 s_2 (X_{i2} - \overline{X}_2)/s_2 + e_i\\ &=& b_1 s_1 X^*_{i1}+ b_2 s_2 X^*_{i2} + e_i\\ (Y_i - \overline{Y} )/s_y &=& b_1 (s_1/s_y) X^*_{i1}+ b_2 (s_2 /s_y) X^*_{i2} + e_i\\ Y_i^* &=& b_1^* X_{i1}^* + b_2^* X_{i2}^* + e_i\\ \end{eqnarray*}\]
Why is there no intercept in the standardized model?
Note that there is no intercept in the standardized regression model. That is because the variables are all centered at zero, so when putting the average \(X\) values in (zero) the maximum likelihood estimates predict the average \(Y\) value (zero). Recall that to solve for coefficients, we take the derivative of the sum of squares and set it equal to zero. That process forces the OLS model to go though the point determined by the average of all the variables (explanatory and response).
\[\begin{eqnarray*} \frac{\partial \sum_i (Y_i^* - b_0^* - b_1^*X_i^*)^2}{\partial b_0^*} &=& 0\\ \sum_i (Y_i^* - b_0^* - b_1^*X_i^*) &=& 0\\ \overline{Y}^* &=& b_0^* + b_1^* \overline{X}^*\\ \end{eqnarray*}\]
Because the standardized coefficients are all centered at zero, the average of all of them will be zero (by definition). Which means that if the model goes through the average points, it goes through the origin, and \(b_0^* = \beta_0^* = 0\).