Goals

The goals for today are

  • Show that OLS estimates can be rewritten in terms of the response.
  • Show the expected value of \(\hat{\beta}_0\) and \(\hat{\beta}_1\) are \(\beta_0\) and \(\beta_1\), respectively.
  • Calculate the expected value when \(X = \bar{x}\).

Properties of Least Squares Estimates

  • OLS estimates depend only from data (remember Anscombe’s data). We can still have same estimates even if a straight-line is inappropriate.
  • We now will show how both \(\hat{\beta}_0\) and \(\hat{\beta}_1\) can be written as linear combinations of the \(y_i\).
  • For \(\hat{\beta}_1\), recall \[ \hat{\beta}_1 = \frac{\texttt{SXY}}{\texttt{SXX}} \]

Rewriting SXY and SXX

Show that \[ \texttt{SXX} = \sum_{i=1}^{n} x_i^2 - n \bar{x}^2 \]

and \[ \texttt{SXY} = \sum_{i=1}^n (x_i-\bar{x}) y_i \]

Rewriting \(\;\hat{\beta}_1\)

We can rewrite \(\hat{\beta}_1\) in terms of the \(y_i\) as \[ \begin{align*} \hat{\beta}_1 &= \frac{\texttt{SXY}}{\texttt{SXX}}\\ &= \sum c_i y_i, \end{align*} \] where \(c_i = \frac{x_i - \bar{x}}{\texttt{SXX}}\). How?

Rewriting \(\beta_0\)

Similarly, we can rewrite \(\hat{\beta}_0\) in terms of the \(y_i\) as \[ \begin{align*} \hat{\beta}_0 &= \bar{y} - \hat{\beta}_1 \bar{x} \\ &= \sum d_i yi \end{align*} \] where \(d_i = \left( \frac{1}{n} - c_i x_i \right)\).

Class Activity: Show the above.

Expected value of \(\; \hat{\beta}\)s

Given the previous information, we will show \[ \text{E}\left( \hat{\beta}_1 | X \right) = \beta_1 \]

and \[ \text{E}\left( \hat{\beta}_0 | X \right) = \beta_0. \]

Class Activity: Show the second expression is true.

Expected value of \(\bar{x}\)

Since \(\hat{\beta}_0 = \bar{y} - \hat{\beta}_1 \bar{x}\), then \[ \begin{align*} \text{E}\left( Y | X = \bar{x} \right) &= \hat{\beta}_0 + \hat{\beta}_1 \bar{x} \\ &= \bar{y} - \hat{\beta}_1 \bar{x} +\hat{\beta}_1 \bar{x}\\ & = \bar{y}, \end{align*} \] meaning that the OLS line passes through the sample means of the given data!

Variance of \(\;\hat{\beta}\)s

Recall that the variance for random variable \(u_i\) is given by \[ \text{Var}(u_i) = \text{E}\left[ (u_i - \text{E}(u_i))^2\right], \]

and for uncorrelated variables, \[ \begin{align*} \text{Var}\left(a_0 + \sum a_i u_i \right) &= \sum \text{Var}\left(a_i u_i \right)\\ &= \sum \text{E}\left[\left(a_i u_i - \text{E}(a_i u_i) \right)^2\right] \\ &= \sum \text{E} \left[ a_i^2 (u_i - \text{E}(u_i))^2 \right]\\ %&= %&= \sum \left( \ex\left[a_i u_i - a_i \ex(u_i) \right] \right)^2\\ %&= \sum \left(a_i \ex\left[ (u_i - \ex(u_i) ) \right] \right)^2\\ &= \sum a_i^2 \text{Var}(u_i)\\ \end{align*} \]

Then \[ \text{Var}\left( \hat{\beta}_1 | X \right) = \sigma^2 \frac{1}{\texttt{SXX}} \] and \[ \text{Var}\left( \hat{\beta}_0 | X \right) = \sigma^2 \left(\frac{1}{n} + \frac{\bar{x}^2}{\texttt{SXX}}\right) \]