Last Time in Math 106

We discussed

  • Properties of variance
  • Variance of \(\beta\)s
  • Estimated variance

Estimated Variance

Since we will not necessarily know the population variance \(\sigma\), we substitute this our estimate \(\hat{\sigma}\), yielding

\[ \widehat{\text{Var}} \left( \hat{\beta}_1 | X \right) = \hat{\sigma}^2 \frac{1}{\texttt{SXX}}, \qquad \widehat{\text{Var}} \left( \hat{\beta}_0 | X \right) = \hat{\sigma}^2 \left( \frac{1}{n} + \frac{\bar{x}^2}{\texttt{SXX}}\right) \]

and the standard error, se, is then \[ \text{se}\left( \hat{\beta}_0 | X \right) = \sqrt{ \widehat{\text{Var}} \left( \hat{\beta}_0 | X \right) }, \]

Confidence Intervals and \(t\)-Tests

Q: How do you create a \((1-\alpha) \times 100 \%\) confidence interval?

For a \(t\)-distribution, we will use \(t (\alpha/2 , df ) = t(\alpha/2, n-2)\). Hence, we have \[ \hat{\beta}_0 - t\left(\tfrac{\alpha}{2}, n-2\right) \text{se}\left(\hat{\beta}_0 | X \right) \le \beta_0 \le \hat{\beta}_0 + t\left(\tfrac{\alpha}{2}, n-2\right) \text{se}\left(\hat{\beta}_0 | X \right) \]

This will be the same for the slope!

Hypothesis Testing

For the hypothesis test,

\[ \begin{align*} H_0: \quad \beta_0 &= \beta_0^*,\; \beta_1 \text{ arbitrary}\\ H_a: \quad \beta_0 &\neq \beta_0^*,\; \beta_1 \text{ arbitrary}, \end{align*} \]

Calculating the \(t\)-statistic will be similar as before,

\[ t = \frac{\hat{\beta}_0 - \beta_0^*}{\text{se}\left(\hat{\beta}_0 | X \right)}. \]

Q: How would you set up (and interpret) the hypothesis for the slope?

Prediction

Q: When can you use your OLS model to predict future data?

Assume we have new data (not seen in constructing the model) \((x_*, y_*)\). Then, a point prediction (with our model) for \(y_*\) would be

\[ \tilde{y}_* = \hat{\beta}_0 + \hat{\beta}_1 x_* \]

\(\tilde{y}_*\) predicts the as yet unobserved \({y}_*\). Assuming the model is correct, then the true value of \({y}_*\)

\[ {y}_* = \beta_0 + \beta_1 x_* + e_*, \]

where \(e_*\) is the random error associated with \({y}_*\).

Prediction and Confidence Intervals

For prediction intervals, the standard error is given by \[ \text{sepred}(y_* | x_*) = \sigma \left( 1 + \frac{1}{n} + \frac{ (x_* - \bar{x} )^2}{\texttt{SXX}} \right)^{1/2}. \]

For confidence intervals, which describes \(E(Y|X = x_*)\), the standard error is \[ \text{sefit}(y_* | x_*) = \sigma \left(\frac{1}{n} + \frac{ (x_* - \bar{x} )^2}{\texttt{SXX}} \right)^{1/2}. \] Q: What do you think accounts for this difference?

Ch. 2 Reading

Article on confidence and prediction intervals

Read Section 2.7 and 2.8 and come with questions to discuss for Friday.

Reminder: Quiz on Friday!