MATH 106 - Applied Linear Statistical Models

Last Time in Math 106

We discussed

Properties of variance
Variance of \(\beta\)s
Estimated variance

Estimated Variance

Since we will not necessarily know the population variance \(\sigma\), we substitute this our estimate \(\hat{\sigma}\), yielding

\[ \widehat{\text{Var}} \left( \hat{\beta}_1 | X \right) = \hat{\sigma}^2 \frac{1}{\texttt{SXX}}, \qquad \widehat{\text{Var}} \left( \hat{\beta}_0 | X \right) = \hat{\sigma}^2 \left( \frac{1}{n} + \frac{\bar{x}^2}{\texttt{SXX}}\right) \]

and the standard error, se, is then \[ \text{se}\left( \hat{\beta}_0 | X \right) = \sqrt{ \widehat{\text{Var}} \left( \hat{\beta}_0 | X \right) }, \]

Confidence Intervals and \(t\)-Tests

Q: How do you create a \((1-\alpha) \times 100 \%\) confidence interval?

For a \(t\)-distribution, we will use \(t (\alpha/2 , df ) = t(\alpha/2, n-2)\). Hence, we have \[ \hat{\beta}_0 - t\left(\tfrac{\alpha}{2}, n-2\right) \text{se}\left(\hat{\beta}_0 | X \right) \le \beta_0 \le \hat{\beta}_0 + t\left(\tfrac{\alpha}{2}, n-2\right) \text{se}\left(\hat{\beta}_0 | X \right) \]

This will be the same for the slope!

Hypothesis Testing

For the hypothesis test,

\[ \begin{align*} H_0: \quad \beta_0 &= \beta_0^*,\; \beta_1 \text{ arbitrary}\\ H_a: \quad \beta_0 &\neq \beta_0^*,\; \beta_1 \text{ arbitrary}, \end{align*} \]

Calculating the \(t\)-statistic will be similar as before,

\[ t = \frac{\hat{\beta}_0 - \beta_0^*}{\text{se}\left(\hat{\beta}_0 | X \right)}. \]

Q: How would you set up (and interpret) the hypothesis for the slope?

Prediction

Q: When can you use your OLS model to predict future data?

Assume we have new data (not seen in constructing the model) \((x_*, y_*)\). Then, a point prediction (with our model) for \(y_*\) would be

\[ \tilde{y}_* = \hat{\beta}_0 + \hat{\beta}_1 x_* \]

\(\tilde{y}_*\) predicts the as yet unobserved \({y}_*\). Assuming the model is correct, then the true value of \({y}_*\)

\[ {y}_* = \beta_0 + \beta_1 x_* + e_*, \]

where \(e_*\) is the random error associated with \({y}_*\).

Prediction and Confidence Intervals

For prediction intervals, the standard error is given by \[ \text{sepred}(y_* | x_*) = \sigma \left( 1 + \frac{1}{n} + \frac{ (x_* - \bar{x} )^2}{\texttt{SXX}} \right)^{1/2}. \]

For confidence intervals, which describes \(E(Y|X = x_*)\), the standard error is \[ \text{sefit}(y_* | x_*) = \sigma \left(\frac{1}{n} + \frac{ (x_* - \bar{x} )^2}{\texttt{SXX}} \right)^{1/2}. \] Q: What do you think accounts for this difference?

Ch. 2 Reading

Article on confidence and prediction intervals

Read Section 2.7 and 2.8 and come with questions to discuss for Friday.

Reminder: Quiz on Friday!