Processing math: 100%

MATH 106 - Applied Linear Statistical Models

04 - Ordinary Least Squares & Estimating Variance

Mario Banuelos

Simple Linear Regression

  • The simple linear regression model consists of the mean and variance function, E(Y|X=x)=β0+β1xVar(Y|X=x)=σ2
  • Parameters are unknown quantities that characterize a model.
  • Estimates of parameters are computable functions of data and are therefore statistics.

Least Squares Estimates

Recall that the least squares estimates for the RSS are ˆβ1=SXYSXX=rxySDySDx=rxy(SYYSXX)1/2ˆβ0=ˉyˆβ1ˉx. In R, we will use lm( response ~ predictor ) to estimate β0 and β1.

Estimating the Variance

  • σ2 is approximately the average squared size of the e2i.
  • To estimate the variance σ2, we divide the RSS by the degrees of freedom, df (number of cases/observations minus number of parameters).
  • For simple linear regression, we have df=n2, and ˆσ2=RSSn2, known as the residual mean square.

The Standard Error of Regression

If we assume Y=β0+β1X+ϵ, where ϵ – the error term – has mean zero, then the regression standard error is σ, and it is the same units as the response variable.

Q: What part of the output of lm tells you this information?

Variance (continued)

If we assume ei are drawn from a normal distribution, then ˆσ2σ2n2χ2(n2),

Q: What is the mean of a χ2 random variable with n degrees of freedom?

Then, E(ˆσ2|X)=σ2n2E[χ2(n2)]=σ2n2(n2)=σ2