Instructor: Prof. Mario Banuelos
Lecture: MWF 1:00 – 1:50 pm, Science I, Rm 242
Office Hours: MW 2:30 – 4:00 pm, Tu 4:30 – 5:30pm, and by appointment (PB 337)
Students: Introduction
Instructor: Prof. Mario Banuelos
Lecture: MWF 1:00 – 1:50 pm, Science I, Rm 242
Office Hours: MW 2:30 – 4:00 pm, Tu 4:30 – 5:30pm, and by appointment (PB 337)
Students: Introduction
A linear model is an equation that attempts to describe the response variable with a linear transformation of the predictor variable using the following form
Example 1: \[ Y \approx mX + b, \] where \(m\) is the slope and \(b\) is the intercept.
Linear models help us understand the world around us:
Linear regression is the backbone of data science and many more advanced machine learning methods.
This class: Model real-world phenomena using linear regression and analyze the resulting models.
Textbooks: Applied Linear Regression (ALR4) , 4th Edition, S. Weisberg, 2014.
An Introduction to Statistical Learning with Applications in R (ISLR) , G. James, D. Witten, T. Hastie, and R. Tibshirani, 2013.
Both books are free and available online.
Quizzes: There will be a total of five quizzes. The lowest quiz will be dropped.
Midterm Exams: There will be two exams,
Final Project Presentation: Monday, May 11 from 1:15-3:15pm in Science I Rm 242 Final project is mandatory and more details will be provided in the future.
Your grade will be based on the following
In this course, you are expected to
You should also be very familiar with the following:
HW1 will be assigned today after class. HW0 (already assigned) is an online survey which is a credit/no credit assignment.
Heights
data.Heights
dataIn small groups, consider the following:
Forbes
.Forbes
dataQ:
Given a collection of one predictor \(X\) and one response \(Y\), scatterplots visually represent potential relationships between \(X\) and \(Y\).
Our interest centers on how the distribution of \(Y\) changes as \(X\) is varied, which is described by the mean function ,
\[ \text{E}(Y \;|\; X = x) = \text{ a function that depends on the value of }x \]
In the case of the mother and daughter height, we have
\[ \text{E}( \texttt{dheight}\; |\; \texttt{ mheight}=x)=\beta_0 + \beta_1 x \]
This particular mean function has 2 parameters, an intercept \(\beta_0\) and a slope \(\beta_1\). If we knew the values of the \(\beta\)s, then the mean function would be completely specified, but usually the \(\beta\)s need to be estimated from data.
Another part of the distribution of \(Y\) is described by the variance function ,
\[ \text{Var}(Y \;|\; X = x) \]
A frequent assumption in fitting linear regression models is that the variance function is the same for every value of x.
\[ \text{Var}(Y \;|\; X = x) = \sigma^2 \]
This is usually done for convenience but we will discuss general variance models in Ch. 7.