Multiple Regression 3Ae574
1. **Problem Statement:** Fit a multiple regression model relating the number of games won ($y$) to passing yardage ($x_2$), percent rushing plays ($x_7$), and opponent's rushing yards ($x_8$).
2. **Model Form:** The multiple linear regression model is:
$$y = \beta_0 + \beta_2 x_2 + \beta_7 x_7 + \beta_8 x_8 + \epsilon$$
where $\beta_0$ is the intercept, $\beta_2$, $\beta_7$, and $\beta_8$ are regression coefficients, and $\epsilon$ is the error term.
3. **Fitting the Model:** Use least squares estimation to find $\hat{\beta}_0, \hat{\beta}_2, \hat{\beta}_7, \hat{\beta}_8$ by minimizing the sum of squared residuals:
$$\min \sum (y_i - \hat{y}_i)^2$$
where $\hat{y}_i = \hat{\beta}_0 + \hat{\beta}_2 x_{2i} + \hat{\beta}_7 x_{7i} + \hat{\beta}_8 x_{8i}$.
4. **Standard Errors:** Calculate standard errors of coefficients from the variance-covariance matrix:
$$\text{Var}(\hat{\beta}) = \sigma^2 (X^T X)^{-1}$$
where $X$ is the design matrix including a column of ones for intercept.
5. **Prediction:** For $x_2=2000$, $x_7=60$, $x_8=1800$, predict games won:
$$\hat{y} = \hat{\beta}_0 + \hat{\beta}_2 \times 2000 + \hat{\beta}_7 \times 60 + \hat{\beta}_8 \times 1800$$
6. **Significance Test:** Use $F$-test for overall regression significance at $\alpha=0.05$:
$$F = \frac{\text{Regression SS} / p}{\text{Residual SS} / (n-p-1)}$$
where $p=3$ predictors.
7. **t-tests for Coefficients:** For each $\beta_j$, test $H_0: \beta_j=0$ vs $H_a: \beta_j \neq 0$ using:
$$t = \frac{\hat{\beta}_j}{SE(\hat{\beta}_j)}$$
Compare with $t$ critical value at $\alpha=0.05$.
8. **Regression Sum of Squares Increase by $x_8$:** Compare regression sum of squares with and without $x_8$ to find its contribution.
9. **95% Confidence Interval for $\beta_8$:**
$$\hat{\beta}_8 \pm t_{\alpha/2, n-p-1} \times SE(\hat{\beta}_8)$$
10. **Proportion of Variability Explained:** Calculate $R^2$:
$$R^2 = 1 - \frac{\text{Residual SS}}{\text{Total SS}}$$
11. **Normal Probability Plot of Residuals:** Plot residuals against normal quantiles to check normality.
12. **Influential Observations:** Use leverage and Cook's distance to identify influential points.
**Note:** Actual numerical answers require data matrix and computations not provided here.