R Squared Explanation
1. **Problem Statement:**
We have a linear regression model for Trips: $$Trips = a + b \times Households + c \times Employment + \varepsilon$$
Given data for 10 zones with Households, Employment, and Observed Trips, we want to understand how to calculate the coefficient of determination, $R^2$, which measures the goodness of fit of the model.
2. **Formula for $R^2$:**
$$R^2 = 1 - \frac{SS_{res}}{SS_{tot}}$$
where:
- $SS_{res} = \sum (y_i - \hat{y}_i)^2$ is the residual sum of squares,
- $SS_{tot} = \sum (y_i - \bar{y})^2$ is the total sum of squares,
- $y_i$ are observed values,
- $\hat{y}_i$ are predicted values from the model,
- $\bar{y}$ is the mean of observed values.
3. **Steps to calculate $R^2$:**
- Calculate the mean of observed trips: $$\bar{y} = \frac{1}{n} \sum_{i=1}^n y_i$$
- Fit the regression model to find coefficients $a$, $b$, and $c$ using least squares.
- Use the model to predict trips $\hat{y}_i$ for each zone.
- Compute $SS_{res} = \sum (y_i - \hat{y}_i)^2$.
- Compute $SS_{tot} = \sum (y_i - \bar{y})^2$.
- Calculate $R^2 = 1 - \frac{SS_{res}}{SS_{tot}}$.
4. **Explanation:**
$R^2$ tells us the proportion of variance in the observed trips explained by the model. An $R^2$ close to 1 means a good fit; close to 0 means poor fit.
5. **Note:**
To find $a$, $b$, and $c$, you would typically use matrix algebra or statistical software to perform multiple linear regression on the data provided.
Since the problem does not provide coefficients or ask for numerical $R^2$, this explanation guides how to compute it from the data and model.