Linear Regression Temperature E71C1F
1. **Problem statement:** We have water temperature data $T$ for months $m=1$ to $8$ (January to August) with May ($m=5$) missing. We assume a linear model $T=am+b$ and want to find the regression line using the 7 known points.
2. **Formula and rules:** The regression line $T=am+b$ minimizes the sum of squared residuals. The slope $a$ and intercept $b$ are given by:
$$a=\frac{n\sum mT - \sum m \sum T}{n\sum m^2 - (\sum m)^2}$$
$$b=\frac{\sum T - a \sum m}{n}$$
where $n=7$ (number of data points), sums are over known months and temperatures.
3. **Data:**
Months $m$: 1, 2, 3, 4, 6, 7, 8
Temperatures $T$: 5.2, 8.0, 7.2, 8.9, 12.6, 15.5, 15.4
Calculate sums:
$$\sum m = 1+2+3+4+6+7+8 = 31$$
$$\sum T = 5.2+8.0+7.2+8.9+12.6+15.5+15.4 = 72.8$$
$$\sum m^2 = 1^2+2^2+3^2+4^2+6^2+7^2+8^2 = 1+4+9+16+36+49+64 = 179$$
$$\sum mT = 1\times5.2 + 2\times8.0 + 3\times7.2 + 4\times8.9 + 6\times12.6 + 7\times15.5 + 8\times15.4$$
$$= 5.2 + 16 + 21.6 + 35.6 + 75.6 + 108.5 + 123.2 = 385.7$$
4. **Calculate slope $a$:**
$$a = \frac{7 \times 385.7 - 31 \times 72.8}{7 \times 179 - 31^2} = \frac{2700 - 2256.8}{1253 - 961} = \frac{443.2}{292} \approx 1.5185$$
5. **Calculate intercept $b$:**
$$b = \frac{72.8 - 1.5185 \times 31}{7} = \frac{72.8 - 47.0735}{7} = \frac{25.7265}{7} \approx 3.6752$$
6. **Regression line:**
$$T = 1.5185 m + 3.6752$$
7. **Estimate temperature for May ($m=5$):**
$$T = 1.5185 \times 5 + 3.6752 = 7.5925 + 3.6752 = 11.2677 \approx 11.27$$
8. **Why not use line to estimate $m$ when $T=10.0$?**
Because the regression models $T$ as a function of $m$, not the inverse. Estimating $m$ from $T$ requires inverse regression or different modeling, and extrapolating $m$ from $T$ can be inaccurate.
9. **Why not predict December ($m=12$)?**
The linear model is based on data from months 1 to 8 only. Temperature patterns may change outside this range (seasonal effects), so predictions for $m=12$ are unreliable.
10. **More appropriate model for extended period:**
A periodic (sinusoidal) model to capture seasonal temperature variations over the year would be more appropriate, e.g., $T = A \sin(Bm + C) + D$.