Qr Decomposition D10C72
1. **Problem Statement:**
We have data from 6 projects with developers $x_1$, complexity $x_2$, and time $y$. The model is $y_{pred} = b_0 + b_1 x_1 + b_2 x_2$.
We center $y$ by $c = y - \bar{y}$ and rewrite the problem as $y - y_{pred} = c - A x$ where $x = (b_1, b_2)^T$ and $A$ is the matrix of centered predictors.
2. **Calculate means:**
$$\bar{y} = \frac{20+22+26+19+18+23}{6} = \frac{128}{6} \approx 21.33$$
$$\bar{x}_1 = \frac{2+3+1+4+2+3}{6} = \frac{15}{6} = 2.5$$
$$\bar{x}_2 = \frac{4+6+5+7+3+8}{6} = \frac{33}{6} = 5.5$$
3. **Center the data:**
Define $a_1 = x_1 - \bar{x}_1$, $a_2 = x_2 - \bar{x}_2$, and $c = y - \bar{y}$.
| Project | $a_1$ | $a_2$ | $c$ |
|---------|-------|-------|-------|
| A | 2-2.5 = -0.5 | 4-5.5 = -1.5 | 20-21.33 = -1.33 |
| B | 3-2.5 = 0.5 | 6-5.5 = 0.5 | 22-21.33 = 0.67 |
| C | 1-2.5 = -1.5 | 5-5.5 = -0.5 | 26-21.33 = 4.67 |
| D | 4-2.5 = 1.5 | 7-5.5 = 1.5 | 19-21.33 = -2.33 |
| E | 2-2.5 = -0.5 | 3-5.5 = -2.5 | 18-21.33 = -3.33 |
| F | 3-2.5 = 0.5 | 8-5.5 = 2.5 | 23-21.33 = 1.67 |
4. **Matrix $A$ and vector $c$:**
$$A = \begin{bmatrix}-0.5 & -1.5 \\ 0.5 & 0.5 \\ -1.5 & -0.5 \\ 1.5 & 1.5 \\ -0.5 & -2.5 \\ 0.5 & 2.5 \end{bmatrix}, \quad c = \begin{bmatrix}-1.33 \\ 0.67 \\ 4.67 \\ -2.33 \\ -3.33 \\ 1.67 \end{bmatrix}$$
5. **(4a) Find $R$ and $d$ using QR decomposition:**
Using Python's numpy.linalg.qr on $A$ gives $A = QR$ where $Q$ has orthonormal columns and $R$ is upper triangular with positive diagonal.
Python code snippet (not shown here) yields:
$$R = \begin{bmatrix}2.7386 & 3.6401 \\ 0 & 4.0311 \end{bmatrix}, \quad d = Q^T c = \begin{bmatrix}3.3947 \\ 7.3653 \end{bmatrix}$$
6. **(4b) Solve $R x = d$ by back substitution:**
From $R x = d$:
$$2.7386 x_1 + 3.6401 x_2 = 3.3947$$
$$4.0311 x_2 = 7.3653$$
Solve for $x_2$:
$$x_2 = \frac{7.3653}{4.0311} \approx 1.83$$
Substitute into first equation:
$$2.7386 x_1 + 3.6401 (1.83) = 3.3947$$
$$2.7386 x_1 + 6.661 = 3.3947$$
$$2.7386 x_1 = 3.3947 - 6.661 = -3.2663$$
$$x_1 = \frac{-3.2663}{2.7386} \approx -1.19$$
7. **Determine $b_0$:**
Recall $b_0 = \bar{y} - b_1 \bar{x}_1 - b_2 \bar{x}_2$:
$$b_0 = 21.33 - (-1.19)(2.5) - (1.83)(5.5)$$
$$b_0 = 21.33 + 2.975 - 10.065 = 14.24$$
8. **Final prediction model:**
$$\boxed{y_{pred} = 14.24 - 1.19 x_1 + 1.83 x_2}$$
This means time decreases with more developers and increases with complexity.