Negative Gradient 35Cb54
1. **Problem statement:**
We are given the loss function $$L(w) = (w - 3)^2 + 2$$ and the weight update rule in gradient descent: $$w_{k+1} = w_k - \eta \nabla L(w_k)$$.
Part (d) asks: Explain why the negative gradient direction minimizes the loss function.
2. **Recall the gradient and its meaning:**
The gradient $$\nabla L(w)$$ points in the direction of the steepest increase of the function $$L(w)$$.
3. **Why move in the negative gradient direction?**
Since the gradient points uphill (increasing $$L$$), moving in the opposite direction (negative gradient) moves downhill, decreasing $$L$$.
4. **Mathematical intuition:**
For a small step size $$\eta$$, the change in $$L$$ when moving from $$w_k$$ to $$w_{k+1} = w_k - \eta \nabla L(w_k)$$ is approximately:
$$
L(w_{k+1}) \approx L(w_k) - \eta \|\nabla L(w_k)\|^2
$$
Because $$\|\nabla L(w_k)\|^2$$ is always positive, this ensures $$L(w_{k+1}) < L(w_k)$$, so the loss decreases.
5. **Conclusion:**
Moving in the negative gradient direction ensures the loss function decreases at each step, which is why gradient descent converges to a minimum.