Negative Gradient 35Cb54

1. **Problem statement:** We are given the loss function $$L(w) = (w - 3)^2 + 2$$ and the weight update rule in gradient descent: $$w_{k+1} = w_k - \eta \nabla L(w_k)$$. Part (d) asks: Explain why the negative gradient direction minimizes the loss function. 2. **Recall the gradient and its meaning:** The gradient $$\nabla L(w)$$ points in the direction of the steepest increase of the function $$L(w)$$. 3. **Why move in the negative gradient direction?** Since the gradient points uphill (increasing $$L$$), moving in the opposite direction (negative gradient) moves downhill, decreasing $$L$$. 4. **Mathematical intuition:** For a small step size $$\eta$$, the change in $$L$$ when moving from $$w_k$$ to $$w_{k+1} = w_k - \eta \nabla L(w_k)$$ is approximately: $$ L(w_{k+1}) \approx L(w_k) - \eta \|\nabla L(w_k)\|^2 $$ Because $$\|\nabla L(w_k)\|^2$$ is always positive, this ensures $$L(w_{k+1}) < L(w_k)$$, so the loss decreases. 5. **Conclusion:** Moving in the negative gradient direction ensures the loss function decreases at each step, which is why gradient descent converges to a minimum.