Gradient Descent 4B8F86
1. **Problem Statement:** We are given the loss function $$L(w) = (w - 3)^2 + 2$$ and the weight update rule in gradient descent: $$w_{k+1} = w_k - \eta \nabla L(w_k)$$. We need to find the differential at $$w_0 = 1$$, compute $$w_1$$ with learning rate $$\eta = 0.1$$, interpret the relationship between the differential and weight update, and explain why moving in the negative gradient direction minimizes the loss.
2. **Formula and Rules:** The gradient (or differential) of $$L(w)$$ with respect to $$w$$ is $$\nabla L(w) = \frac{dL}{dw}$$. Gradient descent updates weights by moving opposite to the gradient to reduce the loss.
3. **Step (a): Find the differential $$dL$$ at $$w_0 = 1$$**
Calculate the derivative:
$$\frac{dL}{dw} = 2(w - 3)$$
Evaluate at $$w_0 = 1$$:
$$\nabla L(1) = 2(1 - 3) = 2(-2) = -4$$
So, the differential at $$w_0$$ is $$dL = -4$$.
4. **Step (b): Find $$w_1$$ using learning rate $$\eta = 0.1$$**
Apply the update rule:
$$w_1 = w_0 - \eta \nabla L(w_0) = 1 - 0.1 \times (-4) = 1 + 0.4 = 1.4$$
5. **Step (c): Interpret the relationship between $$dL$$ and the weight update**
The differential $$dL$$ indicates the slope of the loss function at $$w_0$$. Since $$dL = -4$$ is negative, the loss decreases as $$w$$ increases. The update adds $$0.4$$ to $$w_0$$, moving $$w$$ in the direction that reduces the loss.
6. **Step (d): Explain why the negative gradient direction minimizes the loss function**
The gradient points in the direction of greatest increase of the function. Moving in the negative gradient direction moves $$w$$ toward lower loss values, thus minimizing $$L(w)$$. This is the fundamental principle behind gradient descent.
**Final answers:**
(a) $$dL = -4$$ at $$w_0 = 1$$
(b) $$w_1 = 1.4$$
(c) The negative differential guides the weight update to increase $$w$$, reducing loss.
(d) Moving opposite the gradient decreases the loss, leading to minimization.