Gradient Descent 4B8F86

1. **Problem Statement:** We are given the loss function $$L(w) = (w - 3)^2 + 2$$ and the weight update rule in gradient descent: $$w_{k+1} = w_k - \eta \nabla L(w_k)$$. We need to find the differential at $$w_0 = 1$$, compute $$w_1$$ with learning rate $$\eta = 0.1$$, interpret the relationship between the differential and weight update, and explain why moving in the negative gradient direction minimizes the loss. 2. **Formula and Rules:** The gradient (or differential) of $$L(w)$$ with respect to $$w$$ is $$\nabla L(w) = \frac{dL}{dw}$$. Gradient descent updates weights by moving opposite to the gradient to reduce the loss. 3. **Step (a): Find the differential $$dL$$ at $$w_0 = 1$$** Calculate the derivative: $$\frac{dL}{dw} = 2(w - 3)$$ Evaluate at $$w_0 = 1$$: $$\nabla L(1) = 2(1 - 3) = 2(-2) = -4$$ So, the differential at $$w_0$$ is $$dL = -4$$. 4. **Step (b): Find $$w_1$$ using learning rate $$\eta = 0.1$$** Apply the update rule: $$w_1 = w_0 - \eta \nabla L(w_0) = 1 - 0.1 \times (-4) = 1 + 0.4 = 1.4$$ 5. **Step (c): Interpret the relationship between $$dL$$ and the weight update** The differential $$dL$$ indicates the slope of the loss function at $$w_0$$. Since $$dL = -4$$ is negative, the loss decreases as $$w$$ increases. The update adds $$0.4$$ to $$w_0$$, moving $$w$$ in the direction that reduces the loss. 6. **Step (d): Explain why the negative gradient direction minimizes the loss function** The gradient points in the direction of greatest increase of the function. Moving in the negative gradient direction moves $$w$$ toward lower loss values, thus minimizing $$L(w)$$. This is the fundamental principle behind gradient descent. **Final answers:** (a) $$dL = -4$$ at $$w_0 = 1$$ (b) $$w_1 = 1.4$$ (c) The negative differential guides the weight update to increase $$w$$, reducing loss. (d) Moving opposite the gradient decreases the loss, leading to minimization.