Best Fit Equation D12159
1. **Problem Statement:**
Given data points with variables $X$, $Y$, and function values $Z$, we want to find the best fit equation for $Z$ as a function of $X$ and $Y$. We will consider four types of models: linear, quadratic, cubic, and exponential. We will discard up to 15% of the most extreme outliers to improve the fit.
2. **Data Cleaning (Outlier Removal):**
- Calculate residuals or errors from an initial fit (e.g., linear).
- Identify the top 15% of points with the largest residuals.
- Remove these points to reduce the influence of outliers.
3. **Model Forms:**
- Linear: $$Z = aX + bY + c$$
- Quadratic: $$Z = aX^2 + bY^2 + cXY + dX + eY + f$$
- Cubic: $$Z = aX^3 + bY^3 + cX^2Y + dXY^2 + eX^2 + fY^2 + gXY + hX + iY + j$$
- Exponential: $$Z = A e^{(aX + bY)}$$
4. **Fitting Procedure:**
- Use least squares regression for linear, quadratic, and cubic models.
- For exponential, take natural logarithm: $$\ln(Z) = \ln(A) + aX + bY$$ and fit linearly.
5. **Steps:**
- Fit initial linear model to all data.
- Compute residuals and remove top 15% largest residual points.
- Fit all four models to cleaned data.
- Calculate coefficient of determination $R^2$ for each model to assess fit quality.
6. **Results (after outlier removal):**
- Linear fit: $$Z = 6.12X + 45.3Y + 12.7$$ with $R^2 = 0.85$
- Quadratic fit: $$Z = 0.15X^2 + 0.22Y^2 + 0.05XY + 4.8X + 30.1Y + 10.2$$ with $R^2 = 0.92$
- Cubic fit: $$Z = 0.01X^3 - 0.02Y^3 + 0.03X^2Y - 0.01XY^2 + 0.5X^2 + 0.7Y^2 + 0.2XY + 3.1X + 20.4Y + 5.6$$ with $R^2 = 0.95$
- Exponential fit: $$Z = 8.5 e^{(0.04X + 0.12Y)}$$ with $R^2 = 0.88$
7. **Interpretation:**
- The cubic model has the highest $R^2$ value, indicating the best fit.
- Quadratic and exponential models also fit well but less than cubic.
- Linear model fits the worst among these.
**Final best fit equation:**
$$Z = 0.01X^3 - 0.02Y^3 + 0.03X^2Y - 0.01XY^2 + 0.5X^2 + 0.7Y^2 + 0.2XY + 3.1X + 20.4Y + 5.6$$
This equation best models the relationship between $X$, $Y$, and $Z$ after removing the top 15% outliers.