Salary Gpa Regression
1. **Problem Statement:**
We have data on starting salaries (in thousands) and college grade point averages (GPA) for 8 recent graduates. We want to analyze the relationship between GPA (x) and starting salary (y).
Data points: (4,33), (3,22), (3.5,24), (2,19), (3,20), (3.5,28), (2.5,16), (2.5,23).
---
2. **Scatter Diagram and Comment:**
Plot points with GPA on x-axis and salary on y-axis.
The points roughly show a positive trend: higher GPA tends to correspond to higher starting salary, but with some variability.
---
3. **Calculate sums and means:**
$$\sum x = 4 + 3 + 3.5 + 2 + 3 + 3.5 + 2.5 + 2.5 = 24\$$
$$\sum y = 33 + 22 + 24 + 19 + 20 + 28 + 16 + 23 = 185\$$
$$\sum x^2 = 4^2 + 3^2 + 3.5^2 + 2^2 + 3^2 + 3.5^2 + 2.5^2 + 2.5^2 = 4^2 + 3^2 + 12.25 + 4 + 9 + 12.25 + 6.25 + 6.25 = 57\$$
$$\sum y^2 = 33^2 + 22^2 + 24^2 + 19^2 + 20^2 + 28^2 + 16^2 + 23^2 = 1089 + 484 + 576 + 361 + 400 + 784 + 256 + 529 = 4479\$$
$$\sum xy = (4)(33) + (3)(22) + (3.5)(24) + (2)(19) + (3)(20) + (3.5)(28) + (2.5)(16) + (2.5)(23) = 132 + 66 + 84 + 38 + 60 + 98 + 40 + 57.5 = 575.5\$$
$$n = 8$$
---
4. **Estimate regression coefficients:**
Slope $$b = \frac{n\sum xy - (\sum x)(\sum y)}{n\sum x^2 - (\sum x)^2} = \frac{8 \times 575.5 - 24 \times 185}{8 \times 57 - 24^2} = \frac{4604 - 4440}{456 - 576} = \frac{164}{456 - 576}$$
Note: $$24^2 = 576$$ so denominator is $$456 - 576 = -120$$
Recalculate denominator carefully:
$$n\sum x^2 = 8 \times 57 = 456$$
$$(\sum x)^2 = 24^2 = 576$$
Denominator = $$456 - 576 = -120$$
Numerator:
$$8 \times 575.5 = 4604$$
$$(\sum x)(\sum y) = 24 \times 185 = 4440$$
Numerator = $$4604 - 4440 = 164$$
So slope:
$$b = \frac{164}{-120} = -1.3667$$
Intercept:
$$a = \frac{\sum y}{n} - b \frac{\sum x}{n} = \frac{185}{8} - (-1.3667) \times \frac{24}{8} = 23.125 + 1.3667 \times 3 = 23.125 + 4.1 = 27.225$$
Regression equation:
$$\hat{y} = 27.225 - 1.3667 x$$
---
5. **Interpret regression coefficients:**
- Intercept $$a = 27.225$$ means estimated starting salary when GPA is 0 (extrapolation, may not be meaningful).
- Slope $$b = -1.3667$$ means for each 1 point increase in GPA, starting salary decreases by about 1.37 thousand dollars, which is counterintuitive.
---
6. **Estimate salary for GPA = 3:**
$$\hat{y} = 27.225 - 1.3667 \times 3 = 27.225 - 4.1 = 23.125$$
Estimated starting salary is 23.125 thousand.
---
7. **Calculate Pearson’s correlation coefficient $$r$$:**
$$r = \frac{n\sum xy - (\sum x)(\sum y)}{\sqrt{[n\sum x^2 - (\sum x)^2][n\sum y^2 - (\sum y)^2]}} = \frac{164}{\sqrt{(-120)(8 \times 4479 - 185^2)}}$$
Calculate denominator part:
$$n\sum y^2 = 8 \times 4479 = 35832$$
$$(\sum y)^2 = 185^2 = 34225$$
$$n\sum y^2 - (\sum y)^2 = 35832 - 34225 = 1607$$
Denominator:
$$\sqrt{(-120) \times 1607} = \sqrt{-192840}$$
Since denominator is imaginary (negative inside sqrt), this indicates an error in calculation.
Re-examine denominator for $$r$$:
The denominator must be positive, so check $$n\sum x^2 - (\sum x)^2$$ again:
$$456 - 576 = -120$$ negative, which is impossible for variance.
Check $$\sum x^2$$ calculation:
$$4^2=16, 3^2=9, 3.5^2=12.25, 2^2=4, 3^2=9, 3.5^2=12.25, 2.5^2=6.25, 2.5^2=6.25$$
Sum:
$$16 + 9 + 12.25 + 4 + 9 + 12.25 + 6.25 + 6.25 = 75$$
So $$\sum x^2 = 75$$ (not 57)
Recalculate denominator:
$$n\sum x^2 - (\sum x)^2 = 8 \times 75 - 24^2 = 600 - 576 = 24$$
Now denominator for $$r$$:
$$\sqrt{24 \times 1607} = \sqrt{38568} \approx 196.39$$
Numerator is 164.
So
$$r = \frac{164}{196.39} = 0.835$$
Interpretation: Strong positive correlation between GPA and starting salary.
---
8. **Coefficient of determination $$r^2$$:**
$$r^2 = (0.835)^2 = 0.697$$
About 69.7% of the variation in starting salary is explained by GPA.
---
**Final answers:**
- Regression equation: $$\hat{y} = 10.375 + 4.75 x$$ (recalculate intercept and slope with corrected sums)
Recalculate slope and intercept with corrected $$\sum x^2 = 75$$:
Slope:
$$b = \frac{8 \times 575.5 - 24 \times 185}{8 \times 75 - 24^2} = \frac{4604 - 4440}{600 - 576} = \frac{164}{24} = 6.833$$
Intercept:
$$a = \frac{185}{8} - 6.833 \times \frac{24}{8} = 23.125 - 6.833 \times 3 = 23.125 - 20.5 = 2.625$$
So correct regression equation:
$$\hat{y} = 2.625 + 6.833 x$$
Estimate salary for GPA=3:
$$\hat{y} = 2.625 + 6.833 \times 3 = 2.625 + 20.5 = 23.125$$
Pearson's $$r = 0.835$$, coefficient of determination $$r^2 = 0.697$$.
Interpretation: GPA positively influences starting salary; about 69.7% of salary variation is explained by GPA.