Sales T Test Vaccine Regression Correlation
1. **Problem:** Determine if there is a significant difference in average sales between salesmen A and B using a t-test at 5% significance level.
2. **Given:**
- Salesman A: $n_1=20$, $\bar{x}_1=170$, $s_1=20$
- Salesman B: $n_2=18$, $\bar{x}_2=205$, $s_2=25$
3. **Formula:** For two independent samples, the t-test statistic is
$$t=\frac{\bar{x}_1 - \bar{x}_2}{\sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}}$$
Degrees of freedom (approximate) can be calculated using the Welch-Satterthwaite equation.
4. **Calculate:**
$$t=\frac{170 - 205}{\sqrt{\frac{20^2}{20} + \frac{25^2}{18}}} = \frac{-35}{\sqrt{20 + 34.72}} = \frac{-35}{\sqrt{54.72}} = \frac{-35}{7.4} = -4.73$$
5. **Degrees of freedom:**
$$df = \frac{(\frac{400}{20} + \frac{625}{18})^2}{\frac{(\frac{400}{20})^2}{19} + \frac{(\frac{625}{18})^2}{17}} = \frac{(20 + 34.72)^2}{\frac{400}{19} + \frac{1201.39}{17}} = \frac{54.72^2}{21.05 + 70.67} = \frac{2995.5}{91.72} \approx 32.65$$
6. **Critical t-value:** For $df \approx 33$ at 5% significance (two-tailed), $t_{crit} \approx \pm 2.034$
7. **Decision:** Since $|t|=4.73 > 2.034$, reject the null hypothesis. There is a significant difference in average sales.
---
1. **Problem:** Test if the influenza vaccine is effective using chi-square test at 95% confidence.
2. **Data:**
| Outcome | Vaccine | Placebo |
|---------|---------|---------|
| Yes | 22 | 83 |
| No | 225 | 226 |
3. **Total:**
- Vaccine total: $22 + 225 = 247$
- Placebo total: $83 + 226 = 309$
- Total yes: $22 + 83 = 105$
- Total no: $225 + 226 = 451$
- Grand total: $247 + 309 = 556$
4. **Expected frequencies:**
$$E_{ij} = \frac{(\text{row total})(\text{column total})}{\text{grand total}}$$
- $E_{yes,vaccine} = \frac{105 \times 247}{556} = 46.62$
- $E_{yes,placebo} = \frac{105 \times 309}{556} = 58.38$
- $E_{no,vaccine} = \frac{451 \times 247}{556} = 200.38$
- $E_{no,placebo} = \frac{451 \times 309}{556} = 250.62$
5. **Chi-square statistic:**
$$\chi^2 = \sum \frac{(O - E)^2}{E} = \frac{(22-46.62)^2}{46.62} + \frac{(83-58.38)^2}{58.38} + \frac{(225-200.38)^2}{200.38} + \frac{(226-250.62)^2}{250.62}$$
$$= \frac{(-24.62)^2}{46.62} + \frac{24.62^2}{58.38} + \frac{24.62^2}{200.38} + \frac{(-24.62)^2}{250.62}$$
$$= 13.01 + 10.38 + 3.02 + 2.42 = 28.83$$
6. **Degrees of freedom:** $(rows-1)(columns-1) = 1$
7. **Critical value:** At 95% confidence, $\chi^2_{crit} = 3.841$
8. **Decision:** Since $28.83 > 3.841$, reject null hypothesis. Vaccine effectiveness is significant.
---
1. **Problem:** Find regression equation $y$ on $x$ and estimate cost for 4.5 hectares.
2. **Data:**
| Piece | $X$ (hectares) | $Y$ (cost) |
|-------|----------------|------------|
| C | 4.2 | 450 |
| D | 3.3 | 310 |
| E | 5.2 | 550 |
| F | 6.0 | 590 |
| G | 7.3 | 740 |
| H | 8.4 | 850 |
| J | 5.6 | 530 |
3. **Calculate means:**
$$\bar{X} = \frac{4.2+3.3+5.2+6.0+7.3+8.4+5.6}{7} = \frac{39.999}{7} = 5.714$$
$$\bar{Y} = \frac{450+310+550+590+740+850+530}{7} = \frac{4020}{7} = 574.29$$
4. **Calculate sums:**
$$S_{XY} = \sum (X_i - \bar{X})(Y_i - \bar{Y})$$
$$S_{XX} = \sum (X_i - \bar{X})^2$$
Calculate each:
- For C: $(4.2-5.714)(450-574.29) = (-1.514)(-124.29) = 188.17$
- D: $(3.3-5.714)(310-574.29) = (-2.414)(-264.29) = 637.88$
- E: $(5.2-5.714)(550-574.29) = (-0.514)(-24.29) = 12.49$
- F: $(6.0-5.714)(590-574.29) = 0.286(15.71) = 4.49$
- G: $(7.3-5.714)(740-574.29) = 1.586(165.71) = 262.88$
- H: $(8.4-5.714)(850-574.29) = 2.686(275.71) = 740.88$
- J: $(5.6-5.714)(530-574.29) = (-0.114)(-44.29) = 5.05$
Sum $S_{XY} = 188.17 + 637.88 + 12.49 + 4.49 + 262.88 + 740.88 + 5.05 = 1851.84$
Similarly for $S_{XX}$:
- C: $(-1.514)^2 = 2.29$
- D: $(-2.414)^2 = 5.83$
- E: $(-0.514)^2 = 0.26$
- F: $0.286^2 = 0.08$
- G: $1.586^2 = 2.52$
- H: $2.686^2 = 7.21$
- J: $(-0.114)^2 = 0.01$
Sum $S_{XX} = 2.29 + 5.83 + 0.26 + 0.08 + 2.52 + 7.21 + 0.01 = 18.20$
5. **Calculate slope and intercept:**
$$b = \frac{S_{XY}}{S_{XX}} = \frac{1851.84}{18.20} = 101.77$$
$$a = \bar{Y} - b\bar{X} = 574.29 - 101.77 \times 5.714 = 574.29 - 581.43 = -7.14$$
6. **Regression equation:**
$$y = -7.14 + 101.77x$$
7. **Estimate cost for 4.5 hectares:**
$$y = -7.14 + 101.77 \times 4.5 = -7.14 + 457.97 = 450.83$$
---
1. **Problem:** Determine the rank correlation between average monthly temperature and ice cream sales.
2. **Data:**
| Month | Temp (°C) | Sales (000) |
|-------|-----------|-------------|
| Jan | 4 | 73 |
| Feb | 4 | 57 |
| Mar | 7 | 81 |
| Apr | 8 | 94 |
| May | 12 | 110 |
| June | 15 | 124 |
| July | 16 | 134 |
| Aug | 17 | 139 |
| Sep | 14 | 124 |
| Oct | 11 | 103 |
| Nov | 7 | 81 |
| Dec | 5 | 80 |
3. **Assign ranks:**
- Rank temps (average rank for ties):
4 (Jan, Feb) = rank 1.5, 5 (Dec) = 3, 7 (Mar, Nov) = 4.5, 8 (Apr) = 6, 11 (Oct) = 7, 12 (May) = 8, 14 (Sep) = 9, 15 (June) = 10, 16 (July) = 11, 17 (Aug) = 12
- Rank sales:
57 (Feb) = 1, 73 (Jan) = 2, 80 (Dec) = 3, 81 (Mar, Nov) = 4.5, 94 (Apr) = 6, 103 (Oct) = 7, 110 (May) = 8, 124 (June, Sep) = 9.5, 134 (July) = 11, 139 (Aug) = 12
4. **Calculate differences $d_i$ and $d_i^2$:**
| Month | Rank Temp | Rank Sales | $d_i$ | $d_i^2$ |
|-------|-----------|------------|-------|---------|
| Jan | 1.5 | 2 | -0.5 | 0.25 |
| Feb | 1.5 | 1 | 0.5 | 0.25 |
| Mar | 4.5 | 4.5 | 0 | 0 |
| Apr | 6 | 6 | 0 | 0 |
| May | 8 | 8 | 0 | 0 |
| June | 10 | 9.5 | 0.5 | 0.25 |
| July | 11 | 11 | 0 | 0 |
| Aug | 12 | 12 | 0 | 0 |
| Sep | 9 | 9.5 | -0.5 | 0.25 |
| Oct | 7 | 7 | 0 | 0 |
| Nov | 4.5 | 4.5 | 0 | 0 |
| Dec | 3 | 3 | 0 | 0 |
Sum $d_i^2 = 0.25 + 0.25 + 0.25 + 0.25 = 1$
5. **Spearman's rank correlation coefficient:**
$$r_s = 1 - \frac{6 \sum d_i^2}{n(n^2 - 1)} = 1 - \frac{6 \times 1}{12(144 - 1)} = 1 - \frac{6}{1716} = 1 - 0.0035 = 0.9965$$
6. **Interpretation:** There is a very strong positive correlation between temperature and ice cream sales.