Sampling Variance Hypothesis
1. **Problem 1: Variance of sample distributions**
Given population $S = \{1, 2, 3, 4, 5\}$, population size $N=5$, sample size $n=2$.
**Sampling with replacement:**
- Each sample of size 2 is drawn with replacement.
- Population mean $\mu = \frac{1+2+3+4+5}{5} = 3$.
- Population variance $\sigma^2 = \frac{(1-3)^2 + (2-3)^2 + (3-3)^2 + (4-3)^2 + (5-3)^2}{5} = \frac{4+1+0+1+4}{5} = 2$.
- Variance of sample mean with replacement: $Var(\bar{X}) = \frac{\sigma^2}{n} = \frac{2}{2} = 1$.
**Sampling without replacement:**
- Variance of sample mean without replacement is $Var(\bar{X}) = \frac{\sigma^2}{n} \times \frac{N-n}{N-1} = 1 \times \frac{5-2}{5-1} = 1 \times \frac{3}{4} = 0.75$.
**Interpretation:**
- Sampling without replacement reduces variance because samples are more representative.
- Sampling with replacement has higher variance due to possible repeated elements.
2. **Problem 2a: Hypothesis testing for spot remover**
- Null hypothesis $H_0: p=0.7$, alternative $H_a: p>0.7$.
- Sample size $n=10$.
- Reject $H_0$ if number of spots removed $\geq 6$.
I. **Type I error probability ($\alpha$):**
- Type I error is rejecting $H_0$ when $p=0.7$.
- $\alpha = P(X \geq 6 | p=0.7) = 1 - P(X \leq 5)$ where $X \sim Binomial(10,0.7)$.
- Calculate $P(X \leq 5) = \sum_{k=0}^5 \binom{10}{k} 0.7^k 0.3^{10-k}$.
- Using binomial CDF, $P(X \leq 5) \approx 0.1503$.
- So, $\alpha = 1 - 0.1503 = 0.8497$.
II. **Power of the test when $p=0.5$:**
- Power = $P(\text{reject } H_0 | p=0.5) = P(X \geq 6 | p=0.5)$.
- Calculate $P(X \geq 6) = 1 - P(X \leq 5)$ for $X \sim Binomial(10,0.5)$.
- $P(X \leq 5) \approx 0.6230$.
- Power $= 1 - 0.6230 = 0.3770$.
3. **Problem 2b: Testing newborn babies weights**
- Data: $4.0, 3.7, 2.5, 4.1, 6.7, 6.0, 2.8, 5.0, 2.35$.
- Sample size $n=9$.
**Step 1: Calculate sample mean $\bar{x}$**
$$\bar{x} = \frac{4.0 + 3.7 + 2.5 + 4.1 + 6.7 + 6.0 + 2.8 + 5.0 + 2.35}{9} = \frac{37.15}{9} \approx 4.128$$
**Step 2: Calculate sample standard deviation $s$**
$$s = \sqrt{\frac{\sum (x_i - \bar{x})^2}{n-1}}$$
Calculate squared deviations:
$(4.0-4.128)^2=0.0164$, $(3.7-4.128)^2=0.183$, $(2.5-4.128)^2=2.646$, $(4.1-4.128)^2=0.0008$, $(6.7-4.128)^2=6.635$, $(6.0-4.128)^2=3.504$, $(2.8-4.128)^2=1.768$, $(5.0-4.128)^2=0.761$, $(2.35-4.128)^2=3.168$.
Sum = 18.682
$$s = \sqrt{\frac{18.682}{8}} = \sqrt{2.335} \approx 1.528$$
**Step 3: Hypothesis test for equal weights**
- Null hypothesis: weights are the same (mean equals some value, e.g., population mean or test mean).
- Without population mean given, test if mean differs significantly from a hypothesized mean (e.g., 4).
- Test statistic: $$t = \frac{\bar{x} - \mu_0}{s/\sqrt{n}}$$
- For $\mu_0=4$, $$t = \frac{4.128 - 4}{1.528/\sqrt{9}} = \frac{0.128}{0.509} = 0.251$$
- Degrees of freedom $df=8$.
- Critical t-values for two-tailed test at 5%: $\pm 2.306$, at 1%: $\pm 3.355$.
- Since $|t|=0.251 < 2.306$, fail to reject null at 5% and 1% significance.
**Step 4: Confidence intervals for mean**
- 95% CI: $$\bar{x} \pm t_{0.025,8} \times \frac{s}{\sqrt{n}} = 4.128 \pm 2.306 \times 0.509 = (2.91, 5.35)$$
- 99% CI: $$4.128 \pm 3.355 \times 0.509 = (2.43, 5.83)$$
**Interpretation:**
- The test shows no significant difference from hypothesized mean 4.
- Confidence intervals provide range of plausible means at 95% and 99% levels.