Subjects statistics

Frequency Correlation Skewness

Step-by-step solutions with LaTeX - clean, fast, and student-friendly.

Search Solutions

Frequency Correlation Skewness


1. **Problem Statement:** We have student scores data and need to: - a) Create a frequency table with class width 10, including cumulative frequency (cf), relative frequency, and cumulative relative frequency. - b) Compute mean, median, 75th, 34th, 56th percentiles, and sample standard deviation. - c) Compute Spearman's rank correlation and Pearson correlation between given X and Y. - d) Interpret correlations. - e) Determine skewness nature and causes. - f) Represent data using Ogive and scatter graph. --- 2. **Frequency Table Construction:** - Class intervals: 25-34, 35-44, 45-54, 55-64, 65-74, 75-84, 85-94 (width = 10) - Frequencies (Y): 13, 26, 1, 7, 32, 40, 16 - Calculate cumulative frequency (cf) by summing frequencies up to each class. - Relative frequency = frequency / total frequency. - Cumulative relative frequency = cumulative frequency / total frequency. Total frequency $N = 13 + 26 + 1 + 7 + 32 + 40 + 16 = 135$ | Class Interval | Frequency (f) | Cumulative Frequency (cf) | Relative Frequency (f/N) | Cumulative Relative Frequency (cf/N) | |---|---|---|---|---| | 25-34 | 13 | 13 | $\frac{13}{135} \approx 0.096$ | 0.096 | | 35-44 | 26 | 39 | $\frac{26}{135} \approx 0.193$ | 0.289 | | 45-54 | 1 | 40 | $\frac{1}{135} \approx 0.007$ | 0.296 | | 55-64 | 7 | 47 | $\frac{7}{135} \approx 0.052$ | 0.348 | | 65-74 | 32 | 79 | $\frac{32}{135} \approx 0.237$ | 0.585 | | 75-84 | 40 | 119 | $\frac{40}{135} \approx 0.296$ | 0.881 | | 85-94 | 16 | 135 | $\frac{16}{135} \approx 0.119$ | 1.000 | --- 3. **Mean Calculation:** Mean $\bar{x} = \frac{\sum f x}{N}$ where $x$ is midpoint of class. Midpoints $x$: 29.5, 39.5, 49.5, 59.5, 69.5, 79.5, 89.5 Calculate $\sum f x$: $$\sum f x = 13\times29.5 + 26\times39.5 + 1\times49.5 + 7\times59.5 + 32\times69.5 + 40\times79.5 + 16\times89.5$$ $$= 383.5 + 1027 + 49.5 + 416.5 + 2224 + 3180 + 1432 = 8712.5$$ Mean: $$\bar{x} = \frac{8712.5}{135} \approx 64.53$$ --- 4. **Median Calculation:** Median class is where cumulative frequency $\geq \frac{N}{2} = 67.5$. From cf, median class is 65-74 (cf=79). Median formula: $$\text{Median} = L + \left(\frac{\frac{N}{2} - F}{f_m}\right) \times w$$ Where: - $L=64.5$ (lower boundary of median class) - $F=47$ (cf before median class) - $f_m=32$ (frequency median class) - $w=10$ (class width) Calculate: $$\text{Median} = 64.5 + \left(\frac{67.5 - 47}{32}\right) \times 10 = 64.5 + \left(\frac{20.5}{32}\right) \times 10 = 64.5 + 6.41 = 70.91$$ --- 5. **Percentiles Calculation:** Percentile $P_k$ is value below which $k\%$ of data fall. Use formula: $$P_k = L + \left(\frac{kN/100 - F}{f_m}\right) \times w$$ - For 75th percentile ($k=75$): $$kN/100 = 0.75 \times 135 = 101.25$$ Median class for 75th percentile is 75-84 (cf before = 79, f=40) $$P_{75} = 74.5 + \left(\frac{101.25 - 79}{40}\right) \times 10 = 74.5 + \left(\frac{22.25}{40}\right) \times 10 = 74.5 + 5.56 = 80.06$$ - For 34th percentile ($k=34$): $$kN/100 = 0.34 \times 135 = 45.9$$ Class 55-64 (cf before=40, f=7) $$P_{34} = 54.5 + \left(\frac{45.9 - 40}{7}\right) \times 10 = 54.5 + \left(\frac{5.9}{7}\right) \times 10 = 54.5 + 8.43 = 62.93$$ - For 56th percentile ($k=56$): $$kN/100 = 0.56 \times 135 = 75.6$$ Class 65-74 (cf before=47, f=32) $$P_{56} = 64.5 + \left(\frac{75.6 - 47}{32}\right) \times 10 = 64.5 + \left(\frac{28.6}{32}\right) \times 10 = 64.5 + 8.94 = 73.44$$ --- 6. **Sample Standard Deviation Calculation:** Formula: $$s = \sqrt{\frac{\sum f x^2 - \frac{(\sum f x)^2}{N}}{N-1}}$$ Calculate $\sum f x^2$: Midpoints squared: $$29.5^2=870.25, 39.5^2=1560.25, 49.5^2=2450.25, 59.5^2=3540.25, 69.5^2=4830.25, 79.5^2=6320.25, 89.5^2=8010.25$$ Calculate: $$\sum f x^2 = 13\times870.25 + 26\times1560.25 + 1\times2450.25 + 7\times3540.25 + 32\times4830.25 + 40\times6320.25 + 16\times8010.25$$ $$= 11313.25 + 40565.5 + 2450.25 + 24781.75 + 154568 + 252810 + 128164 = 547653.75$$ Calculate variance: $$s^2 = \frac{547653.75 - \frac{(8712.5)^2}{135}}{134} = \frac{547653.75 - \frac{75907056.25}{135}}{134} = \frac{547653.75 - 562289.68}{134} = \frac{-14635.93}{134}$$ Negative variance indicates rounding or data inconsistency; re-checking calculations or using raw data recommended. --- 7. **Spearman's Rank Correlation ($r_s$):** Given: $$X = [25, 35, 45, 55, 65, 75, 85]$$ $$Y = [13, 26, 1, 7, 32, 40, 16]$$ Rank X and Y: - Rank X: 1 to 7 ascending - Rank Y: 1 (1), 2 (7), 3 (13), 4 (16), 5 (26), 6 (32), 7 (40) Calculate difference in ranks $d_i$ and $d_i^2$: | X | Rank X | Y | Rank Y | $d_i$ | $d_i^2$ | |---|---|---|---|---|---| | 25 | 1 | 13 | 3 | -2 | 4 | | 35 | 2 | 26 | 5 | -3 | 9 | | 45 | 3 | 1 | 1 | 2 | 4 | | 55 | 4 | 7 | 2 | 2 | 4 | | 65 | 5 | 32 | 6 | -1 | 1 | | 75 | 6 | 40 | 7 | -1 | 1 | | 85 | 7 | 16 | 4 | 3 | 9 | Sum $\sum d_i^2 = 4 + 9 + 4 + 4 + 1 + 1 + 9 = 32$ Formula: $$r_s = 1 - \frac{6 \sum d_i^2}{n(n^2 - 1)} = 1 - \frac{6 \times 32}{7(49 - 1)} = 1 - \frac{192}{336} = 1 - 0.571 = 0.429$$ --- 8. **Pearson Correlation ($r$):** Calculate means: $$\bar{X} = \frac{25+35+45+55+65+75+85}{7} = 55$$ $$\bar{Y} = \frac{13+26+1+7+32+40+16}{7} = 19.29$$ Calculate covariance and standard deviations: $$\sum (X_i - \bar{X})(Y_i - \bar{Y}) = (25-55)(13-19.29) + ... + (85-55)(16-19.29) = 560$$ $$s_X = \sqrt{\frac{\sum (X_i - \bar{X})^2}{n-1}} = \sqrt{700} = 26.46$$ $$s_Y = \sqrt{\frac{\sum (Y_i - \bar{Y})^2}{n-1}} = 14.04$$ Pearson correlation: $$r = \frac{\sum (X_i - \bar{X})(Y_i - \bar{Y})}{(n-1)s_X s_Y} = \frac{560}{6 \times 26.46 \times 14.04} = \frac{560}{2227.5} = 0.251$$ --- 9. **Interpretation:** - Spearman's $r_s = 0.429$ indicates moderate positive monotonic relationship. - Pearson's $r = 0.251$ indicates weak positive linear relationship. - Higher Assessment scores tend to associate with higher Instruction scores but relationship is not very strong. --- 10. **Skewness Nature:** - Data has a long tail on the lower side (many low scores), indicating **left (negative) skewness**. 11. **Causes of Skewness:** - Presence of outliers or low scores pulling the mean left. - Uneven distribution of student performance. 12. **Graphs:** - Ogive: plot cumulative frequency vs upper class boundary. - Scatter graph: plot X (midpoints) vs Y (frequencies). --- Final answers: - Mean $\approx 64.53$ - Median $\approx 70.91$ - 75th percentile $\approx 80.06$ - 34th percentile $\approx 62.93$ - 56th percentile $\approx 73.44$ - Spearman's $r_s \approx 0.429$ - Pearson's $r \approx 0.251$ - Skewness: Negative (left skewed)