Subjects statistics

Cluster Variance

Step-by-step solutions with LaTeX - clean, fast, and student-friendly.

Search Solutions

Cluster Variance


1. **Problem Statement:** We want to calculate the variance $v_c^2$ for clusters given the formula: $$ v_c^2 = \frac{1}{n_c - 1} \sum_{i=1}^{n_c} \left( d_i - \overline{d} \right)^2 $$ where $n_c$ is the number of data points in cluster $c$, $d_i$ are the data points, and $\overline{d}$ is the mean of the data points in the cluster. 2. **Dataset and Clustering:** Given points: $(1,1), (3,1), (1,3), (3,3), (7,7), (9,9)$. Assume two clusters are formed: - Cluster 1: $(1,1), (3,1), (1,3), (3,3)$ - Cluster 2: $(7,7), (9,9)$ 3. **Calculate Mean of Each Cluster:** - For Cluster 1: \[ \overline{x} = \frac{1+3+1+3}{4} = 2, \quad \overline{y} = \frac{1+1+3+3}{4} = 2 \] - For Cluster 2: \[ \overline{x} = \frac{7+9}{2} = 8, \quad \overline{y} = \frac{7+9}{2} = 8 \] 4. **Calculate Variance for Each Cluster:** Variance is calculated separately per dimension and then summed, so for Cluster 1: \[ v_1^2 = \frac{1}{4-1} \sum_{i=1}^4 \left((x_i - 2)^2 + (y_i - 2)^2\right) \] Calculations for each point in Cluster 1: - $(1,1): (1-2)^2 + (1-2)^2 = 1 + 1 = 2$ - $(3,1): (3-2)^2 + (1-2)^2 = 1 + 1 = 2$ - $(1,3): (1-2)^2 + (3-2)^2 = 1 + 1 = 2$ - $(3,3): (3-2)^2 + (3-2)^2 = 1 + 1 = 2$ Sum = 2+2+2+2=8 \[ v_1^2 = \frac{8}{3} = 2.6667 \] For Cluster 2: \[ v_2^2 = \frac{1}{2-1} \sum_{i=1}^2 \left((x_i - 8)^2 + (y_i - 8)^2\right) \] Calculations: - $(7,7): (7-8)^2 + (7-8)^2 = 1 + 1 = 2$ - $(9,9): (9-8)^2 + (9-8)^2 = 1 + 1 = 2$ Sum = 2 + 2 = 4 \[ v_2^2 = \frac{4}{1} = 4 \] 5. **Final Answer:** - Variance of Cluster 1: $v_1^2 = 2.67$ - Variance of Cluster 2: $v_2^2 = 4$ Thus, you now know how to compute cluster variance using the given dataset.