Variable Classification And Data Analysis
1. Problem: Classify variables as categorical or numerical, and if numerical, as discrete or continuous, and identify the measurement scale.
1.a Number of telephones per household:
- This is a numerical variable because it counts telephones.
- It is discrete since telephones are countable in whole numbers.
- Measurement scale: Ratio scale (has a true zero and meaningful ratios).
1.b Length (in minutes) of the longest telephone call made in a month:
- Numerical variable because it measures time.
- Continuous since time can be any value within a range.
- Measurement scale: Ratio scale.
1.c Whether someone in the household owns a Wi-Fi-capable cell phone:
- Categorical variable (yes/no).
- Measurement scale: Nominal scale (categories without order).
1.d Whether there is a high-speed Internet connection in the household:
- Categorical variable (yes/no).
- Measurement scale: Nominal scale.
2. Problem: Given data set $\{7,4,9,7,3,12\}$ with $n=6$, compute descriptive statistics and analyze data.
2.a Compute mean, median, and mode:
- Mean: $$\frac{7+4+9+7+3+12}{6} = \frac{42}{6} = 7$$
- Median: Sort data: $\{3,4,7,7,9,12\}$; median is average of middle two: $$\frac{7+7}{2} = 7$$
- Mode: Most frequent value is 7.
2.b Compute range, variance, standard deviation, coefficient of variation:
- Range: $12 - 3 = 9$
- Variance: Calculate squared deviations from mean 7:
$$(7-7)^2=0, (4-7)^2=9, (9-7)^2=4, (7-7)^2=0, (3-7)^2=16, (12-7)^2=25$$
- Sum of squared deviations: $0+9+4+0+16+25=54$
- Sample variance: $$s^2 = \frac{54}{6-1} = \frac{54}{5} = 10.8$$
- Standard deviation: $$s = \sqrt{10.8} \approx 3.29$$
- Coefficient of variation: $$\frac{s}{\text{mean}} = \frac{3.29}{7} \approx 0.47$$
2.c Compute Z scores and check for outliers:
- Z score formula: $$Z = \frac{x - \text{mean}}{s}$$
- Calculate for each:
- $Z_7 = \frac{7-7}{3.29} = 0$
- $Z_4 = \frac{4-7}{3.29} \approx -0.91$
- $Z_9 = \frac{9-7}{3.29} \approx 0.61$
- $Z_7 = 0$
- $Z_3 = \frac{3-7}{3.29} \approx -1.22$
- $Z_{12} = \frac{12-7}{3.29} \approx 1.52$
- Common outlier rule: $|Z| > 2$; no values exceed this, so no outliers.
2.d Describe shape of data set:
- Data is roughly symmetric since mean = median = mode.
- No extreme outliers.
- Distribution shape is approximately symmetric and unimodal.