Subjects statistics

Correlation Regression 9B340F

Step-by-step solutions with LaTeX - clean, fast, and student-friendly.

Search Solutions

Correlation Regression 9B340F


1. **Problem Statement:** Given advertising expenditure $X$ (in $100$ units) and sales revenue $Y$ (in $1000$ units) as: $X = [1, 2, 3, 4, 5]$ $Y = [3, 3, 5, 5, 6]$ We need to: a) Calculate the coefficient of correlation between $X$ and $Y$. b) Fit the linear regression model $y = a + bx$. c) Calculate the reliability of the model. --- 2. **Formulas and Important Rules:** - Mean of $X$: $\bar{X} = \frac{\sum X_i}{n}$ - Mean of $Y$: $\bar{Y} = \frac{\sum Y_i}{n}$ - Covariance: $S_{XY} = \frac{\sum (X_i - \bar{X})(Y_i - \bar{Y})}{n}$ - Variance of $X$: $S_{XX} = \frac{\sum (X_i - \bar{X})^2}{n}$ - Variance of $Y$: $S_{YY} = \frac{\sum (Y_i - \bar{Y})^2}{n}$ - Coefficient of correlation: $r = \frac{S_{XY}}{\sqrt{S_{XX} S_{YY}}}$ - Regression coefficients: $$b = \frac{S_{XY}}{S_{XX}}, \quad a = \bar{Y} - b \bar{X}$$ - Reliability (coefficient of determination): $r^2$ --- 3. **Calculations:** - Number of data points: $n=5$ - Calculate means: $$\bar{X} = \frac{1+2+3+4+5}{5} = 3$$ $$\bar{Y} = \frac{3+3+5+5+6}{5} = \frac{22}{5} = 4.4$$ - Calculate deviations and products: | $X_i$ | $Y_i$ | $X_i - \bar{X}$ | $Y_i - \bar{Y}$ | $(X_i - \bar{X})(Y_i - \bar{Y})$ | $(X_i - \bar{X})^2$ | $(Y_i - \bar{Y})^2$ | |-------|-------|-----------------|-----------------|-------------------------------|-------------------|-------------------| | 1 | 3 | $1-3=-2$ | $3-4.4=-1.4$ | $(-2)(-1.4)=2.8$ | 4 | 1.96 | | 2 | 3 | $2-3=-1$ | $3-4.4=-1.4$ | $(-1)(-1.4)=1.4$ | 1 | 1.96 | | 3 | 5 | $3-3=0$ | $5-4.4=0.6$ | $0 \times 0.6=0$ | 0 | 0.36 | | 4 | 5 | $4-3=1$ | $5-4.4=0.6$ | $1 \times 0.6=0.6$ | 1 | 0.36 | | 5 | 6 | $5-3=2$ | $6-4.4=1.6$ | $2 \times 1.6=3.2$ | 4 | 2.56 | - Sum these values: $$\sum (X_i - \bar{X})(Y_i - \bar{Y}) = 2.8 + 1.4 + 0 + 0.6 + 3.2 = 8$$ $$\sum (X_i - \bar{X})^2 = 4 + 1 + 0 + 1 + 4 = 10$$ $$\sum (Y_i - \bar{Y})^2 = 1.96 + 1.96 + 0.36 + 0.36 + 2.56 = 7.2$$ - Calculate variances and covariance (using $n=5$): $$S_{XY} = \frac{8}{5} = 1.6$$ $$S_{XX} = \frac{10}{5} = 2$$ $$S_{YY} = \frac{7.2}{5} = 1.44$$ - Calculate coefficient of correlation: $$r = \frac{1.6}{\sqrt{2 \times 1.44}} = \frac{1.6}{\sqrt{2.88}} = \frac{1.6}{1.697} \approx 0.9428$$ - Calculate regression coefficients: $$b = \frac{S_{XY}}{S_{XX}} = \frac{1.6}{2} = 0.8$$ $$a = \bar{Y} - b \bar{X} = 4.4 - 0.8 \times 3 = 4.4 - 2.4 = 2$$ - Regression equation: $$y = 2 + 0.8x$$ - Reliability (coefficient of determination): $$r^2 = (0.9428)^2 = 0.889$$ --- 4. **Interpretation:** - The correlation coefficient $r \approx 0.943$ indicates a strong positive linear relationship between advertising expenditure and sales revenue. - The regression line $y = 2 + 0.8x$ can be used to predict sales revenue from advertising expenditure. - The reliability $r^2 \approx 0.889$ means about 88.9% of the variation in sales revenue is explained by the advertising expenditure. --- **Final answers:** a) Coefficient of correlation $r \approx 0.943$ b) Regression line $y = 2 + 0.8x$ c) Reliability of the model $r^2 \approx 0.889$