What is Variance?
Variance measures how far each number in a dataset is from the mean. It is the square of the standard deviation. Variance gives more weight to outliers because differences are squared.
Sample Variance s² = Σ(xᵢ − x̄)² / (N−1)
Variance vs Standard Deviation
- Variance = Standard Deviation²
- Variance is in squared units; SD is in original units
- SD is more interpretable; variance is used in calculations
Variance Calculation — Step by Step
Dataset: {5, 10, 15, 20, 25}
Squared deviations: (5−15)²=100, (10−15)²=25, (15−15)²=0, (20−15)²=25, (25−15)²=100
Sum = 250
Population Variance σ² = 250/5 = 50
Sample Variance s² = 250/4 = 62.5
Population SD σ = √50 ≈ 7.07
Sample SD s = √62.5 ≈ 7.91
Why Sample Variance Divides by (n−1)?
When you have a sample rather than the full population, dividing by (n−1) rather than n gives an unbiased estimator of the true population variance. This correction is called Bessel's correction.
With n−1: estimate is unbiased on average across many samples
As n → ∞, the difference becomes negligible
Variance in Probability and Statistics
| Property | Formula |
|---|---|
| Var(X) = E[(X−μ)²] | Definition |
| Var(aX) = a²·Var(X) | Scaling |
| Var(X+Y) = Var(X)+Var(Y) | Independent variables |
| SD = √Variance | Relationship |
Real-World Applications
- Finance: Variance measures investment risk (mean-variance portfolio theory)
- Quality control: Low variance means consistent manufacturing output
- ANOVA: Analysis of Variance is a statistical test using variance ratios
- Machine learning: Bias-variance tradeoff in model selection
Variance in Finance — Modern Portfolio Theory
Variance is the cornerstone of Modern Portfolio Theory (MPT), developed by Harry Markowitz in 1952 (Nobel Prize 1990). Investors use variance to measure investment risk:
where w = weights, σ² = variance, ρ = correlation
Key insight: diversification reduces portfolio variance
even when individual stock variances are high
Coefficient of Variation (CV)
CV allows fair comparison of variability between datasets with different means:
Dataset A: mean=$100, SD=$20 → CV=20%
Dataset B: mean=$1000, SD=$100 → CV=10%
Dataset B is relatively LESS variable despite larger SD
Variance in ANOVA
Analysis of Variance (ANOVA) is a statistical test that compares variance between groups to variance within groups to determine if group means are significantly different. It is widely used in scientific research, clinical trials, and quality control.