What is Variance? – Foundations of Data Science Tutorial

Section 01

The Story That Explains Variance

Imagine two pizza delivery services in your city — QuickSlice and PizzaPal. Both claim an average delivery time of 30 minutes. You order from both over ten days and record the times.

Day	QuickSlice (mins)	PizzaPal (mins)
1	29	10
2	31	55
3	30	8
4	28	62
5	32	15
6	30	48
7	29	20
8	31	70
9	30	12
10	30	0

Both have a mean of exactly 30 minutes. But QuickSlice always arrives between 28–32 minutes. PizzaPal arrives anywhere from 0 to 70 minutes. The mean tells you nothing about this difference. Variance does.

💡

What Variance Tells You

Variance measures how much each value in a dataset deviates from the mean. A low variance means values stay close to the mean (like QuickSlice). A high variance means values are all over the place (like PizzaPal). Same mean — completely different story.

Section 02

The Formula

Population Variance

σ² = Σ(x − μ)² / N

Use when you have every single data point in the population

Sample Variance

s² = Σ(x − x̄)² / (n − 1)

Use when your data is a sample from a larger population

📐

Why (n − 1) and not n?

This is called Bessel's correction. When you work with a sample, your data tends to underestimate how spread out the full population is. Dividing by (n − 1) instead of n corrects for this bias and gives a more accurate estimate of the true population variance. This is why it is called an "unbiased estimator."

Section 03

Step-by-Step Calculation

Let us calculate the variance for QuickSlice: [29, 31, 30, 28, 32, 30, 29, 31, 30, 30] — Mean = 30

🧮 QuickSlice Variance

Step 1

Subtract the mean from each value.
(29−30)=−1 (31−30)=1 (30−30)=0 (28−30)=−2 (32−30)=2 (30−30)=0 (29−30)=−1 (31−30)=1 (30−30)=0 (30−30)=0

Step 2

Square each difference (removes negatives).
1 + 1 + 0 + 4 + 4 + 0 + 1 + 1 + 0 + 0 = 12

Step 3

Divide by (n − 1) — sample variance.
12 / (10 − 1) = 12 / 9 = 1.33 min²

Now let us calculate PizzaPal: [10, 55, 8, 62, 15, 48, 20, 70, 12, 0] — Mean = 30

🧮 PizzaPal Variance

Step 1

Subtract the mean from each value.
(10−30)=−20 (55−30)=25 (8−30)=−22 (62−30)=32 (15−30)=−15 (48−30)=18 (20−30)=−10 (70−30)=40 (12−30)=−18 (0−30)=−30

Step 2

Square each difference.
400 + 625 + 484 + 1024 + 225 + 324 + 100 + 1600 + 324 + 900 = 6006

Step 3

Divide by (n − 1).
6006 / 9 = 667.33 min²

🧮

Result

QuickSlice variance = 1.33 | PizzaPal variance = 667.33
Same mean. Variance of 501× higher for PizzaPal. Now you have proof — not just a feeling — that PizzaPal is wildly unpredictable.

Section 04

Python Implementation

Manual calculation

quickslice = [29, 31, 30, 28, 32, 30, 29, 31, 30, 30]
pizzapal   = [10, 55,  8, 62, 15, 48, 20, 70, 12,  0]

def sample_variance(data):
    n    = len(data)
    mean = sum(data) / n
    return sum((x - mean) ** 2 for x in data) / (n - 1)

print(f"QuickSlice variance: {sample_variance(quickslice):.2f}")
# QuickSlice variance: 1.33

print(f"PizzaPal variance:   {sample_variance(pizzapal):.2f}")
# PizzaPal variance:   667.33

Using the statistics module

import statistics

quickslice = [29, 31, 30, 28, 32, 30, 29, 31, 30, 30]
pizzapal   = [10, 55,  8, 62, 15, 48, 20, 70, 12,  0]

# Sample variance (divides by n-1)
print(statistics.variance(quickslice))   # 1.3333
print(statistics.variance(pizzapal))     # 667.3333

# Population variance (divides by n)
print(statistics.pvariance(quickslice))  # 1.2
print(statistics.pvariance(pizzapal))    # 600.6

Using NumPy

import numpy as np

quickslice = [29, 31, 30, 28, 32, 30, 29, 31, 30, 30]
pizzapal   = [10, 55,  8, 62, 15, 48, 20, 70, 12,  0]

# ddof=1 → sample variance
# ddof=0 → population variance (NumPy default)
print(np.var(quickslice, ddof=1))   # 1.3333
print(np.var(pizzapal,   ddof=1))   # 667.3333

⚠️

NumPy Default Warning

np.var() uses ddof=0 by default — population variance. In almost all data science work you want ddof=1. Always specify it explicitly to avoid a subtle but impactful mistake.

Section 05

The One Problem with Variance

Variance is powerful but has one frustrating issue — its unit is squared. If your data is in minutes, variance is in minutes squared. That is hard to interpret in the real world.

Data	Unit	Variance Unit	Makes sense?
Delivery times	minutes	minutes²	Hard to interpret
Heights	cm	cm²	Hard to interpret
Prices	£	£²	Hard to interpret
Temperatures	°C	°C²	Hard to interpret

🎯

Solution: Standard Deviation

Taking the square root of variance gives you the Standard Deviation — which is in the same unit as your original data. This is why standard deviation is more commonly used for interpretation, while variance is used in mathematical calculations and models. The next tutorial covers standard deviation in full.

Section 06

Where Variance is Used in Data Science

PCA

📊

Finds directions of max variance
Reduces dimensions
Removes redundant features

ANOVA

📐

Compares group variances
Tests if groups differ
Used in A/B testing

Finance

📈

Measures investment risk
Portfolio optimisation
Volatility analysis

Section 07

Golden Rules

🎯 Variance — Key Rules

Variance is always zero or positive. It can never be negative because you square the differences before summing them. A variance of zero means every value is identical.

Use sample variance (n−1) when your data is a sample. Use population variance (n) only when you have the complete dataset for the entire population — which is rare in practice.

Variance is highly sensitive to outliers. Because differences are squared, a single extreme value inflates variance dramatically. Always check for outliers before interpreting variance.

For human interpretation use Standard Deviation (square root of variance). For mathematical operations in ML models use variance — it has better algebraic properties.