The Bridge Engineer & The Freak Storm
In 1998, a bridge engineer was reviewing 50 years of wind-speed data for a major crossing. The average wind speed looked perfectly safe. The standard deviation looked acceptable too. Every standard model gave the green light.
But one senior analyst noticed something the summary statistics had hidden: the wind data had an unusually heavy tail. Extreme gusts — the kind that happen once every 30 years — were far more likely than a normal distribution would predict. The bridge was redesigned. A near-disaster was avoided.
Mean tells you where your data is centred. Standard deviation tells you how spread out it is. But neither tells you how likely extreme values are. Kurtosis fills that gap — it measures the weight of a distribution's tails relative to a normal distribution.
In finance, ignoring kurtosis caused multi-billion dollar losses in 2008 — risk models assumed normal distributions while the real data had fat tails. In manufacturing, kurtosis reveals whether defects cluster near the mean or explode at the extremes. In machine learning, it helps you understand feature distributions before modelling.
The Three Shapes — Mesokurtic, Leptokurtic, Platykurtic
Every distribution's tail behaviour falls into one of three categories. Think of them as three different exam score distributions in a school — same average, same spread, but very different risk of extreme outcomes.
- Normal distribution
- Baseline reference
- Moderate tails
- Excess kurtosis = 0
- Tall, sharp peak
- Fat / heavy tails
- More extreme outliers
- E.g. stock returns
- Flat, broad peak
- Thin / light tails
- Fewer extreme values
- E.g. uniform-like data
Lepto = thin (Greek) → thin waist, energy pushed to the tails → fat tails, sharp peak.
Platy = broad/flat (Greek) → broad shape, energy pulled from tails → thin tails, flat peak.
Meso = middle (Greek) → right in between — the normal distribution.
Visual Diagram — Tail Behaviour at a Glance
The diagram below compares the three kurtosis types overlaid on the same axis. Notice that all three curves share the same mean and approximate spread — kurtosis only changes the peak height and tail weight.
Blue (Mesokurtic) — the normal bell curve, your reference baseline.
Amber (Leptokurtic) — taller and narrower peak, but fatter tails that stretch further out.
Purple (Platykurtic) — flatter and wider peak, but thinner tails that drop off quickly.
Notice how the leptokurtic curve has a taller, sharper spike at the centre but its tails extend further and drop more slowly — meaning extreme values occur more often than you would expect from a normal distribution. The platykurtic curve is flatter overall, with very little probability mass in the extreme tails.
The Formula — Kurtosis & Excess Kurtosis
Kurtosis is the standardised fourth central moment of a distribution. The fourth power makes it extremely sensitive to values far from the mean — which is exactly what we want when measuring tail heaviness.
SciPy scipy.stats.kurtosis() returns excess kurtosis (normal = 0) by default.
Pandas .kurt() also returns excess kurtosis.
Excel KURT() returns excess kurtosis.
Always check the documentation — confusing the two leads to completely wrong conclusions.
Step-by-Step Calculation
Let's manually calculate kurtosis for a small dataset of
daily temperature readings (°C):
[18, 20, 20, 21, 21, 21, 22, 22, 23, 32].
The last value (32) is an outlier — we want to see how it affects kurtosis.
(18+20+20+21+21+21+22+22+23+32) / 10 = 22.0
18→(−4)⁴=256 | 20→(−2)⁴=16 | 20→16
21→(−1)⁴=1 | 21→1 | 21→1
22→(0)⁴=0 | 22→0 | 23→(1)⁴=1
32→(10)⁴ = 10,000 ← outlier dominates!
256+16+16+1+1+1+0+0+1+10000 = 10,292
Variance = Σ(xᵢ−μ)² / N = (16+4+4+1+1+1+0+0+1+100)/10 = 12.8
σ⁴ = 12.8² = 163.84
This is leptokurtic — the single outlier at 32°C pulls the tails heavily.
The outlier value of 32°C contributed 10,000 to the sum of fourth powers, while all other nine values combined contributed only 292. This demonstrates why kurtosis is so sensitive to outliers — the fourth power amplifies extreme deviations dramatically. One bad day can make your whole distribution look leptokurtic.
Real-World Stories for Each Type
Leptokurtic — Stock Market Returns
Daily stock market returns are the classic example of a leptokurtic distribution. On most days, returns cluster tightly near zero — forming a sharp central peak. But market crashes and rallies (Black Monday 1987, the 2008 crisis, COVID March 2020) produce extreme returns that a normal distribution assigns near-zero probability. Traders who assumed normality were wiped out. This is the infamous "fat tails" problem in finance.
Goldman Sachs' CFO famously said the 2008 crisis involved events that were "25 standard deviations away" — events the normal distribution says happen once every universe lifetime. In reality, fat-tailed leptokurtic distributions make such events far more common. Ignoring kurtosis cost trillions of dollars.
Platykurtic — Student Exam Scores (Uniform Marking)
Imagine a school that caps grades between 40 and 80, and grades are spread fairly evenly across that range. There are no students scoring 5 or 98 — the extremes are clipped. This distribution is platykurtic: a broad, flat peak with very thin tails. In manufacturing, a tightly controlled production process (e.g. a machine stamping parts to ±0.1 mm tolerance) produces platykurtic output — extreme defects are nearly impossible.
Mesokurtic — Human Heights
Adult heights in a large population follow approximately a normal distribution, which is the textbook mesokurtic case. There are very tall and very short people, but they occur at the rate the normal distribution predicts — no surprises in the tails. Excess kurtosis ≈ 0.
Leptokurtic: Stock returns, insurance claim sizes, earthquake magnitudes, network traffic spikes, NLP token frequency distributions.
Platykurtic: Uniformly distributed random numbers, clamped sensor readings, bounded test scores, certain bimodal distributions.
Mesokurtic: Heights, IQ scores, measurement errors, many natural phenomena under the Central Limit Theorem.
Python Implementation
Using SciPy
from scipy import stats
import numpy as np
# Temperature data with one outlier
data = [18, 20, 20, 21, 21, 21, 22, 22, 23, 32]
# scipy returns EXCESS kurtosis by default (normal = 0)
excess_kurt = stats.kurtosis(data)
print(f"Excess Kurtosis: {excess_kurt:.4f}")
# Output: Excess Kurtosis: 3.2842
# Raw kurtosis (fisher=False disables the -3 correction)
raw_kurt = stats.kurtosis(data, fisher=False)
print(f"Raw Kurtosis: {raw_kurt:.4f}")
# Output: Raw Kurtosis: 6.2842
# Interpret the result
if excess_kurt > 0:
print("Leptokurtic — fat tails, sharp peak")
elif excess_kurt < 0:
print("Platykurtic — thin tails, flat peak")
else:
print("Mesokurtic — normal-like tails")
Using Pandas
import pandas as pd
data = [18, 20, 20, 21, 21, 21, 22, 22, 23, 32]
s = pd.Series(data)
# .kurt() returns excess kurtosis
print(f"Excess Kurtosis: {s.kurt():.4f}")
# Output: Excess Kurtosis: 4.5765 (Pandas uses sample correction by default)
# On a DataFrame — kurtosis of each column
df = pd.DataFrame({
'temperature': [18, 20, 20, 21, 21, 21, 22, 22, 23, 32],
'humidity': [55, 60, 62, 60, 58, 61, 63, 64, 62, 59]
})
print(df.kurt())
# temperature 4.576...
# humidity -0.123...
Comparing All Three Types Numerically
from scipy import stats
import numpy as np
np.random.seed(42)
# Mesokurtic — standard normal
mesokurtic = np.random.normal(loc=0, scale=1, size=10000)
# Leptokurtic — Student's t-distribution (low df = fatter tails)
leptokurtic = np.random.standard_t(df=3, size=10000)
# Platykurtic — uniform distribution
platykurtic = np.random.uniform(low=-3, high=3, size=10000)
for name, dist in [("Mesokurtic (normal)", mesokurtic),
("Leptokurtic (t, df=3)", leptokurtic),
("Platykurtic (uniform)", platykurtic)]:
k = stats.kurtosis(dist)
print(f"{name} → Excess Kurtosis = {k:.3f}")
# Output (approximate):
# Mesokurtic (normal) → Excess Kurtosis = 0.021
# Leptokurtic (t, df=3) → Excess Kurtosis = 5.847
# Platykurtic (uniform) → Excess Kurtosis = -1.193
The t-distribution is leptokurtic by design. As degrees of freedom (df) increase, it converges to the normal distribution and kurtosis → 0. At df=3, excess kurtosis = ∞ (theoretically undefined). At df=5, it equals 6. At df=30+, it is practically indistinguishable from normal. This is why t-tests work better than z-tests for small samples.
Kurtosis Comparison Table
| Type | Raw Kurtosis | Excess Kurtosis | Peak Shape | Tail Weight | Real-world Example |
|---|---|---|---|---|---|
| Mesokurtic | = 3 | = 0 | Medium bell | Moderate | Human heights, IQ |
| Leptokurtic | > 3 | > 0 | Tall, sharp | Fat / Heavy | Stock returns, earthquakes |
| Platykurtic | < 3 | < 0 | Flat, broad | Thin / Light | Uniform data, bounded scores |
| Property | Kurtosis | Skewness |
|---|---|---|
| Measures | Tail heaviness / peak sharpness | Asymmetry / lean direction |
| Moment used | 4th central moment | 3rd central moment |
| Normal value | 3 (raw) / 0 (excess) | 0 |
| Sensitive to outliers? | Extremely | Yes |
| SciPy function | stats.kurtosis() |
stats.skew() |