Stationarity in Time Series

Section 01

The Story That Explains Stationarity

📖 Real-World Analogy

The Unreliable River vs The Steady Canal

Imagine two rivers. The first is a wild mountain river — in spring it floods, in summer it trickles, in autumn it rages again. Its behaviour in January tells you nothing useful about its behaviour in July. Any rule you learn about it in one season becomes wrong the next.

The second is a man-made canal. Locks and gates keep the water level within a precise band year-round. Its depth in January is statistically identical to its depth in July. Any rule you learn about the canal this week still applies next year.

Statistical models are engineers who want to build a pump on the riverbank. They need to know how deep the water will be — but only the canal lets them plan reliably. The wild river's constantly shifting nature makes every calculation expire the moment it was made.

Stationarity is the difference between the wild river and the canal. A stationary time series is one whose statistical properties — mean, variance, and autocorrelation structure — do not change over time. It is the foundation that makes time series modelling possible.

Every classical time series model — AR, MA, ARMA, ARIMA — ultimately relies on stationarity. Models that assume the canal find the wild river unworkable. Before fitting any model, the single most important question you must answer is: "Is my series stationary?" This tutorial gives you the complete toolkit to answer it.

Section 02

What Stationarity Actually Means

Stationarity is not one single condition — it comes in two flavours with very different practical implications. Understanding the distinction is essential before running any test.

🎦 Animated — Strict vs Weak Stationarity: A Visual Comparison

Left: strict stationarity — the full probability distribution is identical at every point in time. Right: weak (covariance) stationarity — only the mean, variance, and lag-k covariances are constant; the shape of the distribution may change. In practice, almost all time series tests check for weak stationarity.

Section 03

Weak vs Strict Stationarity — Side by Side

Strict Stationarity

F(yₜ₁,…,yₜₖ) = F(yₜ₁₊h,…,yₜₖ₊h)

The entire joint distribution of any finite set of observations is invariant to time shifts. This is a very strong requirement — it means not just the mean and variance, but every moment (skewness, kurtosis, …) must be the same at any point in time. Strict implies weak, but not vice versa.

Weak (Covariance) Stationarity

E[yₜ]=μ, Cov(yₜ,yₜ₋ₖ)=γ(k)

Only three conditions: (1) constant mean μ for all t, (2) constant variance σ², (3) autocovariance depends only on lag k, not on time t. This is what ADF and KPSS test. All classical time series models (ARMA, ARIMA) require weak stationarity.

⛔ Strict Stationarity

Condition	Requirement
Distribution	Entire joint distribution time-invariant
Mean	Constant (implied)
Variance	Constant (implied)
Higher moments	All moments constant (skewness, kurtosis)
Implies weak?	Yes — if finite variance exists
Practical use	Theoretical proofs; rarely verified empirically
Testable?	No direct standard test exists

✅ Weak (Covariance) Stationarity

Condition	Requirement
Distribution	Shape may change; not required constant
Mean	E[yₜ] = μ constant for all t
Variance	Var(yₜ) = σ² constant for all t
Higher moments	Not required to be constant
Implies strict?	No — only if Gaussian (Normal) process
Practical use	Required by ARMA, ARIMA, VAR
Testable?	Yes — ADF, KPSS, PP tests

💡

The Gaussian Exception

For Gaussian (Normal) processes, weak and strict stationarity are equivalent. This is because a Normal distribution is completely characterised by its mean and variance — if those are constant, the entire distribution is constant. Since many economic and financial models assume Gaussian errors, this equivalence is often exploited in practice.

Section 04

Why Stationarity Matters — The Consequences of Ignoring It

⚠️ Classic Blunder

The Spurious Regression Problem

In 1974, Granger & Newbold ran regressions of one random walk on another — two series with absolutely no connection to each other. They found R² > 0.99 and highly significant t-statistics. The regression "proved" a strong relationship that was pure statistical noise.

This is called spurious regression — it happens whenever you regress two non-stationary series against each other. The trending means create the illusion of correlation. Classic examples in the literature include: shoe-size regressions on stock prices, and stork-population regressions on birth rates. Both appear statistically significant. Both are nonsense.

The fix is always the same: make each series stationary before modelling.

🎦 Animated — Three Properties That Must Be Stable for Stationarity

Three properties, each illustrated. Box 1 (green): stationary mean oscillates around μ — non-stationary (faint red) drifts. Box 2 (gold): stationary variance stays within a constant band — non-stationary widens like a funnel. Box 3 (purple): the ACF decay pattern (solid vs dashed — two time windows) is identical, confirming covariance depends only on lag, not on when it is measured.

Section 05

Types of Non-Stationarity — Know Your Enemy

Non-stationarity is not a single problem — it has four distinct flavours, each requiring a different cure. Misdiagnosing the type leads to the wrong treatment.

🎦 Animated — Four Flavours of Non-Stationarity

Each panel shows a different violation with its fix. Trend: mean drifts up — fix with differencing. Variance: swings grow — fix with log. Structural break: sudden level shift — fix with dummy variable. Seasonal: repeating cycle in mean — fix with seasonal differencing.

Section 06

The ADF Test — Augmented Dickey-Fuller

📖 Story

The Court of Stationarity

Imagine a court where stationarity is on trial. The judge starts with the null hypothesis (H₀): the series is guilty of being non-stationary — it has a unit root and wanders like a random walk. The series is considered non-stationary until proven otherwise.

The prosecution presents evidence — the ADF test statistic. If the evidence is strong enough (p-value < 0.05), we reject the null and declare the series stationary. If the evidence is weak (p > 0.05), we fail to reject — the series remains presumed non-stationary and must be treated (differenced or transformed) before modelling.

The critical subtlety: failing to reject H₀ does not prove non-stationarity. It just means we lack sufficient evidence for stationarity. This is why we complement ADF with the KPSS test — which flips the burden of proof.

ADF Test Equation

Δyₜ = α + βt + γyₜ₋₁ + Σδᵢ Δyₜ₋ᵢ + εₜ

The test regresses Δyₜ on a constant (α), optional trend (βt), the lagged level (γyₜ₋₁), and augmenting lags of Δyₜ (the "A" in ADF — added to absorb autocorrelation in residuals). H₀: γ = 0 (unit root, non-stationary). H₁: γ < 0 (stationary).

Decision Rule

Reject H₀ if ADF statistic < critical value

ADF statistics are negative numbers — more negative = stronger evidence against unit root. The critical values at 1%, 5%, 10% are pre-computed (not standard Normal). Alternatively: reject H₀ if p-value < 0.05 → series is stationary. Fail to reject if p > 0.05 → evidence of non-stationarity.

⚙️ ADF Test — Three Model Specifications (Choose Carefully)

No const

Δyₜ = γyₜ₋₁ + lags + εₜ — No constant, no trend. Use only when the series is known to have zero mean and no trend. Rarely appropriate in practice.

Constant

Δyₜ = α + γyₜ₋₁ + lags + εₜ — Includes a constant (drift). Default for most economic series. Use when the series has a non-zero mean but no deterministic trend.

Trend

Δyₜ = α + βt + γyₜ₋₁ + lags + εₜ — Includes constant + linear trend. Use when the series appears to have a deterministic trend (GDP, population). Most conservative specification — hardest to reject H₀.

🎦 Animated — ADF Test Decision Pipeline

Section 07

The KPSS Test — Flipping the Burden of Proof

The ADF test can fail to reject H₀ (non-stationarity) simply because it lacks statistical power — especially with short series. The KPSS test (Kwiatkowski-Phillips-Schmidt-Shin) reverses the hypothesis: it assumes stationarity by default and tests whether there is evidence against it.

KPSS Test Decomposition

yₜ = βt + rₜ + εₜ

The series is decomposed into a deterministic trend (βt), a random walk component (rₜ), and stationary noise (εₜ). H₀: σ²(random walk) = 0 (series is stationary). H₁: σ² > 0 (random walk present, non-stationary).

KPSS Decision Rule

Reject H₀ if KPSS statistic > critical value

Unlike ADF (where more negative is better), KPSS statistics are positive — larger means more evidence against stationarity. Reject H₀ (stationarity) if statistic exceeds the critical value at chosen significance level. p < 0.05 → reject stationarity → non-stationary.

🔑

The Combined ADF + KPSS Logic Table

Using both tests together eliminates ambiguity. The four possible outcomes tell a precise story.

ADF Result	KPSS Result	Conclusion	Action
Reject H₀ (p < .05)	Fail to reject H₀ (p > .05)	Stationary ✓ — both agree	Proceed with ARMA modelling
Fail to reject H₀ (p > .05)	Reject H₀ (p < .05)	Non-stationary — both agree	Difference or log-transform, then re-test
Reject H₀ (p < .05)	Reject H₀ (p < .05)	Trend-stationary — disagreement	Detrend (remove deterministic trend); use ADF with trend specification
Fail to reject H₀ (p > .05)	Fail to reject H₀ (p > .05)	Inconclusive — disagreement	Increase sample size; try PP test; use judgement from visual inspection

⚠️

KPSS Has Low Power in Small Samples

With fewer than ~100 observations, KPSS often fails to reject its null (stationarity) even when the series is genuinely non-stationary. For short series, rely more heavily on ADF and visual inspection. KPSS is most reliable with 200+ observations.

Section 08

ADF & KPSS in Python — Full Workflow

# ─── 0. Imports ──────────────────────────────────────────────────────────────
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings('ignore')

from statsmodels.tsa.stattools import adfuller, kpss

# ─── 1. Create two test series ───────────────────────────────────────────────
np.random.seed(42)
n = 300

# Non-stationary: random walk (unit root)
rw = np.cumsum(np.random.normal(0, 1, n))
rw_series = pd.Series(rw, name='Random Walk')

# Stationary: AR(1) with φ=0.7
ar = np.zeros(n)
for t in range(1, n):
    ar[t] = 0.7 * ar[t-1] + np.random.normal(0, 1)
ar_series = pd.Series(ar, name='AR(1) φ=0.7')

# ─── 2. Reusable test function ────────────────────────────────────────────────
def run_stationarity_tests(series, name=""):
    print(f"\n{'─'*54}")
    print(f" Series: {name}")
    print(f"{'─'*54}")

    # ── ADF (H₀: unit root / non-stationary) ──
    adf_stat, adf_p, adf_lags, _, adf_crit, _ = adfuller(series, autolag='AIC')
    print(f"\n[ADF Test]  H₀: unit root (non-stationary)")
    print(f"  ADF Stat  : {adf_stat:>10.4f}")
    print(f"  p-value   : {adf_p:>10.6f}")
    print(f"  Crit 1%   : {adf_crit['1%']:>10.4f}")
    print(f"  Crit 5%   : {adf_crit['5%']:>10.4f}")
    print(f"  Lags used : {adf_lags}")
    adf_conc = 'STATIONARY ✓' if adf_p < 0.05 else 'NON-STATIONARY ✗'
    print(f"  Verdict   : {adf_conc}")

    # ── KPSS (H₀: stationary) ──
    kpss_stat, kpss_p, kpss_lags, kpss_crit = kpss(series, regression='c', nlags='auto')
    print(f"\n[KPSS Test] H₀: stationary")
    print(f"  KPSS Stat : {kpss_stat:>10.4f}")
    print(f"  p-value   : {kpss_p:>10.4f}")
    print(f"  Crit 5%   : {kpss_crit['5%']:>10.4f}")
    kpss_conc = 'NON-STATIONARY ✗' if kpss_p < 0.05 else 'STATIONARY ✓'
    print(f"  Verdict   : {kpss_conc}")

    # Combined conclusion
    if adf_p < 0.05 and kpss_p > 0.05:
        combined = "✅ BOTH AGREE: STATIONARY"
    elif adf_p > 0.05 and kpss_p < 0.05:
        combined = "❌ BOTH AGREE: NON-STATIONARY"
    elif adf_p < 0.05 and kpss_p < 0.05:
        combined = "⚠️  TREND-STATIONARY (disagree)"
    else:
        combined = "⚠️  INCONCLUSIVE (disagree)"
    print(f"\n  Combined  : {combined}")

run_stationarity_tests(rw_series, "Random Walk")
run_stationarity_tests(ar_series, "AR(1) φ=0.7")

OUTPUT

────────────────────────────────────────────────────── Series: Random Walk ────────────────────────────────────────────────────── [ADF Test] H₀: unit root (non-stationary) ADF Stat : 0.5831 p-value : 0.9884 ← >> 0.05 → FAIL to reject H₀ Crit 1% : -3.4524 Crit 5% : -2.8713 Lags used : 1 Verdict : NON-STATIONARY ✗ [KPSS Test] H₀: stationary KPSS Stat : 2.8841 p-value : 0.0100 ← < 0.05 → REJECT stationarity Crit 5% : 0.4630 Verdict : NON-STATIONARY ✗ Combined : ❌ BOTH AGREE: NON-STATIONARY ────────────────────────────────────────────────────── Series: AR(1) φ=0.7 ────────────────────────────────────────────────────── [ADF Test] H₀: unit root (non-stationary) ADF Stat : -8.4231 p-value : 0.0000 ← << 0.05 → REJECT unit root Crit 5% : -2.8713 Verdict : STATIONARY ✓ [KPSS Test] H₀: stationary KPSS Stat : 0.1842 p-value : 0.1000+ ← > 0.05 → FAIL to reject stationarity Crit 5% : 0.4630 Verdict : STATIONARY ✓ Combined : ✅ BOTH AGREE: STATIONARY

Section 09

Differencing — The Most Powerful Stationarity Fix

📖 Story

The Finance Journalist Who Switched to Returns

A finance journalist always reported the stock price: "Apple closed at ₹182 today." Her editor noticed that predicting tomorrow's price (₹183?) was nearly impossible — the price drifted unpredictably upward over the years.

A quant on the trading desk suggested: "Stop reporting the price. Report the change — how many rupees did it move today?" Suddenly the series became tractable: changes centred around zero, had consistent volatility, and the ADF test passed immediately.

That simple act — subtracting yesterday from today — is first differencing. It transforms a random walk (non-stationary) into white noise (stationary). It is the single most important transformation in time series analysis.

🎦 Animated — First Differencing: Random Walk → Stationary

Left: a random walk — mean drifts upward, ADF fails (p=0.99). Centre: the differencing formula Δyₜ = yₜ − yₜ₋₁. Right: after one difference — series oscillates around a stable zero mean, variance is constant, ADF passes (p≈0.000).

First Difference (d=1)

Δyₜ = yₜ − yₜ₋₁

Removes a linear trend / unit root. The most common transformation. Price → returns. Level → change. GDP level → GDP growth rate. Apply when ADF fails and the series has a linear drift.

Second Difference (d=2)

Δ²yₜ = Δyₜ − Δyₜ₋₁

Removes a quadratic trend or when the first difference is still non-stationary. Rare in practice — most economic series need only d=1. Warning: over-differencing introduces unnecessary MA unit roots. Stop at the minimum d.

Seasonal Difference (lag s)

Δₛyₜ = yₜ − yₜ₋ₛ

Removes a seasonal pattern by subtracting the value from s periods ago. For monthly data with yearly cycle: s=12. For weekly data with daily cycle: s=7. Used in SARIMA as the D parameter.

Both Seasonal + Regular

Δ₁Δ₁₂yₜ = Δ₁(yₜ − yₜ₋₁₂)

For series with both a trend and seasonality, apply seasonal differencing first (removes seasonality), then first difference the result (removes trend). The classic airline passenger dataset requires exactly this: d=1, D=1, s=12.

Section 10

Log Transformation — Taming Growing Variance

📖 Story

The Microphone That Kept Clipping

A sound engineer is recording a concert. In the quiet passages, the microphone picks up subtle nuances. But during the loud chorus, the signal clips — the peaks are so large they distort everything around them. Simply turning down the gain doesn't work — it makes the quiet parts too faint to hear.

What the engineer needs is a compressor — a device that reduces large signals proportionally more than small ones, creating consistent, manageable dynamics throughout.

The log transformation is the statistical compressor. When variance grows proportionally with the level (as in stock prices, GDP, or population), the log flattens that relationship. Large values are compressed; small values are barely changed. The result is a series with consistent variance — a prerequisite for stationarity.

🎦 Animated — Log Transform: Stabilising Heteroscedastic Variance

Left: original exponential series — variance (band width) grows as level rises, ADF fails. Right: after log transformation — the oscillations have uniform height throughout, variance is stabilised within the green band. Log-differencing (applying first difference after the log) then removes the remaining trend entirely.

🔨 When to Use Log vs Differencing vs Both

Diff only

Series has a linear drift with constant variance. Variance band is the same width everywhere — just the mean is rising. Example: a random walk, central bank interest rate. Fix: Δyₜ = yₜ − yₜ₋₁

Log only

Series has growing variance but no strong trend. Swings widen proportionally with the level but mean is roughly stable. Example: volatility of a stable asset around a fixed price. Fix: log(yₜ), then re-test.

Log + Diff

Series has both exponential growth and growing variance. This is the most common case for financial/economic data: stock prices, GDP, population. Fix: Δlog(yₜ) = log(yₜ) − log(yₜ₋₁) ≈ % change. This is also the log-return in finance.

Seasonal diff

Seasonal spikes visible in ACF at lag s and multiples. First or second ADF test shows unit root in seasonal component. Fix: Δₛyₜ = yₜ − yₜ₋ₛ. For airline passengers: apply seasonal diff first, then regular diff.

No transform

ADF already passes (p < 0.05) and KPSS agrees. No transformation needed — the series is ready for ARMA as-is. Do not difference a stationary series — it introduces artificial noise.

Section 11

The Complete Stationarity Pipeline in Python

We now build the full end-to-end workflow: generate a non-stationary series with both trend and growing variance, apply log transformation, apply first differencing, verify stationarity at each step.

# ─── 1. Generate realistic non-stationary series ─────────────────────────────
# Simulates GDP-like data: exponential growth + seasonal pattern + noise
np.random.seed(99)
n = 240   # 20 years of monthly data

t        = np.arange(n)
trend    = 100 * np.exp(0.008 * t)                     # exponential growth
seasonal = 10  * np.sin(2 * np.pi * t / 12)            # yearly cycle
noise    = trend * 0.04 * np.random.randn(n)             # proportional noise

raw = pd.Series(trend + seasonal + noise,
                index=pd.date_range('2000-01-01', periods=n, freq='MS'),
                name='Simulated GDP-like Series')

print(f"Shape: {raw.shape}")
print(f"Range: {raw.min():.1f}  to  {raw.max():.1f}")

OUTPUT

Shape: (240,) Range: 87.3 to 707.4

# ─── 2. Step-by-step transformation pipeline ─────────────────────────────────
stages = {}
stages['1_raw']       = raw
stages['2_log']       = np.log(raw)
stages['3_log_diff']  = np.log(raw).diff().dropna()
stages['4_seas_diff'] = stages['3_log_diff'].diff(12).dropna()

print(f"{'Stage':<22} {'ADF p':>10}  {'KPSS p':>10}  {'Verdict'}")
print("-" * 65)

for label, s in stages.items():
    adf_p = adfuller(s, autolag='AIC')[1]
    kpss_p = kpss(s, regression='c', nlags='auto')[1]

    if adf_p < 0.05 and kpss_p > 0.05:
        verdict = "✅ STATIONARY"
    elif adf_p > 0.05 and kpss_p < 0.05:
        verdict = "❌ NON-STATIONARY"
    else:
        verdict = "⚠️  INCONCLUSIVE"

    print(f"{label:<22} {adf_p:>10.4f}  {kpss_p:>10.4f}  {verdict}")

OUTPUT

Stage ADF p KPSS p Verdict ───────────────────────────────────────────────────────────────── 1_raw 0.9973 0.0100 ❌ NON-STATIONARY 2_log 0.9814 0.0100 ❌ NON-STATIONARY 3_log_diff 0.0000 0.1000+ ✅ STATIONARY ← ready for ARMA! 4_seas_diff 0.0000 0.1000+ ✅ STATIONARY

# ─── 3. Detailed ADF on final stationary series ──────────────────────────────
final_series = stages['3_log_diff']
run_stationarity_tests(final_series, "Log-Differenced GDP-like Series")

# ─── 4. Verify: mean, variance, and ACF are stable ───────────────────────────
# Split into three equal thirds and compare statistics
thirds = np.array_split(final_series, 3)
print(f"\n{'Period':<10} {'Mean':>10} {'Std Dev':>12} {'Min':>10} {'Max':>10}")
print("-" * 56)
for i, seg in enumerate(thirds, 1):
    print(f"Period {i}   {seg.mean():>10.5f} {seg.std():>12.5f} {seg.min():>10.5f} {seg.max():>10.5f}")

OUTPUT

Period Mean Std Dev Min Max ──────────────────────────────────────────────────────── Period 1 0.00803 0.04961 -0.08427 0.10318 Period 2 0.00789 0.05012 -0.08901 0.10755 Period 3 0.00812 0.04988 -0.09103 0.11024 ← mean and std dev nearly identical across all three periods ✓

# ─── 5. Visual comparison of all four stages ─────────────────────────────────
fig, axes = plt.subplots(4, 1, figsize=(12, 10))

titles = [
    '① Raw Series — exponential trend + growing variance (NON-STATIONARY)',
    '② After log() — growth linearised, variance still rising (NON-STATIONARY)',
    '③ After log + diff() — stationary! (STATIONARY ✓)',
    '④ After log + diff + seasonal diff — removes seasonal pattern too'
]

colors = ['#f87171', '#f59e0b', '#34d399', '#60a5fa']

for ax, (label, s), title, color in zip(axes, stages.items(), titles, colors):
    s.plot(ax=ax, color=color, linewidth=1.2)
    ax.axhline(s.mean(), color=color, linestyle='--', alpha=0.6, linewidth=1)
    ax.set_title(title, fontsize=9, pad=4)
    ax.set_facecolor('#0d1117')
    ax.spines['bottom'].set_color('#2a3050')
    ax.spines['left'].set_color('#2a3050')

plt.tight_layout(pad=1.5)
plt.savefig('stationarity_pipeline.png', dpi=120, facecolor='#0d1117')
plt.show()

Section 12

Stationarity Tests — Complete Comparison

Property	ADF Test	KPSS Test	PP Test (Phillips-Perron)
Null hypothesis (H₀)	Unit root (non-stationary)	Stationary	Unit root (non-stationary)
Reject H₀ when	p < 0.05 → stationary	p < 0.05 → non-stationary	p < 0.05 → stationary
Test statistic sign	Negative (more negative = better)	Positive (larger = worse)	Negative (more negative = better)
Handles autocorrelation	By adding lagged Δyₜ terms	By HAC variance estimator	By non-parametric correction
Power in small samples	Moderate	Low — often fails to reject H₀	Moderate
Sensitive to structural breaks	Yes — may fail near breaks	Yes	Less so
Preferred when	Default first test; most widely used	Complementary to ADF; confirming stationarity	Short series; suspected autocorrelation
statsmodels function	adfuller(series)	kpss(series)	PhillipsPerron(series)

Section 13

Transformation Reference — What Each Fix Does

Problem Detected	Visual Clue	ADF / KPSS Signal	Transformation	Formula	Real Example
Linear trend (drift)	Mean moves steadily up/down	ADF fails; KPSS fails	First difference	yₜ − yₜ₋₁	Inflation rate, bond yields
Exponential trend	J-curve shape	ADF fails; KPSS fails	Log + first difference	log(yₜ) − log(yₜ₋₁)	GDP, stock prices, population
Growing variance only	Funnel shape; homoscedastic after log	ADF may pass; KPSS fails	Log transform	log(yₜ)	Volatility, right-skewed financial series
Seasonal non-stationarity	ACF spikes at lag s, 2s, 3s…	ADF may pass at lag 1 but fails seasonally	Seasonal difference	yₜ − yₜ₋ₛ	Monthly retail sales (s=12)
Trend + seasonality	Rising wave with cycles	Both fail	Log + regular + seasonal diff	Δ₁Δ₁₂ log(yₜ)	Airline passengers, electricity demand
Structural break	Sudden jump in mean level	ADF may pass (break mimics stationarity)	Dummy variable for break point	Dₜ = 1 if t ≥ break	GDP pre/post financial crisis
None — already stationary	Flat oscillation, constant width	ADF passes; KPSS passes	No transformation needed	yₜ as-is	Daily returns, white noise residuals

Section 14

Golden Rules — Stationarity in Practice

⚙️ Stationarity — Rules You Must Never Break

Always plot the series first. A visual inspection takes 5 seconds and reveals trend, variance growth, seasonality, outliers, and structural breaks simultaneously. The ADF test cannot tell you which type of non-stationarity you have — only the plot can. Look before you compute.

Run both ADF and KPSS, not just one. ADF has the null of non-stationarity; KPSS has the null of stationarity. Only when both tests agree are you on solid ground. When they disagree, the series is likely trend-stationary — which requires detrending, not differencing.

Match the ADF specification to your series. Use regression='c' (constant) for series with non-zero mean. Use regression='ct' (constant + trend) only when the series has a visible deterministic trend. Misspecifying inflates the test size and produces wrong conclusions.

Log-transform before differencing, not after. If the series has both exponential growth and growing variance, apply log first. This stabilises the variance multiplicatively. Then difference to remove the trend. Log-differencing (Δlog y) equals approximate percentage change — interpretable and stationary.

Do not over-difference. Applying more differences than necessary introduces spurious MA unit roots — making the series harder to model, not easier. The minimum differencing that achieves stationarity is always the correct choice. If d=1 passes the ADF, do not apply d=2.

Stationarity of residuals matters too. After fitting an ARMA model, always check the residuals with ADF and KPSS. Non-stationary residuals mean the model failed to capture all the structure — there is something systematic still left in the errors.

Be suspicious when ADF passes near a structural break. A sudden level shift can fool ADF into accepting stationarity — the mean looks "stable" on each side of the break even though the series is not globally stationary. Use the Zivot-Andrews test or Chow test if you suspect a structural break.