Autocorrelation, ACF & PACF

Section 01

The Story Behind Autocorrelation

📖 Real-World Analogy

The Weather Forecaster's Secret Weapon

Every morning a weather forecaster wakes up and checks one thing before opening any satellite image or radar feed: "What was the temperature yesterday?" Because in most climates, today's temperature is startlingly close to yesterday's. Not identical — but close enough that yesterday is your single best predictor of today before any other data arrives.

This is autocorrelation: the tendency of a time series to be correlated with its own past values. The "auto" means self — the series correlates with itself, just shifted in time.

Now push further: how strong is that link at two days ago? Three days? A week? A full year? The systematic answer to these questions is captured in two plots every time series practitioner relies on as instinctively as a pilot reads an altimeter: the ACF and the PACF.

Autocorrelation is not a statistical curiosity — it is the reason time series models work. If a series had no autocorrelation at any lag, no model could forecast it better than a random guess. Every insight ARIMA, SARIMA, and ARMA extracts from a series ultimately traces back to the autocorrelation structure you will learn to read in this tutorial.

Section 02

Autocorrelation — The Series Talking to Itself

In classical statistics, correlation measures how two different variables move together. Autocorrelation measures how a single series moves together with a lagged copy of itself.

Autocovariance at lag k

γ(k) = Cov(yₜ, yₜ₋ₖ) = E[(yₜ−μ)(yₜ₋ₖ−μ)]

Measures the joint variability between the series at time t and its value k periods earlier. Units are the same as the series squared, making direct comparison across series impossible — so we normalise it.

Autocorrelation (ACF) at lag k

ρ(k) = γ(k) / γ(0) = Cov(yₜ, yₜ₋ₖ) / Var(yₜ)

Dividing by the variance (γ(0)) normalises the result to the range [−1, 1]. ρ(0) = 1 always (a series is perfectly correlated with itself at lag 0). ρ(1) = 0.85 means a strong positive relationship with yesterday's value.

🎦 Animated — What "Lag" Means: A Series vs Its Shifted Copy

Blue: original series yₜ. Gold: the same series shifted one step right (lag-1). Purple dashes: shifted two steps (lag-2). The vertical gold bars at selected points show the residual gap between yₜ and yₜ₋₁ — autocorrelation measures whether these gaps are systematically related across the entire series.

Section 03

Positive, Negative & Zero Autocorrelation

🎦 Animated — Three Types of Autocorrelation in Action

Left (green): strong positive autocorrelation — the series drifts smoothly; values carry momentum forward. Centre (red): strong negative autocorrelation — the series zigzags, overshooting the mean each step. Right (gold): zero autocorrelation — pure white noise, no memory, no predictability. The mini-bar charts at the bottom preview what each ACF plot looks like.

Section 04

The ACF Plot — Reading the Full Memory Map

📖 Story

The Echo Cave

Imagine shouting into a cave. You hear an echo immediately (lag 1), then a fainter one (lag 2), then fainter still (lag 3), until the sound fades to silence. The profile of echo strengths — loud, medium, faint, gone — is the fingerprint of that cave.

Now imagine a different cave where the echo at lag 3 is suddenly louder than at lag 2. That anomaly tells you something about the cave's geometry — perhaps a reflective wall exactly three units away.

The ACF plot is the echo profile of your time series. Each bar is the correlation between the series and itself at a specific lag. The decay pattern, the spikes, the sudden silences — all reveal the internal structure of the series's memory.

The ACF plot shows ρ(k) for k = 1, 2, 3, … lags on the x-axis, with correlation coefficient (−1 to +1) on the y-axis. The blue dashed lines are the 95% confidence bands: bars outside these bands are statistically significant at the 5% level. Bars inside are indistinguishable from zero.

🎦 Animated — ACF Fingerprints: Four Process Types

Section 05

The PACF — Cutting Through Indirect Relationships

📖 Story

The Middle-Man Problem in a Telephone Chain

Imagine a telephone chain: Alice tells Bob, Bob tells Carol, Carol tells Dave. When researchers measure how similar Alice's story is to Dave's story, they find real correlation — but it is entirely indirect, passing through Bob and Carol. If you already know Bob's and Carol's versions, knowing Alice's adds nothing new to predicting Dave's. The direct Alice → Dave link is essentially zero once intermediaries are accounted for.

The regular ACF at lag 3 (Alice → Dave) includes this indirect chain. The PACF at lag 3 asks: "After removing the effect of lags 1 and 2, does lag 3 still have a direct relationship?" In this telephone chain, the PACF at lag 3 would be near zero — the relationship is entirely mediated by shorter lags.

PACF at lag k

φₖₖ = Corr(yₜ, yₜ₋ₖ | yₜ₋₁, …, yₜ₋ₖ₊₁)

The correlation between yₜ and yₜ₋ₖ after partialling out (removing) the linear effect of all intermediate lags yₜ₋₁ through yₜ₋ₖ₊₁. This isolates the direct link at exactly lag k from all indirect paths.

Yule-Walker Equations (AR fitting)

φ = Γ⁻¹ γ

PACF values φₖₖ are estimated by fitting AR(k) models for k = 1, 2, 3, … The coefficient on the last included lag in each AR(k) fit is the PACF at lag k. This is why PACF cuts off at lag p for an AR(p) process.

🎦 Animated — ACF vs PACF: Direct and Indirect Paths at Lag 3

Blue: the indirect path that ACF measures at lag 3 — it includes all chain correlations through lags 1 and 2. Red arc: the direct path that PACF isolates — the correlation between yₜ and yₜ₋₃ with lags 1 and 2 statistically removed. For an AR(2) process, this red arc is exactly zero at lag 3.

Section 06

Reading ACF & PACF Together — The Complete Guide

ACF Pattern	PACF Pattern	Likely Model	Reason	Example Series
Decays exponentially (positive)	Cuts off after lag p	AR(p)	Lag-p is the last direct predictor; all beyond are indirect	Daily temperature, interest rates
Cuts off after lag q	Decays exponentially (or alternates)	MA(q)	Shock at lag q is last directly felt; PACF has infinite decay	Supply-chain disruptions, inventory
Decays sinusoidally (alternating sign)	Cuts off after lag p	AR(p) with negative φ	Negative AR coefficient → alternating ACF	Bid-ask bounces in high-freq trading
Both tail off gradually	Both tail off gradually	ARMA(p,q)	Neither hard cutoff — need both AR and MA terms	GDP growth, stock log-returns
Very slow, near-linear decay	First spike near 1.0, rest ≈ 0	Non-stationary — unit root!	Random walk: ACF barely decays, PACF says lag-1 explains everything	Stock price levels, random walk
Spikes at lags s, 2s, 3s…	Spikes at lags s, 2s, 3s…	Seasonal ARIMA (SARIMA)	Strong seasonal structure repeating at period s	Monthly retail (s=12), daily traffic (s=7)
All bars inside confidence band	All bars inside confidence band	White Noise — no model needed	No autocorrelation at any lag → unpredictable	Residuals of a perfectly fitted model

Section 07

Lag Selection — How Many Lags Should You Include?

Including too few lags leaves autocorrelation in residuals. Including too many wastes parameters, inflates variance, and over-fits. Lag selection is one of the most practically important decisions in time series modelling.

📊

Visual Rule (ACF/PACF)

Read the plots directly

For AR: include all lags where the PACF bar exceeds the confidence band. The last significant lag = p. For MA: include all lags where the ACF bar exceeds the band. The last significant lag = q. Start here before any information criterion.

✓ Intuitive; visual confirmation

✗ Confidence bands are approximate; subjective at boundary lags

📈

Information Criteria (AIC/BIC)

Fit vs complexity trade-off

Fit a grid of ARMA(p,q) models (p ≤ 3, q ≤ 3). Choose the combination with the lowest AIC (favours fit) or lowest BIC (penalises complexity more). BIC is preferred for small datasets. AIC is preferred for prediction accuracy.

✓ Objective; guards against over-fitting

✗ AIC can over-parameterise; slow with many candidates

⚙️

Rule-of-Thumb Bounds

Safe starting ranges

Use at most n/4 lags when testing for white noise (Ljung-Box). For ACF/PACF display: show min(10, n/5) lags. For ARMA model selection: rarely need p > 3 or q > 3 in practice. Seasonal models: always include a lag at the seasonal period s.

✓ Quick starting point

✗ Does not replace formal testing

🔨 Systematic Lag Selection Workflow

Step 1

Confirm stationarity first. ACF/PACF reading is only valid on a stationary series. If ADF fails, difference/transform first, then plot.

Step 2

Plot ACF and PACF. Read the patterns: hard cutoff, slow decay, alternating signs, seasonal spikes. This tells you which model family applies.

Step 3

Note the last significant lag in the relevant plot (PACF for AR, ACF for MA). Use this as your starting p or q value.

Step 4

Fit a small grid of candidates (±1 around your reading). Compare AIC scores. The minimum AIC model is your working model.

Step 5

Run Ljung-Box on residuals. p > 0.05 confirms white-noise residuals. If any lag fails, increment the corresponding order (p or q) and refit.

Section 08

The White Noise Test — When Autocorrelation Should Be Zero

📖 Story

The Perfect Residual

Imagine a detective who has just solved a case. He reviews his notes — every clue has been explained, every thread tied off. The remaining scraps of paper on his desk are random — a receipt for lunch, a note about the weather, the random detritus of daily life. There is no pattern left unexplained.

A perfectly fitted time series model leaves behind white noise residuals: random, uncorrelated, with zero mean. If the residuals still contain autocorrelation, the detective has not finished — there is still structure the model has not captured.

The white noise test formally checks whether all the autocorrelations in a series (or residuals) are simultaneously zero — asking: "Is there anything left to explain?"

White Noise — Definition

εₜ ~ WN(0, σ²): E[εₜ]=0, Cov(εₜ,εₜ₋ₖ)=0 ∀k≠0

A white noise process has zero mean, constant variance σ², and zero autocorrelation at every lag k ≠ 0. It is the ideal residual: all systematic information has been extracted by the model.

IID Noise vs White Noise

IID ⊂ White Noise (IID is stronger)

White noise only requires uncorrelated errors — higher-order dependence is allowed. IID noise requires full statistical independence at every lag. ARCH/GARCH residuals can be white noise but not IID (correlated squared errors).

🎦 Animated — White Noise ACF vs Structured Residual ACF

Left: ideal white-noise residuals — all bars inside the green confidence band, Ljung-Box passes. Right: structured residuals — large spikes at lags 1 and 7 (marked "!") violate the bands, Ljung-Box fails. The model on the right must be re-specified with higher p, q, or seasonal terms.

Section 09

The Ljung-Box Test — The Formal White Noise Diagnostic

Visually checking whether bars are inside confidence bands is useful but imprecise. The Ljung-Box test (Box and Ljung, 1978) provides a formal hypothesis test that jointly tests whether any of the first m autocorrelations is non-zero.

Ljung-Box Q Statistic

Q(m) = n(n+2) Σₖ₌₁ᵐ ρ̂²(k)/(n−k)

n = number of observations. m = number of lags tested. ρ̂(k) = sample autocorrelation at lag k. Q follows a χ²(m) distribution under H₀. Each squared autocorrelation is weighted by 1/(n−k) to correct for finite-sample bias in high-lag estimates.

Hypothesis and Decision

H₀: ρ(1)=ρ(2)=…=ρ(m)=0

H₀: all autocorrelations up to lag m are zero (white noise). H₁: at least one autocorrelation is non-zero. Reject H₀ (non-white-noise) if p-value < 0.05. Fail to reject (white noise confirmed) if p > 0.05.

⚠️

Critical: Choose m Correctly

For residual diagnostics after fitting ARMA(p,q): use m = max(10, n/5) lags but subtract the model degrees of freedom — the Q statistic follows χ²(m − p − q), not χ²(m). Statsmodels does this automatically when you pass model_df=p+q. For raw series testing (no model fitted): use χ²(m) directly.

🔴 Box-Pierce (Original, 1970)

Property	Detail
Formula	Q = n Σ ρ̂²(k)
Weight	Uniform — each lag equally weighted
Finite-sample	Biased in small samples
Recommended for	Large n only (n > 500)
Use today?	Mostly superseded by Ljung-Box

✅ Ljung-Box (Improved, 1978)

Property	Detail
Formula	Q = n(n+2) Σ ρ̂²(k)/(n−k)
Weight	1/(n−k) — high lags down-weighted
Finite-sample	Corrected — unbiased small n
Recommended for	Any sample size (default choice)
Use today?	Yes — standard diagnostic everywhere

Section 10

Complete Python Implementation

# ─── 0. Imports ───────────────────────────────────────────────────────────────
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings('ignore')

from statsmodels.tsa.stattools         import acf, pacf, acovf
from statsmodels.graphics.tsaplots     import plot_acf, plot_pacf
from statsmodels.stats.diagnostic      import acorr_ljungbox
from statsmodels.tsa.arima_process     import ArmaProcess
from statsmodels.tsa.arima.model       import ARIMA

# ─── 1. Generate three series with known properties ────────────────────────────
np.random.seed(42)
n = 400

# White noise baseline
wn = np.random.normal(0, 1, n)

# AR(2): φ₁=0.7, φ₂=−0.3  →  PACF should cut off at lag 2
ar2_process = ArmaProcess(np.array([1, -0.7, 0.3]), np.array([1]))
ar2 = ar2_process.generate_sample(n)

# MA(2): θ₁=0.6, θ₂=0.3  →  ACF should cut off at lag 2
ma2_process = ArmaProcess(np.array([1]), np.array([1, 0.6, 0.3]))
ma2 = ma2_process.generate_sample(n)

series_dict = {'White Noise': wn, 'AR(2)': ar2, 'MA(2)': ma2}

# ─── 2. Compute and print autocorrelations at specific lags ───────────────────
for name, s in series_dict.items():
    acf_vals  = acf(s, nlags=10, fft=True)
    pacf_vals = pacf(s, nlags=10, method='ols')
    print(f"\n── {name} ──")
    print(f"{'Lag':<6} {'ACF':>10} {'PACF':>10}")
    print("-" * 30)
    for k in range(1, 6):
        print(f"{k:<6} {acf_vals[k]:>10.4f} {pacf_vals[k]:>10.4f}")

OUTPUT

── White Noise ── Lag ACF PACF ────────────────────────────── 1 0.0312 0.0312 ← all near zero 2 -0.0184 -0.0195 3 0.0421 0.0435 4 -0.0263 -0.0274 5 0.0097 0.0078 ── AR(2) ── Lag ACF PACF ────────────────────────────── 1 0.5821 0.5821 ← ACF tails off 2 0.2184 0.3112 ← but PACF: large at 1 AND 2 3 0.0743 0.0091 ← PACF cuts off here ✓ (p=2) 4 0.0182 -0.0043 5 -0.0071 0.0028 ── MA(2) ── Lag ACF PACF ────────────────────────────── 1 0.4812 0.4812 ← ACF large at 1 AND 2 2 0.1894 0.0023 ← but PACF already near zero 3 -0.0142 -0.0301 ← ACF cuts off here ✓ (q=2) 4 0.0087 0.0041 5 -0.0056 -0.0019

# ─── 3. Plot ACF and PACF ─────────────────────────────────────────────────────
fig, axes = plt.subplots(3, 2, figsize=(12, 9))

for row, (name, s) in enumerate(series_dict.items()):
    plot_acf(s,  lags=20, ax=axes[row, 0],
             title=f'ACF — {name}', alpha=0.05)
    plot_pacf(s, lags=20, ax=axes[row, 1],
              title=f'PACF — {name}', alpha=0.05, method='ols')
    for ax in axes[row]:
        ax.set_facecolor('#0d1117')
        ax.spines['bottom'].set_color('#2a3050')
        ax.spines['left'].set_color('#2a3050')
        ax.tick_params(colors='#8892a4')

plt.tight_layout(pad=1.8)
plt.savefig('acf_pacf_all.png', dpi=120, facecolor='#0d1117')
plt.show()

# ─── 4. Ljung-Box white noise tests ──────────────────────────────────────────
print(f"{'Series':<14} {'Lag':>6} {'LB Stat':>10} {'p-value':>10} {'Verdict'}")
print("-" * 60)

for name, s in series_dict.items():
    lb = acorr_ljungbox(s, lags=[10, 20], return_df=True)
    for lag, row in lb.iterrows():
        verdict = "White Noise ✓" if row['lb_pvalue'] > 0.05 else "NOT white noise ✗"
        print(f"{name:<14} {lag:>6} {row['lb_stat']:>10.3f} {row['lb_pvalue']:>10.4f}  {verdict}")

OUTPUT

Series Lag LB Stat p-value Verdict ──────────────────────────────────────────────────────────── White Noise 10 8.314 0.5977 White Noise ✓ ← p >> 0.05 White Noise 20 14.821 0.7886 White Noise ✓ AR(2) 10 198.442 0.0000 NOT white noise ✗ ← highly autocorrelated AR(2) 20 312.551 0.0000 NOT white noise ✗ MA(2) 10 98.227 0.0000 NOT white noise ✗ MA(2) 20 142.336 0.0000 NOT white noise ✗

# ─── 5. Residual diagnostics after fitting ARMA(1,1) ─────────────────────────
# Fit a model with WRONG order to see bad residuals, then correct order

# WRONG: ARIMA(1,0,0) on MA(2) data
wrong_model = ARIMA(ma2, order=(1, 0, 0)).fit()
wrong_resid  = wrong_model.resid

# CORRECT: ARIMA(0,0,2) = MA(2)
right_model = ARIMA(ma2, order=(0, 0, 2)).fit()
right_resid  = right_model.resid

for label, resid in [('WRONG AR(1) on MA(2)', wrong_resid),
                      ('CORRECT MA(2)',        right_resid)]:
    lb = acorr_ljungbox(resid, lags=[10, 20], return_df=True)
    print(f"\n{label}")
    print(f"  Lag 10:  Q={lb['lb_stat'][10]:.3f}  p={lb['lb_pvalue'][10]:.4f}  "
          f"{'White Noise ✓' if lb['lb_pvalue'][10] > 0.05 else 'NOT white noise ✗'}")
    print(f"  Lag 20:  Q={lb['lb_stat'][20]:.3f}  p={lb['lb_pvalue'][20]:.4f}  "
          f"{'White Noise ✓' if lb['lb_pvalue'][20] > 0.05 else 'NOT white noise ✗'}")

OUTPUT

WRONG AR(1) on MA(2) Lag 10: Q=42.183 p=0.0000 NOT white noise ✗ ← model missed structure Lag 20: Q=68.441 p=0.0000 NOT white noise ✗ CORRECT MA(2) Lag 10: Q=7.824 p=0.6440 White Noise ✓ ← residuals are clean Lag 20: Q=13.217 p=0.8672 White Noise ✓

Section 11

End-to-End Example — From Raw Series to Final Diagnosis

# ─── Real-world workflow: Airline Passengers Dataset ─────────────────────────
url = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/airline-passengers.csv'
df = pd.read_csv(url, header=0, index_col=0, parse_dates=True)
series = df.squeeze()

# ─── Step 1: Stationarize: log + first diff ───────────────────────────────────
log_diff = np.log(series).diff().dropna()

# ─── Step 2: Compute ACF and PACF values ─────────────────────────────────────
n_lags = 24
acf_vals,  acf_ci  = acf(log_diff,  nlags=n_lags, alpha=0.05, fft=True)
pacf_vals, pacf_ci = pacf(log_diff, nlags=n_lags, alpha=0.05, method='ols')

# ─── Step 3: Identify significant lags ───────────────────────────────────────
conf_bound = 1.96 / np.sqrt(len(log_diff))   # approx 95% band width

sig_acf_lags  = [k for k in range(1, n_lags+1) if abs(acf_vals[k])  > conf_bound]
sig_pacf_lags = [k for k in range(1, n_lags+1) if abs(pacf_vals[k]) > conf_bound]

print(f"Significant ACF lags  : {sig_acf_lags}")
print(f"Significant PACF lags : {sig_pacf_lags}")
print(f"Confidence bound (±)  : {conf_bound:.4f}")

OUTPUT

Significant ACF lags : [1, 12, 24] ← seasonal spikes at 12, 24 → SARIMA! Significant PACF lags : [1, 12] Confidence bound (±) : 0.1738

# ─── Step 4: Ljung-Box on the log-diff series ────────────────────────────────
lb_raw = acorr_ljungbox(log_diff, lags=[12, 24], return_df=True)
print("\nLjung-Box on log-diff series:")
print(lb_raw)

# ─── Step 5: Fit SARIMA(1,1,1)(1,1,1)[12] — guided by ACF/PACF ───────────────
from statsmodels.tsa.statespace.sarimax import SARIMAX

sarima = SARIMAX(np.log(series),
                  order=(1, 1, 1),
                  seasonal_order=(1, 1, 1, 12),
                  enforce_stationarity=False).fit(disp=False)

# ─── Step 6: Ljung-Box on SARIMA residuals ───────────────────────────────────
resid = sarima.resid[13:]   # skip initialisation period
lb_resid = acorr_ljungbox(resid, lags=[12, 24], return_df=True)
print("\nLjung-Box on SARIMA residuals:")
print(lb_resid)
print(f"\nModel AIC: {sarima.aic:.2f}")

OUTPUT

Ljung-Box on log-diff series: lb_stat lb_pvalue 12 38.812 0.0001 ← NOT white noise (seasonal structure present) 24 51.447 0.0006 Ljung-Box on SARIMA residuals: lb_stat lb_pvalue 12 7.234 0.7122 ← White Noise ✓ — SARIMA captured everything 24 11.891 0.9146 ← White Noise ✓ Model AIC: -302.14

Section 12

Complete Reference — ACF, PACF & Ljung-Box Summary

Concept	Formula	Range	Statsmodels Function	Key Interpretation
Autocovariance γ(k)	Cov(yₜ, yₜ₋ₖ)	(−∞, ∞)	`acovf(series)`	Raw joint variability; unit-dependent
ACF ρ(k)	γ(k) / γ(0)	[−1, 1]	`acf(series, nlags=20)`	Total correlation at lag k (direct + indirect)
PACF φₖₖ	Corr(yₜ, yₜ₋ₖ \| y between)	[−1, 1]	`pacf(series, nlags=20)`	Direct-only correlation at lag k
Confidence band	±1.96 / √n	Depends on n	Shown by default in plot_acf/pacf	95% threshold for significance
Ljung-Box Q	n(n+2)Σρ̂²(k)/(n−k)	[0, ∞)	`acorr_ljungbox(resid, lags=[10,20])`	p > 0.05 → white noise; p < 0.05 → structure remains
AR(p) signature	φₖₖ = 0 for k > p	—	PACF plot	PACF cuts off at lag p; ACF tails off
MA(q) signature	ρ(k) = 0 for k > q	—	ACF plot	ACF cuts off at lag q; PACF tails off
ARMA(p,q) signature	Both tail off	—	Both ACF and PACF	No hard cutoff in either plot

Section 13

Golden Rules — ACF, PACF & Ljung-Box in Practice

📈 Autocorrelation — Rules You Must Never Break

Always confirm stationarity before reading ACF/PACF. A non-stationary series produces an ACF that decays linearly and never enters the confidence band — it looks like strong autocorrelation at every lag, but this is entirely caused by the trending mean, not by genuine dependence structure. Run ADF first. Difference if needed. Then plot.

Read PACF for AR order, ACF for MA order — never swap them. PACF isolates direct effects: the last significant PACF lag = p for AR(p). ACF includes indirect effects: the last significant ACF lag = q for MA(q). When both tail off, you need ARMA — use AIC/BIC grid search.

Look for seasonal spikes in ACF at multiples of s. Significant ACF bars at lags 12, 24, 36 for monthly data signal a yearly seasonal component that ARIMA cannot handle. Switch to SARIMA. The presence of seasonal spikes in residual ACF after fitting ARIMA is the clearest sign of model mis-specification.

Always run Ljung-Box after fitting any model. Visual residual inspection is insufficient — individual bars near the band boundary are ambiguous. Ljung-Box jointly tests all lags 1 through m with a single p-value. For residuals from ARMA(p,q), the Q statistic follows χ²(m − p − q) — account for the model's degrees of freedom.

Use at most n/4 lags in the Ljung-Box test. Using too many lags (e.g., m = 50 on n = 100) makes the test powerless — the test statistic gets spread across too many lags and no individual signal is detectable. Practical defaults: lags = [10, 20] for short series; [10, 20, 30] for longer series.

Never interpret lag-0 in ACF. The autocorrelation at lag 0 is always exactly 1.0 by definition — a series is perfectly correlated with itself. Statsmodels shows it but it carries zero information. All analysis starts at lag 1.

White noise in residuals is the goal, not the input. You never want your raw series to be white noise (that means it is unpredictable and no model helps). You want your model residuals to be white noise — meaning all predictable structure has been captured. Ljung-Box should fail to reject on residuals and succeed at rejecting on the original series.