Data Structure 📂 Sorting & Order Statistics · 5 of 10 31 min read

Radix Sort Explained: Hands-On Animation, Benefits, Use Cases & Python Code

A hands-on, animated walkthrough of Radix Sort. Watch each line of Python execute live as numbers fall into digit buckets, then explore why it beats comparison sorts, where to use it (and where not), its O(n·k) complexity, LSD vs MSD, and battle-tested golden rules.

Section 01

The Story That Explains Radix Sort

The Old Post Office Sorting Wall
Picture a 1950s mail room. A clerk faces 5,000 letters that must be ordered by a 5-digit postcode. He does not compare letters two at a time — that would take forever. Instead the wall behind him has 10 pigeonholes labelled 0–9.

He looks only at the last digit of every postcode and drops each letter into the matching pigeonhole. Then he scoops the holes back up in order 0→9, and repeats — this time on the second-to-last digit, then the third, and so on. After five quick passes, with zero comparisons between any two postcodes, the entire stack is perfectly sorted.

That clerk just performed Radix Sort.

Radix Sort is a non-comparative integer sorting algorithm. Instead of asking “is A bigger than B?”, it sorts numbers digit by digit, using a stable bucketing pass for each digit position. Because it never compares whole keys, it sidesteps the famous O(n log n) lower bound that limits comparison sorts.

📝
The Core Insight

If keys are built from a small fixed alphabet of symbols (digits 0–9, or bytes 0–255), you can sort them by repeatedly bucketing on one symbol position at a time. Do it from the least significant digit (LSD) to the most significant, keep each pass stable, and the array falls into perfect order.


Section 02

The Two Ingredients of Radix Sort

Radix Sort is really just two simple ideas working together. Neither alone sorts anything — together they are unstoppable.

💷
Ingredient 1 — Digit Extraction
(num // place) % 10
For each number, peel off one digit at a chosen place value. place = 1 gives the ones digit, place = 10 the tens, place = 100 the hundreds. Integer division then modulo — that is the whole trick.
🧮
Ingredient 2 — A Stable Bucket Pass
Counting / Bucket Sort
Drop every number into one of 10 buckets by its current digit, then read the buckets back 0→9. Because items keep their relative order, the pass is stable — the secret that lets earlier passes survive later ones.
🔄
Repeat — LSD to MSD
d passes total
Run one stable pass per digit, moving from least to most significant. After the pass for the highest digit of the largest number, the array is fully sorted. No comparisons. Ever.
🔑
Why Stability Is Non-Negotiable

When the tens-digit pass groups numbers, ties (same tens digit) must keep the order the ones pass already gave them. An unstable bucket pass would scramble that earlier work and the final result would be wrong. Stability is what chains the passes together.


Section 03

How It Works — Step by Step

01
Find the Largest Number
Its digit count tells you how many passes you need. Largest = 839 → 3 digits → 3 passes.
02
Bucket by the Current Digit
Create 10 empty buckets. For every number, compute its digit at the current place and append it to that bucket.
03
Collect Buckets 0→9
Read the buckets back into the array in label order. Equal digits keep their existing order — the pass stays stable.
04
Move to the Next Digit
Multiply place by 10 (ones → tens → hundreds) and repeat steps 2–3.
05
Stop When Digits Run Out
Once the most significant digit of the largest value has been processed, the array is fully sorted.

Section 04

Hands-On — Watch Each Line Do Its Job

This is the heart of the tutorial. Press Step (or Auto) and watch: the highlighted code line is the one currently executing, the caption explains exactly what that line is doing, and the array & buckets animate the effect of that line in real time.

⚡ Radix Sort — Live Line Tracer arr = [329, 457, 657, 839, 436, 720, 355]
1def radix_sort(arr):
2 max_val = max(arr)
3 place = 1
4 while max_val // place > 0:
5 buckets = [[] for _ in range(10)]
6 for num in arr:
7 digit = (num // place) % 10
8 buckets[digit].append(num)
9 arr = [n for b in buckets for n in b]
10 place *= 10
11 return arr
place = 1 (ones)
Working Array
Buckets 0 – 9
Press Step or Auto to begin the trace.
0 / 0
🎯
What to Notice

Watch line 7 extract the digit, line 8 drop the number into a green bucket, and line 9 scoop them all back in order. Two numbers with the same current digit always land in the same bucket in the order they arrived — that is stability happening before your eyes.


Section 05

The Same Run, as a Table

Here is exactly what the animation does, frozen into three passes for [329, 457, 657, 839, 436, 720, 355].

PassSort OnArray After This Pass
Start329, 457, 657, 839, 436, 720, 355
Pass 1Ones digit720, 355, 436, 457, 657, 329, 839
Pass 2Tens digit720, 329, 436, 839, 355, 457, 657
Pass 3Hundreds digit329, 355, 436, 457, 657, 720, 839
👀
Read It Closely

After Pass 1 the numbers are ordered by their last digit only. After Pass 2 they are ordered by the last two digits. After Pass 3, by all three — which is the full order. Each pass refines the previous one because every pass is stable.


Section 06

Full Python Implementation

LSD Radix Sort (the version animated above)

def radix_sort(arr):
    # Edge case: empty list is already sorted
    if not arr:
        return arr

    # 1. Largest value decides how many digit-passes we need
    max_val = max(arr)
    place   = 1                       # start at the ones digit

    # 2. One stable pass per digit, LSD -> MSD
    while max_val // place > 0:
        buckets = [[] for _ in range(10)]   # 10 empty buckets, digits 0-9

        # 3. Distribute every number by its current digit
        for num in arr:
            digit = (num // place) % 10     # peel off one digit
            buckets[digit].append(num)      # append keeps it stable

        # 4. Collect buckets 0->9 back into the array
        arr = [num for bucket in buckets for num in bucket]

        place *= 10                     # advance to the next digit

    return arr


# ── Demo ──────────────────────────────────────────────
data = [329, 457, 657, 839, 436, 720, 355]
print("Before:", data)
print("After: ", radix_sort(data))
OUTPUT
Before: [329, 457, 657, 839, 436, 720, 355] After: [329, 355, 436, 457, 657, 720, 839]

Handling Negative Numbers (a common gotcha)

Plain radix sort only works on non-negative integers. The simplest robust fix: split the array, sort the absolute values of the negatives, reverse and negate them, then concatenate.

def radix_sort_signed(arr):
    neg = [-x for x in arr if x < 0]   # magnitudes of negatives
    pos = [x  for x in arr if x >= 0]

    neg = radix_sort(neg)               # sort magnitudes ascending
    pos = radix_sort(pos)

    # negatives go first, reversed and re-negated
    return [-x for x in reversed(neg)] + pos
OUTPUT
radix_sort_signed([3, -1, -7, 4, 0, -2]) -> [-7, -2, -1, 0, 3, 4]

Section 07

LSD vs MSD — Two Flavours

➡️ LSD (Least Significant Digit)
PropertyBehaviour
DirectionOnes → highest digit
PassesAlways d (fixed)
StabilityStable, simple
Best forFixed-width integers
RecursionNone — flat loop
⬅️ MSD (Most Significant Digit)
PropertyBehaviour
DirectionHighest digit → ones
PassesCan stop early per group
StabilityNeeds care to stay stable
Best forStrings, variable length
RecursionRecursive on sub-buckets

The animated version above is LSD — the most common and easiest to reason about. MSD shines for sorting strings (like a dictionary), because it can finish ordering a group as soon as a prefix is decided, without touching the rest of every word.


Section 08

Complexity & The Cost of Speed

📐
The Numbers

With n items, d digits per key, and a radix (base) of k buckets, each pass is O(n + k) and there are d passes:

Time = O(d · (n + k))
Space = O(n + k)

When d and k are small constants, this collapses to a near-linear O(n) — faster than any comparison sort.

AlgorithmBestAverageWorstSpaceStable
Radix SortO(n·k)O(n·k)O(n·k)O(n+k)Yes
Quick SortO(n log n)O(n log n)O(n²)O(log n)No
Merge SortO(n log n)O(n log n)O(n log n)O(n)Yes
Heap SortO(n log n)O(n log n)O(n log n)O(1)No
⚠️
The Catch — It's Not Magic

That O(n) hides a real cost. For huge keys, d grows; if d ≈ log n, radix sort is no better than quicksort. It also needs extra memory for buckets and works cleanly only on fixed-length, discrete keys (integers, fixed strings). It is a specialist, not a general-purpose replacement.


Section 09

When To Use Radix Sort (and When Not To)

Fixed-Length Numeric Keys
Postcodes, phone numbers, IP addresses, account IDs, timestamps — uniform-width integers are radix sort's home turf.
zip codes, phone numbers, IPs
Massive Integer Datasets
When n is in the millions and keys fit in 32/64-bit words, the linear O(n·k) beats comparison sorts and parallelises beautifully.
databases, big-data pipelines
Graphics & GPUs
Depth/z-buffer ordering and particle sorting use radix sort because it is non-branching and maps superbly onto GPU parallel hardware.
z-buffer, particle systems
Variable-Length / Mixed Keys
Floats, arbitrary-length strings, or custom comparison logic don't map to fixed digit positions cleanly. Reach for comparison sorts.
floats, free-text, custom order
Tiny or Memory-Tight Inputs
For small n the bucket overhead and extra O(n+k) memory aren't worth it. An in-place quicksort or insertion sort wins.
small arrays, embedded RAM limits
Huge Key Range, Few Items
Sorting 10 numbers that span 0 to a billion means many passes for almost nothing. d dominates and the linear promise evaporates.
sparse, wide-range keys

Section 10

Benefits at a Glance

⭐ Why Engineers Reach For Radix Sort
Speed
Linear time O(n·k) on fixed-width keys — beats the O(n log n) comparison-sort barrier because it never compares whole keys.
Stable
Preserves the relative order of equal keys — essential for multi-field sorts (e.g. sort by day, then month, then year).
No Compare
Works purely on digit values, so it is branch-light and ideal for SIMD and GPU acceleration.
Parallel
Each digit is independent, so passes and buckets parallelise cleanly across cores or shaders.
Predictable
No worst-case O(n²) blowup like quicksort — runtime depends only on n and digit count, not on data order.

Section 11

Golden Rules

🧮 Radix Sort — Non-Negotiable Rules
1
The inner pass must be stable. Always append to buckets and read them back in order. Swap in an unstable subroutine and every earlier pass is destroyed.
2
Pass count = digits of the largest value. Compute it from max(arr); never hard-code the number of passes.
3
Handle negatives explicitly. Vanilla radix sort assumes non-negative integers. Split, sort magnitudes, then recombine — or bias all values into a non-negative range first.
4
Pick the radix for your hardware. Base 256 (one byte per pass) is the sweet spot for 32/64-bit integers: 4–8 passes total and cache-friendly bucket sizes.
5
Don't force it on floats or free-text. Use it for fixed-width discrete keys. For everything else, a good comparison sort is simpler and just as fast.
6
Mind the memory. Buckets cost O(n + k) extra space. On memory-constrained systems, account for it before choosing radix sort over an in-place algorithm.