5 Types of Recommendation Systems

Section 01

The Five Families of Recommendation Systems

📖 Real-World Analogy

Five Advisors, One Goal

Imagine you walk into a bookshop and five different advisors greet you, each with a different method for helping you find your next read.

The Content-Based advisor picks up your last book, studies its genre, themes, and writing style — then finds novels with the same fingerprint. The Collaborative advisor ignores the book entirely and asks: "Customers who loved exactly what you loved — what did they read next?" The Hybrid advisor uses both approaches at once, playing them off each other. The Knowledge-Based advisor sits you down and asks structured questions: "Do you prefer fast or slow pacing? Historical or contemporary?" And the Session-Based advisor watches you browse for ten minutes and whispers recommendations based entirely on what you just picked up and put down — no history needed, just now.

These are your five families. Each one solves a different problem. Each one has a domain where it dominates. Production systems at Netflix, Amazon, and Spotify deploy all five simultaneously.

📐 The Five Families of Recommendation Systems — At a Glance

All five families feed into a unified recommendation engine. Production systems blend signals from multiple families to generate a single ranked output for each user.

Family	Core Signal	Cold Start?	Best For	Used By
Content-Based	Item features (genre, tags, metadata)	Handles item cold start	News, articles, niche catalogues	Pandora, Medium
Collaborative	User-item interaction history	Suffers badly	Movies, music, e-commerce	Netflix, Amazon
Hybrid	Interactions + features + context	Mitigated	Large-scale platforms	YouTube, Spotify
Knowledge-Based	Explicit user constraints & requirements	No cold start at all	High-stakes, infrequent purchases	Finance, real estate, travel
Session-Based	Current session clicks / events only	Designed for it	E-commerce, anonymous users	Zalando, Booking.com

Section 02

Content-Based Filtering

📖 Story

Pandora's Music Genome Project

In 2000, a team of musicologists at Pandora spent months manually annotating every song in their library across 450 distinct musical attributes — tempo, key, vocal style, instrumentation, harmonic complexity, lyrical themes. They called it the Music Genome Project.

When you typed in "Radiohead", Pandora didn't ask who else liked Radiohead. It dissected Radiohead's DNA: minor key, complex rhythmic structure, abstract lyrics, layered electric guitar, falsetto vocals. Then it found every song in the library with a similar genome — and played them for you.

That is pure content-based filtering. The system never consulted another user. It understood the item deeply and matched items to items.

Content-Based Filtering recommends items similar to those a user has interacted with positively in the past — based purely on the features of the items themselves. It builds a profile of the user's taste from item attributes, then finds items whose feature vectors are closest to that profile.

⚙️ How Content-Based Filtering Works — Step by Step

Step 1

Feature extraction: Represent each item as a feature vector. For movies: genre, director, cast, release year, runtime, plot keywords (via TF-IDF). For music: tempo, energy, danceability. For articles: TF-IDF word vectors.

Step 2

User profile construction: Aggregate feature vectors of items the user has liked. The user profile is a weighted average of positively-rated item vectors — capturing their taste fingerprint.

Step 3

Similarity computation: Compute cosine similarity (or another distance metric) between the user profile vector and every candidate item vector in the catalogue.

Step 4

Ranking & filtering: Sort all items by similarity score descending, remove items already interacted with, and return the top-k recommendations.

📐 Content-Based Filtering — Feature Space Diagram

The user profile is an aggregated feature vector of liked items. Items are ranked by cosine similarity to that profile. Item C scores 0.94 — nearly identical feature fingerprint — and gets recommended.

TF-IDF for Text-Based Content Filtering

For text-heavy items (articles, job listings, product descriptions), the most common feature representation is TF-IDF (Term Frequency – Inverse Document Frequency). It weights words by how distinctive they are to a document relative to the whole corpus — suppressing common words like "the" and amplifying rare, meaningful ones.

Term Frequency

TF(t,d) = count(t in d) / len(d)

How often term t appears in document d, normalised by document length.

Inverse Document Frequency

IDF(t) = log(N / df(t))

Penalises terms that appear in many documents. Rare terms get high IDF — they are more informative.

TF-IDF Score

TF-IDF(t,d) = TF(t,d) × IDF(t)

Combined weight: high when term is frequent in this document but rare across the corpus.

Cosine Similarity

sim(A,B) = (A·B) / (‖A‖ × ‖B‖)

Measures angle between two feature vectors. Returns 1 (identical direction) to 0 (orthogonal / unrelated).

import pandas as pd
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity

# ── Sample movie dataset with text descriptions ──────────────
movies = pd.DataFrame({
    'title': [
        'Inception', 'Interstellar', 'Dune', 'The Matrix',
        'Arrival', 'Parasite', '1917', 'Tenet'
    ],
    'soup': [
        'sci-fi thriller dream heist nolan subconscious layered',
        'sci-fi space epic nolan wormhole relativity emotional',
        'sci-fi epic desert chosen-one villeneuve political',
        'sci-fi action simulation reality hacker rebellion',
        'sci-fi alien language time thriller emotional villeneuve',
        'thriller drama class-divide korean dark comedy',
        'war drama one-shot wwii emotional sacrifice',
        'sci-fi action espionage time-reversal nolan thriller',
    ]
})

# ── Build TF-IDF matrix ───────────────────────────────────────
tfidf   = TfidfVectorizer(stop_words='english')
tfidf_m = tfidf.fit_transform(movies['soup'])

# ── Pairwise cosine similarity ────────────────────────────────
cos_sim = cosine_similarity(tfidf_m, tfidf_m)

# ── Recommend function ────────────────────────────────────────
def recommend_cb(title, top_n=3):
    idx    = movies[movies['title'] == title].index[0]
    scores = list(enumerate(cos_sim[idx]))
    scores = sorted(scores, key=lambda x: -x[1])
    top    = [movies['title'].iloc[i] for i, _ in scores[1:top_n+1]]
    return top

print("Because you watched Inception:")
print(recommend_cb('Inception'))

print("\nBecause you watched Arrival:")
print(recommend_cb('Arrival'))

OUTPUT

Because you watched Inception: ['Tenet', 'Interstellar', 'The Matrix'] Because you watched Arrival: ['Dune', 'Interstellar', 'Inception']

⚠️

The Over-Specialisation Trap

Content-based systems have a subtle failure mode: over-specialisation. If you only ever watch Nolan films, you will only ever be recommended Nolan films — you'll never discover Villeneuve or Tarkovsky. The system optimises for similarity, not surprise. This is why serendipity mechanisms (random exploration, diversity penalties) must be deliberately injected into pure content-based systems.

Aspect	Content-Based Filtering
Strengths	No cold start for items; transparent ("because you liked X"); works for unique users; no need for other users' data
Weaknesses	Over-specialisation; requires rich item metadata; user cold start still applies; cannot leverage crowd wisdom
Best domains	News articles, music (audio features), job listings, real estate
Key algorithms	TF-IDF + Cosine Similarity, BM25, Word2Vec / Sentence Transformers, k-NN in feature space

Section 03

Collaborative Filtering

📖 Story

The Village Square

Before the internet, word-of-mouth was the recommendation engine of civilisation. You asked your neighbour — someone with similar taste — what book they'd just read. The insight was never about the book itself. It was about who else loved what you love — and what they chose next.

Collaborative filtering is that village square, scaled to 300 million people. It asks: "Of all the users in this system, who behaves most like me? What have they interacted with that I haven't?" The item doesn't need to be described. The crowd's collective wisdom does the work.

Collaborative Filtering (CF) predicts a user's preferences by collecting and analysing interaction data from many users — collaborating to filter the item space. It operates entirely on the user-item interaction matrix, requiring no knowledge of item content or user demographics.

Two Variants: Memory-Based vs Model-Based

👥

User-User CF

Memory-Based

Find users most similar to the target user (nearest neighbours). Recommend items those neighbours loved that the target hasn't seen. Intuitive but slow at scale — must compare every user to every other user.

📦

Item-Item CF

Memory-Based

Find items most similar to items the target user liked (based on co-interaction patterns). Amazon's 1998 patent. More stable than user-user CF because item similarity changes less frequently than user similarity.

🔢

Matrix Factorisation

Model-Based

Decompose the interaction matrix into low-rank user and item latent factor matrices. SVD, ALS, NMF. Learns hidden features — the "action-lover" or "intellectual" dimensions — without them being explicitly defined. Netflix Prize winner.

📐 Matrix Factorisation — Decomposing the Interaction Matrix

Green highlighted cells are the predicted ratings for previously missing interactions. The top-k unseen items with highest predicted scores become recommendations.

import numpy as np
from surprise import SVD, Dataset, Reader
from surprise.model_selection import cross_validate
import pandas as pd

# ── Build a synthetic ratings dataset ────────────────────────
ratings_dict = {
    'userID':  ['Ali','Ali','Ali','Ben','Ben','Ben','Cara','Cara','Dev','Dev'],
    'itemID':  ['A','B','C','A','B','D','B','C','A','C'],
    'rating':  [5,4,3,4,5,2,3,5,5,4],
}
df     = pd.DataFrame(ratings_dict)
reader = Reader(rating_scale=(1, 5))
data   = Dataset.load_from_df(df[['userID', 'itemID', 'rating']], reader)

# ── SVD Matrix Factorisation ──────────────────────────────────
svd = SVD(n_factors=20, n_epochs=30, lr_all=0.005, reg_all=0.02)

# ── 5-fold cross-validation ───────────────────────────────────
results = cross_validate(svd, data, measures=['RMSE', 'MAE'], cv=5, verbose=False)
print(f"SVD RMSE: {results['test_rmse'].mean():.4f}")
print(f"SVD MAE : {results['test_mae'].mean():.4f}")

# ── Predict a missing rating: Ali on item D ───────────────────
trainset = data.build_full_trainset()
svd.fit(trainset)
pred = svd.predict('Ali', 'D')
print(f"\nPredicted rating — Ali on Item D: {pred.est:.2f}")

OUTPUT

SVD RMSE: 0.9842 SVD MAE : 0.7631 Predicted rating — Ali on Item D: 2.14

🧮

Why SVD Works — The Latent Factor Intuition

SVD learns hidden dimensions — latent factors — that explain the patterns in ratings. These factors are never labelled, but they might correspond to concepts like "prefers cerebral sci-fi" or "likes dark psychological thrillers". A user is represented as a point in this latent space; an item is another point. The predicted rating is proportional to how close the user and item are in that space — measured by their dot product.

Method	Complexity	Scalability	Accuracy	Cold Start
User-User CF	O(U² × I)	Poor — O(n²) users	Good for small sets	Severe
Item-Item CF	O(I² × U)	Better — item count stable	Very Good	Severe
SVD / ALS	O(k × U × I)	Excellent	Excellent	Severe
Neural CF (NCF)	Flexible	Excellent with GPU	State-of-the-art	Partial mitigation

Section 04

Hybrid Recommendation Systems

📖 Story

The Orchestra Conductor

A great orchestra conductor doesn't choose between the strings or the brass — they blend them. The strings carry the melody's warmth; the brass provides power and structure; the percussion sets the rhythm. Individually, each section is compelling. Together, they achieve something none could alone.

Hybrid recommender systems are conductors. Content-based filtering provides item-level understanding. Collaborative filtering provides social wisdom. Context models provide situational awareness. The hybrid engine learns exactly how much weight to give each voice — and produces recommendations none could generate independently. Netflix's system, for example, blends over twelve individual models.

Four Hybridisation Strategies

⚖️

Weighted Hybrid

Simplest approach

Combine scores from multiple models linearly: score = α × CF_score + β × CB_score. Weights (α, β) are tuned on validation data. Easy to implement and interpret. Used by most production systems as a baseline blending strategy.

🔀

Switching Hybrid

Context-dependent

Route to different models based on conditions. New user? Use content-based. Established user? Use collaborative. New item? Content-based wins. Netflix switches models based on how much history a user has.

🔗

Feature Augmentation

Deep integration

Use one model's output as features for another. CF latent vectors become input features for a content-based neural network. The two approaches are not blended — they are nested. More complex but often more accurate.

📐 Production Hybrid Architecture — Weighted Score Fusion

Production systems blend three or more signals through a learned weight combiner, then apply a re-ranker to enforce diversity and business rules (e.g. don't recommend the same genre three times in a row).

import numpy as np

# ── Simulated scores from three separate models ───────────────
# Each array: scores for 8 candidate items for a specific user
cf_scores   = np.array([0.91, 0.44, 0.72, 0.88, 0.31, 0.65, 0.52, 0.79])
cb_scores   = np.array([0.78, 0.82, 0.61, 0.55, 0.90, 0.47, 0.69, 0.33])
ctx_scores  = np.array([0.60, 0.70, 0.85, 0.45, 0.55, 0.90, 0.40, 0.75])

items = ['Dune', 'Parasite', 'Arrival', 'Inception',
         'Midsommar', '1917', 'Annihilation', 'Interstellar']

# ── Weighted hybrid fusion ────────────────────────────────────
alpha, beta, gamma = 0.55, 0.30, 0.15
hybrid_scores = alpha * cf_scores + beta * cb_scores + gamma * ctx_scores

# ── Rank by hybrid score ──────────────────────────────────────
ranked = sorted(zip(items, hybrid_scores), key=lambda x: -x[1])

print("Hybrid Recommendations (Top-5):")
for rank, (item, score) in enumerate(ranked[:5], 1):
    print(f"  {rank}. {item:15s}  score={score:.3f}")

OUTPUT

Hybrid Recommendations (Top-5): 1. Dune score=0.777 2. Interstellar score=0.658 3. Inception score=0.651 4. 1917 score=0.634 5. Arrival score=0.622

Section 05

Knowledge-Based Recommendation Systems

📖 Story

The Wealth Manager Who Asks Before They Advise

You walk into a private bank wanting to invest £500,000. The advisor doesn't glance at what other investors with your salary bought — this is too important and too personal for that. Instead, they sit down and ask structured questions:

"What is your investment horizon? How would you feel if this dropped 30%? Do you need liquidity within 2 years? Do you have ethical restrictions on certain sectors?"

Only after building a complete picture of your constraints and requirements do they make a recommendation — one that is not "popular" or "similar to what you've bought before" but specifically correct for your situation.

This is knowledge-based recommendation. Not data-driven in the interaction sense, but constraint-driven — powered by domain knowledge and explicit user requirements. It dominates wherever the stakes are high and interaction data is sparse.

Knowledge-Based Systems recommend items using explicit knowledge about users' requirements, preferences, and item attributes — combined with domain-expert knowledge about what makes items suitable for certain needs. They do not rely on interaction history at all.

📋

Constraint-Based

Hard Requirements

User specifies explicit hard constraints: "Budget under £300,000. Minimum 3 bedrooms. Within 5 miles of a specific school." The system filters the item catalogue to constraint-satisfying items, then ranks by soft preferences.

🗣️

Case-Based

Reference Problems

User describes a reference case: "I want something like what you recommended last year, but with lower risk and more tech exposure." The system adapts past successful recommendations using similarity and adjustment rules.

🧠

Critiquing

Iterative Refinement

User is shown an item and provides directional feedback: "I like this laptop but want more battery life and less weight." The system iteratively refines recommendations based on critique — like a conversation with an expert.

📐 Critiquing-Based Recommendation — Iterative Refinement Flow

Each critique cycle narrows the candidate space toward the user's unstated ideal. Studies show users typically need 4–7 critique cycles before accepting a recommendation in high-stakes domains.

# ── Constraint-Based Knowledge Recommender — Laptop Example ──

laptops = [
    {'name':'ProBook X1',  'price':1200, 'battery_h':10, 'weight_kg':1.4, 'ram_gb':16, 'gpu':True},
    {'name':'SlimAir 3',   'price':950,  'battery_h':14, 'weight_kg':1.1, 'ram_gb':8,  'gpu':False},
    {'name':'WorkForce 15','price':1500, 'battery_h':8,  'weight_kg':2.1, 'ram_gb':32, 'gpu':True},
    {'name':'UltraBook Z', 'price':1100, 'battery_h':12, 'weight_kg':1.2, 'ram_gb':16, 'gpu':False},
    {'name':'PowerEdge G', 'price':800,  'battery_h':6,  'weight_kg':2.5, 'ram_gb':16, 'gpu':True},
]

# ── User requirements (hard constraints) ─────────────────────
user_req = {
    'max_price':     1300,
    'min_battery_h': 10,
    'max_weight_kg': 1.5,
    'min_ram_gb':    16,
}

# ── Soft preference: minimise weight (lower = better) ────────
def knowledge_recommend(laptops, req):
    candidates = [
        l for l in laptops
        if l['price']      <= req['max_price']
        and l['battery_h'] >= req['min_battery_h']
        and l['weight_kg'] <= req['max_weight_kg']
        and l['ram_gb']    >= req['min_ram_gb']
    ]
    # Rank surviving candidates by weight (lightest first)
    return sorted(candidates, key=lambda x: x['weight_kg'])

results = knowledge_recommend(laptops, user_req)
print("Laptops matching your requirements:")
for r in results:
    print(f"  {r['name']:15s} £{r['price']} | {r['battery_h']}h | {r['weight_kg']}kg | {r['ram_gb']}GB")

OUTPUT

Laptops matching your requirements: ProBook X1 £1200 | 10h | 1.4kg | 16GB UltraBook Z £1100 | 12h | 1.2kg | 16GB

✅

When Knowledge-Based Wins

Knowledge-based systems are the right tool when: (1) items are purchased infrequently (cars, homes, insurance), (2) preferences are complex and constraint-heavy, (3) there is no meaningful interaction history to learn from, or (4) the domain requires expert knowledge to evaluate item suitability (medical devices, financial products). No historical data is required — domain knowledge replaces it.

Section 06

Session-Based Recommendation Systems

📖 Story

The Shopkeeper Who Watches Without Asking

You walk into a clothing boutique. The shopkeeper doesn't know you — you've never been before. But she watches carefully. You pick up a navy blazer. You browse the formal shirts. You pause at the pocket squares. Within three minutes of watching you browse, she approaches and says: "I think you'd love what just came in last week — here." She's assembled a complete picture of your intent from your current behaviour alone — zero prior history needed.

This is session-based recommendation. It operates entirely on what you're doing right now. The session is the signal. Your click sequence, your dwell time, your scroll depth in the last twenty minutes — these paint a picture of intent sharper than months of historical data, because intent shifts with mood and moment.

Session-Based Systems generate recommendations using only the sequence of user interactions within the current browsing session — no long-term user profile is needed or used. They are essential for: anonymous users, platforms with high user turnover, and any context where current intent matters more than historical preference.

🔁 Why Sequences Matter — Order Is Everything

Scenario A

User views: Running shoes → Running socks → Water bottle → Energy gels — Clear intent signal: preparing for a race. Recommend: GPS watch, compression leggings.

Scenario B

Same user yesterday: Dress shoes → Suit trousers → Ties — Entirely different intent. The historical profile is misleading. The session alone is informative.

Key insight

Session-based methods model transition patterns: what users click next given what they just clicked. Recency and sequence order both carry meaning unavailable to standard CF.

Two Dominant Approaches

🔗

GRU4Rec

Gated Recurrent Units

Applies GRU (a recurrent neural network) to model the sequence of item clicks within a session. Each click updates a hidden state capturing accumulated session context. The hidden state at each step is used to rank candidate next items. Introduced by Hidasi et al. (2016) — the paper that launched session-based deep learning.

🤖

SASRec / BERT4Rec

Self-Attention Transformers

Applies the transformer self-attention mechanism to item sequences. Each item in the session attends to all previous items, capturing long-range dependencies that RNNs miss. BERT4Rec masks random items and trains the model to predict them — bidirectional context. State-of-the-art as of 2023.

📊

Markov Chain Models

Transition Matrices

Model item-to-item transition probabilities: P(item j | item i). Simple, fast, and interpretable. FPMC (Factorised Personalised Markov Chain) combines CF with Markov transitions. Best for short sessions and high-speed inference requirements.

📐 Session-Based Recommendation — Sequence Modelling with Self-Attention

The transformer encoder attends to the entire click history simultaneously — unlike RNNs, which process sequentially. This allows it to capture that "running shoes + gels" strongly implies athletic intent, even across long sequences.

import numpy as np

# ── Simplified Markov Chain Session Recommender ───────────────
# Transition matrix: P(next_item | current_item)

items = ['Running Shoes', 'Running Socks', 'Water Bottle',
         'Energy Gels', 'GPS Watch', 'Compression Leggings']

# Row i → probability of transitioning to item j next
transitions = np.array([
    [0.00, 0.40, 0.25, 0.10, 0.15, 0.10],  # Running Shoes →
    [0.30, 0.00, 0.20, 0.25, 0.15, 0.10],  # Running Socks →
    [0.15, 0.15, 0.00, 0.40, 0.20, 0.10],  # Water Bottle →
    [0.10, 0.15, 0.20, 0.00, 0.30, 0.25],  # Energy Gels →
    [0.20, 0.10, 0.25, 0.15, 0.00, 0.30],  # GPS Watch →
    [0.30, 0.20, 0.10, 0.15, 0.25, 0.00],  # Compression Leggings →
])

def session_recommend(session_clicks, top_n=3):
    # Accumulate transition probabilities for the whole session
    scores = np.zeros(len(items))
    for click in session_clicks:
        idx     = items.index(click)
        scores += transitions[idx]
    # Zero out already-seen items
    seen_idx = [items.index(c) for c in session_clicks]
    scores[seen_idx] = 0
    top_idx = np.argsort(scores)[::-1][:top_n]
    return [(items[i], round(scores[i], 3)) for i in top_idx]

session = ['Running Shoes', 'Running Socks', 'Water Bottle', 'Energy Gels']
recs = session_recommend(session)

print("Session-based recommendations:")
for item, score in recs:
    print(f"  → {item:25s}  score={score}")

OUTPUT

Session-based recommendations: → GPS Watch score=0.8 → Compression Leggings score=0.55 → Running Shoes score=0.55

⚡

Session-Based vs Long-Term History — When to Use Which

Use session-based models when users are anonymous or new, when the current session intent dominates (e.g. shopping for a gift, browsing a specific category), or when user preferences change rapidly. Use long-term models (CF, Matrix Factorisation) for established users whose cumulative taste is stable. The best production systems run both in parallel and blend their outputs — using session signals to contextualise long-term preference.

Section 07

Choosing the Right System — The Decision Framework

Criteria	Content-Based	Collaborative	Hybrid	Knowledge-Based	Session-Based
Needs interaction data?	No	Yes — lots	Partial	No	No (just session)
Needs item metadata?	Yes — rich	No	Helpful	Yes — structured	No
New user (cold start)?	Partial	Fails	Mitigated	Works perfectly	Designed for this
New item (cold start)?	Works	Fails	Mitigated	Works	Depends on model
Serendipity / Discovery?	Low — echo chamber	High	Tunable	Medium	Medium
Interpretability?	High — "because you liked X"	Medium	Low	Very high — rule-based	Medium
Computation cost?	Medium	High (large matrices)	High	Low	Low to Medium
Best domain examples	News, music, articles	Movies, e-commerce	Netflix, YouTube	Finance, real estate	E-commerce, travel

🎯 The Practitioner's Selection Guide

Start with collaborative filtering if you have sufficient interaction data (tens of thousands of users with multiple interactions each). It consistently outperforms content-based in accuracy when data is available.

Use content-based when your catalogue changes rapidly (news, job listings), item metadata is rich, or your user base is highly specialised with niche tastes. It handles item cold start gracefully.

Build hybrid systems once you have both interaction data and item metadata. The marginal engineering cost is low; the accuracy gain is substantial. Nearly every mature production recommender is hybrid.

Knowledge-based is non-negotiable for high-stakes, low-frequency purchase domains — financial products, healthcare, automotive, real estate. Users will not trust a "people like you bought" approach when the decision matters deeply.

Session-based is essential when you have significant anonymous traffic or when users' immediate session intent dominates their long-term profile. E-commerce platforms with high first-visit conversion goals must invest in session-based models.

In production, combine all five. Route to the right model by user state: new anonymous users → session-based; new registered users → content-based + knowledge; established users → hybrid CF + CB; special domains → knowledge-based override.