Spam Detection in NLP

Section 01

The Story That Explains Spam Detection

📖 Real World Analogy

The Village Postman Who Never Sleeps

Imagine a tiny village with one postman — let's call him Arjun. Every morning, hundreds of letters arrive at the post office. Some are genuine: letters from family, bank statements, job offers. But lately, shady envelopes have flooded in — fake lottery winnings, miracle cures, "urgent" notices from princes needing your bank account.

At first, Arjun reads every single letter before delivering it. Exhausting — and the village falls behind. So instead, he starts learning patterns. Letters with red ink screaming "YOU WON £10,000!" go into the bin. Letters mentioning "bank transfer" from unknown senders get held. Letters from known neighbours get delivered immediately.

After a few months, Arjun barely needs to open envelopes. He recognises spam by its shape, smell, and language. That intuition — built from thousands of examples — is exactly what a spam detection model does.

Spam detection is one of the oldest and most impactful applications of Natural Language Processing (NLP). It is a binary text classification problem: given a message, decide whether it is spam (unwanted) or ham (legitimate). The techniques behind it now power email filters, SMS blockers, social media moderation, and fraud alert systems worldwide.

💻

Why This Matters in 2025

Over 45% of all email traffic globally is spam. That is roughly 160 billion spam emails sent every single day. Without automated detection, inboxes would be unusable. Modern spam filters combine NLP, machine learning, and deep learning — often achieving over 99% accuracy on well-labelled datasets.

Section 02

What Is Spam? Categories and Real Examples

Not all spam is the same. Before building a detector, you need to know what you are detecting. Spam comes in many flavours — each with its own linguistic fingerprints.

💰

Promotional / Commercial Spam

Most common type

Unsolicited marketing messages. Fake discounts, miracle products, weight loss pills.

Example: "CONGRATULATIONS! You've been selected for a FREE iPhone 15. Click NOW before offer expires!!!"

🔐

Phishing Spam

High danger level

Attempts to steal credentials, passwords, or banking details by impersonating legitimate institutions like banks or government bodies.

Example: "Dear customer, your SBI account has been suspended. Verify immediately: http://sbi-secure-verify.com"

🙅

Social Engineering Spam

Psychological manipulation

Exploits emotions — urgency, fear, greed, curiosity — to make the recipient act without thinking.

Example: "Mum, I'm in trouble. I lost my phone. Please transfer ₹15,000 to this number urgently. Don't call Dad."

Spam Type	Common Signal Words	Typical Channel	Danger Level
Promotional	FREE, WINNER, CLICK NOW, !!!, % OFF	Email, SMS	Medium
Phishing	verify, account suspended, login, urgent, bank	Email, WhatsApp	High
Lottery / Advance Fee	million, prize, transfer, fees, beneficiary	Email	Medium
Social Engineering	urgent, help me, stranded, don't tell	SMS, WhatsApp	High
Malware / Drive-by	click, download, update required, install	Email, Social	Critical
Ham (Legitimate)	meeting, thanks, attached, please review	Any	None

Section 03

The NLP Pipeline for Spam Detection

Raw text cannot be fed directly into a machine learning model. It must be cleaned, transformed, and numerically represented. This journey is called the NLP preprocessing pipeline.

Raw Text Input

The raw SMS, email subject, or message body arrives as a string. Example: "Congratulations!! You WON a FREE car!!! Call 09061743582 now."

Lowercasing

Convert all text to lowercase so "FREE", "Free", and "free" are treated as the same word. Result: "congratulations!! you won a free car!!! call 09061743582 now."

Noise Removal

Remove punctuation, special characters, numbers, and HTML tags. Optionally, preserve features like exclamation count or ALL-CAPS ratio as engineered features first. Result: "congratulations you won a free car call now"

Tokenisation

Split the text into individual tokens (words). Result: ["congratulations", "you", "won", "a", "free", "car", "call", "now"]

Stop Word Removal

Remove common, low-information words like "you", "a", "the", "is". Result: ["congratulations", "won", "free", "car", "call"]

Stemming / Lemmatisation

Reduce words to their root form. "running" → "run", "winner" → "win", "congratulations" → "congratul". Lemmatisation is more linguistically accurate than stemming.

Vectorisation

Convert tokens to numbers using Bag of Words, TF-IDF, or word embeddings. Now the text is a numerical vector that a model can learn from.

Section 04

Text Vectorisation — Turning Words into Numbers

📖 Story

The Library Index

Picture a massive library. To find a book quickly, the librarian doesn't read every page — she looks at the index at the back: which words appear, how many times, and on which pages. Bag of Words and TF-IDF do exactly this for machine learning — they create a vocabulary index across all messages, then represent each message as a vector of word counts or weighted scores. The model can then look at these numbers and learn which words are most associated with spam.

Method 1 — Bag of Words (BoW)

Create a vocabulary of all unique words in the dataset. Each message becomes a vector where each position is the count of a word from the vocabulary.

📾 Raw Messages

ID	Text	Label
M1	free prize call now	spam
M2	meeting at office now	ham
M3	free meeting call	spam

✨ BoW Matrix

ID	free	prize	call	now	meeting	office
M1	1	1	1	1	0	0
M2	0	0	0	1	1	1
M3	1	0	1	0	1	0

⚠️

The Problem with BoW

BoW ignores frequency importance. The word "the" appears in every message but tells us nothing — yet BoW counts it equally to "free". It also ignores word order, so "dog bites man" and "man bites dog" look identical.

Method 2 — TF-IDF (Term Frequency–Inverse Document Frequency)

TF-IDF fixes BoW's weakness. It rewards words that appear often in a message but rarely across all messages — these are the truly discriminative words.

Term Frequency (TF)

TF(t,d) = count(t,d) / total_words(d)

How often does this word appear in this document? High if the word is prominent.

Inverse Document Frequency (IDF)

IDF(t) = log(N / df(t))

How rare is this word across all documents? High if the word appears in few messages.

TF-IDF Score

TF-IDF = TF × IDF

High score = word is frequent in this document AND rare globally. These are the signal words.

Real Example

"free" in 3/500 messages

IDF = log(500/3) ≈ 5.1. High score. "the" in 490/500: IDF ≈ 0.02. Nearly zero weight.

Section 05

Building a Spam Detector — Step-by-Step with Python

We will use the famous UCI SMS Spam Collection dataset — 5,574 SMS messages labelled as "spam" or "ham". Below is a complete, production-ready pipeline.

📚 Step 1 — Load & Explore the Data

Import

Load pandas, sklearn, and the dataset. Check class distribution immediately.

Explore

Dataset has 4,827 ham (86.6%) and 747 spam (13.4%). Imbalanced! Remember this.

Inspect

Check average message length — spam tends to be longer with more exclamation marks.

import pandas as pd
import numpy as np
import re
import string

# Load the SMS spam dataset
df = pd.read_csv('spam.csv', encoding='latin-1')[['v1', 'v2']]
df.columns = ['label', 'text']

# Check class distribution
print(df['label'].value_counts())
print(df['label'].value_counts(normalize=True).round(3))

# Check average text length by class
df['length'] = df['text'].apply(len)
print(df.groupby('label')['length'].mean())

OUTPUT

ham 4825 spam 747 dtype: int64 ham 0.866 spam 0.134 dtype: float64 label ham 71.5 spam 138.7 <- Spam messages are nearly TWICE as long on average

🧰 Step 2 — Text Preprocessing Function

Clean

One reusable function handles lowercasing, punctuation, stop words, and stemming.

Engineer

Extract signal features before cleaning: exclamation count, uppercase ratio, URL presence.

import nltk
from nltk.corpus import stopwords
from nltk.stem import PorterStemmer

nltk.download('stopwords', quiet=True)
nltk.download('punkt', quiet=True)

ps = PorterStemmer()
stop_words = set(stopwords.words('english'))

# ── Engineer features BEFORE cleaning ────────────────────
df['num_exclamation'] = df['text'].str.count('!')
df['has_url']         = df['text'].str.contains(r'http|www|\.com', case=False).astype(int)
df['num_digits']      = df['text'].apply(lambda x: sum(c.isdigit() for c in x))
df['upper_ratio']     = df['text'].apply(
    lambda x: sum(c.isupper() for c in x) / (len(x) + 1))

# ── Clean text ────────────────────────────────────────────
def preprocess(text):
    text = text.lower()
    text = re.sub(r'http\S+|www\S+', ' url ', text)  # replace URLs with token
    text = re.sub(r'[^a-z\s]', '', text)              # remove non-alpha
    tokens = text.split()
    tokens = [ps.stem(w) for w in tokens if w not in stop_words]
    return ' '.join(tokens)

df['clean_text'] = df['text'].apply(preprocess)
print(df[['text', 'clean_text']].head(3))

OUTPUT

Original: "Congratulations!! You've WON a FREE prize. Call 09061743582 NOW!" Cleaned: "congratul won free prize call" Original: "Ok lar... Joking wif u oni..." Cleaned: "ok lar joke wif u oni" Original: "URGENT! Your mobile number 07808726822 won £1000 prize!" Cleaned: "urgent mobil number won prize"

🎯 Step 3 — Build & Evaluate the Model

Split

Stratified train/test split to preserve the 87/13 class ratio in both sets.

Pipeline

TF-IDF vectoriser + Multinomial Naive Bayes assembled in a single sklearn Pipeline.

Evaluate

Use Precision, Recall, F1, and ROC-AUC — not just Accuracy (it misleads on imbalanced data).

from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.naive_bayes import MultinomialNB
from sklearn.pipeline import Pipeline
from sklearn.metrics import classification_report, roc_auc_score, confusion_matrix

# Encode label
df['label_enc'] = (df['label'] == 'spam').astype(int)

X_train, X_test, y_train, y_test = train_test_split(
    df['clean_text'], df['label_enc'],
    test_size=0.2, random_state=42, stratify=df['label_enc']
)

# Build pipeline: TF-IDF + Naive Bayes
model = Pipeline([
    ('tfidf', TfidfVectorizer(
        ngram_range=(1, 2),  # unigrams + bigrams
        max_features=10000,
        sublinear_tf=True    # dampens very high frequencies
    )),
    ('clf', MultinomialNB(alpha=0.1))
])

model.fit(X_train, y_train)
y_pred = model.predict(X_test)
y_prob = model.predict_proba(X_test)[:, 1]

print(classification_report(y_test, y_pred, target_names=['Ham', 'Spam']))
print(f"ROC-AUC: {roc_auc_score(y_test, y_prob):.4f}")
print("Confusion Matrix:")
print(confusion_matrix(y_test, y_pred))

OUTPUT

precision recall f1-score support Ham 0.99 1.00 0.99 965 Spam 0.98 0.95 0.96 150 accuracy 0.99 1115 macro avg 0.98 0.97 0.98 1115 weighted avg 0.99 0.99 0.99 1115 ROC-AUC: 0.9973 Confusion Matrix: [[963 2] [ 8 142]] ← 8 spam messages slipped through (False Negatives) ← 2 ham incorrectly flagged (False Positives)

Section 06

Understanding the Confusion Matrix for Spam

In spam detection, the two types of errors have very different real-world consequences. Understanding this is critical to tuning your classifier correctly.

😱 False Negative — Spam Slips Through

963True Ham (TN)

8Missed Spam (FN)

2False Alarm (FP)

142True Spam (TP)

8 spam messages reached the inbox. Annoying but survivable.

😡 False Positive — Ham Blocked

963Correct Ham (TN)

0Missed Spam (FN)

2Blocked Ham (FP)

150Blocked Spam (TP)

2 real messages blocked — could mean a missed job offer or bank alert. Critical!

⚖️

Precision vs Recall Trade-off in Spam

For spam filters, high precision (avoid blocking real emails) is usually more important than high recall (catch every spam). A missed spam is annoying. A blocked job offer is catastrophic. Adjust your decision threshold accordingly — use model.predict_proba and set a higher threshold (e.g. 0.7) for labelling something as spam instead of the default 0.5.

Section 07

The Naive Bayes Algorithm — Why It Dominates Spam Filtering

📖 Story

The Detective Who Looks at Clues Independently

Imagine Inspector Sharma receives a suspicious envelope. He checks three clues: (1) Does it say "FREE"? Yes — suspicious. (2) Does it have a Nigerian sender? Yes — suspicious. (3) Does it mention "prize money"? Yes — suspicious.

Inspector Sharma doesn't think about whether these clues are related to each other. He simply multiplies their individual probabilities of spam: 0.9 × 0.85 × 0.95 = 0.73. That is the Naive part — assuming independence. It's mathematically wrong but practically brilliant, because in text classification, it works shockingly well despite the flawed assumption.

Naive Bayes applies Bayes' Theorem to compute the probability that a message is spam given the words it contains:

Bayes' Theorem

P(Spam|words) ∝ P(Spam) × ∏ P(word|Spam)

The probability of spam given these words is proportional to the prior × the product of individual word likelihoods.

The "Naive" Assumption

P(w1,w2,...|Spam) = P(w1|Spam) × P(w2|Spam)...

Words are treated as conditionally independent given the class. This is almost never true — but it works.

Example

P("free"|Spam) = 0.45, P("free"|Ham) = 0.01

The word "free" is 45× more likely to appear in spam than ham. This word alone is a powerful signal.

Laplace Smoothing

P(w|class) = (count + α) / (total + α×V)

Adds α (usually 1) to avoid zero probability for unseen words. Critical for new vocabulary.

🏆

Why Naive Bayes Is the Classic Spam Choice

Naive Bayes trains in milliseconds, handles high-dimensional text data natively, requires very little data to generalise, and produces calibrated probability outputs. It was the algorithm behind early Gmail spam filters and still powers many commercial email systems today due to its speed and interpretability.

Section 08

Beyond Naive Bayes — Other Algorithms Compared

As datasets grow and requirements become more sophisticated, you have several powerful alternatives to Naive Bayes. Here is a practical comparison for spam detection:

Algorithm	Accuracy	Speed	Interpretable	Best For
Naive Bayes	97–98%	Very Fast	Yes	Baseline, resource-constrained systems
Logistic Regression	97–99%	Fast	Yes	When feature coefficients matter
Random Forest	98–99%	Medium	Partial	Combining TF-IDF + engineered features
Support Vector Machine	98–99%	Slow (large data)	No	High-dimensional text, strong accuracy
LSTM / GRU	99%+	Slow	No	Sequential pattern learning, long emails
BERT / DistilBERT	99.5%+	Very Slow	No	State-of-the-art, complex phishing text

Trying Logistic Regression and SVM

from sklearn.linear_model import LogisticRegression
from sklearn.svm import LinearSVC
from sklearn.model_selection import cross_val_score

tfidf = TfidfVectorizer(ngram_range=(1, 2), max_features=10000, sublinear_tf=True)
X_all = tfidf.fit_transform(df['clean_text'])
y_all = df['label_enc']

models = {
    'Naive Bayes':          MultinomialNB(alpha=0.1),
    'Logistic Regression':  LogisticRegression(max_iter=1000, C=1.0),
    'LinearSVC':            LinearSVC(C=1.0)
}

for name, clf in models.items():
    scores = cross_val_score(clf, X_all, y_all, cv=5, scoring='f1')
    print(f"{name:25s}: F1 = {scores.mean():.4f} ± {scores.std():.4f}")

OUTPUT

Naive Bayes : F1 = 0.9521 ± 0.0088 Logistic Regression : F1 = 0.9648 ± 0.0071 LinearSVC : F1 = 0.9712 ± 0.0054 ← Best on this dataset

Section 09

Feature Engineering — The Secret Weapon

TF-IDF captures word content. But spam has structural signals too — signals that exist in the formatting, not just the words. Feature engineering extracts these.

❗

Exclamation Count

Spam messages use 3–10× more exclamation marks than ham. A single integer feature with high predictive power.

df['num_exclaim'] = df['text'].str.count('!')

🔢

Uppercase Ratio

Shouting in capitals is a spam hallmark. "FREE PRIZE NOW" signals urgency designed to override rational thinking.

upper_ratio = UPPERCASE / total_chars

🔗

URL Presence

Does the message contain a hyperlink? Phishing spam almost always does. A binary 0/1 feature.

str.contains(r'http|www|\.com')

📞

Phone Number Presence

Premium-rate phone numbers are a classic spam tactic. Pattern: long digit sequences (10–15 digits).

re.search(r'\b\d{10,}\b', text)

💲

Currency Mention

Presence of £, $, €, ₹ symbols or words like "prize", "cash", "win" correlates strongly with spam.

str.contains(r'£|\$|€|₹|prize|win')

📈

Message Length

Spam averages 139 chars vs ham's 71 chars. A simple length feature adds measurable accuracy.

df['length'] = df['text'].apply(len)

Combining TF-IDF + Engineered Features

from scipy.sparse import hstack
from sklearn.preprocessing import StandardScaler

# TF-IDF matrix (sparse)
tfidf = TfidfVectorizer(ngram_range=(1,2), max_features=10000, sublinear_tf=True)
X_text = tfidf.fit_transform(df['clean_text'])

# Engineered features (dense)
eng_features = ['num_exclamation', 'has_url', 'num_digits', 'upper_ratio', 'length']
X_eng = df[eng_features].values

# Combine: sparse + dense
from scipy.sparse import csr_matrix
X_combined = hstack([X_text, csr_matrix(X_eng)])

# Evaluate with combined features
svm = LinearSVC(C=1.0)
scores = cross_val_score(svm, X_combined, df['label_enc'], cv=5, scoring='f1')
print(f"Combined Features F1: {scores.mean():.4f} ± {scores.std():.4f}")

OUTPUT

Combined Features F1: 0.9791 ± 0.0041 ↑ Improvement from 0.9712 (text only) to 0.9791 — engineered features added real value

Section 10

Advanced: Deep Learning with LSTM

For large-scale or high-stakes spam detection (e.g., detecting sophisticated phishing), deep learning models capture sequential context that TF-IDF misses — understanding that "you won" is more suspicious than "you" and "won" independently.

🧠

When to Use Deep Learning for Spam

Use LSTM or BERT when: you have large datasets (>50k examples), spam is contextually sophisticated (not just keyword-based), you need to detect adversarial spam that deliberately avoids trigger words, or you are classifying multi-language content.

import tensorflow as tf
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding, LSTM, Dense, Dropout

# Tokenise text
MAX_VOCAB = 8000
MAX_LEN   = 120

tokenizer = Tokenizer(num_words=MAX_VOCAB, oov_token='<OOV>')
tokenizer.fit_on_texts(X_train)

X_tr_seq = pad_sequences(tokenizer.texts_to_sequences(X_train), maxlen=MAX_LEN)
X_te_seq = pad_sequences(tokenizer.texts_to_sequences(X_test),  maxlen=MAX_LEN)

# Build LSTM model
model_lstm = Sequential([
    Embedding(MAX_VOCAB, 64, input_length=MAX_LEN),
    LSTM(64, return_sequences=False),
    Dropout(0.3),
    Dense(32, activation='relu'),
    Dense(1,  activation='sigmoid')
])

model_lstm.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
model_lstm.fit(X_tr_seq, y_train, epochs=5, batch_size=32, validation_split=0.1, verbose=1)

# Evaluate
loss, acc = model_lstm.evaluate(X_te_seq, y_test, verbose=0)
print(f"LSTM Test Accuracy: {acc:.4f}")

OUTPUT

Epoch 5/5 - loss: 0.0421 - accuracy: 0.9882 - val_loss: 0.0612 - val_accuracy: 0.9821 LSTM Test Accuracy: 0.9893 Slightly below LinearSVC on this small dataset (5.5k rows) — deep learning needs more data to shine.

Section 11

Handling Adversarial Spam — The Arms Race

📖 Story

The Spammer Who Learned to Speak Ham

When the first spam filters started blocking "FREE PRIZE", spammers adapted. They wrote "Fr ee Pr ize" — splitting words. Filters updated. Then came "F.R.E.E P.R.I.Z.E". Then image-based spam (text inside a picture the filter couldn't read). Then HTML tricks — hiding the word "casino" in white text on a white background. It became a never-ending arms race between spammers and detectors.

Modern spam is adversarial by design. A good detector must anticipate these tricks — not just learn from past examples.

Adversarial Technique	Example	Counter-Measure
Character substitution	fr33, w1nn3r, @pple	Regex normalisation before tokenisation
Word splitting	f r e e, p-r-i-z-e	Remove non-alpha chars first, rejoin
Typo injection	Cingratulations!, Freee prize	Fuzzy matching, character n-grams
HTML obfuscation	White text on white background	Parse HTML, extract visible text only
Image-based spam	Text embedded in attached image	OCR (Tesseract) + then text classify
Synonym replacement	"complimentary" instead of "free"	Word embeddings (Word2Vec / BERT)

Handling Character-Level Tricks with Character N-grams

# Character n-grams catch obfuscated spam like "fr33" and "w!nn3r"
char_tfidf = TfidfVectorizer(
    analyzer='char_wb',   # character n-grams within word boundaries
    ngram_range=(2, 4),   # bi-grams to 4-grams of characters
    max_features=20000,
    sublinear_tf=True
)

# Combine word + character n-gram features
X_char = char_tfidf.fit_transform(df['text'])   # use RAW text for char n-grams
X_word = tfidf.transform(df['clean_text'])

X_robust = hstack([X_word, X_char])

scores = cross_val_score(
    LinearSVC(C=1.0), X_robust, df['label_enc'],
    cv=5, scoring='f1'
)
print(f"Robust (word + char n-gram) F1: {scores.mean():.4f} ± {scores.std():.4f}")

OUTPUT

Robust (word + char n-gram) F1: 0.9834 ± 0.0039 ↑ Further improvement — character n-grams catch obfuscated words TF-IDF misses

Section 12

Model Deployment — Saving and Using Your Spam Filter

import joblib

# Save the final trained pipeline
final_pipeline = Pipeline([
    ('tfidf', TfidfVectorizer(ngram_range=(1,2), max_features=10000, sublinear_tf=True)),
    ('clf',   LinearSVC(C=1.0))
])
final_pipeline.fit(df['clean_text'], df['label_enc'])

joblib.dump(final_pipeline, 'spam_detector.pkl')
print("Model saved!")

# ── Load and predict on new messages ─────────────────────
loaded_model = joblib.load('spam_detector.pkl')

new_messages = [
    "Congratulations! You have won a free holiday. Call 0800-123456 NOW!",
    "Hey, are we still on for the team meeting tomorrow at 10am?",
    "URGENT: Your bank account has been compromised. Click here to verify.",
    "Thanks for sending the report. I'll review it this evening."
]

cleaned_new = [preprocess(m) for m in new_messages]
predictions = loaded_model.predict(cleaned_new)
labels      = ['🚨 SPAM' if p == 1 else '✅ HAM ' for p in predictions]

for msg, label in zip(new_messages, labels):
    print(f"{label} → {msg[:60]}...")

OUTPUT

Model saved! 🚨 SPAM → Congratulations! You have won a free holiday. Call 0800... ✅ HAM → Hey, are we still on for the team meeting tomorrow at 10am? 🚨 SPAM → URGENT: Your bank account has been compromised. Click her... ✅ HAM → Thanks for sending the report. I'll review it this evening.

🎉

All Four Predictions Correct

The model correctly identifies the lottery spam, phishing attempt, and both legitimate messages. In production, wrap this in a Flask API or FastAPI endpoint so any service can call POST /predict with a message and receive a spam/ham prediction.

Section 13

Golden Rules of Spam Detection

🛡️ Spam Detection — Non-Negotiable Rules

Never use accuracy alone. With 86% ham, a model that predicts "ham" for everything achieves 86% accuracy. Always report Precision, Recall, F1, and ROC-AUC. On imbalanced classes, accuracy is meaningless.

Engineer features before vectorising. Extract exclamation count, uppercase ratio, URL presence, and digit count from the raw text — before lowercasing and cleaning destroy that signal. These simple features often rival complex models.

Use bigrams, not just unigrams. ngram_range=(1,2) in TF-IDF captures phrases like "click now", "free prize", and "call immediately" — patterns that are far more diagnostic than individual words. Bigrams alone can boost F1 by 2–4%.

Tune the decision threshold. Default is 0.5. For spam, consider raising it to 0.65–0.75 to protect against false positives (blocking real email). Use predict_proba and plot the Precision-Recall curve to find the optimal operating point for your use case.

Retrain regularly. Spam evolves. A model trained in 2023 will be weaker against 2025 spam. Build a continuous feedback loop: users who mark something as spam or "not spam" create new labelled training data. Retrain monthly at minimum.

Add character n-grams for adversarial robustness. Spammers obfuscate words — "fr33", "c@sh", "w-i-n". Character-level n-grams (analyzer='char_wb') catch these tricks that word-level models completely miss.

Start simple, scale up only when needed. Naive Bayes + TF-IDF achieves 97%+ on standard datasets in milliseconds. Only move to LSTM or BERT if you have >50k examples, complex phishing text, or multi-language requirements — and can afford the inference latency.