OpenCV Python Tutorial

Section 01

The Story That Explains OpenCV

📖 Real World Analogy

Teaching a Machine to See

Imagine you are holding a photo of a cat. Your brain fires in milliseconds — you recognise fur, ears, eyes, whiskers. Now imagine you want a computer to do the same thing. To a computer, that photo is nothing more than a grid of numbers — millions of tiny values between 0 and 255 representing pixel brightness.

OpenCV (Open Source Computer Vision Library) is the Swiss Army knife that lets you manipulate, analyse, and extract meaning from those grids of numbers. First built at Intel in 1999, it now powers everything from NASA rover cameras to your phone's portrait mode — and it is 100% free.

At its core, every image in OpenCV is a NumPy array of shape (Height, Width, Channels) with data type uint8 (values 0–255). Every algorithm is simply a function that takes that array in and returns a transformed array out.

📈 Fig 1 — OpenCV Pipeline: From Pixels to Insight

OpenCV reads images as BGR arrays by default — not RGB. All processing happens on NumPy arrays in memory.

📷

BGR — Not RGB

OpenCV stores colour channels in Blue-Green-Red order — the opposite of what Matplotlib, PIL, and most libraries expect. Always convert with cv2.cvtColor(img, cv2.COLOR_BGR2RGB) before passing to any other library. Forgetting this is the most common beginner mistake.

🔌 Image Flow — What Happens Step by Step

Input

Camera / File / Stream → raw bytes on disk or in memory

Load

cv2.imread() decodes → NumPy array shape (H, W, 3) dtype uint8

Process

Any OpenCV algorithm — blur, detect, transform, annotate — operates on the array

Output

imshow() display · imwrite() save · pass to another function for further analysis

Installation

# Standard install — everything needed for this tutorial
pip install opencv-python numpy

# Full build with extra contrib modules (SIFT, SURF, aruco…)
pip install opencv-contrib-python

Section 02

Reading, Writing & Displaying Images

The three functions you will type more than any other. Every OpenCV program begins and ends with these.

Function	Purpose	Key Flag / Parameter
`cv2.imread()`	Load image from disk	IMREAD_COLOR · IMREAD_GRAYSCALE · IMREAD_UNCHANGED
`cv2.imshow()`	Display in a pop-up window	window name string, image array
`cv2.imwrite()`	Save image to disk	Extension sets format automatically (.jpg/.png/.bmp)
`cv2.waitKey()`	Pause execution for keyboard input	0 = wait forever · n = wait n ms
`cv2.destroyAllWindows()`	Close all display windows	—

import cv2
import numpy as np

# ── Load an image from disk ───────────────────────────────────
img  = cv2.imread("photo.jpg")                        # BGR, shape (H, W, 3)
gray = cv2.imread("photo.jpg", cv2.IMREAD_GRAYSCALE)  # shape (H, W)
rgba = cv2.imread("logo.png",  cv2.IMREAD_UNCHANGED)  # keeps alpha channel

# ── Guard against failed loads ────────────────────────────────
if img is None:
    raise FileNotFoundError("Image not found — check the path")

# ── Inspect the array ─────────────────────────────────────────
print(f"Shape : {img.shape}")   # (480, 640, 3)
print(f"Dtype : {img.dtype}")   # uint8
print(f"Pixels: {img.size}")    # H × W × C total values

# ── Display ───────────────────────────────────────────────────
cv2.imshow("Original", img)
cv2.imshow("Grayscale", gray)
cv2.waitKey(0)
cv2.destroyAllWindows()

# ── Save with quality control ─────────────────────────────────
cv2.imwrite("output.jpg", img, [cv2.IMWRITE_JPEG_QUALITY, 95])
cv2.imwrite("output.png", img)   # PNG is lossless

OUTPUT

Shape : (480, 640, 3) Dtype : uint8 Pixels: 921600

⚠️

imread() Returns None Silently

If the file path is wrong, cv2.imread() does not raise an exception — it returns None. Every subsequent operation will crash with a cryptic error. Always add a None check immediately after loading.

Section 03

Color Spaces & Conversions

📖 Story

The Traffic Light Problem

A self-driving car needs to detect a red traffic light. In BGR, red is (0, 0, 255) — but under noon sun it reads as (20, 30, 210), and at dusk as (10, 15, 140). Matching three channels across lighting conditions is a nightmare.

Switch to HSV. Red is always Hue 0°–10° and 170°–180° regardless of sunlight. You now write a single two-range mask that works in rain, noon glare, and tunnel shadow alike. Choosing the right color space is often more powerful than any algorithm.

🌈 Fig 2 — Common Color Spaces in OpenCV

■ BGR — default load/save ■ HSV — colour masking ■ LAB — perceptual ■ GRAY — single channel

HSV isolates colour (Hue) from brightness — making masks robust to lighting changes. Use GRAY when you only need intensity.

🏃

BGR / RGB

3 channels · 0–255 each

Default in OpenCV (BGR) and most tools (RGB). Good for display and saving. Poor for colour-based filtering because brightness affects all three channels simultaneously.

🌈

HSV

Hue · Saturation · Value

Best for colour masking and object detection. Hue encodes pure colour independently of brightness. Define a range on Hue alone and your mask works across lighting conditions.

👁

Grayscale / LAB

1 channel · Perceptual

Grayscale for edge detection, thresholding, and speed (⅓ the data). LAB for perceptually uniform colour comparisons — great for skin tone detection and colour-consistency checks.

import cv2
import numpy as np

img = cv2.imread("scene.jpg")

# ── Common conversions ────────────────────────────────────────
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
hsv  = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
rgb  = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
lab  = cv2.cvtColor(img, cv2.COLOR_BGR2LAB)

# ── Practical: isolate RED objects using HSV ──────────────────
# Red wraps around 0° in Hue — needs two separate ranges
lower_red1 = np.array([  0, 120, 70])
upper_red1 = np.array([ 10, 255, 255])
lower_red2 = np.array([170, 120, 70])
upper_red2 = np.array([180, 255, 255])

mask1  = cv2.inRange(hsv, lower_red1, upper_red1)
mask2  = cv2.inRange(hsv, lower_red2, upper_red2)
mask   = cv2.bitwise_or(mask1, mask2)
result = cv2.bitwise_and(img, img, mask=mask)

cv2.imshow("Red Objects Only", result)
cv2.waitKey(0)

Section 04

Image Filtering & Blurring

Before detecting edges or finding objects, you almost always need to reduce noise. Filtering replaces each pixel with a weighted combination of its neighbours — a mathematical operation called convolution.

🧰 Fig 3 — How Kernel Convolution Works

The kernel slides one pixel at a time. Different kernel values produce blur, sharpen, edge-detect, or emboss effects.

Filter	Function	Removes	Best For
`cv2.blur()`	Box / Average	General noise	Quick pre-processing
`cv2.GaussianBlur()`	Gaussian weighted	Gaussian noise	Before Canny edge detection
`cv2.medianBlur()`	Median of neighbourhood	Best for salt & pepper	Scanned documents, old photos
`cv2.bilateralFilter()`	Edge-preserving smooth	Noise, keeps edges	Portrait smoothing, medical
`cv2.filter2D()`	Custom kernel	Any (you define it)	Sharpen, emboss, custom effects

import cv2
import numpy as np

img = cv2.imread("noisy.jpg")

# ── Standard blur methods ─────────────────────────────────────
box       = cv2.blur(img, (5, 5))
gaussian  = cv2.GaussianBlur(img, (5, 5), sigmaX=0)
median    = cv2.medianBlur(img, 5)
bilateral = cv2.bilateralFilter(img, d=9, sigmaColor=75, sigmaSpace=75)

# ── Custom sharpening kernel ──────────────────────────────────
sharpen_k = np.array([[ 0, -1,  0],
                        [-1,  5, -1],
                        [ 0, -1,  0]])
sharp = cv2.filter2D(img, ddepth=-1, kernel=sharpen_k)

compare = np.hstack([img, gaussian, bilateral, sharp])
cv2.imshow("Original | Gaussian | Bilateral | Sharp", compare)
cv2.waitKey(0)

🔑

Kernel Size Must Always Be Odd

Kernel sizes must be odd integers (3, 5, 7, …) so there is a defined centre pixel. Passing an even number raises a cv2.error. Start with (5, 5) and increase for more smoothing.

Section 05

Edge Detection — Canny, Sobel & Laplacian

📖 Story

The Architect's Blueprint

An architect's blueprint is 95% blank space and 5% lines. Those lines carry all the information. Edge detection is how computers "read" blueprints — stripping colour and texture to reveal only the structure. John Canny designed his algorithm in 1986 as a university project. Nearly 40 years later it remains the default choice for single-scale edge detection in medical imaging, autonomous vehicles, and industrial inspection.

📈 Fig 4 — Canny Edge Detection: 5-Step Pipeline

Canny produces the cleanest, thinnest edges of any single-scale detector. The 5-step pipeline eliminates false edges from noise before they can propagate.

Detector	Strengths	Weaknesses	Use When
Canny	Clean thin edges, best overall	Two thresholds to tune	General purpose — almost always first choice
Sobel	Directional (X or Y separately)	Thicker edges, more noise	You need edge direction information
Laplacian	Isotropic, one pass	Very noise sensitive	Blur/focus detection (is image sharp?)

import cv2
import numpy as np

img  = cv2.imread("building.jpg")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (5, 5), 0)

# ── Canny with Otsu auto-threshold ───────────────────────────
otsu_thresh, _ = cv2.threshold(blur, 0, 255,
                                cv2.THRESH_BINARY + cv2.THRESH_OTSU)
edges = cv2.Canny(blur,
                   threshold1=otsu_thresh * 0.5,
                   threshold2=otsu_thresh)

# ── Sobel — horizontal and vertical separately ───────────────
sobelX = cv2.Sobel(gray, cv2.CV_64F, 1, 0, ksize=3)
sobelY = cv2.Sobel(gray, cv2.CV_64F, 0, 1, ksize=3)
sobel  = cv2.convertScaleAbs(cv2.magnitude(sobelX, sobelY))

# ── Laplacian — all directions at once ───────────────────────
laplacian = cv2.convertScaleAbs(cv2.Laplacian(gray, cv2.CV_64F, ksize=3))

compare = np.hstack([edges, sobel, laplacian])
cv2.imshow("Canny | Sobel | Laplacian", compare)
cv2.waitKey(0)

Section 06

Thresholding & Binary Segmentation

Thresholding converts a grayscale image into a clean black-and-white binary image. Each pixel is compared to a threshold value: above → white, below → black. It is the simplest and fastest form of segmentation.

▮ Fig 5 — Global vs Adaptive Thresholding

■ Pixel intensity ⎯ Global threshold (fixed) ~ Adaptive threshold (local mean)

For documents with shadows or uneven scanner lighting, always use Adaptive Gaussian. Global binary is only reliable under perfectly uniform illumination.

Type	Flag	Threshold Chosen By	Best For
Binary	`THRESH_BINARY`	You pick manually	Uniform lighting, clear background
Otsu's Auto	`THRESH_OTSU`	Auto — bimodal histogram	Well-lit scenes, documents
Adaptive Mean	`ADAPTIVE_THRESH_MEAN_C`	Mean of neighbourhood	Uneven lighting
Adaptive Gaussian	`ADAPTIVE_THRESH_GAUSSIAN_C`	Weighted mean — smoother	Shadows, gradients in scene
Inverse Binary	`THRESH_BINARY_INV`	You pick (inverted)	Dark objects on bright background

import cv2

img  = cv2.imread("document.jpg")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# ── 1. Manual binary threshold ────────────────────────────────
_, thresh_manual = cv2.threshold(gray, 127, 255, cv2.THRESH_BINARY)

# ── 2. Otsu's automatic threshold ────────────────────────────
_, thresh_otsu = cv2.threshold(gray, 0, 255,
                                cv2.THRESH_BINARY + cv2.THRESH_OTSU)

# ── 3. Adaptive Gaussian — best for uneven lighting ──────────
thresh_adapt = cv2.adaptiveThreshold(
    gray, 255,
    cv2.ADAPTIVE_THRESH_GAUSSIAN_C,
    cv2.THRESH_BINARY,
    blockSize=11,    # neighbourhood size — must be odd, > 1
    C=2              # constant subtracted from the mean
)

cv2.imshow("Adaptive Threshold", thresh_adapt)
cv2.waitKey(0)

Section 07

Contour Detection & Shape Analysis

📖 Story

The Factory Quality Robot

A pharmaceutical factory needs to count tablets in a blister pack and reject any pack with a missing or broken tablet — 10,000 packs per hour. The OpenCV solution: threshold the image to isolate white tablets on dark foil, call findContours() to outline each blob, measure the area of each contour, and flag anything outside the expected size range. Zero human involvement. Zero blinks. The robot paid for itself in three weeks.

🔎 Fig 6 — Shape Classification via Contour Vertices

approxPolyDP reduces a contour to its key vertices — triangle=3, rectangle=4, pentagon=5, circle=many. Use area to filter noise blobs.

import cv2
import numpy as np

img  = cv2.imread("shapes.png")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
_, bw = cv2.threshold(gray, 127, 255, cv2.THRESH_BINARY_INV)

contours, _ = cv2.findContours(bw, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
print(f"Found {len(contours)} objects")

output = img.copy()
for cnt in contours:
    area      = cv2.contourArea(cnt)
    perimeter = cv2.arcLength(cnt, closed=True)
    if area < 500: continue

    approx   = cv2.approxPolyDP(cnt, 0.04 * perimeter, True)
    vertices = len(approx)
    shape    = {3: "Triangle", 4: "Rectangle",
                5: "Pentagon"}.get(vertices, "Circle")

    M  = cv2.moments(cnt)
    cx = int(M["m10"] / (M["m00"] + 1e-5))
    cy = int(M["m01"] / (M["m00"] + 1e-5))

    cv2.drawContours(output, [cnt], -1, (0, 255, 0), 2)
    cv2.putText(output, shape, (cx-30, cy),
               cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0, 0, 255), 2)

cv2.imshow("Shapes", output)
cv2.waitKey(0)

Section 08

Drawing & Annotations

OpenCV's drawing functions modify the image array in-place. Always call img.copy() first if you want to keep the original.

Function	Shape	Key Arguments
`cv2.line()`	Straight line	img, pt1, pt2, color, thickness
`cv2.rectangle()`	Axis-aligned box	img, pt1, pt2, color, thickness (-1=filled)
`cv2.circle()`	Circle	img, center, radius, color, thickness
`cv2.ellipse()`	Ellipse / arc	img, center, axes, angle, startAngle, endAngle, color
`cv2.polylines()`	Polygon outline	img, [pts], isClosed, color, thickness
`cv2.putText()`	Text label	img, text, org, fontFace, fontScale, color, thickness
`cv2.arrowedLine()`	Arrow with head	img, pt1, pt2, color, thickness, tipLength

import cv2
import numpy as np

canvas = np.zeros((480, 640, 3), dtype=np.uint8)

RED    = (0,   0,   255)
GREEN  = (0,   255, 0)
BLUE   = (255, 0,   0)
YELLOW = (0,   255, 255)
WHITE  = (255, 255, 255)

cv2.line      (canvas, (50, 50),   (300, 50),  RED,   3)
cv2.rectangle (canvas, (50, 80),   (200, 180), GREEN, 2)
cv2.circle    (canvas, (350, 130), 60,          BLUE,  -1)
cv2.ellipse   (canvas, (500, 130), (70, 40), 30, 0, 360, YELLOW, 2)
pts = np.array([[100,300],[200,250],[300,350],[150,400]], np.int32)
cv2.polylines (canvas, [pts], isClosed=True, color=YELLOW, thickness=2)
cv2.putText   (canvas, "OpenCV Drawing API", (50, 450),
              cv2.FONT_HERSHEY_DUPLEX, 1.0, WHITE, 2)

cv2.imshow("Canvas", canvas)
cv2.waitKey(0)

Section 09

Geometric Transformations — Resize, Rotate, Warp

🔃 Fig 7 — Affine vs Perspective Transform

■ Original ■ Affine (3-point, parallel lines preserved) ■ Perspective (4-point, only straight lines preserved)

Affine needs 3 point pairs (getAffineTransform). Perspective needs 4 point pairs (getPerspectiveTransform) — perfect for document scanner flatten.

Transform	Function	What It Preserves	Real Use
Resize	`cv2.resize()`	Aspect (if fx=fy)	Normalising for ML model input
Rotation	`getRotationMatrix2D` + `warpAffine`	Parallel lines	Deskew scanned documents
Affine Warp	`getAffineTransform` + `warpAffine`	Parallel lines (3-point)	Correct camera tilt
Perspective Warp	`getPerspectiveTransform` + `warpPerspective`	Straight lines only	Document scanner, bird's-eye road
Flip	`cv2.flip()`	Shape and values	Data augmentation
Crop (ROI)	`img[y1:y2, x1:x2]`	Pixel values exactly	Region of interest extraction

import cv2
import numpy as np

img  = cv2.imread("photo.jpg")
h, w = img.shape[:2]

# ── 1. Resize to half ─────────────────────────────────────────
half = cv2.resize(img, (0, 0), fx=0.5, fy=0.5,
                   interpolation=cv2.INTER_AREA)

# ── 2. Rotate 45° around image centre ────────────────────────
M_rot  = cv2.getRotationMatrix2D((w//2, h//2), angle=45, scale=1.0)
rotated = cv2.warpAffine(img, M_rot, (w, h))

# ── 3. Perspective warp — flatten a document ─────────────────
src = np.float32([[73,239],[356,117],[475,265],[187,391]])
dst = np.float32([[0,0],[300,0],[300,400],[0,400]])
M_persp = cv2.getPerspectiveTransform(src, dst)
warped  = cv2.warpPerspective(img, M_persp, (300, 400))

# ── 4. Crop an ROI ────────────────────────────────────────────
roi = img[100:300, 150:400]    # [y1:y2, x1:x2]

cv2.imshow("Rotated", rotated)
cv2.imshow("Warped",  warped)
cv2.waitKey(0)

Section 10

Histograms & Contrast Equalization

A histogram counts how many pixels exist at each intensity level (0–255). A dark image clusters near 0. Equalization stretches the distribution across the full range — dramatically improving visibility in medical scans and surveillance footage.

📊 Fig 8 — Dark Image vs After CLAHE Equalization

■ Dark image histogram (bunched near 0) ■ After CLAHE (spread across full range)

CLAHE divides the image into tiles and equalises each tile separately (tileGridSize), capping amplification at clipLimit to avoid over-boosting noise.

import cv2
import numpy as np

img  = cv2.imread("dark_xray.jpg")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

hist = cv2.calcHist([gray], channels=[0], mask=None,
                     histSize=[256], ranges=[0, 256])

# Global equalization (avoid — amplifies noise in uniform areas)
eq_global = cv2.equalizeHist(gray)

# CLAHE — always prefer this for real images
clahe    = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8, 8))
eq_clahe = clahe.apply(gray)

cv2.imshow("Original", gray)
cv2.imshow("CLAHE", eq_clahe)
cv2.waitKey(0)

Section 11

Face Detection with Haar Cascades

📖 Story

The Algorithm in Every Camera Since 2008

Paul Viola and Michael Jones published their face detector in 2001. By 2008 it was inside every consumer digital camera ever made — the little green square that appears when a face comes into frame. It works by rapidly testing simple brightness differences (Haar features) at hundreds of positions and scales, rejecting non-faces in milliseconds through a cascade of increasingly strict classifiers. The pre-trained XML files ship free with every OpenCV installation. No training required.

👀 Fig 9 — Haar Features & Face Detection Cascade

■ Edge feature ■ Line feature ■ Centre-surround feature ▪ Dark region ▪ Light region

The cascade rejects non-faces early (cheap stages) and only runs expensive stages on candidate regions — enabling real-time detection at 30+ fps.

import cv2

face_cascade = cv2.CascadeClassifier(
    cv2.data.haarcascades + "haarcascade_frontalface_default.xml")
eye_cascade  = cv2.CascadeClassifier(
    cv2.data.haarcascades + "haarcascade_eye.xml")

img  = cv2.imread("group_photo.jpg")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

faces = face_cascade.detectMultiScale(
    gray, scaleFactor=1.1, minNeighbors=5, minSize=(30, 30))
print(f"Detected {len(faces)} face(s)")

for (x, y, w, h) in faces:
    cv2.rectangle(img, (x, y), (x+w, y+h), (0, 255, 0), 2)
    roi_gray  = gray[y:y+h, x:x+w]
    roi_color = img [y:y+h, x:x+w]
    eyes = eye_cascade.detectMultiScale(roi_gray, 1.1, 3)
    for (ex, ey, ew, eh) in eyes:
        cv2.circle(roi_color, (ex+ew//2, ey+eh//2), ew//2, (255,0,0), 2)

cv2.imshow("Face Detection", img)
cv2.waitKey(0)

OUTPUT

Detected 4 face(s)

Section 12

Real-Time Video Processing

Every video is a sequence of images (frames) delivered at a fixed frame rate. OpenCV's VideoCapture reads frames one by one — from a webcam, file, or IP stream — so you can apply any image algorithm to each frame.

🎥 Fig 10 — Real-Time Video Processing Loop

Use waitKey(1) (1 ms) — not waitKey(0) — inside a video loop. waitKey(0) blocks indefinitely and freezes the feed.

import cv2

cap = cv2.VideoCapture(0)     # 0 = default webcam

if not cap.isOpened():
    raise IOError("Cannot access camera")

cap.set(cv2.CAP_PROP_FRAME_WIDTH,  1280)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 720)

fourcc = cv2.VideoWriter_fourcc(*"mp4v")
writer = cv2.VideoWriter("output.mp4", fourcc, 30, (1280, 720))

face_cascade = cv2.CascadeClassifier(
    cv2.data.haarcascades + "haarcascade_frontalface_default.xml")

while True:
    ret, frame = cap.read()
    if not ret: break

    gray  = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    faces = face_cascade.detectMultiScale(gray, 1.1, 5)
    for (x, y, w, h) in faces:
        cv2.rectangle(frame, (x,y), (x+w,y+h), (0,255,0), 2)

    writer.write(frame)
    cv2.imshow("Live — press q to quit", frame)
    if cv2.waitKey(1) & 0xFF == ord("q"): break

cap.release()
writer.release()
cv2.destroyAllWindows()

Section 13

Morphological Operations

📖 Story

Separating Touching Cells

A biologist is counting cells in a microscope image. Problem: several cells are touching each other, so findContours counts them as one blob. Apply Erosion to shrink every white region — the cells separate. Apply Dilation to grow them back — they return to their original size but are now cleanly separated. That sequence is called Opening and it is one of the most powerful tricks in binary image processing.

🧪 Fig 11 — Effect of Each Morphological Operation

■ White region (foreground) ■ Black region (background) ■ Changed pixels

Structuring element shape (RECT, ELLIPSE, CROSS) affects how corners and curves are handled. ELLIPSE is usually best for natural objects.

Operation	Effect on White Regions	Practical Use
Erosion	Shrinks blobs, removes thin protrusions	Separate touching objects
Dilation	Grows blobs, fills small holes	Connect broken contours
Opening (E→D)	Removes small bright blobs	Noise removal without shrinking
Closing (D→E)	Fills small dark holes inside bright regions	Close gaps in contour lines
Morphological Gradient	Outline of objects (D − E)	Edge detection alternative
Top Hat	Bright structures smaller than kernel	Uneven background correction

import cv2
import numpy as np

img  = cv2.imread("cells.png")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
_, bw = cv2.threshold(gray, 127, 255, cv2.THRESH_BINARY)

kernel_ellp = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (5, 5))
kernel_rect = cv2.getStructuringElement(cv2.MORPH_RECT,    (5, 5))

eroded   = cv2.erode    (bw, kernel_ellp, iterations=1)
dilated  = cv2.dilate   (bw, kernel_ellp, iterations=1)
opened   = cv2.morphologyEx(bw, cv2.MORPH_OPEN,     kernel_ellp)
closed   = cv2.morphologyEx(bw, cv2.MORPH_CLOSE,    kernel_ellp)
gradient = cv2.morphologyEx(bw, cv2.MORPH_GRADIENT, kernel_rect)
tophat   = cv2.morphologyEx(bw, cv2.MORPH_TOPHAT,   kernel_rect)

compare = np.hstack([bw, eroded, dilated, opened, closed])
cv2.imshow("Original | Erode | Dilate | Open | Close", compare)
cv2.waitKey(0)

Section 14

Template Matching

Template matching slides a small reference image (template) over a larger scene, computing a similarity score at every position. The peak score marks where the template is found.

import cv2
import numpy as np

img      = cv2.imread("scene.jpg")
template = cv2.imread("logo.png")
th, tw   = template.shape[:2]

result    = cv2.matchTemplate(img, template, cv2.TM_CCOEFF_NORMED)
locations = np.where(result >= 0.80)

output = img.copy()
for pt in zip(*locations[::-1]):
    cv2.rectangle(output, pt, (pt[0]+tw, pt[1]+th), (0,0,255), 2)

print(f"Found {len(locations[0])} match(es)")
cv2.imshow("Template Match", output)
cv2.waitKey(0)

ℹ️

TM_CCOEFF_NORMED — Almost Always the Right Choice

Normalised cross-correlation gives scores between −1 and 1 regardless of image brightness. For scale or rotation-invariant matching, use ORB feature matching with FLANN instead — template matching only works when the object appears at the same scale and orientation as the template.

Section 15

Golden Rules

📷 OpenCV — Non-Negotiable Rules

Always check for None after imread(). OpenCV silently returns None on a bad path — every subsequent call crashes with a cryptic AttributeError or shape error. One if img is None: raise saves hours of debugging.

Images are BGR, not RGB. Convert with cv2.cvtColor(img, cv2.COLOR_BGR2RGB) before passing to Matplotlib, PIL, TensorFlow, or any other library. The sky turns orange if you forget.

Always use img.copy() before drawing or modifying if you want to preserve the original. Drawing functions modify arrays in-place with no undo.

Blur before you detect edges or threshold. Canny, Sobel, and Otsu's all behave dramatically better with a light Gaussian blur first. GaussianBlur(gray, (5,5), 0) is almost never wasted.

Use HSV for colour-based masking, not BGR. In HSV, Hue alone defines the colour — your range works across indoor, outdoor, daylight, and shadow. In BGR you must fight all three channels simultaneously.

Always call cap.release() and cv2.destroyAllWindows() when your video loop exits. Failing to release the camera locks it — your next run cannot access it until Python fully exits.

Use CLAHE, not equalizeHist() for contrast enhancement. Global equalization amplifies noise. CLAHE limits amplification per local tile and produces visibly better results on medical images, dark video, and satellite imagery.