Files
DS-for-LA/notebooks/05_svd_image_denoising.ipynb
Pawel Sarkowicz 9093ea2c13 to notebooks
2026-03-30 18:45:32 -04:00

474 lines
14 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
{
"cells": [
{
"cell_type": "markdown",
"id": "6ae6c7f8",
"metadata": {},
"source": [
"# Spectral Image Denoising via Truncated SVD\n",
"\n",
"This notebook extracts the image denoising project into a standalone workflow and extends it from **grayscale images** to **actual color images**.\n",
"\n",
"The core idea is the same as in the original write-up: if an image matrix has singular value decomposition\n",
"$$\n",
"A = U \\Sigma V^T,\n",
"$$\n",
"then the best rank-$k$ approximation to $A$ in Frobenius norm is obtained by truncating the SVD. This is the **EckartYoungMirsky theorem**.\n",
"\n",
"For a grayscale image, the image is a single matrix. For an RGB image, we treat the image as **three matrices**, one for each channel, and apply truncated SVD to each channel separately."
]
},
{
"cell_type": "markdown",
"id": "31a665c9",
"metadata": {},
"source": [
"## Outline\n",
"\n",
"1. Load an image from disk\n",
"2. Convert it to grayscale or keep it in RGB\n",
"3. Add synthetic Gaussian noise\n",
"4. Compute a truncated SVD reconstruction\n",
"5. Compare the original, noisy, and denoised images\n",
"6. Measure quality using MSE and PSNR\n",
"\n",
"This notebook is written so that you can use **your own image files** directly."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "88584c56",
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"from PIL import Image\n",
"from pathlib import Path"
]
},
{
"cell_type": "markdown",
"id": "30e96441",
"metadata": {},
"source": [
"## A note on color images\n",
"\n",
"For a grayscale image, SVD applies directly to a single matrix. For a color image $A \\in \\mathbb{R}^{n \\times p \\times 3}$, we write\n",
"$$\n",
"A = (A_R, A_G, A_B),\n",
"$$\n",
"where each channel is an $n \\times p$ matrix. We then compute a rank-$k$ approximation for each channel:\n",
"$$\n",
"A_R \\approx (A_R)_k,\\qquad\n",
"A_G \\approx (A_G)_k,\\qquad\n",
"A_B \\approx (A_B)_k,\n",
"$$\n",
"and stack them back together.\n",
"\n",
"This is the most direct extension of the grayscale method, and it works well as a first linear-algebraic treatment of color denoising."
]
},
{
"cell_type": "markdown",
"id": "f275cbc9",
"metadata": {},
"source": [
"## Helper functions\n",
"\n",
"We begin with some utilities for:\n",
"- loading images,\n",
"- adding Gaussian noise,\n",
"- reconstructing rank-$k$ approximations,\n",
"- computing image-quality metrics."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "21adfcaf",
"metadata": {},
"outputs": [],
"source": [
"def load_image(path, mode=\"rgb\"):\n",
" \"\"\"\n",
" Load an image from disk.\n",
"\n",
" Parameters\n",
" ----------\n",
" path : str or Path\n",
" Path to the image file.\n",
" mode : {\"rgb\", \"gray\"}\n",
" Whether to load the image as RGB or grayscale.\n",
"\n",
" Returns\n",
" -------\n",
" np.ndarray\n",
" Float image array scaled to [0, 255].\n",
" Shape is (H, W, 3) for RGB and (H, W) for grayscale.\n",
" \"\"\"\n",
" path = Path(path)\n",
" img = Image.open(path)\n",
"\n",
" if mode.lower() in {\"gray\", \"grayscale\", \"l\"}:\n",
" img = img.convert(\"L\")\n",
" else:\n",
" img = img.convert(\"RGB\")\n",
"\n",
" return np.asarray(img, dtype=np.float64)\n",
"\n",
"\n",
"def show_image(img, title=None):\n",
" \"\"\"Display a grayscale or RGB image.\"\"\"\n",
" plt.figure(figsize=(6, 6))\n",
" if img.ndim == 2:\n",
" plt.imshow(np.clip(img, 0, 255), cmap=\"gray\", vmin=0, vmax=255)\n",
" else:\n",
" plt.imshow(np.clip(img, 0, 255).astype(np.uint8))\n",
" if title is not None:\n",
" plt.title(title)\n",
" plt.axis(\"off\")\n",
" plt.show()\n",
"\n",
"\n",
"def add_gaussian_noise(img, sigma=25, seed=0):\n",
" \"\"\"\n",
" Add Gaussian noise to an image.\n",
"\n",
" Parameters\n",
" ----------\n",
" img : np.ndarray\n",
" Image array in [0, 255].\n",
" sigma : float\n",
" Standard deviation of the noise.\n",
" seed : int\n",
" Random seed for reproducibility.\n",
"\n",
" Returns\n",
" -------\n",
" np.ndarray\n",
" Noisy image clipped to [0, 255].\n",
" \"\"\"\n",
" rng = np.random.default_rng(seed)\n",
" noisy = img + rng.normal(loc=0.0, scale=sigma, size=img.shape)\n",
" return np.clip(noisy, 0, 255)\n",
"\n",
"\n",
"def truncated_svd_matrix(A, k):\n",
" \"\"\"\n",
" Return the rank-k truncated SVD approximation of a 2D matrix A.\n",
" \"\"\"\n",
" U, s, Vt = np.linalg.svd(A, full_matrices=False)\n",
" k = min(k, len(s))\n",
" return (U[:, :k] * s[:k]) @ Vt[:k, :]\n",
"\n",
"\n",
"def truncated_svd_image(img, k):\n",
" \"\"\"\n",
" Apply truncated SVD to a grayscale or RGB image.\n",
"\n",
" For RGB images, truncated SVD is applied channel-by-channel.\n",
"\n",
" Parameters\n",
" ----------\n",
" img : np.ndarray\n",
" Shape (H, W) for grayscale or (H, W, 3) for RGB.\n",
" k : int\n",
" Truncation rank.\n",
"\n",
" Returns\n",
" -------\n",
" np.ndarray\n",
" Reconstructed image clipped to [0, 255].\n",
" \"\"\"\n",
" if img.ndim == 2:\n",
" recon = truncated_svd_matrix(img, k)\n",
" return np.clip(recon, 0, 255)\n",
"\n",
" if img.ndim == 3:\n",
" channels = []\n",
" for c in range(img.shape[2]):\n",
" channel_recon = truncated_svd_matrix(img[:, :, c], k)\n",
" channels.append(channel_recon)\n",
" recon = np.stack(channels, axis=2)\n",
" return np.clip(recon, 0, 255)\n",
"\n",
" raise ValueError(\"Image must be either 2D (grayscale) or 3D (RGB).\")\n",
"\n",
"\n",
"def mse(A, B):\n",
" \"\"\"Mean squared error between two images.\"\"\"\n",
" return np.mean((A.astype(np.float64) - B.astype(np.float64)) ** 2)\n",
"\n",
"\n",
"def psnr(A, B, max_val=255.0):\n",
" \"\"\"Peak signal-to-noise ratio in decibels.\"\"\"\n",
" err = mse(A, B)\n",
" if err == 0:\n",
" return np.inf\n",
" return 10 * np.log10((max_val ** 2) / err)"
]
},
{
"cell_type": "markdown",
"id": "fe1d4932",
"metadata": {},
"source": [
"## Choose an image"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "42bafca5",
"metadata": {},
"outputs": [],
"source": [
"IMAGE_PATH = \"../images/bella.jpg\" \n",
"MODE = \"rgb\" # use \"gray\" for grayscale, \"rgb\" for color\n",
"\n",
"img = load_image(IMAGE_PATH, mode=MODE)\n",
"print(\"Image shape:\", img.shape)\n",
"show_image(img, title=f\"Original image ({MODE})\")"
]
},
{
"cell_type": "markdown",
"id": "05e52222",
"metadata": {},
"source": [
"## Add synthetic Gaussian noise\n",
"\n",
"We add noise so that the denoising effect is visible and measurable."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "528e69b3",
"metadata": {},
"outputs": [],
"source": [
"sigma = 25\n",
"seed = 0\n",
"\n",
"img_noisy = add_gaussian_noise(img, sigma=sigma, seed=seed)\n",
"\n",
"img_noisy.save('../images/bella_noisy.jpg')\n",
"\n",
"show_image(img_noisy, title=f\"Noisy image (sigma={sigma})\")"
]
},
{
"cell_type": "markdown",
"id": "1bbcc1d8",
"metadata": {},
"source": [
"## Visualizing rank-$k$ reconstructions\n",
"\n",
"For small $k$, the reconstruction captures only coarse structure.\n",
"As $k$ increases, more detail returns. For denoising, there is often a useful middle ground:\n",
"enough singular values to preserve structure, but not so many that we reintroduce noise."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "563df53a",
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"import math\n",
"\n",
"ks = [5, 20, 50, 100]\n",
"\n",
"# Collect all images + titles\n",
"images = []\n",
"titles = []\n",
"\n",
"# Original\n",
"images.append(img)\n",
"titles.append(\"Original\")\n",
"\n",
"# Noisy\n",
"images.append(img_noisy)\n",
"titles.append(\"Noisy\")\n",
"\n",
"# Reconstructions\n",
"for k in ks:\n",
" recon = truncated_svd_image(img_noisy, k)\n",
" images.append(recon)\n",
" titles.append(f\"k = {k}\")\n",
"\n",
"# Grid setup\n",
"ncols = 2\n",
"nrows = math.ceil(len(images) / ncols)\n",
"\n",
"fig, axes = plt.subplots(nrows, ncols, figsize=(6 * ncols, 4 * nrows))\n",
"axes = axes.flatten() # easier indexing\n",
"\n",
"# Plot everything\n",
"for i, (ax, im, title) in enumerate(zip(axes, images, titles)):\n",
" if im.ndim == 2:\n",
" ax.imshow(im, cmap=\"gray\", vmin=0, vmax=255)\n",
" else:\n",
" ax.imshow(np.clip(im, 0, 255).astype(np.uint8))\n",
" \n",
" ax.set_title(title)\n",
" ax.axis(\"off\")\n",
"\n",
"# Hide any unused axes\n",
"for j in range(len(images), len(axes)):\n",
" axes[j].axis(\"off\")\n",
"\n",
"plt.tight_layout()\n",
"plt.savefig('../images/bella_truncated_svd_multiplie_ks.png')\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"id": "309579fa",
"metadata": {},
"source": [
"## Quantitative evaluation\n",
"\n",
"We compare each reconstruction against the **clean original image**, not against the noisy one.\n",
"A good denoising rank should typically:\n",
"- reduce MSE relative to the noisy image,\n",
"- increase PSNR relative to the noisy image."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "56ce07ee",
"metadata": {},
"outputs": [],
"source": [
"baseline_mse = mse(img, img_noisy)\n",
"baseline_psnr = psnr(img, img_noisy)\n",
"\n",
"print(f\"Noisy image baseline -> MSE: {baseline_mse:.2f}, PSNR: {baseline_psnr:.2f} dB\")\n",
"\n",
"results = []\n",
"for k in ks:\n",
" recon = truncated_svd_image(img_noisy, k)\n",
" results.append((k, mse(img, recon), psnr(img, recon)))\n",
"\n",
"print(\"\\nRank-k reconstructions:\")\n",
"for k, m, p in results:\n",
" print(f\"k = {k:3d} | MSE = {m:10.2f} | PSNR = {p:6.2f} dB\")"
]
},
{
"cell_type": "markdown",
"id": "de9c3f3c",
"metadata": {},
"source": [
"## Automatic search over many values of $k$\n",
"\n",
"This is often useful because the best denoising rank is image-dependent."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e097dcf4",
"metadata": {},
"outputs": [],
"source": [
"candidate_ks = list(range(1, 151, 5))\n",
"\n",
"scores = []\n",
"for k in candidate_ks:\n",
" recon = truncated_svd_image(img_noisy, k)\n",
" scores.append((k, mse(img, recon), psnr(img, recon)))\n",
"\n",
"best_by_mse = min(scores, key=lambda x: x[1])\n",
"best_by_psnr = max(scores, key=lambda x: x[2])\n",
"\n",
"print(\"Best by MSE :\", best_by_mse)\n",
"print(\"Best by PSNR:\", best_by_psnr)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4b9dc5c7",
"metadata": {},
"outputs": [],
"source": [
"best_k = best_by_psnr[0]\n",
"best_recon = truncated_svd_image(img_noisy, best_k)\n",
"\n",
"fig, axes = plt.subplots(1, 3, figsize=(15, 5))\n",
"\n",
"if img.ndim == 2:\n",
" axes[0].imshow(img, cmap=\"gray\", vmin=0, vmax=255)\n",
" axes[1].imshow(img_noisy, cmap=\"gray\", vmin=0, vmax=255)\n",
" axes[2].imshow(best_recon, cmap=\"gray\", vmin=0, vmax=255)\n",
"else:\n",
" axes[0].imshow(np.clip(img, 0, 255).astype(np.uint8))\n",
" axes[1].imshow(np.clip(img_noisy, 0, 255).astype(np.uint8))\n",
" axes[2].imshow(np.clip(best_recon, 0, 255).astype(np.uint8))\n",
"\n",
"axes[0].set_title(\"Original\")\n",
"axes[1].set_title(\"Noisy\")\n",
"axes[2].set_title(f\"Best reconstruction (k={best_k})\")\n",
"\n",
"for ax in axes:\n",
" ax.axis(\"off\")\n",
"\n",
"plt.tight_layout()\n",
"plt.savefig('../images/bella_best_truncated.png')\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"id": "1a3acfe5",
"metadata": {},
"source": [
"## Remarks and possible extensions\n",
"\n",
"- The same rank $k$ was used for every color channel. You could instead choose different ranks per channel.\n",
"- You can test this on photographs, scanned documents, or screenshots.\n",
"- Truncated SVD is excellent for illustrating low-rank structure, but it is not the only denoising method.\n",
"- A more advanced next step would be to compare SVD denoising against:\n",
" - Gaussian blur,\n",
" - median filtering,\n",
" - wavelet denoising,\n",
" - non-local means,\n",
" - autoencoder-based denoising.\n",
"\n",
"For this notebook, though, the point is to keep the method squarely grounded in linear algebra."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.14.3"
}
},
"nbformat": 4,
"nbformat_minor": 5
}