to notebooks

2026-03-30 18:45:32 -04:00
parent 767673eb7a
commit 9093ea2c13
38 changed files with 41415 additions and 1975 deletions
--- a/notebooks/05_svd_image_denoising.ipynb
+++ b/notebooks/05_svd_image_denoising.ipynb
@@ -0,0 +1,473 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "6ae6c7f8",
+   "metadata": {},
+   "source": [
+    "# Spectral Image Denoising via Truncated SVD\n",
+    "\n",
+    "This notebook extracts the image denoising project into a standalone workflow and extends it from **grayscale images** to **actual color images**.\n",
+    "\n",
+    "The core idea is the same as in the original write-up: if an image matrix has singular value decomposition\n",
+    "$$\n",
+    "A = U \\Sigma V^T,\n",
+    "$$\n",
+    "then the best rank-$k$ approximation to $A$ in Frobenius norm is obtained by truncating the SVD. This is the **Eckart–Young–Mirsky theorem**.\n",
+    "\n",
+    "For a grayscale image, the image is a single matrix. For an RGB image, we treat the image as **three matrices**, one for each channel, and apply truncated SVD to each channel separately."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "31a665c9",
+   "metadata": {},
+   "source": [
+    "## Outline\n",
+    "\n",
+    "1. Load an image from disk\n",
+    "2. Convert it to grayscale or keep it in RGB\n",
+    "3. Add synthetic Gaussian noise\n",
+    "4. Compute a truncated SVD reconstruction\n",
+    "5. Compare the original, noisy, and denoised images\n",
+    "6. Measure quality using MSE and PSNR\n",
+    "\n",
+    "This notebook is written so that you can use **your own image files** directly."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "88584c56",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import numpy as np\n",
+    "import matplotlib.pyplot as plt\n",
+    "from PIL import Image\n",
+    "from pathlib import Path"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "30e96441",
+   "metadata": {},
+   "source": [
+    "## A note on color images\n",
+    "\n",
+    "For a grayscale image, SVD applies directly to a single matrix. For a color image $A \\in \\mathbb{R}^{n \\times p \\times 3}$, we write\n",
+    "$$\n",
+    "A = (A_R, A_G, A_B),\n",
+    "$$\n",
+    "where each channel is an $n \\times p$ matrix. We then compute a rank-$k$ approximation for each channel:\n",
+    "$$\n",
+    "A_R \\approx (A_R)_k,\\qquad\n",
+    "A_G \\approx (A_G)_k,\\qquad\n",
+    "A_B \\approx (A_B)_k,\n",
+    "$$\n",
+    "and stack them back together.\n",
+    "\n",
+    "This is the most direct extension of the grayscale method, and it works well as a first linear-algebraic treatment of color denoising."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f275cbc9",
+   "metadata": {},
+   "source": [
+    "## Helper functions\n",
+    "\n",
+    "We begin with some utilities for:\n",
+    "- loading images,\n",
+    "- adding Gaussian noise,\n",
+    "- reconstructing rank-$k$ approximations,\n",
+    "- computing image-quality metrics."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "21adfcaf",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def load_image(path, mode=\"rgb\"):\n",
+    "    \"\"\"\n",
+    "    Load an image from disk.\n",
+    "\n",
+    "    Parameters\n",
+    "    ----------\n",
+    "    path : str or Path\n",
+    "        Path to the image file.\n",
+    "    mode : {\"rgb\", \"gray\"}\n",
+    "        Whether to load the image as RGB or grayscale.\n",
+    "\n",
+    "    Returns\n",
+    "    -------\n",
+    "    np.ndarray\n",
+    "        Float image array scaled to [0, 255].\n",
+    "        Shape is (H, W, 3) for RGB and (H, W) for grayscale.\n",
+    "    \"\"\"\n",
+    "    path = Path(path)\n",
+    "    img = Image.open(path)\n",
+    "\n",
+    "    if mode.lower() in {\"gray\", \"grayscale\", \"l\"}:\n",
+    "        img = img.convert(\"L\")\n",
+    "    else:\n",
+    "        img = img.convert(\"RGB\")\n",
+    "\n",
+    "    return np.asarray(img, dtype=np.float64)\n",
+    "\n",
+    "\n",
+    "def show_image(img, title=None):\n",
+    "    \"\"\"Display a grayscale or RGB image.\"\"\"\n",
+    "    plt.figure(figsize=(6, 6))\n",
+    "    if img.ndim == 2:\n",
+    "        plt.imshow(np.clip(img, 0, 255), cmap=\"gray\", vmin=0, vmax=255)\n",
+    "    else:\n",
+    "        plt.imshow(np.clip(img, 0, 255).astype(np.uint8))\n",
+    "    if title is not None:\n",
+    "        plt.title(title)\n",
+    "    plt.axis(\"off\")\n",
+    "    plt.show()\n",
+    "\n",
+    "\n",
+    "def add_gaussian_noise(img, sigma=25, seed=0):\n",
+    "    \"\"\"\n",
+    "    Add Gaussian noise to an image.\n",
+    "\n",
+    "    Parameters\n",
+    "    ----------\n",
+    "    img : np.ndarray\n",
+    "        Image array in [0, 255].\n",
+    "    sigma : float\n",
+    "        Standard deviation of the noise.\n",
+    "    seed : int\n",
+    "        Random seed for reproducibility.\n",
+    "\n",
+    "    Returns\n",
+    "    -------\n",
+    "    np.ndarray\n",
+    "        Noisy image clipped to [0, 255].\n",
+    "    \"\"\"\n",
+    "    rng = np.random.default_rng(seed)\n",
+    "    noisy = img + rng.normal(loc=0.0, scale=sigma, size=img.shape)\n",
+    "    return np.clip(noisy, 0, 255)\n",
+    "\n",
+    "\n",
+    "def truncated_svd_matrix(A, k):\n",
+    "    \"\"\"\n",
+    "    Return the rank-k truncated SVD approximation of a 2D matrix A.\n",
+    "    \"\"\"\n",
+    "    U, s, Vt = np.linalg.svd(A, full_matrices=False)\n",
+    "    k = min(k, len(s))\n",
+    "    return (U[:, :k] * s[:k]) @ Vt[:k, :]\n",
+    "\n",
+    "\n",
+    "def truncated_svd_image(img, k):\n",
+    "    \"\"\"\n",
+    "    Apply truncated SVD to a grayscale or RGB image.\n",
+    "\n",
+    "    For RGB images, truncated SVD is applied channel-by-channel.\n",
+    "\n",
+    "    Parameters\n",
+    "    ----------\n",
+    "    img : np.ndarray\n",
+    "        Shape (H, W) for grayscale or (H, W, 3) for RGB.\n",
+    "    k : int\n",
+    "        Truncation rank.\n",
+    "\n",
+    "    Returns\n",
+    "    -------\n",
+    "    np.ndarray\n",
+    "        Reconstructed image clipped to [0, 255].\n",
+    "    \"\"\"\n",
+    "    if img.ndim == 2:\n",
+    "        recon = truncated_svd_matrix(img, k)\n",
+    "        return np.clip(recon, 0, 255)\n",
+    "\n",
+    "    if img.ndim == 3:\n",
+    "        channels = []\n",
+    "        for c in range(img.shape[2]):\n",
+    "            channel_recon = truncated_svd_matrix(img[:, :, c], k)\n",
+    "            channels.append(channel_recon)\n",
+    "        recon = np.stack(channels, axis=2)\n",
+    "        return np.clip(recon, 0, 255)\n",
+    "\n",
+    "    raise ValueError(\"Image must be either 2D (grayscale) or 3D (RGB).\")\n",
+    "\n",
+    "\n",
+    "def mse(A, B):\n",
+    "    \"\"\"Mean squared error between two images.\"\"\"\n",
+    "    return np.mean((A.astype(np.float64) - B.astype(np.float64)) ** 2)\n",
+    "\n",
+    "\n",
+    "def psnr(A, B, max_val=255.0):\n",
+    "    \"\"\"Peak signal-to-noise ratio in decibels.\"\"\"\n",
+    "    err = mse(A, B)\n",
+    "    if err == 0:\n",
+    "        return np.inf\n",
+    "    return 10 * np.log10((max_val ** 2) / err)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "fe1d4932",
+   "metadata": {},
+   "source": [
+    "## Choose an image"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "42bafca5",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "IMAGE_PATH = \"../images/bella.jpg\" \n",
+    "MODE = \"rgb\"               # use \"gray\" for grayscale, \"rgb\" for color\n",
+    "\n",
+    "img = load_image(IMAGE_PATH, mode=MODE)\n",
+    "print(\"Image shape:\", img.shape)\n",
+    "show_image(img, title=f\"Original image ({MODE})\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "05e52222",
+   "metadata": {},
+   "source": [
+    "## Add synthetic Gaussian noise\n",
+    "\n",
+    "We add noise so that the denoising effect is visible and measurable."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "528e69b3",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "sigma = 25\n",
+    "seed = 0\n",
+    "\n",
+    "img_noisy = add_gaussian_noise(img, sigma=sigma, seed=seed)\n",
+    "\n",
+    "img_noisy.save('../images/bella_noisy.jpg')\n",
+    "\n",
+    "show_image(img_noisy, title=f\"Noisy image (sigma={sigma})\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "1bbcc1d8",
+   "metadata": {},
+   "source": [
+    "## Visualizing rank-$k$ reconstructions\n",
+    "\n",
+    "For small $k$, the reconstruction captures only coarse structure.\n",
+    "As $k$ increases, more detail returns. For denoising, there is often a useful middle ground:\n",
+    "enough singular values to preserve structure, but not so many that we reintroduce noise."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "563df53a",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import numpy as np\n",
+    "import matplotlib.pyplot as plt\n",
+    "import math\n",
+    "\n",
+    "ks = [5, 20, 50, 100]\n",
+    "\n",
+    "# Collect all images + titles\n",
+    "images = []\n",
+    "titles = []\n",
+    "\n",
+    "# Original\n",
+    "images.append(img)\n",
+    "titles.append(\"Original\")\n",
+    "\n",
+    "# Noisy\n",
+    "images.append(img_noisy)\n",
+    "titles.append(\"Noisy\")\n",
+    "\n",
+    "# Reconstructions\n",
+    "for k in ks:\n",
+    "    recon = truncated_svd_image(img_noisy, k)\n",
+    "    images.append(recon)\n",
+    "    titles.append(f\"k = {k}\")\n",
+    "\n",
+    "# Grid setup\n",
+    "ncols = 2\n",
+    "nrows = math.ceil(len(images) / ncols)\n",
+    "\n",
+    "fig, axes = plt.subplots(nrows, ncols, figsize=(6 * ncols, 4 * nrows))\n",
+    "axes = axes.flatten()  # easier indexing\n",
+    "\n",
+    "# Plot everything\n",
+    "for i, (ax, im, title) in enumerate(zip(axes, images, titles)):\n",
+    "    if im.ndim == 2:\n",
+    "        ax.imshow(im, cmap=\"gray\", vmin=0, vmax=255)\n",
+    "    else:\n",
+    "        ax.imshow(np.clip(im, 0, 255).astype(np.uint8))\n",
+    "    \n",
+    "    ax.set_title(title)\n",
+    "    ax.axis(\"off\")\n",
+    "\n",
+    "# Hide any unused axes\n",
+    "for j in range(len(images), len(axes)):\n",
+    "    axes[j].axis(\"off\")\n",
+    "\n",
+    "plt.tight_layout()\n",
+    "plt.savefig('../images/bella_truncated_svd_multiplie_ks.png')\n",
+    "plt.show()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "309579fa",
+   "metadata": {},
+   "source": [
+    "## Quantitative evaluation\n",
+    "\n",
+    "We compare each reconstruction against the **clean original image**, not against the noisy one.\n",
+    "A good denoising rank should typically:\n",
+    "- reduce MSE relative to the noisy image,\n",
+    "- increase PSNR relative to the noisy image."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "56ce07ee",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "baseline_mse = mse(img, img_noisy)\n",
+    "baseline_psnr = psnr(img, img_noisy)\n",
+    "\n",
+    "print(f\"Noisy image baseline -> MSE: {baseline_mse:.2f}, PSNR: {baseline_psnr:.2f} dB\")\n",
+    "\n",
+    "results = []\n",
+    "for k in ks:\n",
+    "    recon = truncated_svd_image(img_noisy, k)\n",
+    "    results.append((k, mse(img, recon), psnr(img, recon)))\n",
+    "\n",
+    "print(\"\\nRank-k reconstructions:\")\n",
+    "for k, m, p in results:\n",
+    "    print(f\"k = {k:3d} | MSE = {m:10.2f} | PSNR = {p:6.2f} dB\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "de9c3f3c",
+   "metadata": {},
+   "source": [
+    "## Automatic search over many values of $k$\n",
+    "\n",
+    "This is often useful because the best denoising rank is image-dependent."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e097dcf4",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "candidate_ks = list(range(1, 151, 5))\n",
+    "\n",
+    "scores = []\n",
+    "for k in candidate_ks:\n",
+    "    recon = truncated_svd_image(img_noisy, k)\n",
+    "    scores.append((k, mse(img, recon), psnr(img, recon)))\n",
+    "\n",
+    "best_by_mse = min(scores, key=lambda x: x[1])\n",
+    "best_by_psnr = max(scores, key=lambda x: x[2])\n",
+    "\n",
+    "print(\"Best by MSE :\", best_by_mse)\n",
+    "print(\"Best by PSNR:\", best_by_psnr)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "4b9dc5c7",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "best_k = best_by_psnr[0]\n",
+    "best_recon = truncated_svd_image(img_noisy, best_k)\n",
+    "\n",
+    "fig, axes = plt.subplots(1, 3, figsize=(15, 5))\n",
+    "\n",
+    "if img.ndim == 2:\n",
+    "    axes[0].imshow(img, cmap=\"gray\", vmin=0, vmax=255)\n",
+    "    axes[1].imshow(img_noisy, cmap=\"gray\", vmin=0, vmax=255)\n",
+    "    axes[2].imshow(best_recon, cmap=\"gray\", vmin=0, vmax=255)\n",
+    "else:\n",
+    "    axes[0].imshow(np.clip(img, 0, 255).astype(np.uint8))\n",
+    "    axes[1].imshow(np.clip(img_noisy, 0, 255).astype(np.uint8))\n",
+    "    axes[2].imshow(np.clip(best_recon, 0, 255).astype(np.uint8))\n",
+    "\n",
+    "axes[0].set_title(\"Original\")\n",
+    "axes[1].set_title(\"Noisy\")\n",
+    "axes[2].set_title(f\"Best reconstruction (k={best_k})\")\n",
+    "\n",
+    "for ax in axes:\n",
+    "    ax.axis(\"off\")\n",
+    "\n",
+    "plt.tight_layout()\n",
+    "plt.savefig('../images/bella_best_truncated.png')\n",
+    "plt.show()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "1a3acfe5",
+   "metadata": {},
+   "source": [
+    "## Remarks and possible extensions\n",
+    "\n",
+    "- The same rank $k$ was used for every color channel. You could instead choose different ranks per channel.\n",
+    "- You can test this on photographs, scanned documents, or screenshots.\n",
+    "- Truncated SVD is excellent for illustrating low-rank structure, but it is not the only denoising method.\n",
+    "- A more advanced next step would be to compare SVD denoising against:\n",
+    "  - Gaussian blur,\n",
+    "  - median filtering,\n",
+    "  - wavelet denoising,\n",
+    "  - non-local means,\n",
+    "  - autoencoder-based denoising.\n",
+    "\n",
+    "For this notebook, though, the point is to keep the method squarely grounded in linear algebra."
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.14.3"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}