{ "cells": [ { "cell_type": "markdown", "id": "6ae6c7f8", "metadata": {}, "source": [ "# Spectral Image Denoising via Truncated SVD\n", "\n", "This notebook extracts the image denoising project into a standalone workflow and extends it from **grayscale images** to **actual color images**.\n", "\n", "The core idea is the same as in the original write-up: if an image matrix has singular value decomposition\n", "$$\n", "A = U \\Sigma V^T,\n", "$$\n", "then the best rank-$k$ approximation to $A$ in Frobenius norm is obtained by truncating the SVD. This is the **Eckart–Young–Mirsky theorem**.\n", "\n", "For a grayscale image, the image is a single matrix. For an RGB image, we treat the image as **three matrices**, one for each channel, and apply truncated SVD to each channel separately." ] }, { "cell_type": "markdown", "id": "31a665c9", "metadata": {}, "source": [ "## Outline\n", "\n", "1. Load an image from disk\n", "2. Convert it to grayscale or keep it in RGB\n", "3. Add synthetic Gaussian noise\n", "4. Compute a truncated SVD reconstruction\n", "5. Compare the original, noisy, and denoised images\n", "6. Measure quality using MSE and PSNR\n", "\n", "This notebook is written so that you can use **your own image files** directly." ] }, { "cell_type": "code", "execution_count": null, "id": "88584c56", "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import matplotlib.pyplot as plt\n", "from PIL import Image\n", "from pathlib import Path" ] }, { "cell_type": "markdown", "id": "30e96441", "metadata": {}, "source": [ "## A note on color images\n", "\n", "For a grayscale image, SVD applies directly to a single matrix. For a color image $A \\in \\mathbb{R}^{n \\times p \\times 3}$, we write\n", "$$\n", "A = (A_R, A_G, A_B),\n", "$$\n", "where each channel is an $n \\times p$ matrix. We then compute a rank-$k$ approximation for each channel:\n", "$$\n", "A_R \\approx (A_R)_k,\\qquad\n", "A_G \\approx (A_G)_k,\\qquad\n", "A_B \\approx (A_B)_k,\n", "$$\n", "and stack them back together.\n", "\n", "This is the most direct extension of the grayscale method, and it works well as a first linear-algebraic treatment of color denoising." ] }, { "cell_type": "markdown", "id": "f275cbc9", "metadata": {}, "source": [ "## Helper functions\n", "\n", "We begin with some utilities for:\n", "- loading images,\n", "- adding Gaussian noise,\n", "- reconstructing rank-$k$ approximations,\n", "- computing image-quality metrics." ] }, { "cell_type": "code", "execution_count": null, "id": "21adfcaf", "metadata": {}, "outputs": [], "source": [ "def load_image(path, mode=\"rgb\"):\n", " \"\"\"\n", " Load an image from disk.\n", "\n", " Parameters\n", " ----------\n", " path : str or Path\n", " Path to the image file.\n", " mode : {\"rgb\", \"gray\"}\n", " Whether to load the image as RGB or grayscale.\n", "\n", " Returns\n", " -------\n", " np.ndarray\n", " Float image array scaled to [0, 255].\n", " Shape is (H, W, 3) for RGB and (H, W) for grayscale.\n", " \"\"\"\n", " path = Path(path)\n", " img = Image.open(path)\n", "\n", " if mode.lower() in {\"gray\", \"grayscale\", \"l\"}:\n", " img = img.convert(\"L\")\n", " else:\n", " img = img.convert(\"RGB\")\n", "\n", " return np.asarray(img, dtype=np.float64)\n", "\n", "\n", "def show_image(img, title=None):\n", " \"\"\"Display a grayscale or RGB image.\"\"\"\n", " plt.figure(figsize=(6, 6))\n", " if img.ndim == 2:\n", " plt.imshow(np.clip(img, 0, 255), cmap=\"gray\", vmin=0, vmax=255)\n", " else:\n", " plt.imshow(np.clip(img, 0, 255).astype(np.uint8))\n", " if title is not None:\n", " plt.title(title)\n", " plt.axis(\"off\")\n", " plt.show()\n", "\n", "\n", "def add_gaussian_noise(img, sigma=25, seed=0):\n", " \"\"\"\n", " Add Gaussian noise to an image.\n", "\n", " Parameters\n", " ----------\n", " img : np.ndarray\n", " Image array in [0, 255].\n", " sigma : float\n", " Standard deviation of the noise.\n", " seed : int\n", " Random seed for reproducibility.\n", "\n", " Returns\n", " -------\n", " np.ndarray\n", " Noisy image clipped to [0, 255].\n", " \"\"\"\n", " rng = np.random.default_rng(seed)\n", " noisy = img + rng.normal(loc=0.0, scale=sigma, size=img.shape)\n", " return np.clip(noisy, 0, 255)\n", "\n", "\n", "def truncated_svd_matrix(A, k):\n", " \"\"\"\n", " Return the rank-k truncated SVD approximation of a 2D matrix A.\n", " \"\"\"\n", " U, s, Vt = np.linalg.svd(A, full_matrices=False)\n", " k = min(k, len(s))\n", " return (U[:, :k] * s[:k]) @ Vt[:k, :]\n", "\n", "\n", "def truncated_svd_image(img, k):\n", " \"\"\"\n", " Apply truncated SVD to a grayscale or RGB image.\n", "\n", " For RGB images, truncated SVD is applied channel-by-channel.\n", "\n", " Parameters\n", " ----------\n", " img : np.ndarray\n", " Shape (H, W) for grayscale or (H, W, 3) for RGB.\n", " k : int\n", " Truncation rank.\n", "\n", " Returns\n", " -------\n", " np.ndarray\n", " Reconstructed image clipped to [0, 255].\n", " \"\"\"\n", " if img.ndim == 2:\n", " recon = truncated_svd_matrix(img, k)\n", " return np.clip(recon, 0, 255)\n", "\n", " if img.ndim == 3:\n", " channels = []\n", " for c in range(img.shape[2]):\n", " channel_recon = truncated_svd_matrix(img[:, :, c], k)\n", " channels.append(channel_recon)\n", " recon = np.stack(channels, axis=2)\n", " return np.clip(recon, 0, 255)\n", "\n", " raise ValueError(\"Image must be either 2D (grayscale) or 3D (RGB).\")\n", "\n", "\n", "def mse(A, B):\n", " \"\"\"Mean squared error between two images.\"\"\"\n", " return np.mean((A.astype(np.float64) - B.astype(np.float64)) ** 2)\n", "\n", "\n", "def psnr(A, B, max_val=255.0):\n", " \"\"\"Peak signal-to-noise ratio in decibels.\"\"\"\n", " err = mse(A, B)\n", " if err == 0:\n", " return np.inf\n", " return 10 * np.log10((max_val ** 2) / err)" ] }, { "cell_type": "markdown", "id": "fe1d4932", "metadata": {}, "source": [ "## Choose an image" ] }, { "cell_type": "code", "execution_count": null, "id": "42bafca5", "metadata": {}, "outputs": [], "source": [ "IMAGE_PATH = \"../images/bella.jpg\" \n", "MODE = \"rgb\" # use \"gray\" for grayscale, \"rgb\" for color\n", "\n", "img = load_image(IMAGE_PATH, mode=MODE)\n", "print(\"Image shape:\", img.shape)\n", "show_image(img, title=f\"Original image ({MODE})\")" ] }, { "cell_type": "markdown", "id": "05e52222", "metadata": {}, "source": [ "## Add synthetic Gaussian noise\n", "\n", "We add noise so that the denoising effect is visible and measurable." ] }, { "cell_type": "code", "execution_count": null, "id": "528e69b3", "metadata": {}, "outputs": [], "source": [ "sigma = 25\n", "seed = 0\n", "\n", "img_noisy = add_gaussian_noise(img, sigma=sigma, seed=seed)\n", "\n", "img_noisy.save('../images/bella_noisy.jpg')\n", "\n", "show_image(img_noisy, title=f\"Noisy image (sigma={sigma})\")" ] }, { "cell_type": "markdown", "id": "1bbcc1d8", "metadata": {}, "source": [ "## Visualizing rank-$k$ reconstructions\n", "\n", "For small $k$, the reconstruction captures only coarse structure.\n", "As $k$ increases, more detail returns. For denoising, there is often a useful middle ground:\n", "enough singular values to preserve structure, but not so many that we reintroduce noise." ] }, { "cell_type": "code", "execution_count": null, "id": "563df53a", "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import matplotlib.pyplot as plt\n", "import math\n", "\n", "ks = [5, 20, 50, 100]\n", "\n", "# Collect all images + titles\n", "images = []\n", "titles = []\n", "\n", "# Original\n", "images.append(img)\n", "titles.append(\"Original\")\n", "\n", "# Noisy\n", "images.append(img_noisy)\n", "titles.append(\"Noisy\")\n", "\n", "# Reconstructions\n", "for k in ks:\n", " recon = truncated_svd_image(img_noisy, k)\n", " images.append(recon)\n", " titles.append(f\"k = {k}\")\n", "\n", "# Grid setup\n", "ncols = 2\n", "nrows = math.ceil(len(images) / ncols)\n", "\n", "fig, axes = plt.subplots(nrows, ncols, figsize=(6 * ncols, 4 * nrows))\n", "axes = axes.flatten() # easier indexing\n", "\n", "# Plot everything\n", "for i, (ax, im, title) in enumerate(zip(axes, images, titles)):\n", " if im.ndim == 2:\n", " ax.imshow(im, cmap=\"gray\", vmin=0, vmax=255)\n", " else:\n", " ax.imshow(np.clip(im, 0, 255).astype(np.uint8))\n", " \n", " ax.set_title(title)\n", " ax.axis(\"off\")\n", "\n", "# Hide any unused axes\n", "for j in range(len(images), len(axes)):\n", " axes[j].axis(\"off\")\n", "\n", "plt.tight_layout()\n", "plt.savefig('../images/bella_truncated_svd_multiplie_ks.png')\n", "plt.show()" ] }, { "cell_type": "markdown", "id": "309579fa", "metadata": {}, "source": [ "## Quantitative evaluation\n", "\n", "We compare each reconstruction against the **clean original image**, not against the noisy one.\n", "A good denoising rank should typically:\n", "- reduce MSE relative to the noisy image,\n", "- increase PSNR relative to the noisy image." ] }, { "cell_type": "code", "execution_count": null, "id": "56ce07ee", "metadata": {}, "outputs": [], "source": [ "baseline_mse = mse(img, img_noisy)\n", "baseline_psnr = psnr(img, img_noisy)\n", "\n", "print(f\"Noisy image baseline -> MSE: {baseline_mse:.2f}, PSNR: {baseline_psnr:.2f} dB\")\n", "\n", "results = []\n", "for k in ks:\n", " recon = truncated_svd_image(img_noisy, k)\n", " results.append((k, mse(img, recon), psnr(img, recon)))\n", "\n", "print(\"\\nRank-k reconstructions:\")\n", "for k, m, p in results:\n", " print(f\"k = {k:3d} | MSE = {m:10.2f} | PSNR = {p:6.2f} dB\")" ] }, { "cell_type": "markdown", "id": "de9c3f3c", "metadata": {}, "source": [ "## Automatic search over many values of $k$\n", "\n", "This is often useful because the best denoising rank is image-dependent." ] }, { "cell_type": "code", "execution_count": null, "id": "e097dcf4", "metadata": {}, "outputs": [], "source": [ "candidate_ks = list(range(1, 151, 5))\n", "\n", "scores = []\n", "for k in candidate_ks:\n", " recon = truncated_svd_image(img_noisy, k)\n", " scores.append((k, mse(img, recon), psnr(img, recon)))\n", "\n", "best_by_mse = min(scores, key=lambda x: x[1])\n", "best_by_psnr = max(scores, key=lambda x: x[2])\n", "\n", "print(\"Best by MSE :\", best_by_mse)\n", "print(\"Best by PSNR:\", best_by_psnr)" ] }, { "cell_type": "code", "execution_count": null, "id": "4b9dc5c7", "metadata": {}, "outputs": [], "source": [ "best_k = best_by_psnr[0]\n", "best_recon = truncated_svd_image(img_noisy, best_k)\n", "\n", "fig, axes = plt.subplots(1, 3, figsize=(15, 5))\n", "\n", "if img.ndim == 2:\n", " axes[0].imshow(img, cmap=\"gray\", vmin=0, vmax=255)\n", " axes[1].imshow(img_noisy, cmap=\"gray\", vmin=0, vmax=255)\n", " axes[2].imshow(best_recon, cmap=\"gray\", vmin=0, vmax=255)\n", "else:\n", " axes[0].imshow(np.clip(img, 0, 255).astype(np.uint8))\n", " axes[1].imshow(np.clip(img_noisy, 0, 255).astype(np.uint8))\n", " axes[2].imshow(np.clip(best_recon, 0, 255).astype(np.uint8))\n", "\n", "axes[0].set_title(\"Original\")\n", "axes[1].set_title(\"Noisy\")\n", "axes[2].set_title(f\"Best reconstruction (k={best_k})\")\n", "\n", "for ax in axes:\n", " ax.axis(\"off\")\n", "\n", "plt.tight_layout()\n", "plt.savefig('../images/bella_best_truncated.png')\n", "plt.show()" ] }, { "cell_type": "markdown", "id": "1a3acfe5", "metadata": {}, "source": [ "## Remarks and possible extensions\n", "\n", "- The same rank $k$ was used for every color channel. You could instead choose different ranks per channel.\n", "- You can test this on photographs, scanned documents, or screenshots.\n", "- Truncated SVD is excellent for illustrating low-rank structure, but it is not the only denoising method.\n", "- A more advanced next step would be to compare SVD denoising against:\n", " - Gaussian blur,\n", " - median filtering,\n", " - wavelet denoising,\n", " - non-local means,\n", " - autoencoder-based denoising.\n", "\n", "For this notebook, though, the point is to keep the method squarely grounded in linear algebra." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.14.3" } }, "nbformat": 4, "nbformat_minor": 5 }