Files
DS-for-LA/notebooks/05_svd_image_denoising.ipynb
Pawel Sarkowicz 290e8bc3e6 link to website
2026-03-31 16:05:58 -04:00

812 lines
27 KiB
Plaintext
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
{
"cells": [
{
"cell_type": "markdown",
"id": "6ae6c7f8",
"metadata": {},
"source": [
"# Spectral Image Denoising via Truncated SVD\n",
"\n",
"This notebook extracts the image denoising project into a standalone workflow and extends it from **grayscale images** to **actual color images**.\n",
"\n",
"The core idea is the same as in the original write-up: if an image matrix has singular value decomposition\n",
"$$\n",
"A = U \\Sigma V^T,\n",
"$$\n",
"then the best rank-$k$ approximation to $A$ in Frobenius norm is obtained by truncating the SVD. This is the **EckartYoungMirsky theorem**.\n",
"\n",
"For a grayscale image, the image is a single matrix. For an RGB image, we treat the image as **three matrices**, one for each channel, and apply truncated SVD to each channel separately."
]
},
{
"cell_type": "markdown",
"id": "31a665c9",
"metadata": {},
"source": [
"## Outline\n",
"\n",
"1. Load an image from disk\n",
"2. Convert it to grayscale or keep it in RGB\n",
"3. Add synthetic Gaussian noise\n",
"4. Compute a truncated SVD reconstruction\n",
"5. Compare the original, noisy, and denoised images\n",
"6. Measure quality using MSE and PSNR\n",
"\n",
"This notebook is written so that you can use **your own image files** directly."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "88584c56",
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"from PIL import Image\n",
"from pathlib import Path\n",
"\n",
"try:\n",
" from skimage.metrics import structural_similarity as ssim\n",
" HAS_SKIMAGE = True\n",
"except ImportError:\n",
" ssim = None\n",
" HAS_SKIMAGE = False\n",
"\n",
"print(f\"scikit-image available: {HAS_SKIMAGE}\")"
]
},
{
"cell_type": "markdown",
"id": "30e96441",
"metadata": {},
"source": [
"## A note on color images\n",
"\n",
"For a grayscale image, SVD applies directly to a single matrix. For a color image $A \\in \\mathbb{R}^{n \\times p \\times 3}$, we write\n",
"$$\n",
"A = (A_R, A_G, A_B),\n",
"$$\n",
"where each channel is an $n \\times p$ matrix. We then compute a rank-$k$ approximation for each channel:\n",
"$$\n",
"A_R \\approx (A_R)_k,\\qquad\n",
"A_G \\approx (A_G)_k,\\qquad\n",
"A_B \\approx (A_B)_k,\n",
"$$\n",
"and stack them back together.\n",
"\n",
"This is the most direct extension of the grayscale method, and it works well as a first linear-algebraic treatment of color denoising."
]
},
{
"cell_type": "markdown",
"id": "f275cbc9",
"metadata": {},
"source": [
"## Helper functions\n",
"\n",
"We begin with some utilities for:\n",
"- loading images,\n",
"- adding Gaussian noise,\n",
"- reconstructing rank-$k$ approximations,\n",
"- computing image-quality metrics."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "21adfcaf",
"metadata": {},
"outputs": [],
"source": [
"def load_image(path, mode=\"rgb\"):\n",
" \"\"\"\n",
" Load an image from disk.\n",
"\n",
" Parameters\n",
" ----------\n",
" path : str or Path\n",
" Path to the image file.\n",
" mode : {\"rgb\", \"gray\"}\n",
" Whether to load the image as RGB or grayscale.\n",
"\n",
" Returns\n",
" -------\n",
" np.ndarray\n",
" Float image array scaled to [0, 255].\n",
" Shape is (H, W, 3) for RGB and (H, W) for grayscale.\n",
" \"\"\"\n",
" path = Path(path)\n",
" if not path.exists():\n",
" raise FileNotFoundError(f\"Could not find image file: {path}\")\n",
"\n",
" img = Image.open(path)\n",
"\n",
" if mode.lower() in {\"gray\", \"grayscale\", \"l\"}:\n",
" img = img.convert(\"L\")\n",
" else:\n",
" img = img.convert(\"RGB\")\n",
"\n",
" return np.asarray(img, dtype=np.float64)\n",
"\n",
"\n",
"def show_image(img, title=None):\n",
" \"\"\"Display a grayscale or RGB image.\"\"\"\n",
" plt.figure(figsize=(6, 6))\n",
" if img.ndim == 2:\n",
" plt.imshow(np.clip(img, 0, 255), cmap=\"gray\", vmin=0, vmax=255)\n",
" else:\n",
" plt.imshow(np.clip(img, 0, 255).astype(np.uint8))\n",
" if title is not None:\n",
" plt.title(title)\n",
" plt.axis(\"off\")\n",
" plt.tight_layout()\n",
" plt.show()\n",
"\n",
"\n",
"def add_gaussian_noise(img, sigma=25, seed=0):\n",
" \"\"\"\n",
" Add Gaussian noise to an image.\n",
"\n",
" Parameters\n",
" ----------\n",
" img : np.ndarray\n",
" Image array in [0, 255].\n",
" sigma : float\n",
" Standard deviation of the noise.\n",
" seed : int\n",
" Random seed for reproducibility.\n",
"\n",
" Returns\n",
" -------\n",
" np.ndarray\n",
" Noisy image clipped to [0, 255].\n",
" \"\"\"\n",
" rng = np.random.default_rng(seed)\n",
" noisy = img + rng.normal(loc=0.0, scale=sigma, size=img.shape)\n",
" return np.clip(noisy, 0, 255)\n",
"\n",
"\n",
"def truncated_svd_matrix(A, k):\n",
" \"\"\"\n",
" Return the rank-k truncated SVD approximation of a 2D matrix A.\n",
" \"\"\"\n",
" U, s, Vt = np.linalg.svd(A, full_matrices=False)\n",
" k = min(k, len(s))\n",
" return (U[:, :k] * s[:k]) @ Vt[:k, :]\n",
"\n",
"\n",
"def truncated_svd_image(img, k):\n",
" \"\"\"\n",
" Apply truncated SVD to a grayscale or RGB image.\n",
"\n",
" For RGB images, truncated SVD is applied channel-by-channel.\n",
" \"\"\"\n",
" if img.ndim == 2:\n",
" recon = truncated_svd_matrix(img, k)\n",
" return np.clip(recon, 0, 255)\n",
"\n",
" if img.ndim == 3:\n",
" channels = []\n",
" for c in range(img.shape[2]):\n",
" channel_recon = truncated_svd_matrix(img[:, :, c], k)\n",
" channels.append(channel_recon)\n",
" recon = np.stack(channels, axis=2)\n",
" return np.clip(recon, 0, 255)\n",
"\n",
" raise ValueError(\"Image must be either 2D (grayscale) or 3D (RGB).\")\n",
"\n",
"\n",
"def mse(A, B):\n",
" \"\"\"Mean squared error between two images.\"\"\"\n",
" return np.mean((A.astype(np.float64) - B.astype(np.float64)) ** 2)\n",
"\n",
"\n",
"def psnr(A, B, max_val=255.0):\n",
" \"\"\"Peak signal-to-noise ratio in decibels.\"\"\"\n",
" err = mse(A, B)\n",
" if err == 0:\n",
" return np.inf\n",
" return 10 * np.log10((max_val ** 2) / err)\n",
"\n",
"\n",
"def image_ssim(A, B, max_val=255.0):\n",
" \"\"\"\n",
" Structural similarity index.\n",
"\n",
" For RGB images, compute SSIM channel-by-channel and average.\n",
" Returns None when scikit-image is unavailable.\n",
" \"\"\"\n",
" if not HAS_SKIMAGE:\n",
" return None\n",
"\n",
" A = A.astype(np.float64)\n",
" B = B.astype(np.float64)\n",
"\n",
" if A.ndim == 2:\n",
" return float(ssim(A, B, data_range=max_val))\n",
"\n",
" if A.ndim == 3:\n",
" vals = [ssim(A[:, :, c], B[:, :, c], data_range=max_val) for c in range(A.shape[2])]\n",
" return float(np.mean(vals))\n",
"\n",
" raise ValueError(\"Images must be either 2D (grayscale) or 3D (RGB).\")"
]
},
{
"cell_type": "markdown",
"id": "fe1d4932",
"metadata": {},
"source": [
"## Choose an image"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "42bafca5",
"metadata": {},
"outputs": [],
"source": [
"from pathlib import Path\n",
"\n",
"MODE = \"rgb\" # use \"gray\" for grayscale, \"rgb\" for color\n",
"\n",
"candidate_paths = [\n",
" Path(\"../images/bella.jpg\"),\n",
" Path(\"images/bella.jpg\"),\n",
" Path(\"bella.jpg\"),\n",
"]\n",
"\n",
"IMAGE_PATH = None\n",
"for p in candidate_paths:\n",
" if p.exists():\n",
" IMAGE_PATH = p\n",
" break\n",
"\n",
"if IMAGE_PATH is None:\n",
" raise FileNotFoundError(\n",
" \"Could not find bella.jpg. Put it in ../images/, images/, or the notebook folder.\"\n",
" )\n",
"\n",
"img = load_image(IMAGE_PATH, mode=MODE)\n",
"print(\"Using image:\", IMAGE_PATH)\n",
"print(\"Image shape:\", img.shape)\n",
"show_image(img, title=f\"Original image ({MODE})\")"
]
},
{
"cell_type": "markdown",
"id": "05e52222",
"metadata": {},
"source": [
"## Add synthetic Gaussian noise\n",
"\n",
"We add noise so that the denoising effect is visible and measurable."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "528e69b3",
"metadata": {},
"outputs": [],
"source": [
"sigma = 40\n",
"seed = 0\n",
"\n",
"img_noisy = add_gaussian_noise(img, sigma=sigma, seed=seed)\n",
"\n",
"noisy_output_path = IMAGE_PATH.with_name(f\"{IMAGE_PATH.stem}_noisy.png\")\n",
"Image.fromarray(np.clip(img_noisy, 0, 255).astype(np.uint8)).save(noisy_output_path)\n",
"print(\"Saved noisy image to:\", noisy_output_path)\n",
"\n",
"show_image(img_noisy, title=f\"Noisy image (sigma={sigma})\")"
]
},
{
"cell_type": "markdown",
"id": "1bbcc1d8",
"metadata": {},
"source": [
"## Visualizing rank-$k$ reconstructions\n",
"\n",
"For small $k$, the reconstruction captures only coarse structure.\n",
"As $k$ increases, more detail returns. For denoising, there is often a useful middle ground:\n",
"enough singular values to preserve structure, but not so many that we reintroduce noise."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "563df53a",
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"import math\n",
"\n",
"ks = [5, 20, 50, 100]\n",
"\n",
"# Collect all images + titles\n",
"images = []\n",
"titles = []\n",
"\n",
"# Original\n",
"images.append(img)\n",
"titles.append(\"Original\")\n",
"\n",
"# Noisy\n",
"images.append(img_noisy)\n",
"titles.append(\"Noisy\")\n",
"\n",
"# Reconstructions\n",
"for k in ks:\n",
" recon = truncated_svd_image(img_noisy, k)\n",
" images.append(recon)\n",
" titles.append(f\"k = {k}\")\n",
"\n",
"# Grid setup\n",
"ncols = 2\n",
"nrows = math.ceil(len(images) / ncols)\n",
"\n",
"fig, axes = plt.subplots(nrows, ncols, figsize=(6 * ncols, 4 * nrows))\n",
"axes = axes.flatten() # easier indexing\n",
"\n",
"# Plot everything\n",
"for i, (ax, im, title) in enumerate(zip(axes, images, titles)):\n",
" if im.ndim == 2:\n",
" ax.imshow(im, cmap=\"gray\", vmin=0, vmax=255)\n",
" else:\n",
" ax.imshow(np.clip(im, 0, 255).astype(np.uint8))\n",
" \n",
" ax.set_title(title)\n",
" ax.axis(\"off\")\n",
"\n",
"# Hide any unused axes\n",
"for j in range(len(images), len(axes)):\n",
" axes[j].axis(\"off\")\n",
"\n",
"plt.tight_layout()\n",
"comparison_output_path = IMAGE_PATH.with_name(f\"{IMAGE_PATH.stem}_truncated_svd_multiple_ks.png\")\n",
"print(\"Saved comparison figure to:\", comparison_output_path)\n",
"plt.savefig(comparison_output_path, bbox_inches=\"tight\")\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"id": "309579fa",
"metadata": {},
"source": [
"## Quantitative evaluation\n",
"\n",
"We compare each reconstruction against the **clean original image**, not against the noisy one.\n",
"A good denoising rank should typically:\n",
"- reduce MSE relative to the noisy image,\n",
"- increase PSNR relative to the noisy image."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "56ce07ee",
"metadata": {},
"outputs": [],
"source": [
"baseline_mse = mse(img, img_noisy)\n",
"baseline_psnr = psnr(img, img_noisy)\n",
"\n",
"print(f\"Noisy image baseline -> MSE: {baseline_mse:.2f}, PSNR: {baseline_psnr:.2f} dB\")\n",
"\n",
"results = []\n",
"for k in ks:\n",
" recon = truncated_svd_image(img_noisy, k)\n",
" results.append((k, mse(img, recon), psnr(img, recon)))\n",
"\n",
"print(\"\\nRank-k reconstructions:\")\n",
"for k, m, p in results:\n",
" print(f\"k = {k:3d} | MSE = {m:10.2f} | PSNR = {p:6.2f} dB\")"
]
},
{
"cell_type": "markdown",
"id": "f2fe6fe2",
"metadata": {},
"source": [
"\n",
"## Efficient search over many values of $k$\n",
"\n",
"A naive implementation would recompute the SVD from scratch for every candidate value of $k$.\n",
"That is extremely expensive: every reconstruction would require a fresh factorization of each\n",
"channel of the noisy image.\n",
"\n",
"A much better approach is:\n",
"\n",
"1. compute the SVD **once** for each channel;\n",
"2. reuse those factors for every candidate $k$;\n",
"3. compare reconstructions using MSE, PSNR, and optionally SSIM.\n",
"\n",
"This is also a nice numerical linear algebra point: all rank-$k$ truncated reconstructions come\n",
"from the **same** singular value decomposition.\n",
"\n",
"We compare two variants:\n",
"\n",
"- **plain truncated SVD**, applied directly to each channel;\n",
"- **centered truncated SVD**, where we subtract each channel's column mean before factorizing and\n",
" add it back after reconstruction.\n",
"\n",
"The centered version sometimes improves reconstruction slightly because the low-rank approximation\n",
"spends less effort representing the mean structure.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4277a913",
"metadata": {},
"outputs": [],
"source": [
"\n",
"def precompute_svd_image(img):\n",
" \"\"\"Precompute plain SVD factors for each channel.\"\"\"\n",
" if img.ndim == 2:\n",
" A = img.astype(np.float64)\n",
" U, s, Vt = np.linalg.svd(A, full_matrices=False)\n",
" return [(U, s, Vt)]\n",
"\n",
" cache = []\n",
" for c in range(img.shape[2]):\n",
" A = img[:, :, c].astype(np.float64)\n",
" U, s, Vt = np.linalg.svd(A, full_matrices=False)\n",
" cache.append((U, s, Vt))\n",
" return cache\n",
"\n",
"\n",
"def precompute_centered_svd_image(img):\n",
" \"\"\"Precompute centered SVD factors for each channel.\"\"\"\n",
" if img.ndim == 2:\n",
" A = img.astype(np.float64)\n",
" col_mean = A.mean(axis=0, keepdims=True)\n",
" A_centered = A - col_mean\n",
" U, s, Vt = np.linalg.svd(A_centered, full_matrices=False)\n",
" return [(U, s, Vt, col_mean)]\n",
"\n",
" cache = []\n",
" for c in range(img.shape[2]):\n",
" A = img[:, :, c].astype(np.float64)\n",
" col_mean = A.mean(axis=0, keepdims=True)\n",
" A_centered = A - col_mean\n",
" U, s, Vt = np.linalg.svd(A_centered, full_matrices=False)\n",
" cache.append((U, s, Vt, col_mean))\n",
" return cache\n",
"\n",
"\n",
"def reconstruct_from_svd_cache(cache, k):\n",
" \"\"\"Reconstruct from precomputed plain SVD factors.\"\"\"\n",
" channels = []\n",
" for U, s, Vt in cache:\n",
" kk = min(k, len(s))\n",
" recon = (U[:, :kk] * s[:kk]) @ Vt[:kk, :]\n",
" channels.append(np.clip(recon, 0, 255))\n",
"\n",
" if len(channels) == 1:\n",
" return channels[0]\n",
" return np.stack(channels, axis=2)\n",
"\n",
"\n",
"def reconstruct_from_centered_svd_cache(cache, k):\n",
" \"\"\"Reconstruct from precomputed centered SVD factors.\"\"\"\n",
" channels = []\n",
" for U, s, Vt, col_mean in cache:\n",
" kk = min(k, len(s))\n",
" recon = (U[:, :kk] * s[:kk]) @ Vt[:kk, :] + col_mean\n",
" channels.append(np.clip(recon, 0, 255))\n",
"\n",
" if len(channels) == 1:\n",
" return channels[0]\n",
" return np.stack(channels, axis=2)\n"
]
},
{
"cell_type": "markdown",
"id": "8d1dacbb",
"metadata": {},
"source": [
"\n",
"## Scoring reconstructions\n",
"\n",
"We first compute a **baseline** by comparing the noisy image to the clean one. Then we score\n",
"rank-$k$ reconstructions. A smaller MSE and a larger PSNR indicate better fidelity to the clean\n",
"image. If `scikit-image` is available, we also compute SSIM.\n",
"\n",
"A useful conceptual warning is important here:\n",
"\n",
"> The best low-rank approximation in a matrix norm does **not** necessarily produce the image that\n",
"> looks best to a human observer.\n",
"\n",
"Why? Because human perception cares about things like edges, texture, and local contrast, while\n",
"MSE and PSNR are purely pixelwise. A reconstruction can score well numerically and still look too\n",
"smooth, too blurry, or otherwise unnatural.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c48c94cc",
"metadata": {},
"outputs": [],
"source": [
"baseline_mse = mse(img, img_noisy)\n",
"baseline_psnr = psnr(img, img_noisy)\n",
"\n",
"print(f\"Baseline noisy vs clean:\")\n",
"print(f\" MSE : {baseline_mse:.2f}\")\n",
"print(f\" PSNR: {baseline_psnr:.2f}\")\n",
"\n",
"if HAS_SKIMAGE:\n",
" baseline_ssim = image_ssim(img, img_noisy)\n",
" print(f\" SSIM: {baseline_ssim:.4f}\")\n"
]
},
{
"cell_type": "markdown",
"id": "231330d1",
"metadata": {},
"source": [
"\n",
"## Automatic search over many values of $k$\n",
"\n",
"Because all rank-$k$ reconstructions come from the same SVD, we precompute the factorizations\n",
"once and then search efficiently over candidate values of $k$.\n",
"\n",
"For very large images this can still be somewhat expensive, so for exploratory work a coarser grid\n",
"such as `range(5, 151, 5)` is often sufficient. Once a promising region is found, one can refine\n",
"the search around that region.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8bae249d",
"metadata": {},
"outputs": [],
"source": [
"\n",
"candidate_ks = list(range(1, 151, 5))\n",
"\n",
"plain_cache = precompute_svd_image(img_noisy)\n",
"centered_cache = precompute_centered_svd_image(img_noisy)\n",
"\n",
"plain_scores = []\n",
"centered_scores = []\n",
"\n",
"for k in candidate_ks:\n",
" plain = reconstruct_from_svd_cache(plain_cache, k)\n",
" centered = reconstruct_from_centered_svd_cache(centered_cache, k)\n",
"\n",
" plain_row = (k, mse(img, plain), psnr(img, plain))\n",
" centered_row = (k, mse(img, centered), psnr(img, centered))\n",
"\n",
" if HAS_SKIMAGE:\n",
" plain_row = plain_row + (image_ssim(img, plain),)\n",
" centered_row = centered_row + (image_ssim(img, centered),)\n",
"\n",
" plain_scores.append(plain_row)\n",
" centered_scores.append(centered_row)\n",
"\n",
"best_plain_by_mse = min(plain_scores, key=lambda x: x[1])\n",
"best_plain_by_psnr = max(plain_scores, key=lambda x: x[2])\n",
"best_centered_by_mse = min(centered_scores, key=lambda x: x[1])\n",
"best_centered_by_psnr = max(centered_scores, key=lambda x: x[2])\n",
"\n",
"print(\"Plain SVD:\")\n",
"print(\" Best by MSE :\", best_plain_by_mse)\n",
"print(\" Best by PSNR:\", best_plain_by_psnr)\n",
"\n",
"print(\"Centered SVD:\")\n",
"print(\" Best by MSE :\", best_centered_by_mse)\n",
"print(\" Best by PSNR:\", best_centered_by_psnr)\n",
"\n",
"if HAS_SKIMAGE:\n",
" best_plain_by_ssim = max(plain_scores, key=lambda x: x[3])\n",
" best_centered_by_ssim = max(centered_scores, key=lambda x: x[3])\n",
"\n",
" print(\"Plain SVD:\")\n",
" print(\" Best by SSIM:\", best_plain_by_ssim)\n",
" print(\"Centered SVD:\")\n",
" print(\" Best by SSIM:\", best_centered_by_ssim)\n"
]
},
{
"cell_type": "markdown",
"id": "166d0877",
"metadata": {},
"source": [
"\n",
"## Metric curves versus $k$\n",
"\n",
"Plotting the metrics as functions of $k$ is often more informative than looking only at the\n",
"single best value. Frequently the metric is nearly flat across a whole range of ranks, in which\n",
"case several nearby values of $k$ have very similar numerical performance.\n",
"\n",
"That is exactly the situation where visual inspection matters most: among a cluster of nearly tied\n",
"candidates, the one that looks nicest to the eye may not be the exact numerical winner.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0e1000de",
"metadata": {},
"outputs": [],
"source": [
"\n",
"plain_ks = [row[0] for row in plain_scores]\n",
"plain_mses = [row[1] for row in plain_scores]\n",
"plain_psnrs = [row[2] for row in plain_scores]\n",
"\n",
"centered_ks = [row[0] for row in centered_scores]\n",
"centered_mses = [row[1] for row in centered_scores]\n",
"centered_psnrs = [row[2] for row in centered_scores]\n",
"\n",
"plt.figure(figsize=(8, 4))\n",
"plt.plot(plain_ks, plain_mses, label=\"Plain SVD\")\n",
"plt.plot(centered_ks, centered_mses, label=\"Centered SVD\")\n",
"plt.xlabel(\"k\")\n",
"plt.ylabel(\"MSE\")\n",
"plt.title(\"MSE versus rank k\")\n",
"plt.legend()\n",
"plt.tight_layout()\n",
"plt.show()\n",
"\n",
"plt.figure(figsize=(8, 4))\n",
"plt.plot(plain_ks, plain_psnrs, label=\"Plain SVD\")\n",
"plt.plot(centered_ks, centered_psnrs, label=\"Centered SVD\")\n",
"plt.xlabel(\"k\")\n",
"plt.ylabel(\"PSNR\")\n",
"plt.title(\"PSNR versus rank k\")\n",
"plt.legend()\n",
"plt.tight_layout()\n",
"plt.show()\n",
"\n",
"if HAS_SKIMAGE:\n",
" plain_ssims = [row[3] for row in plain_scores]\n",
" centered_ssims = [row[3] for row in centered_scores]\n",
"\n",
" plt.figure(figsize=(8, 4))\n",
" plt.plot(plain_ks, plain_ssims, label=\"Plain SVD\")\n",
" plt.plot(centered_ks, centered_ssims, label=\"Centered SVD\")\n",
" plt.xlabel(\"k\")\n",
" plt.ylabel(\"SSIM\")\n",
" plt.title(\"SSIM versus rank k\")\n",
" plt.legend()\n",
" plt.tight_layout()\n",
" plt.show()\n"
]
},
{
"cell_type": "markdown",
"id": "d008c548",
"metadata": {},
"source": [
"\n",
"## Visual comparison near the best ranks\n",
"\n",
"Finally, we inspect a few reconstructions around the automatically selected ranks. This is important\n",
"because the reconstruction that is optimal in Frobenius norm, MSE, or PSNR is not guaranteed to be\n",
"the reconstruction a human would actually prefer.\n",
"\n",
"Low-rank approximation is mathematically optimal for a precise matrix objective, but photographic\n",
"quality is influenced by far more than that. Fine textures, fur, sharp edges, and local contrast\n",
"can all matter a great deal perceptually, and some of those are exactly the kinds of features that\n",
"get smoothed away by aggressive truncation.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "cb0c7d57",
"metadata": {},
"outputs": [],
"source": [
"\n",
"# Pick a few candidate ranks around the PSNR-optimal values\n",
"plain_best_k = best_plain_by_psnr[0]\n",
"centered_best_k = best_centered_by_psnr[0]\n",
"\n",
"plain_inspect_ks = sorted(set(k for k in [plain_best_k - 10, plain_best_k - 5, plain_best_k, plain_best_k + 5, plain_best_k + 10] if k >= 1))\n",
"centered_inspect_ks = sorted(set(k for k in [centered_best_k - 10, centered_best_k - 5, centered_best_k, centered_best_k + 5, centered_best_k + 10] if k >= 1))\n",
"\n",
"print(\"Plain SVD ranks to inspect :\", plain_inspect_ks)\n",
"print(\"Centered SVD ranks to inspect:\", centered_inspect_ks)\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "aab1aff3",
"metadata": {},
"outputs": [],
"source": [
"\n",
"import math\n",
"\n",
"# Build a gallery: original, noisy, then several plain and centered reconstructions\n",
"gallery_images = [img, img_noisy]\n",
"gallery_titles = [\"Original\", f\"Noisy (sigma={sigma})\"]\n",
"\n",
"for k in plain_inspect_ks:\n",
" gallery_images.append(reconstruct_from_svd_cache(plain_cache, k))\n",
" gallery_titles.append(f\"Plain SVD, k={k}\")\n",
"\n",
"for k in centered_inspect_ks:\n",
" gallery_images.append(reconstruct_from_centered_svd_cache(centered_cache, k))\n",
" gallery_titles.append(f\"Centered SVD, k={k}\")\n",
"\n",
"ncols = 2\n",
"nrows = math.ceil(len(gallery_images) / ncols)\n",
"\n",
"fig, axes = plt.subplots(nrows, ncols, figsize=(6 * ncols, 4 * nrows))\n",
"axes = np.array(axes).reshape(-1)\n",
"\n",
"for ax, im, title in zip(axes, gallery_images, gallery_titles):\n",
" if im.ndim == 2:\n",
" ax.imshow(im, cmap=\"gray\", vmin=0, vmax=255)\n",
" else:\n",
" ax.imshow(np.clip(im, 0, 255).astype(np.uint8))\n",
" ax.set_title(title)\n",
" ax.axis(\"off\")\n",
"\n",
"for ax in axes[len(gallery_images):]:\n",
" ax.axis(\"off\")\n",
"\n",
"plt.tight_layout()\n",
"plt.show()\n"
]
},
{
"cell_type": "markdown",
"id": "2433f279",
"metadata": {},
"source": [
"\n",
"## Remarks and possible extensions\n",
"\n",
"- Truncated SVD provides the best rank-$k$ approximation in Frobenius norm, but that does **not**\n",
" automatically mean it gives the most visually pleasing denoised image.\n",
"- For real photographs, low-rank methods often smooth away texture and local detail along with the\n",
" noise.\n",
"- The visually best image may lie near the metric optimum rather than exactly at it.\n",
"- One can compare this method with more perceptual denoisers such as wavelet methods, bilateral\n",
" filtering, non-local means, or modern learned denoisers.\n",
"- A useful next step would be to compare how the preferred $k$ changes as the noise level\n",
" $\\sigma$ increases.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.14.3"
}
},
"nbformat": 4,
"nbformat_minor": 5
}