Spectral Image Denoising via Truncated SVDΒΆ

This notebook extracts the image denoising project into a standalone workflow and extends it from grayscale images to actual color images.

The core idea is the same as in the original write-up: if an image matrix has singular value decomposition $$ A = U \Sigma V^T, $$ then the best rank-$k$ approximation to $A$ in Frobenius norm is obtained by truncating the SVD. This is the Eckart–Young–Mirsky theorem.

For a grayscale image, the image is a single matrix. For an RGB image, we treat the image as three matrices, one for each channel, and apply truncated SVD to each channel separately.

OutlineΒΆ

  1. Load an image from disk
  2. Convert it to grayscale or keep it in RGB
  3. Add synthetic Gaussian noise
  4. Compute a truncated SVD reconstruction
  5. Compare the original, noisy, and denoised images
  6. Measure quality using MSE and PSNR

This notebook is written so that you can use your own image files directly.

InΒ [1]:
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image
from pathlib import Path

try:
    from skimage.metrics import structural_similarity as ssim
    HAS_SKIMAGE = True
except ImportError:
    ssim = None
    HAS_SKIMAGE = False

print(f"scikit-image available: {HAS_SKIMAGE}")
scikit-image available: True

A note on color imagesΒΆ

For a grayscale image, SVD applies directly to a single matrix. For a color image $A \in \mathbb{R}^{n \times p \times 3}$, we write $$ A = (A_R, A_G, A_B), $$ where each channel is an $n \times p$ matrix. We then compute a rank-$k$ approximation for each channel: $$ A_R \approx (A_R)_k,\qquad A_G \approx (A_G)_k,\qquad A_B \approx (A_B)_k, $$ and stack them back together.

This is the most direct extension of the grayscale method, and it works well as a first linear-algebraic treatment of color denoising.

Helper functionsΒΆ

We begin with some utilities for:

  • loading images,
  • adding Gaussian noise,
  • reconstructing rank-$k$ approximations,
  • computing image-quality metrics.
InΒ [2]:
def load_image(path, mode="rgb"):
    """
    Load an image from disk.

    Parameters
    ----------
    path : str or Path
        Path to the image file.
    mode : {"rgb", "gray"}
        Whether to load the image as RGB or grayscale.

    Returns
    -------
    np.ndarray
        Float image array scaled to [0, 255].
        Shape is (H, W, 3) for RGB and (H, W) for grayscale.
    """
    path = Path(path)
    if not path.exists():
        raise FileNotFoundError(f"Could not find image file: {path}")

    img = Image.open(path)

    if mode.lower() in {"gray", "grayscale", "l"}:
        img = img.convert("L")
    else:
        img = img.convert("RGB")

    return np.asarray(img, dtype=np.float64)


def show_image(img, title=None):
    """Display a grayscale or RGB image."""
    plt.figure(figsize=(6, 6))
    if img.ndim == 2:
        plt.imshow(np.clip(img, 0, 255), cmap="gray", vmin=0, vmax=255)
    else:
        plt.imshow(np.clip(img, 0, 255).astype(np.uint8))
    if title is not None:
        plt.title(title)
    plt.axis("off")
    plt.tight_layout()
    plt.show()


def add_gaussian_noise(img, sigma=25, seed=0):
    """
    Add Gaussian noise to an image.

    Parameters
    ----------
    img : np.ndarray
        Image array in [0, 255].
    sigma : float
        Standard deviation of the noise.
    seed : int
        Random seed for reproducibility.

    Returns
    -------
    np.ndarray
        Noisy image clipped to [0, 255].
    """
    rng = np.random.default_rng(seed)
    noisy = img + rng.normal(loc=0.0, scale=sigma, size=img.shape)
    return np.clip(noisy, 0, 255)


def truncated_svd_matrix(A, k):
    """
    Return the rank-k truncated SVD approximation of a 2D matrix A.
    """
    U, s, Vt = np.linalg.svd(A, full_matrices=False)
    k = min(k, len(s))
    return (U[:, :k] * s[:k]) @ Vt[:k, :]


def truncated_svd_image(img, k):
    """
    Apply truncated SVD to a grayscale or RGB image.

    For RGB images, truncated SVD is applied channel-by-channel.
    """
    if img.ndim == 2:
        recon = truncated_svd_matrix(img, k)
        return np.clip(recon, 0, 255)

    if img.ndim == 3:
        channels = []
        for c in range(img.shape[2]):
            channel_recon = truncated_svd_matrix(img[:, :, c], k)
            channels.append(channel_recon)
        recon = np.stack(channels, axis=2)
        return np.clip(recon, 0, 255)

    raise ValueError("Image must be either 2D (grayscale) or 3D (RGB).")


def mse(A, B):
    """Mean squared error between two images."""
    return np.mean((A.astype(np.float64) - B.astype(np.float64)) ** 2)


def psnr(A, B, max_val=255.0):
    """Peak signal-to-noise ratio in decibels."""
    err = mse(A, B)
    if err == 0:
        return np.inf
    return 10 * np.log10((max_val ** 2) / err)


def image_ssim(A, B, max_val=255.0):
    """
    Structural similarity index.

    For RGB images, compute SSIM channel-by-channel and average.
    Returns None when scikit-image is unavailable.
    """
    if not HAS_SKIMAGE:
        return None

    A = A.astype(np.float64)
    B = B.astype(np.float64)

    if A.ndim == 2:
        return float(ssim(A, B, data_range=max_val))

    if A.ndim == 3:
        vals = [ssim(A[:, :, c], B[:, :, c], data_range=max_val) for c in range(A.shape[2])]
        return float(np.mean(vals))

    raise ValueError("Images must be either 2D (grayscale) or 3D (RGB).")

Choose an imageΒΆ

InΒ [3]:
from pathlib import Path

MODE = "rgb"  # use "gray" for grayscale, "rgb" for color

candidate_paths = [
    Path("../images/bella.jpg"),
    Path("images/bella.jpg"),
    Path("bella.jpg"),
]

IMAGE_PATH = None
for p in candidate_paths:
    if p.exists():
        IMAGE_PATH = p
        break

if IMAGE_PATH is None:
    raise FileNotFoundError(
        "Could not find bella.jpg. Put it in ../images/, images/, or the notebook folder."
    )

img = load_image(IMAGE_PATH, mode=MODE)
print("Using image:", IMAGE_PATH)
print("Image shape:", img.shape)
show_image(img, title=f"Original image ({MODE})")
Using image: ../images/bella.jpg
Image shape: (3456, 5184, 3)
No description has been provided for this image

Add synthetic Gaussian noiseΒΆ

We add noise so that the denoising effect is visible and measurable.

InΒ [4]:
sigma = 40
seed = 0

img_noisy = add_gaussian_noise(img, sigma=sigma, seed=seed)

noisy_output_path = IMAGE_PATH.with_name(f"{IMAGE_PATH.stem}_noisy.png")
Image.fromarray(np.clip(img_noisy, 0, 255).astype(np.uint8)).save(noisy_output_path)
print("Saved noisy image to:", noisy_output_path)

show_image(img_noisy, title=f"Noisy image (sigma={sigma})")
Saved noisy image to: ../images/bella_noisy.png
No description has been provided for this image

Visualizing rank-$k$ reconstructionsΒΆ

For small $k$, the reconstruction captures only coarse structure. As $k$ increases, more detail returns. For denoising, there is often a useful middle ground: enough singular values to preserve structure, but not so many that we reintroduce noise.

InΒ [5]:
import numpy as np
import matplotlib.pyplot as plt
import math

ks = [5, 20, 50, 100]

# Collect all images + titles
images = []
titles = []

# Original
images.append(img)
titles.append("Original")

# Noisy
images.append(img_noisy)
titles.append("Noisy")

# Reconstructions
for k in ks:
    recon = truncated_svd_image(img_noisy, k)
    images.append(recon)
    titles.append(f"k = {k}")

# Grid setup
ncols = 2
nrows = math.ceil(len(images) / ncols)

fig, axes = plt.subplots(nrows, ncols, figsize=(6 * ncols, 4 * nrows))
axes = axes.flatten()  # easier indexing

# Plot everything
for i, (ax, im, title) in enumerate(zip(axes, images, titles)):
    if im.ndim == 2:
        ax.imshow(im, cmap="gray", vmin=0, vmax=255)
    else:
        ax.imshow(np.clip(im, 0, 255).astype(np.uint8))
    
    ax.set_title(title)
    ax.axis("off")

# Hide any unused axes
for j in range(len(images), len(axes)):
    axes[j].axis("off")

plt.tight_layout()
comparison_output_path = IMAGE_PATH.with_name(f"{IMAGE_PATH.stem}_truncated_svd_multiple_ks.png")
print("Saved comparison figure to:", comparison_output_path)
plt.savefig(comparison_output_path, bbox_inches="tight")
plt.show()
Saved comparison figure to: ../images/bella_truncated_svd_multiple_ks.png
No description has been provided for this image

Quantitative evaluationΒΆ

We compare each reconstruction against the clean original image, not against the noisy one. A good denoising rank should typically:

  • reduce MSE relative to the noisy image,
  • increase PSNR relative to the noisy image.
InΒ [6]:
baseline_mse = mse(img, img_noisy)
baseline_psnr = psnr(img, img_noisy)

print(f"Noisy image baseline -> MSE: {baseline_mse:.2f}, PSNR: {baseline_psnr:.2f} dB")

results = []
for k in ks:
    recon = truncated_svd_image(img_noisy, k)
    results.append((k, mse(img, recon), psnr(img, recon)))

print("\nRank-k reconstructions:")
for k, m, p in results:
    print(f"k = {k:3d} | MSE = {m:10.2f} | PSNR = {p:6.2f} dB")
Noisy image baseline -> MSE: 1426.31, PSNR: 16.59 dB

Rank-k reconstructions:
k =   5 | MSE =     314.20 | PSNR =  23.16 dB
k =  20 | MSE =     120.90 | PSNR =  27.31 dB
k =  50 | MSE =     104.98 | PSNR =  27.92 dB
k = 100 | MSE =     155.79 | PSNR =  26.21 dB

Efficient search over many values of $k$ΒΆ

A naive implementation would recompute the SVD from scratch for every candidate value of $k$. That is extremely expensive: every reconstruction would require a fresh factorization of each channel of the noisy image.

A much better approach is:

  1. compute the SVD once for each channel;
  2. reuse those factors for every candidate $k$;
  3. compare reconstructions using MSE, PSNR, and optionally SSIM.

This is also a nice numerical linear algebra point: all rank-$k$ truncated reconstructions come from the same singular value decomposition.

We compare two variants:

  • plain truncated SVD, applied directly to each channel;
  • centered truncated SVD, where we subtract each channel's column mean before factorizing and add it back after reconstruction.

The centered version sometimes improves reconstruction slightly because the low-rank approximation spends less effort representing the mean structure.

InΒ [7]:
def precompute_svd_image(img):
    """Precompute plain SVD factors for each channel."""
    if img.ndim == 2:
        A = img.astype(np.float64)
        U, s, Vt = np.linalg.svd(A, full_matrices=False)
        return [(U, s, Vt)]

    cache = []
    for c in range(img.shape[2]):
        A = img[:, :, c].astype(np.float64)
        U, s, Vt = np.linalg.svd(A, full_matrices=False)
        cache.append((U, s, Vt))
    return cache


def precompute_centered_svd_image(img):
    """Precompute centered SVD factors for each channel."""
    if img.ndim == 2:
        A = img.astype(np.float64)
        col_mean = A.mean(axis=0, keepdims=True)
        A_centered = A - col_mean
        U, s, Vt = np.linalg.svd(A_centered, full_matrices=False)
        return [(U, s, Vt, col_mean)]

    cache = []
    for c in range(img.shape[2]):
        A = img[:, :, c].astype(np.float64)
        col_mean = A.mean(axis=0, keepdims=True)
        A_centered = A - col_mean
        U, s, Vt = np.linalg.svd(A_centered, full_matrices=False)
        cache.append((U, s, Vt, col_mean))
    return cache


def reconstruct_from_svd_cache(cache, k):
    """Reconstruct from precomputed plain SVD factors."""
    channels = []
    for U, s, Vt in cache:
        kk = min(k, len(s))
        recon = (U[:, :kk] * s[:kk]) @ Vt[:kk, :]
        channels.append(np.clip(recon, 0, 255))

    if len(channels) == 1:
        return channels[0]
    return np.stack(channels, axis=2)


def reconstruct_from_centered_svd_cache(cache, k):
    """Reconstruct from precomputed centered SVD factors."""
    channels = []
    for U, s, Vt, col_mean in cache:
        kk = min(k, len(s))
        recon = (U[:, :kk] * s[:kk]) @ Vt[:kk, :] + col_mean
        channels.append(np.clip(recon, 0, 255))

    if len(channels) == 1:
        return channels[0]
    return np.stack(channels, axis=2)

Scoring reconstructionsΒΆ

We first compute a baseline by comparing the noisy image to the clean one. Then we score rank-$k$ reconstructions. A smaller MSE and a larger PSNR indicate better fidelity to the clean image. If scikit-image is available, we also compute SSIM.

A useful conceptual warning is important here:

The best low-rank approximation in a matrix norm does not necessarily produce the image that looks best to a human observer.

Why? Because human perception cares about things like edges, texture, and local contrast, while MSE and PSNR are purely pixelwise. A reconstruction can score well numerically and still look too smooth, too blurry, or otherwise unnatural.

InΒ [8]:
baseline_mse = mse(img, img_noisy)
baseline_psnr = psnr(img, img_noisy)

print(f"Baseline noisy vs clean:")
print(f"  MSE : {baseline_mse:.2f}")
print(f"  PSNR: {baseline_psnr:.2f}")

if HAS_SKIMAGE:
    baseline_ssim = image_ssim(img, img_noisy)
    print(f"  SSIM: {baseline_ssim:.4f}")
Baseline noisy vs clean:
  MSE : 1426.31
  PSNR: 16.59
  SSIM: 0.0674

Automatic search over many values of $k$ΒΆ

Because all rank-$k$ reconstructions come from the same SVD, we precompute the factorizations once and then search efficiently over candidate values of $k$.

For very large images this can still be somewhat expensive, so for exploratory work a coarser grid such as range(5, 151, 5) is often sufficient. Once a promising region is found, one can refine the search around that region.

InΒ [10]:
candidate_ks = list(range(1, 151, 5))

plain_cache = precompute_svd_image(img_noisy)
centered_cache = precompute_centered_svd_image(img_noisy)

plain_scores = []
centered_scores = []

for k in candidate_ks:
    plain = reconstruct_from_svd_cache(plain_cache, k)
    centered = reconstruct_from_centered_svd_cache(centered_cache, k)

    plain_row = (k, mse(img, plain), psnr(img, plain))
    centered_row = (k, mse(img, centered), psnr(img, centered))

    if HAS_SKIMAGE:
        plain_row = plain_row + (image_ssim(img, plain),)
        centered_row = centered_row + (image_ssim(img, centered),)

    plain_scores.append(plain_row)
    centered_scores.append(centered_row)

best_plain_by_mse = min(plain_scores, key=lambda x: x[1])
best_plain_by_psnr = max(plain_scores, key=lambda x: x[2])
best_centered_by_mse = min(centered_scores, key=lambda x: x[1])
best_centered_by_psnr = max(centered_scores, key=lambda x: x[2])

print("Plain SVD:")
print("  Best by MSE :", best_plain_by_mse)
print("  Best by PSNR:", best_plain_by_psnr)

print("Centered SVD:")
print("  Best by MSE :", best_centered_by_mse)
print("  Best by PSNR:", best_centered_by_psnr)

if HAS_SKIMAGE:
    best_plain_by_ssim = max(plain_scores, key=lambda x: x[3])
    best_centered_by_ssim = max(centered_scores, key=lambda x: x[3])

    print("Plain SVD:")
    print("  Best by SSIM:", best_plain_by_ssim)
    print("Centered SVD:")
    print("  Best by SSIM:", best_centered_by_ssim)
Plain SVD:
  Best by MSE : (41, np.float64(100.86643831213188), np.float64(28.093336751115086), 0.6093116647651357)
  Best by PSNR: (41, np.float64(100.86643831213188), np.float64(28.093336751115086), 0.6093116647651357)
Centered SVD:
  Best by MSE : (36, np.float64(100.77445519633338), np.float64(28.097299019055427), 0.6300189753895039)
  Best by PSNR: (36, np.float64(100.77445519633338), np.float64(28.097299019055427), 0.6300189753895039)
Plain SVD:
  Best by SSIM: (1, np.float64(1048.5582022585174), np.float64(17.924878190392008), 0.7729763808220854)
Centered SVD:
  Best by SSIM: (1, np.float64(891.0525380576963), np.float64(18.631770492935846), 0.7731825777973053)

Metric curves versus $k$ΒΆ

Plotting the metrics as functions of $k$ is often more informative than looking only at the single best value. Frequently the metric is nearly flat across a whole range of ranks, in which case several nearby values of $k$ have very similar numerical performance.

That is exactly the situation where visual inspection matters most: among a cluster of nearly tied candidates, the one that looks nicest to the eye may not be the exact numerical winner.

InΒ [11]:
plain_ks = [row[0] for row in plain_scores]
plain_mses = [row[1] for row in plain_scores]
plain_psnrs = [row[2] for row in plain_scores]

centered_ks = [row[0] for row in centered_scores]
centered_mses = [row[1] for row in centered_scores]
centered_psnrs = [row[2] for row in centered_scores]

plt.figure(figsize=(8, 4))
plt.plot(plain_ks, plain_mses, label="Plain SVD")
plt.plot(centered_ks, centered_mses, label="Centered SVD")
plt.xlabel("k")
plt.ylabel("MSE")
plt.title("MSE versus rank k")
plt.legend()
plt.tight_layout()
plt.show()

plt.figure(figsize=(8, 4))
plt.plot(plain_ks, plain_psnrs, label="Plain SVD")
plt.plot(centered_ks, centered_psnrs, label="Centered SVD")
plt.xlabel("k")
plt.ylabel("PSNR")
plt.title("PSNR versus rank k")
plt.legend()
plt.tight_layout()
plt.show()

if HAS_SKIMAGE:
    plain_ssims = [row[3] for row in plain_scores]
    centered_ssims = [row[3] for row in centered_scores]

    plt.figure(figsize=(8, 4))
    plt.plot(plain_ks, plain_ssims, label="Plain SVD")
    plt.plot(centered_ks, centered_ssims, label="Centered SVD")
    plt.xlabel("k")
    plt.ylabel("SSIM")
    plt.title("SSIM versus rank k")
    plt.legend()
    plt.tight_layout()
    plt.show()
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image

Visual comparison near the best ranksΒΆ

Finally, we inspect a few reconstructions around the automatically selected ranks. This is important because the reconstruction that is optimal in Frobenius norm, MSE, or PSNR is not guaranteed to be the reconstruction a human would actually prefer.

Low-rank approximation is mathematically optimal for a precise matrix objective, but photographic quality is influenced by far more than that. Fine textures, fur, sharp edges, and local contrast can all matter a great deal perceptually, and some of those are exactly the kinds of features that get smoothed away by aggressive truncation.

InΒ [12]:
# Pick a few candidate ranks around the PSNR-optimal values
plain_best_k = best_plain_by_psnr[0]
centered_best_k = best_centered_by_psnr[0]

plain_inspect_ks = sorted(set(k for k in [plain_best_k - 10, plain_best_k - 5, plain_best_k, plain_best_k + 5, plain_best_k + 10] if k >= 1))
centered_inspect_ks = sorted(set(k for k in [centered_best_k - 10, centered_best_k - 5, centered_best_k, centered_best_k + 5, centered_best_k + 10] if k >= 1))

print("Plain SVD ranks to inspect   :", plain_inspect_ks)
print("Centered SVD ranks to inspect:", centered_inspect_ks)
Plain SVD ranks to inspect   : [31, 36, 41, 46, 51]
Centered SVD ranks to inspect: [26, 31, 36, 41, 46]
InΒ [13]:
import math

# Build a gallery: original, noisy, then several plain and centered reconstructions
gallery_images = [img, img_noisy]
gallery_titles = ["Original", f"Noisy (sigma={sigma})"]

for k in plain_inspect_ks:
    gallery_images.append(reconstruct_from_svd_cache(plain_cache, k))
    gallery_titles.append(f"Plain SVD, k={k}")

for k in centered_inspect_ks:
    gallery_images.append(reconstruct_from_centered_svd_cache(centered_cache, k))
    gallery_titles.append(f"Centered SVD, k={k}")

ncols = 2
nrows = math.ceil(len(gallery_images) / ncols)

fig, axes = plt.subplots(nrows, ncols, figsize=(6 * ncols, 4 * nrows))
axes = np.array(axes).reshape(-1)

for ax, im, title in zip(axes, gallery_images, gallery_titles):
    if im.ndim == 2:
        ax.imshow(im, cmap="gray", vmin=0, vmax=255)
    else:
        ax.imshow(np.clip(im, 0, 255).astype(np.uint8))
    ax.set_title(title)
    ax.axis("off")

for ax in axes[len(gallery_images):]:
    ax.axis("off")

plt.tight_layout()
plt.show()
No description has been provided for this image

Remarks and possible extensionsΒΆ

  • Truncated SVD provides the best rank-$k$ approximation in Frobenius norm, but that does not automatically mean it gives the most visually pleasing denoised image.
  • For real photographs, low-rank methods often smooth away texture and local detail along with the noise.
  • The visually best image may lie near the metric optimum rather than exactly at it.
  • One can compare this method with more perceptual denoisers such as wavelet methods, bilateral filtering, non-local means, or modern learned denoisers.
  • A useful next step would be to compare how the preferred $k$ changes as the noise level $\sigma$ increases.