Svd.en
Summary
- SVD factorizes a matrix into orthogonal bases and singular values, enabling low-rank approximation.
- Truncating small singular values removes noise while preserving the main signal components.
- SVD underlies many practical techniques, including PCA and latent-factor models.
Intuition #
SVD decomposes data into independent modes sorted by strength. Retaining only strong modes yields compact representations with controlled information loss.
Detailed Explanation #
1. Why SVD? #
- Any rectangular matrix (A) can be decomposed into interpretable pieces: rotations plus scaling.
- By truncating the singular values we get the best low-rank approximation in the Frobenius norm sense.
- SVD is numerically stable and forms the backbone of many algorithms (LSA, PCA, collaborative filtering).
2. Definition #
For (A \in \mathbb{R}^{m \times n}):
$$A = U \Sigma V^\top$$- (U): left singular vectors (orthonormal, size (m \times m))
- (\Sigma): diagonal matrix with singular values (\sigma_1 \ge \sigma_2 \ge \dots)
- (V): right singular vectors (orthonormal, size (n \times n))
The top singular values capture most of the energy of the matrix.
3. Prepare an image #
| |
4. Compute the SVD #
| |
5. Low-rank approximation / compression #
| |
A handful of singular values already reconstructs the image reasonably well.
6. Inspect singular vectors #
| |
Each singular vector pair encodes a particular pattern in the image; adding more pairs refines the detail.
7. Practical tips #
- Compression: choose (k) so that (\sum_{i=1}^k \sigma_i / \sum_j \sigma_j) hits the desired energy.
- Noise reduction: dropping small singular values often acts as a denoiser.
- PCA: simply apply SVD to the centred data matrix; the right singular vectors correspond to principal axes.
- Cost: full SVD is (O(mn \min(m,n))); use truncated or randomized SVD for large matrices.