Singular Value Decomposition (SVD) | รากฐานเชิงเส้นสำหรับ PCA

Created: 2019-02-19 Last updated: 2020-02-12 Read time: 1 min

まとめ

SVD แยกเมทริกซ์ $X$ เป็น $U \Sigma V^\top$ ช่วยให้เข้าใจโครงสร้างข้อมูล เช่น องค์ประกอบหลักและค่า singular value
เป็นหัวใจของ PCA, Latent Semantic Analysis, การบีบอัดรูปภาพ ฯลฯ
ช่วยลดมิติด้วยการตัด singular values ตัวเล็ก ๆ ทิ้ง

สูตร #

สำหรับเมทริกซ์ $X \in \mathbb{R}^{n \times d}$

$$ X = U \Sigma V^\top, $$

โดย $U$: $n \times d$, $V$: $d \times d$ orthonormal, $\Sigma$ มีค่า singular value ที่เรียงจากมากไปน้อย

โค้ดตัวอย่าง #

import numpy as np
import matplotlib.pyplot as plt
import japanize_matplotlib
from sklearn.datasets import load_digits

digits = load_digits()
X = digits.data

# ใช้เพียง singular values 32 ตัวแรก
U, S, VT = np.linalg.svd(X, full_matrices=False)
k = 32
X_approx = (U[:, :k] * S[:k]) @ VT[:k, :]

plt.figure(figsize=(8, 4))
plt.subplot(1, 2, 1)
plt.imshow(X[0].reshape(8, 8), cmap="gray")
plt.title("ต้นฉบับ")
plt.subplot(1, 2, 2)
plt.imshow(X_approx[0].reshape(8, 8), cmap="gray")
plt.title("หลังลดเหลือ 32 singular values")
plt.tight_layout()
plt.show()

การบีบอัดภาพด้วย SVD

เคล็ดลับ #

ใช้ scipy.sparse.linalg.svds สำหรับเมทริกซ์ขนาดใหญ่/เบาบาง
ตรวจสอบว่าต้องการ full SVD หรือ truncated SVD (เช่น TruncatedSVD ใน scikit-learn)
เชื่อมโยงกับ PCA: eigen decomposition ของ $X^\top X$ คือ $V \Sigma^2 V^\top$

เอกสารอ้างอิง #

Golub, G. H., & Van Loan, C. F. (2013). Matrix Computations. JHU Press.
scikit-learn developers. (2024). TruncatedSVD. https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.TruncatedSVD.html