Kernel PC A.en

Kernel PC A.en

Last updated 2026-02-16 Read time 3 min
Summary
  • Kernel PCA performs PCA in an implicit nonlinear feature space using kernels.
  • Kernel choice and parameters (for example RBF gamma) determine which nonlinear structure becomes separable.
  • It is useful when linear PCA misses curved or interaction-heavy patterns.

Intuition #

Kernel PCA changes the similarity geometry before extracting principal components. Patterns that are nonlinear in input space can become linearly separable in kernel space.

Detailed Explanation #

1. Why go beyond plain PCA? #

  • Standard PCA only sees linear variance along the original axes, so nonlinear structure gets flattened incorrectly.
  • Many real datasets lie on curved surfaces; we would like to capture that curvature while still enjoying PCA’s interpretability.
  • Kernel PCA uses a nonlinear feature map (\phi(x)) and computes inner products via kernels, enabling PCA-style analysis in that richer space.

2. Formulation #

Given samples (x_i \in \mathbb{R}^d) and a feature map (\phi(x)):

  1. Build the kernel (Gram) matrix

    $$ K_{ij} = \langle \phi(x_i), \phi(x_j) \rangle = k(x_i, x_j) $$

    No explicit (\phi) is needed as long as we can evaluate the kernel function.

  2. Solve the eigenproblem

    $$ K v = \lambda v $$

    The eigenvectors give the principal components in feature space. After normalising we can project samples the same way as PCA.

Typical kernels:

  • RBF kernel $$k(x, x') = \exp(-\gamma \lVert x - x' \rVert^2)$$
  • Polynomial kernel $$k(x, x') = (\langle x, x' \rangle + c)^d$$

3. Create a toy dataset #

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
import numpy as np
import matplotlib.pyplot as plt
import japanize_matplotlib
from sklearn.datasets import make_circles

X, y = make_circles(n_samples=400, factor=0.3, noise=0.15)

plt.figure(figsize=(8, 8))
plt.scatter(X[:, 0], X[:, 1], c=y)
plt.title("Concentric circles")
plt.show()

Two-circle dataset


4. Run Kernel PCA #

1
2
3
4
5
6
7
8
9
from sklearn.decomposition import KernelPCA

kpca = KernelPCA(kernel="rbf", gamma=3)
X_kpca = kpca.fit_transform(X)

plt.figure(figsize=(8, 8))
plt.scatter(X_kpca[:, 0], X_kpca[:, 1], c=y)
plt.title("2-D embedding via Kernel PCA")
plt.show()

Kernel PCA result

Compared with linear PCA, the concentric circles are now clearly separable.


5. Tips #

  • Nonlinearity: choose kernels (RBF, polynomial) that reflect the type of curvature in your data.
  • Scaling: standardise features before building the kernel matrix to keep (\gamma) meaningful.
  • Computation: the Gram matrix is (O(n^2)); memory becomes the bottleneck for large (n).
  • Downstream models: the same kernel ideas appear in SVMs; Kernel PCA can be a useful exploratory step.

Notes #

  • Kernel PCA = kernel trick + PCA; it preserves nonlinear manifolds that ordinary PCA cannot.
  • Inspect how many components you actually need—large (k) can overfit noise.
  • If scalability is a concern, look into approximate kernels (Nyström, random Fourier features).